Why are there 2 requests from my browser? - http

I have a simple node server. All it does is log the req.headers and res (I am learning!).
let http = require('http');
function handleIncomingRequest(req, res) {
console.log('---------------------------------------------------');
console.log(req.headers);
console.log('---------------------------------------------------');
console.log();
console.log('---------------------------------------------------');
res.writeHead(200, {'Content-Type': 'application/json'});
res.end(JSON.stringify( {error: null}) + '\n');
}
let s = http.createServer(handleIncomingRequest);
s.listen(8080);
When I use curl to test the server it sends 1 request. When I use chrome it sends 2 different requests.
{ host: 'localhost:8080',
connection: 'keep-alive',
'cache-control': 'max-age=0',
'upgrade-insecure-requests': '1',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
accept: 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'accept-encoding': 'gzip, deflate, sdch, br',
'accept-language': 'en-GB,en;q=0.8' }
and
{ host: 'localhost:8080',
connection: 'keep-alive',
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36',
accept: 'image/webp,image/*,*/*;q=0.8',
referer: 'http://localhost:8080/',
'accept-encoding': 'gzip, deflate, sdch, br',
'accept-language': 'en-GB,en;q=0.8' }
This is in incognito mode as in normal mode there are 3 requests!
What is the browser doing and why?

Hard to tell without seeing the full transaction data (for example, what was the request, i.e. what came after GET or POST - and what were the answers from the server).
But it could be caused by the 'upgrade-insecure-requests': '1' header:
When a server encounters this preference in an HTTP request’s headers,
it SHOULD redirect the user to a potentially secure representation of
the resource being requested.
See this.
accept: 'image/webp,image/*,*/*;q=0.8'
On the other hand, the second request is probably for an image only, most likely the favicon.ico or a (bigger) icon for iPad/iPhone maybe (that could explain the 3 requests). You should check out the full request data to be sure.
You can use F12 en select network in the browser to see what's really happening.

Related

Scraping data from https://cardano.ideascale.com webpage, but server noticed I am using Internet Explorer

I am scraping the content of this link. And my procedure is:
GET-TOKEN to get a Bearer token.
GET Fork Gitcoin and deploy on Cardano using the above token in the header and get json content in response.
My issue was when i run my below code, when run get /detail I got response as I am using Internet Explorer to access, that is weird because my request header has "User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36".
<div id="ie-unsupported-alert" class="ie-d-none">
<p>We noticed you are using Internet Explorer. We don\'t have support for this browser in Incoming Moderation!
</p>
<p>We recommend using the Microsoft Edge Browser, Chrome, Firefox or Safari. <a
href="https://help.ideascale.com/knowledge/internet-web-browsers-supported-by-ideascale">Click for more
info.</a></p>
</div>
Can anyone explain the error and teach me how to fix it?
Below is my python code.
import requests
def get_content(url):
s = requests.session()
response = s.get(f"https://cardano.ideascale.com/a/community/api/get-token")
if response.status_code != 200:
print(f"\033[4m\033[1m{response.status_code}\033[0m")
return None
cookies = response.cookies
headers = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36",
'Accept': 'application/json,',
'Accept-Language': 'en-US,en;q=0.5',
'Accept-Encoding': 'gzip, deflate',
'Cache-Control': 'no-cache',
'Pragma': 'no-cache',
'Authorization': f'Bearer {response.content.decode("utf-8")}',
'Alt-Used': 'cardano.ideascale.com',
'Connection': 'keep-alive',
'Referer': url,
'Sec-Fetch-Dest': 'empty',
'Sec-Fetch-Mode': 'cors',
'Sec-Fetch-Site': 'same-origin',
'TE': 'trailers',
}
#import ipdb; ipdb.set_trace()
response = s.get(f"{url}/detail", headers=headers, cookies=cookies)
print(response.content)
get_content("https://cardano.ideascale.com/c/idea/317821")

casperjs failed to access certain websites that even wget can

A very simple example link https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm.
Even wget without any header information can successfully scrape the information.
However, casperjs just not work
var casper=require("casper").create();
var mouse=require("mouse").create(casper);
var link="https://www.accessdata.fda.gov/scripts/cder/daf/index.cfm";
casper.start().then(function() {
this.open(link);
this.wait(5000);
});
casper.run(function(){
this.echo(this.getPageContent()).exit();
});
It always output
<html><head></head><body></body></html>
add header info does not help, like below
this.open(link, {
method: 'get',
authority: 'www.accessdata.fda.gov',
path: '/scripts/cder/daf/index.cfm',
scheme: 'https',
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36',
'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-US,en;q=0.9,zh-TW;q=0.8,zh;q=0.7,zh-CN;q=0.6,ja;q=0.5',
'cache-control': 'max-age=0',
'sec-fetch-dest': 'document',
'sec-fetch-mode': 'navigate',
'sec-fetch-site': 'none',
'sec-fetch-user': '?1',
'upgrade-insecure-requests': '1'
}
});
I tried many combinations of header style but just not work.
However, it is noteworthy that the casperjs code above works for certain website like http://docs.casperjs.org/en/latest/selectors.html
I just noticed that add --ssl-protocol=any
casperjs --ssl-protocol=any yourScript.js
solved the issue
this link has more explanation
CasperJS/PhantomJS doesn't load https page

Why does my http request body doesn't get transferred to the server?

I have made an ajax http post request and tried it in Fiddler and it worked, but when I tried to run the exact same request in Dart the request body doesn't got transferred to the server. Is something wrong with my Dart request body?
Response response = await client.post(
'https://intranet.tam.ch/krm/timetable/ajax-get-timetable',
headers: {
'Content-Type': 'application/x-www-form-urlencoded',
'Accept': 'application/json, text/javascript, */*; q=0.01',
'Accept-Language': 'de-ch',
'Accept-Encoding': 'gzip, deflate, br',
'Host': 'intranet.tam.ch',
'Origin': 'https://intranet.tam.ch',
'User-Agent':
'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15',
'Connection': 'keep-alive',
'Referer': 'https://intranet.tam.ch/krm/calendar',
'Content-Length': '83',
'Cookie':
'school=krm; sturmsession=xx; sturmuser=xx; username=xx',
'X-Requested-With': 'XMLHttpRequest'
},
body:
'startDate=1597615200000&endDate=598133600000&studentId%5B%5D=x&holidaysOnly=0');
client.close();
print(response.body);
Any answers are highly appreciated
There seems to be some kind of strange behavior if you define Content-Length in the header manually where the body are never going to be sent. If you remove the header and lets the library handle the Content-Length it works.

How http request with "Sec-Fetch-Mode: no-cors" in Blazor Webassembly

How is it possible to make a request by HttpClient with the HTTP request header Sec-Fetch-Mode: no-cors in Blazor Webassembly?
My actuel code is :
var hc = new HttpClient();
var responseHTTP = await hc.GetAsync("https://www.somedomain.com/api/");
But this produces the following HTTP request headers :
:authority: www.somedomain.com
:method: GET
:path: /api/json?input=test&key=AIzaSyDqWvsxxxxxxxxxxxxxxxxx1R7x2qoSkc&sessiontoken=136db14b-88bd-4730-a0b2-9b6c1861d9c7
:scheme: https
accept: */*
accept-encoding: gzip, deflate, br
accept-language: en-US,en;q=0.9
origin: http://localhost:5000
referer: http://localhost:5000/places
sec-fetch-dest: empty
sec-fetch-mode: cors
sec-fetch-site: cross-site
user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.116 Safari/537.36
x-client-data: CJS2yQxxxxxxxxxxxxxxxxxxxxxxxxygEI7bXKAQiOusoBCObGygE=
To specifically answer your question, you need to create a HttpRequestMessage first.
e.g.
var request = new HttpRequestMessage(HttpMethod.Get, "https://www.somedomain.com/api/");
request.SetBrowserRequestMode(BrowserRequestMode.NoCors);
request.SetBrowserRequestCache(BrowserRequestCache.NoStore); //optional
using (var httpClient = new HttpClient())
{
var response = await httpClient.SendAsync(request);
var content = await response.Content.ReadAsStringAsync();
}
This will correctly set the sec-fetch-mode header to no-cors
I've found however, that the response comes back as empty even though upon inspection in fiddler the response is there.
The closest I got to understanding the problem is through this issue here but unfortunately the bug was closed.

POST raw to server Processing

I have an Intel Edison running a Node.JS server that is printing everything I post to it into the console. I can successfully post to it using Postman and see the sent raw data in the console.
Now I'm using Processing to POST to it, which will fire off different events on the Node.JS server.
My problem is that I can't seem to successfully POST the raw body to the server, I've been trying to get this working for several hours already.
import processing.net.*;
String url = "192.168.0.107:3000";
Client myClient;
void setup(){
myClient = new Client(this, "192.168.0.107", 3000);
myClient.write("POST / HTTP/1.1\n");
myClient.write("Cache-Control: no-cache\n");
myClient.write("Content-Type: text/plain\n");
//Attempting to write the raw post body
myClient.write("test");
//2 newlines tells the server that we're done sending
myClient.write("\n\n");
}
The console shows that the server received the POST, and the correct headers, but it doesn't show any data in it.
How do I specify the that "test" is the raw POST data?
The HTTP code from Postman:
POST HTTP/1.1
Host: 192.168.0.107:3000
Content-Type: text/plain
Cache-Control: no-cache
Postman-Token: 6cab79ad-b43b-b4d3-963f-fad11523ec0b
test
The server output from a POST from Postman:
{ host: '192.168.0.107:3000',
connection: 'keep-alive',
'content-length': '4',
'cache-control': 'no-cache',
origin: 'chrome-extension://fhbjgbiflinjbdggehcddcbncdddomop',
'content-type': 'text/plain',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36',
'postman-token': 'd17676a6-98f4-917c-955c-7d8ef01bb024',
accept: '*/*',
'accept-encoding': 'gzip, deflate',
'accept-language': 'en-US,en;q=0.8' }
test
The server output from my POST from Processing:
{ host: '192.168.0.107:3000',
'cache-control': 'no-cache',
'content-type': 'text/plain' }
{}
I just figured out what was wrong, I needed to add the content-length header to tell the server how much data to listen for, and then a newline before the data.
Final code:
import processing.net.*;
String url = "192.168.0.107:3000";
Client myClient;
void setup(){
myClient = new Client(this, "192.168.0.107", 3000);
myClient.write("POST / HTTP/1.1\n");
myClient.write("Cache-Control: no-cache\n");
myClient.write("Content-Type: text/plain\n");
myClient.write("content-length: 4\n");
myClient.write("\n");
myClient.write("test");
myClient.write("\n\n");
}

Resources