I am trying to pull a web page in my client (not a browser) with the following settings in the HTTP header
Accept: "text/html;charset=UTF-8"
Accept-Charset: "ISO-8859-1"
User-Agent: "Mozilla/5.0"
however I get an error code 406,
I also tried changing to;
Accept: "text/html"
with no success; error code and status message in the response header is
statusCode: 406
statusMessage: "Not Acceptable"
any idea waht the correct header settings should be, the page loads fine in the browser
Finally figured it out, I ran a sniffer to see which header settings worked, and here is what worked in every case
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X; de-de) AppleWebKit/523.10.3 (KHTML, like Gecko) Version/3.0.4 Safari/523.10',
'Accept-Charset': 'ISO-8859-1,UTF-8;q=0.7,*;q=0.7',
'Accept-language' : 'de,en;q=0.7,en-us;q=0.3'
}
You should add Accept-Language. See Here
Why are you sending contradictory headers? You are requesting a representation that is both UTF8 and ISO-8859-1 at the same time. I guess that you could interpret the request as being for 7-bit ASCII representation.
In this case I would omit the Accept-Charset and change the Accept header to text/html, */*;q=0.1 so that you will get something back with a strong preference for HTML. See the Content Negotiation section of RFC7231 for details about these headers.
Related
Working on some code that that I inherited from a non-responsive initial developer. My ASP.NET and web.config are at best "dated", thus I'm turning to the community for some help. One of the first things I had to change is to force this website to operate in SSL (https:) as it deals with sensitive data. The program immediately stopped working and I had to make some undesirable changes to code that "already worked". And it still seems broken, and the changes won't make the client happy.
This is an ASP.NET project that seems hand-rolled.
Sending a POST command with some body text that (I think) is JSon setting additional parameters to the POST command such as: "indexID=8379fcd1-5083-4d1c-a6ee-5812f134a505".
As far as I can tell, this works as intended on non SSL (i.e. http: requests). However, when running in SSL (i.e. https: requests), it appears that the BODY (Json text) isn't getting decoded into the HttpContect.Current.Request parameters (which seems to be happening in http:).
However the post_data that I can read from the input stream has the JSon body text (as clear text?) with the parameters, which my 'fix' adds to the incoming HttpContect.Current.Request parameters as a combined dictionary.
[Here is the RAW command intercepted with Fiddler] POST https://vmdev-xpp/BuilderQC/Services/Data.svc/QueryGridResults?typename=ImportReadyForDownload HTTP/1.1 Host: vmdev-xpp User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:92.0) Gecko/20100101 Firefox/92.0 Accept: application/json, text/javascript, /; q=0.01 Accept-Language: en-US,en;q=0.5 Accept-Encoding: gzip, deflate, br Content-Type: application/x-www-form-urlencoded X-Requested-With: XMLHttpRequest Content-Length: 378 Origin: https://vmdev-xpp Connection: keep-alive Referer: https://vmdev-xpp/BuilderQC/BREDFileManagement.aspx Cookie: ASP.NET_SessionId=ehi2ccsdmekvkgegmfh11n1v; .ASPXLMPTest=2226D5725FBC10FBCCD606108CE5A4E32990EEA8FF8A1864496F69874F116D7E1ABF48A8BFD05EE683FE3F456D4475E88A61B19B299CB557209129BD25E87AC38CECA5303C7E2035E64C1F5A4AD2605D8581181A9C7E48680371F83BC7A93D7A63D8748EA4761A608F424578C20127D01DE0E2FBFBD5F079575E86FD506925D541026B7C8713FDEFE108BCEADBFC1DA0 Sec-Fetch-Dest: empty Sec-Fetch-Mode: cors Sec-Fetch-Site: same-origin
_search=true&nd=1629052554858&rows=40&page=1&sidx=&sord=asc&QC_ProjectID=14&FileReadyForDownload=1&filters=%7B%22fields%22%3A%5B%7B%22field%22%3A%22QC_ProjectID%22%2C+%22op%22%3A+%22cn%22%2C+%22value%22%3A%2214%22%7D%2C%7B%22field%22%3A%22FileReadyForDownload%22%2C+%22op%22%3A+%22cn%22%2C+%22value%22%3A%221%22%7D%5D%7D&indexID=8379fcd1-5083-4d1c-a6ee-5812f134a505&entity=false
[Here is the post_data I obtained from the input stream, I think that this being in clear-text is suspect] Post_data = _search=true&nd=1629052767566&rows=40&page=1&sidx=&sord=asc&QC_ProjectID=14&FileReadyForDownload=1&filters=%7B%22fields%22%3A%5B%7B%22field%22%3A%22FileName%22%2C+%22op%22%3A+%22cn%22%2C+%22value%22%3A%22th%22%7D%2C%7B%22field%22%3A%22QC_ProjectID%22%2C+%22op%22%3A+%22cn%22%2C+%22value%22%3A%2214%22%7D%2C%7B%22field%22%3A%22FileReadyForDownload%22%2C+%22op%22%3A+%22cn%22%2C+%22value%22%3A%221%22%7D%5D%7D&indexID=8379fcd1-5083-4d1c-a6ee-5812f134a505&entity=false&FileName=th
Here is the incoming HttpContext.Current.Request.Params.AllKeys, Notice the lacking "indexID" among other parameters
I have a problem with recreating the headers, everything seem identical, but it just doesn't work. Need those headers to access Instagram API
I tried to use Charles to intercept a traffic from mobile device and it's working as expected, but I'm struggling to recreate the same headers.
URL is https://i.instagram.com/api/v1/feed/user/7499201770/reel_media/
Headers are
:method: GET
:scheme: https
:path: /api/v1/feed/user/7499201770/reel_media/
:authority: i.instagram.com
content-type: application/json
authority: i.instagram.com
accept: */*
path: /api/v1/feed/user/7499201770/reel_media/
accept-language: en-IN;q=1.0
accept-encoding: gzip;q=1.0, compress;q=0.5
content-length: 2
user-agent: Instagram 10.29.0 (iPhone7,2; iPhone OS 9_3_3; en_US; en-US; scale=2.00; 750x1334) AppleWebKit/420+
referer: https://www.instagram.com/
x-ig-capabilities: 3w==
cookie: ds_user_id=6742557571; sessionid=IGSCf716eb61bf2a6d41f...
I tried to use Postman in order to recreate this request, but every time I get the same error "Login required". How should I paste those headers? I can't understand that
It was the user-agent: Instagram 10.29.0 (iPhone7,2; iPhone OS 9_3_3; en_US; en-US; scale=2.00; 750x1334) AppleWebKit/420+ that I didn't copy
With user-agent it works, so HTTP headers will look like this in case someone writing an instagram story saver)
["Content-Type": "application/json",
"Accept-encoding": "gzip, deflate",
"User-agent": "Instagram 10.29.0 (iPhone7,2; iPhone OS 9_3_3; en_US; en-US; scale=2.00; 750x1334) AppleWebKit/420+",
"Cookie": "ds_user_id=67425...; sessionid=IGSCf716eb61b....]
I'm printing out all the headers and I get:
map[Cookie:[_ga=GA1.2.843429125.1462575405] User-Agent:[Mozilla/5.0
(Macintosh; Intel Mac OS X 10_11_3) AppleWebKit/601.4.4 (KHTML, like Gecko)
Version/9.0.3 Safari/601.4.4] Accept-Language:[en-us]
Accept-Encoding:[gzip, deflate] Connection:[keep-alive]
Accept:[text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8]]
which means my browser is sending "Cookie", "User-Agent", "Accept-Language", "Accept-Encoding", "Connection", and "Accept" but there is no "Host" value.
How can I get my https://en.wikipedia.org/wiki/Virtual_hosting working without this value?
I'm using https://github.com/gin-gonic/gin
It stated on Golang http docs :
For incoming requests, the Host header is promoted to the Request.Host
field and removed from the Header map.
So you can get the host by access
http.Request.Host
Check here for details : https://golang.org/pkg/net/http/
Thats from wikipedia:
For version 1.1 of the HTTP protocol, the chunked transfer mechanism is considered to be always and anyways acceptable, even if not listed in the TE (transfer encoding) request header field
Thats what I get from clients (Mozilla, Opera):
GET http://www.google.com/ HTTP/1.1
Host: www.google.com
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:45.0) Gecko/20100101 Firefox/45.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Apparently there is neither Transfer-Encoding field there, nor I see any chunks (I've checked with HEX editor, no additional symbols).
I open connection as follows (Python)
socket.socket(socket.AF_INET, socket.SOCK_STREAM)
Is it lower level handling joins chunks into message? Is so, how can I know where the HTTP message ends so that I can stop reading the request and start handling it?
You should read the specification.
But simply, in this case, since it's a GET, and there's not content, there's not going to be a Content-Length header. So, you stop reading when you get the empty line with just a CR/LF.
Otherwise, you read past that blank line, and read Content-Length bytes.
Given this snippet:
(defroutes main-routes
(POST "/input/:controller" request
(let [buff (ByteArrayOutputStream.)]
(copy (request :body) buff)
;; --- snip
The value of buff will be a non-empty byte array iff there's the Content-Type header in the request. The value can be nonsencial, the header just has to be there.
However, I need to dump the body (hm... that came out wrong) if the request came without a content type, so that the client can track down the offending upload. (The uploading software is not under my control and its maintainers won't provide anything extra in the headers.)
Thank you for any ideas on how to solve or work around this!
EDIT:
Here are the headers I get from the client:
{
"content-length" "159",
"accept" "*/*",
"host" (snip),
"user-agent" (snip)
}
Plus, I discovered that Ring, using an instance of Java's ServletRequest, fills in the content type with the standard default, x-www-form-urlencoded. I'm now guessing that HTTPParser, which supplies the body through HTTPParser#Input, can't parse it correctly.
I face the same issue. It's definitely one of the middleware not being able to parse the body correctly and transforming :body. The main issue is that the Content-Type suggest the body should be parsable.
Using ngrep, I found out how curl confuses the middleware. The following, while intuitive (or rather sexy) on the command line sends a wrong Content-Type which confuses the middleware:
curl -nd "Unknown error" http://localhost:3000/event/error
T 127.0.0.1:44440 -> 127.0.0.1:3000 [AP]
POST /event/error HTTP/1.1.
Authorization: Basic SzM5Mjg6ODc2NXJkZmdoam5idmNkOQ==.
User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3.
Host: localhost:3000.
Accept: */*.
Content-Length: 13.
Content-Type: application/x-www-form-urlencoded.
.
Unknown error
The following however forces the Content-Type to being opaque and the middleware will not interfere with the :body.
curl -nd "Unknown error" -H "Content-Type: application/data" http://localhost:3000/event/error
T 127.0.0.1:44441 -> 127.0.0.1:3000 [AP]
POST /event/error HTTP/1.1.
Authorization: Basic SzM5Mjg6ODc2NXJkZmdoam5idmNkOQ==.
User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3.
Host: localhost:3000.
Accept: */*.
Content-Type: application/data.
Content-Length: 13.
.
Unknown error
I'm considering replacing the middleware with a more liberal one because even though the request is wrong, I'd still like to be able to decide what to do with the body myself. It's a really weird choice to zero the request body when the request doesn't make sense. I actually think a more correct behavior would be to pass it to an error handler which by default would return a 400 Bad Request or 406 Not Acceptable.
Any thoughts on that? In my case I might propose a patch to Compojure.
According to:
http://mmcgrana.github.com/ring/ring.middleware.content-type-api.html
the default content type is application/octet-stream. Unless you actively support that content type, can't you just check if the content type matches that one, and then dump whatever you need based on that?