http client got bad request page from google - http

I sent the following http request to google, and it returned a bad request page. Was there any wrong in my request? I was implementing a proxy server with C++. I redirected clients' requests to servers they want to connect to. Before redirecting, I inserted "\r\nConnection: close" to the request. Was the position I inserted to wrong? Thanks. (I use "###" to surround the request)
###GET http://www.google.com.tw/ HTTP/1.1
Host: www.google.com.tw
User-Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:11.0) Gecko/20100101 Firefox/11.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-tw,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip, deflate
Proxy-Connection: keep-alive
Connection: close
Cookie:***
###

What you have in there is not correct per spec, although I wouldn't be surprised if some servers actually responded to it (but not Google's).
Proxy-Connection is a misnomer and not needed at all.
The GET request should provide a relative path, not an absolute one. To be clear: the client does need to send a full address in the GET header, but the proxy needs to extract it and rewrite it such that GET carries the path, and Host header carries the hostname.
To try a couple of simple experiments, simply telnet google.com 80 and copy paste your request followed by few CRLF's.

Related

Postman gets different response on same API endpoint vs browser

I've been trying to create an app which does some requests on Wizzair api, and found that there is this endpoint as /Api/search/search. While searching for flights in the browser this endpoint returns a list of flights based on the parameters provided as a json response. While accessing the same endpoint from postman and copying the same headers and body as the request I get a 428 response. That seems kinda odd, since the headers and body are exactly the same as the one in the Newtork tab in the Developer tools.
Here's a reference URL: https://wizzair.com/#/booking/select-flight/LTN/VIE/2022-07-23/2022-08-05/1/0/0/null
The added headers are:
Host: be.wizzair.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:101.0) Gecko/20100101 Firefox/101.0
Accept: application/json, text/plain, */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Referer: https://wizzair.com/
Content-Type: application/json;charset=utf-8
X-RequestVerificationToken: <token>
Content-Length: 254
Origin: https://wizzair.com
Connection: keep-alive
Cookie: <some_cookies>
Sec-Fetch-Dest: empty
Sec-Fetch-Mode: cors
Sec-Fetch-Site: same-site
TE: trailers
And the body is added as raw json:
{"isFlightChange":false,"flightList":[{"departureStation":"LTN","arrivalStation":"VIE","departureDate":"2022-07-24"},{"departureStation":"VIE","arrivalStation":"LTN","departureDate":"2022-08-05"}],"adultCount":1,"childCount":0,"infantCount":0,"wdc":true}
The response from postman is:
{"sec-cp-challenge": "true","provider":"crypto","branding_url_content":"/_sec/cp_challenge/crypto_message-3-7.htm","chlg_duration":30}
Could anyone explain to me why there is a different behavior on the browser vs postman on the exact same request and if possible replicate the proper response in postman?
Don't know if it is still relevant.
But this one
{"sec-cp-challenge": "true","provider":"crypto","branding_url_content":"/_sec/cp_challenge/crypto_message-3-7.htm","chlg_duration":30}
enter code here
is a fingerprint of akamai bot protection. AFAIK it uses JS to tell real browser from scripted requests. It stores result in cookies, obfuscating it an every possible way. Good thing is that you can copy cookies from your browser session, and that way have several requests with meaningful results. After that akamai starts to try to change cookies again, and you'll have to start all over.

"Serializing" an HTTP(S) Request

The software I'm using is completely blind when it comes to HTTP. I'd like to add a really basic way to send GET requests over an existing TCP or TLS connection.
I've had a look at http.cap from the Wireshark wiki and it seems to me that an HTTP packet (when no Websocket upgrades are involved, that is) is simply some TCP payload. Therefore, could I simply get away by sending the following text as a hex stream over TCP? (1)
GET /download.html HTTP/1.1\r\n
Host: www.ethereal.com\r\n
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113\r\n
Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1\r\n
Accept-Language: en-us,en;q=0.5\r\n
Accept-Encoding: gzip,deflate\r\n
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7\r\n
Keep-Alive: 300\r\n
Connection: keep-alive\r\n
Referer: http://www.ethereal.com/development.html\r\n
\r\n
If so, would TLS cause an issue? (2) I can't see how, but I'd rather ask now than face the problem later.
Also (3) Is the TCP header any different? Again, same reasoning as above. As far as I know it's a normal TCP packet.
Thanks in advance for your input. I'm a complete HTTP beginner, if my questions are odd that's why.

Trying to understand how to respond to CORS OPTIONS request with 403 and when

I really want some validation after reading many websites on CORS to see if I have this
OPTIONS /frog/LOTS/upload/php.php HTTP/1.1
Host: staff.curriculum.local
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20100101 Firefox/14.0.1
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Origin: http://frogserver.curriculum.local
Access-Control-Request-Method: POST
Access-Control-Request-Headers: cache-control,x-requested-with
Pragma: no-cache
Cache-Control: no-cache
Here is when I think I respond with 403 ->
If my origin set on the server is not * and not that domain, I response with 403.
If I do not support POST (I may support GET), I respond with 403
If I do not support any ONE of the request headers, I respond with 403
For #1, if the domain is not supported, I will NOT send any Access Control headers in the response. I think that is ok.
for #2 and #3, I would send these headers assuming Origin request header was a match
Access-Control-Allow-Origin: http://frogserver.curriculum.local
Access-Control-Allow-Credentials: {exists and is true IF supported on server}
Access-Control-Allow-Headers: {all request headers we support and not related to incoming 'Access-Control-Request-Headers' in the request}
Access-Control-Allow-Methods: {all methods supported NOT related to incoming method POST that came in?}
Access-Control-Expose-Headers: {all headers we allow browser origin website to read}
Is my assumption correct here? or are some of the response headers related to the request headers in some way I am not seeing? (The names are similar but I don't think the behavior is tied to each other).
I would find it really odd that the request even needs to contain Access-Control-Request-Method & Access-Control-Request-Headers if we did not send back a 403 in the cases where we don't support all requested information on that endpoint, right? This is why I 'suspect' we are supposed to return a 403 along with what we do support?
thanks,
Dean

How to verify if the user and password part is readable by intermediary servers when accessing a url

If I want to use URL authentication of the form:
http://user:password#domain.tld
Is it guaranteed that the user and password are not seen by any servers between me and the requests's destination?
I'd like to know how to check this myself.
This is what I tried so far:
I've set a netcat server to listen on port 9090 and sent request with both python's request library and curl to check out what is received by a server.
Python request:
requests.get("http://foo:bar#localhost:9090")
What nc shows me:
GET / HTTP/1.1
Host: localhost:9090
User-Agent: python-requests/2.9.1
Connection: keep-alive
Accept-Encoding: gzip, deflate
Accept: */*
Authorization: Basic Zm9vOmJhcg==
curl request:
curl 'foo:bar#localhost:9090'
nc:
GET / HTTP/1.1
Host: localhost:9090
Authorization: Basic Zm9vOmJhcg==
User-Agent: curl/7.47.0
Accept: */*
I noticed that the credentials are converted into an Authorization header. Netcat shows me the headers and body of the request but not what happens until it reaches it.
Is it possible for intermediate servers to find out my credentials?
Is it possible for intermediate servers to find out my credentials?
Yes, if you are communicating on an unencrypted channel (like http protocol). The Authorization header is nothing but Base64 encoding of username and password concatenated. You can try decode "Zm9vOmJhcg==" as mentioned in the question here - https://codebeautify.org/base64-decode. However if you are communicating on a secure channel (like https), the intermediate servers cannot see it.

HTTP 307 redirect does not work in firefox

We have a custom webserver, written in C.
When the browser visits the page http://mydomain.com:30001/index.html,
our webserver will redirect the browser to mydomain.com:30001/login.html, by sending a http 307 response to the browser, then the browser will visit the login url.
This worked well in IE 8, and Chrome.
But in firefox(18+), when visiting the page http://mydomain.com:30001/index.html,
the browser cannot load the page(/index.html nor /login.html), and seems to be in the loading process forever. (And firebug > network panel shows nothing.)
I also tried firefox setting
Tools > Options > Advanced > General : Accessibility : [ ] "Warn me when web sites try to redirect or reload the page",
but has no effect and nothing changed.
So I wonder why firefox behaves different or there's other reason.
Update: here's firefox HTTP part captured in wireshark
1.REQUEST(when visiting http://mydomain.com:30001/index.html in the browser addressbar)
GET /index.html HTTP/1.1
Host: mydomain.com:30001
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8
Accept-Language: zh-cn,zh;q=0.8,en-us;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
DNT: 1
Connection: keep-alive
2.RESPONSE
HTTP/1.1 307 Temporary Redirect
Connection: keep-alive
Location: /login.html
that's all, and firefox does not fetch /login.html with another request.
By comparing responses from other servers, it looks like by adding
Content-Length: 0
in the response header solved the problem. Thanks.
According to the protocol, Content-Length can be determined by connection close if there's no Content-Length given.
My original response provides no Content-Length, means the browser is waiting the end of transfer of this response to know the right length, but setting Connection: keep-alive does not end this connection.
I guess IE or Chrome starts redirect processing right after
it knows it's a 307 redirect, while firefox does not do so until it
completes reading this response.
Here's a test case for 307 that works with Firefox: http://greenbytes.de/tech/tc/httpredirects/#t307loc. You'll have to find out what's different in your server.

Resources