How to test HTTP Keep alive is actually working - http

I know HTTP keep-alive is on by default in HTTP 1.1 but I want to find a way to confirm that it is actually working.
Does anyone know of a simple way to test from a web browser (EG how to make sense of wireshark). I know I need to look for multiple HTTP requests over the same TCP connection but I don't know how to confirm that in wireshark or any other way.
Thanks!

As Ron Garrity said on ServerFault, you can use Curl like this:
curl -Iv http://www.aptivate.org 2>&1 | grep -i 'connection #0'
And it outputs these two lines if keep-alive is working:
* Connection #0 to host www.aptivate.org left intact
* Closing connection #0
And if keep-alive is not working, then it just outputs this line:
* Closing connection #0

If you're on Windows Vista or later, you can use Resource Manager. The Network tab will list all open TCP connections and the process they were started by. Open a browser with one tab, browse to your page, and test.

First, try to capture the traffic to the target website in Wireshark and limit it to what you need with a filter like:
tcp port 80 and host targetwebsite.com
Then load the page in a browser or fetch it by any tool you have. If the target web page refreshes itself or one of the values in it, leave it open until you have at least one change in it.
Now you have enough data and you can stop capturing procedure in Wireshark.
You should see dozens of records and their protocol should be TCP or HTTP. For the purpose of your quick simple check, you will not need TCP records. So, lets remove them by applying another filter. In top of the window there is a "filter" field. Type http there, and wireshark will hide all records but those which have a HTTP protocol.
Now select a record and look at the next level of details, which you can find in the 2nd box bellow all records. Just to be sure you are looking at the right place, the first line there starts with "Frame XYZ". The fourth line starts with "Transmission Control Protocol". Look for the port numbers after "SRC Port" and "DST Port:". Depending on the record, one of these numbers belongs to the webserver (typically 80) and the other one shows port number in your end.
Now check a couple of different GET records. To know if the request is a GET record, check the Info column. If the port numbers in your end are used several times, all those requests were made through HTTP keepalive.
Remember that most browsers will open multiple connections, even if the webserver supports keepalive. So, DO NOT conclude your evaluation by finding just one different port.

The most accurate way is to curl the same URL multiple times.
curl -v http://weibo.com -o /dev/null http://weibo.com -o /dev/null
If the output contains Re-using existing connection, then the HTTP keep-alive feature is working. For example,
* TCP_NODELAY set
* Connected to weibo.com (180.149.138.251) port 80 (#0)
> GET / HTTP/1.1
> Host: weibo.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< ...
< ...
<
{ [236 bytes data]
* Connection #0 to host weibo.com left intact
* Found bundle for host weibo.com: 0x56324121d9a0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host weibo.com
* Connected to weibo.com (180.149.138.251) port 80 (#0)
> GET / HTTP/1.1
> Host: weibo.com
> User-Agent: curl/7.68.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 301 Moved Permanently
< ...
< ...
<
{ [236 bytes data]
* Connection #0 to host weibo.com left intact
Another quick way is to test with ab. But some HTTP servers might not return the Connection: keep-alive header even when they've already turned on keep-alive feature, such as uwsgi. In such cases, ab does NOT send keep-alive requests. That makes ab can only do "positive" detection on HTTP keep-alive.
ab -c 5 -n 50 -k https://www.google.com/
If the result shows
...
Complete requests: 50
Failed requests: 0
Keep-Alive requests: 50 # Pay attention to this line
Total transferred:
...
Then the HTTP keep-alive is enabled.

Related

How do I what Content Types are on offer (for HTTP Content Negotiation)?

What one gets back when resolving a DOI depends on content negotiation.
I was looking at https://citation.crosscite.org/docs.html#sec-3
and I see different services offer different Content Types.
For a particular URL I want to know all the content types it can give me.
Some of them might be more useful than any that I am aware of (i.e. i don't want to write a list of preferences in advance).
For example:
https://doi.org/10.5061/dryad.1r170
I thought maybe OPTIONS was the way to do it
but that gave back nothing interesting, only about allowed request methods.
shell> curl -v -X OPTIONS http://doi.org/10.5061/dryad.1r170
* Hostname was NOT found in DNS cache
* Trying 2600:1f14:6cf:c01::d...
* Trying 54.191.229.235...
* Connected to doi.org (2600:1f14:6cf:c01::d) port 80 (#0)
> OPTIONS /10.5061/dryad.1r170 HTTP/1.1
> User-Agent: curl/7.38.0
> Host: doi.org
> Accept: */*
>
< HTTP/1.1 200 OK
* Server Apache-Coyote/1.1 is not blacklisted
< Server: Apache-Coyote/1.1
< Allow: GET, HEAD, POST, TRACE, OPTIONS
< Content-Length: 0
< Date: Mon, 29 Jan 2018 07:01:14 GMT
<
* Connection #0 to host doi.org left intact
I guess there is no such standard yet, but Link header: https://www.w3.org/wiki/LinkHeader could expose this information.
But personally, I won't rely too much on it. For example, a server could start sending a new content type and still NOT expose it via this header.
It might be useful to check the API response headers frequently, via manual or automated means for any changes.

How to make an HTTP GET request manually with netcat?

So, I have to retrieve temperature from any one of the cities from http://www.rssweather.com/dir/Asia/India.
Let's assume I want to retrieve of Kanpur's.
How to make an HTTP GET request with Netcat?
I'm doing something like this.
nc -v rssweather.com 80
GET http://www.rssweather.com/wx/in/kanpur/wx.php HTTP/1.1
I don't know exactly if I'm even in the right direction or not. I am not able to find any good tutorials on how to make an HTTP get request with netcat, so I'm posting it on here.
Of course you could dig in standards searched for google, but actually if you want to get only a single URL, it isn't​‎​‎ worth the effort.
You could also start a netcat in listening mode on a port:
nc -l 64738
(Sometimes nc -l -p 64738 is the correct argument list)
...and then do a browser request into this port with a real browser. Just type in your browser http://localhost:64738 and see.
In your actual case the problem is that HTTP/1.1 doesn't close the connection automatically, but it waits your next URL you want to retrieve. The solution is simple:
Use HTTP/1.0:
GET /this/url/you/want/to/get HTTP/1.0
Host: www.rssweather.com
<empty line>
or use a Connection: request header to say the server you want to close after that:
GET /this/url/you/want/to/get HTTP/1.1
Host: www.rssweather.com
Connection: close
<empty line>
Extension: After the GET header write only the path part of the request. The hostname from which you want to get data belongs to a Host: header as you can see in my examples. This is because multiple websites can run on the same webserver, so the browsers need to say him, from which site it wants to load the page.
This works for me:
$ nc www.rssweather.com 80
GET /wx/in/kanpur/wx.php HTTP/1.0
Host: www.rssweather.com
And then hit double <enter>, i.e. once for the remote http server and once for the nc command.
source: pentesterlabs
You don't even need to use/install netcat
Create a tcp socket via an unused file-descriptor i.e I use 88 here
Write the request into it
use the fd
exec 88<>/dev/tcp/rssweather.com/80
echo -e "GET /dir/Asia/India HTTP/1.1\nhost: www.rssweather.com\nConnection: close\n\n" >&88
sed 's/<[^>]*>/ /g' <&88
On MacOS, you need the -c flag as follows:
Little-Net:~ minfrin$ nc -c rssweather.com 80
GET /wx/in/kanpur/wx.php HTTP/1.1
Host: rssweather.com
Connection: close
[empty line]
The response then appears as follows:
HTTP/1.1 200 OK
Date: Thu, 23 Aug 2018 13:20:49 GMT
Server: Apache
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html
The -c flag is described as "Send CRLF as line-ending".
To be HTTP/1.1 compliant, you need the Host header, as well as the "Connection: close" if you want to disable keepalive.
Test it out locally with python3 http.server
This is also a fun way to test it out. On one shell, launch a local file server:
python3 -m http.server 8000
Then on the second shell, make a request:
printf 'GET / HTTP/1.1\r\nHost: localhost\r\n\r\n' | nc localhost 8000
The Host: header is required in HTTP 1.1.
This shows an HTML listing of the directory, just as you would see from:
firefox http://localhost:8000
Next you can try to list files and directories and observe the response:
printf 'GET /my-subdir/ HTTP/1.1\n\n' | nc localhost 8000
printf 'GET /my-file HTTP/1.1\n\n' | nc localhost 8000
Every time you make a successful request, the server prints:
127.0.0.1 - - [05/Oct/2018 11:20:55] "GET / HTTP/1.1" 200 -
confirming that it was received.
example.com
This IANA maintained domain is another good test URL:
printf 'GET / HTTP/1.1\r\nHost: example.com\r\n\r\n' | nc example.com 80
and compare with: http://example.com/
https SSL
nc does not seem to be able to handle https URLs. Instead, you can use:
sudo apt-get install nmap
printf 'GET / HTTP/1.1\r\nHost: github.com\r\n\r\n' | ncat --ssl github.com 443
See also: https://serverfault.com/questions/102032/connecting-to-https-with-netcat-nc/650189#650189
If you try nc, it just hangs:
printf 'GET / HTTP/1.1\r\nHost: github.com\r\n\r\n' | nc github.com 443
and trying port 80:
printf 'GET / HTTP/1.1\r\nHost: github.com\r\n\r\n' | nc github.com 443
just gives a redirect response to the https version:
HTTP/1.1 301 Moved Permanently
Content-Length: 0
Location: https://github.com/
Connection: keep-alive
Tested on Ubuntu 18.04.

External links URL encoding leads to '%3F' and '%3D' on Nginx server

I got a problem with my server. I got four inbound links to different sites of my dynamic webpage which look something like this:
myurl.com/default/Site%3Fid%3D13
They should look like this:
myurl.com/default/Site?id=13
I do know that those %3F is an escape sequence for the ? sign and the %3D is an escape sequence for the equal sign. But I do get an error 400 when I use those links. What can I do about that?
The four links are for different sites, and I imagine over time there will be more links like that. So one fix for all would be perfect.
An exact same question was actually asked on nginx-ru mailing list about a year ago:
http://mailman.nginx.org/pipermail/nginx-ru/2013-February/050200.html
The most helpful response, by an Nginx, Inc, employee/developer, Валентин Бартенев:
http://mailman.nginx.org/pipermail/nginx-ru/2013-February/050209.html
Если запрос приходит в таком виде, то это уже не параметры, а имя запрошенного
файла. Другое дело, что location ищется по уже раскодированному адресу, о чем в
документации написано.
Translation:
If the request comes in such a form, then these are no longer the args, but the name of the requested file. Another thing is that, as documented, the location matching is performed against a normalised URI.
His suggested solution, translated to the sample example from the question here at SO, would then be:
location /default/Site? {
rewrite \?(.*)$ /default/Site?$1? last;
}
location = /default/Site {
[...]
}
The following sample would redirect all wrongly-looking requests (defined as having ? in the requested filename — encoded as %3F in the request) into less wrongly-looking ones, regardless of URL.
(Please note that, as rightly advised elsewhere, you should not be getting these wrongly-formed links in the first place, so, use it as a last resort — only when you cannot correct the wrongly formed links otherwise, and you do know that such requests are attempted by valid agents.)
server {
listen [::]:80;
server_name localhost;
rewrite ^/([^?]*)\?(.*)$ /$1?$2? permanent;
location / {
return 200 "id is $arg_id\n";
}
}
This is example of how it would work — when a wrongly looking request is encountered, a correction attempt is made with a 301 Moved Permanently response with a supposedly correct Location response header, which would make the browser automatically re-issue the request to the newly provided location:
opti# curl -6v "http://localhost/default/Site%3Fid%3D13"
* About to connect() to localhost port 80 (#0)
* Trying ::1...
* connected
* Connected to localhost (::1) port 80 (#0)
> GET /default/Site%3Fid%3D13 HTTP/1.1
> User-Agent: curl/7.26.0
> Host: localhost
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Server: nginx/1.4.1
< Date: Wed, 15 Jan 2014 17:09:25 GMT
< Content-Type: text/html
< Content-Length: 184
< Location: http://localhost/default/Site?id=13
< Connection: keep-alive
<
<html>
<head><title>301 Moved Permanently</title></head>
<body bgcolor="white">
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx/1.4.1</center>
</body>
</html>
* Connection #0 to host localhost left intact
* Closing connection #0
Note that no correction attempts are made on proper-looking requests:
opti# curl -6v "http://localhost/default/Site?id=13"
* About to connect() to localhost port 80 (#0)
* Trying ::1...
* connected
* Connected to localhost (::1) port 80 (#0)
> GET /default/Site?id=13 HTTP/1.1
> User-Agent: curl/7.26.0
> Host: localhost
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.4.1
< Date: Wed, 15 Jan 2014 17:09:30 GMT
< Content-Type: application/octet-stream
< Content-Length: 9
< Connection: keep-alive
<
id is 13
* Connection #0 to host localhost left intact
* Closing connection #0
The URL is perfectly valid. The escaped characters it contains are just that, escaped. Which is perfectly fine.
The purpose is that you can actually have a request name (in most cases corresponding to the filename on the disk) that is Site?id=13 and not Site and the rest as the query string.
I would consider it bad practice to have characters in a filename that makes this necessary. However, in URL arguments it may very well be necessary.
Nevertheless, the request URL is valid, and probably not what you want it to be. Which consequently suggest that you should correct the error wherever anybody has picked up the wrong URL in the first place.
I do not really understand why you get an error 400; you should rather get an error 404. But that depends on your setup.
There are also cases, especially with nginx, that mostly involve passing on whole URLs and URL parts along multiple levels (for example reverse proxies, matching regular expressions from the URL and using them as variables, etc.) where such an error may occur. But to verify this and fix it we would need to know more about your setup.

HTTP streaming / chunked responses on Heroku with clojure

I'm making a clojure web app that streams data to clients using chunked HTTP responses. This works great when I run it locally using foreman, but doesn't work properly when I deploy it to Heroku.
A minimal example exhibiting this behaviour can be found on my github here. The frontend (in resources/index.html) performs an AJAX GET request and prints the response chunks as they arrive. The server uses http-kit to send a new chunk to connected clients every second. By design, the HTTP request never completes.
When the same code is deployed to Heroku, the HTTP connection is closed by the server immediately after the first chunk is sent. It seems to be Heroku's routing mesh which is causing this disconnection to occur.
This can also be seen by performing the GET request using curl:
$ curl -v http://arcane-headland-2284.herokuapp.com/stream
* About to connect() to arcane-headland-2284.herokuapp.com port 80 (#0)
* Trying 54.243.166.168...
* Adding handle: conn: 0x6c3be0
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x6c3be0) send_pipe: 1, recv_pipe: 0
* Connected to arcane-headland-2284.herokuapp.com (54.243.166.168) port 80 (#0)
> GET /stream HTTP/1.1
> User-Agent: curl/7.31.0
> Host: arcane-headland-2284.herokuapp.com
> Accept: */*
>
< HTTP/1.1 200 OK
< Content-Type: text/html; charset=utf-8
< Date: Sat, 17 Aug 2013 16:57:24 GMT
* Server http-kit is not blacklisted
< Server: http-kit
< transfer-encoding: chunked
< Connection: keep-alive
<
* transfer closed with outstanding read data remaining
* Closing connection 0
curl: (18) transfer closed with outstanding read data remaining
The time is currently Sat Aug 17 16:57:24 UTC 2013 <-- this is the first chunk
Can anybody suggest why this is happening? HTTP streaming is supposed to be supported in Heroku's Cedar stack. The fact the code runs correctly using foreman suggests it is something in Heroku's routing mesh causing it to break.
Live demo of the failing project: http://arcane-headland-2284.herokuapp.com/
This was due to a bug in http-kit which will be fixed shortly.
https://devcenter.heroku.com/articles/request-timeout may be relevant: "long-polling" requests like yours have to send data every 55 seconds or be terminated.

HTTP over TCP using Telnet/Hercules/Raw socket/

I'm connecting to real-time data on a remote server as a client. I want to send the following to a server and keep the connection open. This is a 'push' protocol.
http://server.domain.com:80/protocol/dosomething.txt?POSTDATA=thePostData
I can call this in a browser and it's fine. However, if I try to use telnet directly in a windows command prompt, the prompt just exits.
GET protocol/dosomething.txt?POSTDATA=thePostData
The same is the case if I use Putty.exe and select Telnet as the protocol. I can't see a way to do this with Hercules at all, as I don't think the server will interpret the GET
Is there any way I can do this?
Thanks.
You have to match the HTTP protocol (RFC2616) to the letter if you want to use telnet. Try something like:
shell$ telnet www.google.com 80
Trying 173.194.43.50...
Connected to www.google.com (173.194.43.50).
Escape character is '^]'.
GET / HTTP/1.1
Host: www.google.com:80
Connection: close
HTTP/1.1 200 OK
Date: Tue, 11 Sep 2012 15:09:51 GMT
...
You need to type the following lines including an "empty line" following the "Connection" line.
GET / HTTP/1.1
Host: www.google.com:80
Connection: close

Resources