HTTP GET request always returns 301 - http

I'm trying to obtain the HTML dump of some RFC's from IETF website, via a simple GET request. However, it responds with status code 301. I'm making use of netcat to simulate the HTTP GET request with the following command :
$ printf 'GET /html/rfc3986 HTTP/1.1\r\nHost: tools.ietf.org\r\nConnection: close\r\n\r\n' | nc tools.ietf.org 80
The following reply is obtained as a result of the above command :
HTTP/1.1 301 Moved Permanently
Date: Wed, 09 Sep 2020 15:36:36 GMT
Server: Apache/2.2.22 (Debian)
Location: https://tools.ietf.org/html/rfc3986
Vary: Accept-Encoding
Content-Length: 323
Connection: close
Content-Type: text/html; charset=iso-8859-1
X-Pad: avoid browser bug
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved here.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at tools.ietf.org Port 80</address>
</body></html>
However, if I try to send a HTTP/1.0 based HEAD request to the Location value determined in the above reply, I get status 404 in reply. I made use of HEAD method just to check the status code of the reply.
Command :
printf 'HEAD https://tools.ietf.org/html/rfc3986 HTTP/1.0\r\n\r\n' | nc tools.ietf.org 80
Reply:
HTTP/1.1 404 Not Found
Date: Wed, 09 Sep 2020 16:32:18 GMT
Server: Apache/2.2.22 (Debian)
Vary: accept-language,accept-charset,Accept-Encoding
Accept-Ranges: bytes
Connection: close
Content-Type: text/html; charset=iso-8859-1
Content-Language: en
Expires: Wed, 09 Sep 2020 16:32:18 GMT
Is there a mistake in the way I'm making use of GET method to obtain the results?

You are sending a plain text request to port 80, so the URL you are trying is effectively http://tools.ietf.org/html/rfc3986
The response is telling you to instead request https://tools.ietf.org/html/rfc3986. That's not a different path on the same server, but a full URL.
The difference is that it begins https meaning you need to make a TLS-secured connection on port 443.
That's not going to be possible with a trivial use of netcat, so you're better off using an HTTP client like curl or wget

Related

Curl having problem to retrieve data after Nginx restart

My server was working fine until I restarted the server and now my program with cURL API stops working. After troubleshooting for a long time, I figured out what the problem is.
When I use this command:
curl -i https://server.my-site.com/checkConnection
Nginx returns error:
HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Thu, 04 Jul 2019 17:14:40 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: keep-alive
Location: /checkConnection/
but if I use this command:
curl -i -L https://server.my-site.com/checkConnection
Then the server return:
HTTP/1.1 301 Moved Permanently
Server: nginx
Date: Thu, 04 Jul 2019 17:14:40 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: keep-alive
Location: /checkConnection/
HTTP/1.1 200 OK
Server: nginx
Date: Thu, 04 Jul 2019 17:14:40 GMT
Content-Type: text/html; charset=utf-8
Content-Length: 2
Connection: keep-alive
X-Frame-Options: SAMEORIGIN
ok
And if I use a browser, then everything works. I have no clue what the error comes from. and how to fix it.
Any help is appreciated!
This is what happens when the path maps to a directory. In theory, a URL like http://example.org/directory could map to a directory like /wherever/public_html/directory, and being a directory, show an index.html or similar file from there; however, that would cause surprising issues when you go to refer to other things like images in the same directory. <img src="picture.jpg"> would load http://example.org/picture.jpg rather than http://example.org/directory/picture.jpg since it's relative to the URL the browser is actually viewing. Because of this, HTTP servers generally issue a redirect to add a slash at the end, which then both loads the right page and at a URL where relative paths do what humans expect.
Adding -L to your curl commandline causes it to follow the redirect, as browsers do, and you get the result you were expecting. Without -L, curl is a more naive http client and lets you do what you will with the information.
Maybe you have a rule for www.server.my-site.com and that is why this is returning the 301 because it is redirecting from server.my-site.com to www site maybe you should share your configuration to check it
Ok. I finally fix it by adding an internal routing to uwsgi. Everything working fine now.

How does HTTP specify the end of a response for Transfer-encoding: chunked?

From this wikipedia example
a typical HTTP response may look like this
HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Content-Type: text/html; charset=UTF-8
Content-Encoding: UTF-8
Content-Length: 138
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)
ETag: "3f80f-1b6-3e1cb03b"
Accept-Ranges: bytes
Connection: close
<html>
<head>
<title>An Example Page</title>
</head>
<body>
Hello World, this is a very simple HTML document.
</body>
</html>
In this case because of this header:
Connection: close
the client will know that the HTTP body has finished, because the TCP connection will be closed, Say for example Connection: close wasn't present, the client would still know the HTTP body was finished because this header:
Content-Length: 138
Is saying, once you get 138 bytes, we're done here. But in the case where neither of these headers are used and instead the server sends Transfer-encoding: chunked how does a browser know when a response has completed, so that it can move onto other things?
From Wikipedia:
The chunked keyword in the Transfer-Encoding header is used to
indicate chunked transfer. Each chunk is preceded by its size. The
transmission ends when a zero-length chunk is encountered.

Telnet HTTP free server restriction

I'd like to get the whole page using telnet:
telnet
o test.bugs3.com 80
GET / HTTP/1.0
Actually I can get almost any website but this one. The same problem occurs with other free servers. I just want to know what exactly causes some restriction like that.
The request is as following:
Connected.
HTTP/1.1 200 OK
Server:
Date: Mon, 11 Nov 2013 04:11:47 GMT
Content-Type: text/html
Content-Length: 328
Last-Modified: Thu, 16 May 2013 12:17:53 GMT
Connection: close
Accept-Ranges: bytes
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>Account unavailable</title>
</head><body>
<h1>Account unavailable</h1>
<p>Maybe account have been moved, deleted, suspended or not activated yet.
<p>The requested resource could not be found but may be available again in
the future.
<hr>
</body></html>
It's because you aren't sending a Host: test.bugs3.com\r\n header. RFC 2616 #14.23: "A client MUST include a Host header field in all HTTP/1.1 request messages."

Is the version of HTTP either 1.0 or 1.1 defined by webserver? How works the HTTP protocol definition?

I have a quick question but in advance I've read the RFC 2616 Chapter 14.22 about Host and HTTP Header but I still not understand where in httpd.conf or configuration file of a webserver should be changed? Please correct me if I'm wrong.
Look at following two HTTP GET I did to an Apache. The first one is GET for HTTP 1.0 , the other one is GET for HTTP 1.1. See the output:
HTTP/1.0 200 OK
Date: Thu, 24 Oct 2013 03:46:22 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Vary: *
Last-Modified: Fri, 10 Aug 2012 20:22:30 GMT
ETag: "17c815b-3b-50256d86"
Accept-Ranges: bytes
Content-Length: 59
Connection: close
Content-Type: text/html
<html>
<body>
<center>webli7</center>
</body>
</html>
HTTP/1.1 400 Bad Request
Date: Thu, 24 Oct 2013 04:04:40 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
16e
The HTTP protocol version is decided dynamicaly, not through configuration files. The client send a request specifying the highest protocol version that its support. Then, the server must respond with either the version requested by the client, or any earlier version that it prefers.
Since Apache does support HTTP/1.1, it should therefore match exactly the version provided by the client.
There exist a flag that you may set in Apache's config to force Apache to use HTTP/1.0 in certain situations, even though the browser requested HTTP/1.1. This is used to fix bugs in HTTP/1.1 handling of some very old browser. Today, you should not need to play with this flag.
As for your error, I would suggest that you make sure that your GET does provide the Host: header. This header is required in HTTP/1.1, yet optional in HTTP/1.0, and having it missing would certainly result in a 400 error.

Determine supported HTTP version by the web server

Is there a way to check whether a web server supports HTTP 1.0 or 1.1? If so, how is this done?
You could issue a:
curl --head www.test.com
that will print out the HTTP version in the first line of the output...
e.g.
HTTP/1.1 200 OK
Content-Length: 28925
Content-Type: text/html
Last-Modified: Fri, 26 Jun 2009 16:08:04 GMT
Accept-Ranges: bytes
ETag: "a41944978f6c91:0"
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Fri, 31 Jul 2009 06:13:25 GMT
In Google Chrome you can see protocol of each requests like this
open developers tools with F12
go to Network Tab
right click any where in column headers (like Name in the picture) and from the context menu select Protocol to be displayed as a new column
then you will see values like h2 (HTTP 2) or http/1.1 entry like the following picture in Protocol column
This should work on any platform that includes a telnet client:
telnet <host> 80
Then you have to type one of the following blind:
HEAD / HTTP/1.0
or
GET /
and hit enter twice.
The first line returned should output the HTTP version supported:
telnet www.stackoverflow.com 80
HEAD / HTTP/1.0
HTTP/1.1 404 Not Found
Content-Length: 315
Content-Type: text/html; charset=us-ascii
Server: Microsoft-HTTPAPI/2.0
Date: Fri, 31 Jul 2009 15:15:15 GMT
Connection: close
Read the release notes or the documentation of the webserver to check that. For example Apache Tomcat documentation tells it supports HTTP 1.1
Which webserver are you looking for?
Also are you asking if this can be checked programmatically?
In Google Chrome and Brave, you can easily use the Developer tools (F12 or Command + Option + I). Open the Network tab, find the request, click the Header tab, scroll down to "Response Headers", and click view source. It should show the HTTP version in the first line.
In the screenshot below, the server is using HTTP/1.1, as you can see: HTTP/1.1 200 OK. If that is missing, it's HTTP/2, since there is no readable source, it's in binary instead.
Alternatively, you can also use netcat so that you don't have to type it blindly as in telnet.
user#linux:~$ nc www.stackoverflow.com 80
HEAD / HTTP
HTTP/1.1 400 Bad Request
Connection: close
Content-Length: 0
user#linux:~$
$curl --head https://url:port -k
You get result something like...
HTTP/1.1 200 OK
blah....blah.
blah...blah..
$
So first line shows version it supports..

Resources