I'd like to get the whole page using telnet:
telnet
o test.bugs3.com 80
GET / HTTP/1.0
Actually I can get almost any website but this one. The same problem occurs with other free servers. I just want to know what exactly causes some restriction like that.
The request is as following:
Connected.
HTTP/1.1 200 OK
Server:
Date: Mon, 11 Nov 2013 04:11:47 GMT
Content-Type: text/html
Content-Length: 328
Last-Modified: Thu, 16 May 2013 12:17:53 GMT
Connection: close
Accept-Ranges: bytes
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>Account unavailable</title>
</head><body>
<h1>Account unavailable</h1>
<p>Maybe account have been moved, deleted, suspended or not activated yet.
<p>The requested resource could not be found but may be available again in
the future.
<hr>
</body></html>
It's because you aren't sending a Host: test.bugs3.com\r\n header. RFC 2616 #14.23: "A client MUST include a Host header field in all HTTP/1.1 request messages."
Related
I'm trying to obtain the HTML dump of some RFC's from IETF website, via a simple GET request. However, it responds with status code 301. I'm making use of netcat to simulate the HTTP GET request with the following command :
$ printf 'GET /html/rfc3986 HTTP/1.1\r\nHost: tools.ietf.org\r\nConnection: close\r\n\r\n' | nc tools.ietf.org 80
The following reply is obtained as a result of the above command :
HTTP/1.1 301 Moved Permanently
Date: Wed, 09 Sep 2020 15:36:36 GMT
Server: Apache/2.2.22 (Debian)
Location: https://tools.ietf.org/html/rfc3986
Vary: Accept-Encoding
Content-Length: 323
Connection: close
Content-Type: text/html; charset=iso-8859-1
X-Pad: avoid browser bug
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved here.</p>
<hr>
<address>Apache/2.2.22 (Debian) Server at tools.ietf.org Port 80</address>
</body></html>
However, if I try to send a HTTP/1.0 based HEAD request to the Location value determined in the above reply, I get status 404 in reply. I made use of HEAD method just to check the status code of the reply.
Command :
printf 'HEAD https://tools.ietf.org/html/rfc3986 HTTP/1.0\r\n\r\n' | nc tools.ietf.org 80
Reply:
HTTP/1.1 404 Not Found
Date: Wed, 09 Sep 2020 16:32:18 GMT
Server: Apache/2.2.22 (Debian)
Vary: accept-language,accept-charset,Accept-Encoding
Accept-Ranges: bytes
Connection: close
Content-Type: text/html; charset=iso-8859-1
Content-Language: en
Expires: Wed, 09 Sep 2020 16:32:18 GMT
Is there a mistake in the way I'm making use of GET method to obtain the results?
You are sending a plain text request to port 80, so the URL you are trying is effectively http://tools.ietf.org/html/rfc3986
The response is telling you to instead request https://tools.ietf.org/html/rfc3986. That's not a different path on the same server, but a full URL.
The difference is that it begins https meaning you need to make a TLS-secured connection on port 443.
That's not going to be possible with a trivial use of netcat, so you're better off using an HTTP client like curl or wget
From this wikipedia example
a typical HTTP response may look like this
HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Content-Type: text/html; charset=UTF-8
Content-Encoding: UTF-8
Content-Length: 138
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)
ETag: "3f80f-1b6-3e1cb03b"
Accept-Ranges: bytes
Connection: close
<html>
<head>
<title>An Example Page</title>
</head>
<body>
Hello World, this is a very simple HTML document.
</body>
</html>
In this case because of this header:
Connection: close
the client will know that the HTTP body has finished, because the TCP connection will be closed, Say for example Connection: close wasn't present, the client would still know the HTTP body was finished because this header:
Content-Length: 138
Is saying, once you get 138 bytes, we're done here. But in the case where neither of these headers are used and instead the server sends Transfer-encoding: chunked how does a browser know when a response has completed, so that it can move onto other things?
From Wikipedia:
The chunked keyword in the Transfer-Encoding header is used to
indicate chunked transfer. Each chunk is preceded by its size. The
transmission ends when a zero-length chunk is encountered.
I have a quick question but in advance I've read the RFC 2616 Chapter 14.22 about Host and HTTP Header but I still not understand where in httpd.conf or configuration file of a webserver should be changed? Please correct me if I'm wrong.
Look at following two HTTP GET I did to an Apache. The first one is GET for HTTP 1.0 , the other one is GET for HTTP 1.1. See the output:
HTTP/1.0 200 OK
Date: Thu, 24 Oct 2013 03:46:22 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Vary: *
Last-Modified: Fri, 10 Aug 2012 20:22:30 GMT
ETag: "17c815b-3b-50256d86"
Accept-Ranges: bytes
Content-Length: 59
Connection: close
Content-Type: text/html
<html>
<body>
<center>webli7</center>
</body>
</html>
HTTP/1.1 400 Bad Request
Date: Thu, 24 Oct 2013 04:04:40 GMT
Server: Apache/1.3.41 (Unix) mod_gzip/1.3.26.1a PHP/5.2.9 mod_throttle/3.1.2 mod_psoft_traffic/0.2 mod_ssl/2.8.31 OpenSSL/0.9.8b
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
16e
The HTTP protocol version is decided dynamicaly, not through configuration files. The client send a request specifying the highest protocol version that its support. Then, the server must respond with either the version requested by the client, or any earlier version that it prefers.
Since Apache does support HTTP/1.1, it should therefore match exactly the version provided by the client.
There exist a flag that you may set in Apache's config to force Apache to use HTTP/1.0 in certain situations, even though the browser requested HTTP/1.1. This is used to fix bugs in HTTP/1.1 handling of some very old browser. Today, you should not need to play with this flag.
As for your error, I would suggest that you make sure that your GET does provide the Host: header. This header is required in HTTP/1.1, yet optional in HTTP/1.0, and having it missing would certainly result in a 400 error.
I have a script on GAE that requests an XML feed from a partner that's typically 40MB but only 5MB gzipped. GAE is automatically unzipping this content and throwing an error that the response is too big:
HTTP response was too large: 46677241. The limit is: 33554432.
The script is setup to uncompress the response itself. How do I prevent GAE from getting in the way and breaking?
Here's the response header from my partner:
HTTP/1.0 200 OK
Expires: Wed, 27 Jun 2012 05:42:07 GMT
Cache-Control: max-age=10368000
Content-Type: application/x-gzip
Accept-Ranges: bytes
Last-Modified: Wed, 22 Feb 2012 11:06:09 GMT
Content-Length: 5263323
Date: Tue, 28 Feb 2012 05:42:07 GMT
Server: lighttpd
X-Cache: MISS from static01
X-Cache-Lookup: MISS from static01:80
Via: 1.0 static01:80 (squid)
Most likely your partner's server responds with plain XML, because it thinks that http-client sending requests (i.e. GAE URL Fetch service) does not support gzipping. Hence "response was too large" error.
To announce that you actually want to receive gzipped content you need to set Accept-Encoding: gzip header when using URL fetch service.
Is there a way to check whether a web server supports HTTP 1.0 or 1.1? If so, how is this done?
You could issue a:
curl --head www.test.com
that will print out the HTTP version in the first line of the output...
e.g.
HTTP/1.1 200 OK
Content-Length: 28925
Content-Type: text/html
Last-Modified: Fri, 26 Jun 2009 16:08:04 GMT
Accept-Ranges: bytes
ETag: "a41944978f6c91:0"
Server: Microsoft-IIS/7.0
X-Powered-By: ASP.NET
Date: Fri, 31 Jul 2009 06:13:25 GMT
In Google Chrome you can see protocol of each requests like this
open developers tools with F12
go to Network Tab
right click any where in column headers (like Name in the picture) and from the context menu select Protocol to be displayed as a new column
then you will see values like h2 (HTTP 2) or http/1.1 entry like the following picture in Protocol column
This should work on any platform that includes a telnet client:
telnet <host> 80
Then you have to type one of the following blind:
HEAD / HTTP/1.0
or
GET /
and hit enter twice.
The first line returned should output the HTTP version supported:
telnet www.stackoverflow.com 80
HEAD / HTTP/1.0
HTTP/1.1 404 Not Found
Content-Length: 315
Content-Type: text/html; charset=us-ascii
Server: Microsoft-HTTPAPI/2.0
Date: Fri, 31 Jul 2009 15:15:15 GMT
Connection: close
Read the release notes or the documentation of the webserver to check that. For example Apache Tomcat documentation tells it supports HTTP 1.1
Which webserver are you looking for?
Also are you asking if this can be checked programmatically?
In Google Chrome and Brave, you can easily use the Developer tools (F12 or Command + Option + I). Open the Network tab, find the request, click the Header tab, scroll down to "Response Headers", and click view source. It should show the HTTP version in the first line.
In the screenshot below, the server is using HTTP/1.1, as you can see: HTTP/1.1 200 OK. If that is missing, it's HTTP/2, since there is no readable source, it's in binary instead.
Alternatively, you can also use netcat so that you don't have to type it blindly as in telnet.
user#linux:~$ nc www.stackoverflow.com 80
HEAD / HTTP
HTTP/1.1 400 Bad Request
Connection: close
Content-Length: 0
user#linux:~$
$curl --head https://url:port -k
You get result something like...
HTTP/1.1 200 OK
blah....blah.
blah...blah..
$
So first line shows version it supports..