HTTP HEAD chunked-encoded - http

I have implemented a HTTP 1.1 server. It is an embedded server so I only support the mandatory features of the RFC. All responses are sent chunked-encoded.
As HEAD is mandatory it is also supported.
HEAD is a GET without body. So the server is sending a response like following in response to a HEAD request:
HTTP/1.1 200 OK
Server: testServer
Connection: keep-alive
Transfer-Encoding: chunked
What I am wondering is do have to add a "0\r\n" as it required to signal the end of chunks:
HTTP/1.1 200 OK
Server: testServer
Connection: keep-alive
Transfer-Encoding: chunked
0
I have tried to collect the relevant parts in the RFC:
"The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response."
"All responses to the HEAD request method MUST NOT include a message-body, even though the presence of entity- header fields might lead one to believe they do."
"1.Any response message which "MUST NOT" include a message-body (such as the 1xx, 204, and 304 responses and any response to a HEAD request) is always terminated by the first empty line after the header fields, regardless of the entity-header fields present in the message."
So far I understand it that my first solution (without 0) is the correct one. But it seems to be strange to send a message with Transfer-encoding: chunked which does not terminate with the the chunk style 0\r\n.

I guess the problem stated is solved as its quite old,
Recently I developed a test http server for streaming video and I wanted to put forth my learnings so that it might help others.
First, "0\r\n" is not the marker for end of chunks, the chunked trailer is '0\r\n\r\n'.
And for the header the string should look like this
"HTTP/1.1 200 OK\r\n" \
"Server: testServer\r\n" \
"Connection: keep-alive\r\n" \
"Transfer-Encoding: chunked\r\n" \
"\r\n"
Note the last CRLF, its indicating end of header.
Hope this help.

Related

How uTorrent can read the body of a http response message without a specified size or "chunked" option?

I've used the SmartSniff tool to catch the http messages between the uTorrent application and a server. I found one server that sends "HTTP/1.0 200 OK" response messages with a body and no headers that can tell the length of the body or the "chunked" option, but uTorrent seems not having trouble with that, it works fine. And I wander how did it do it ?
I think that, maybe, uTorrent knows about this "server error" and when it is expecting a body, after it reads the response line and headers (in this case none), it reads until the server close the connection. Is this posible ?
Catched communication:
GET /announce.php?(a list of parameters here) HTTP/1.1
Host: some.server.here:1234
User-Agent: uTorrent/3320(30416)
Accept-Encoding: gzip
Connection: Close
HTTP/1.0 200 OK
(empty line)
d8:completei176e10:incompletei0e8:intervali3600e5:peers0:e
I researched the link provided and I find the answer. It is as follow:
In a response message without a declared message body length, the message body length is determined by the number of octets received prior to the server closing the connection.

How to extract Data from a multipart Http Request?

I am writing a HTTP Webserver. My server has to handle Http multipart requests. In my previous implementation, I was extracting the data with the help of content length header present in every part of request. The client which I was using give content-length header with every part part(file) in the multipart request.
But another client is not giving content-length of each file. In my implementation I use content-length header to extract that much bytes and save them into a file.
Please tell me how can I extract data now.
The Headers which I am getting now are:
POST xxxxxxxxxxxxxxxxxxxxxxx&currentTab=PHOTOxxxxxxxxxxxxxxxx HTTP/1.1
Content-Length: 6829
Content-Type: multipart/form-data; boundary=SnlCg9JqTpQIl6t_mPzByTjZ8bD24kUj; charset=UTF-8
Host: host
Connection: Keep-Alive
User-Agent: Apache-HttpClient/xxxxxxxx
Accept-Encoding: gzip
--SnlCg9JqTpQIl6t_mPzByTjZ8bD24kUj
Content-Disposition: form-data; name="file"; filename="imagesCA5L2CL6_jpg(2)_jpg.jpg"
Content-Type: photo/jpg
**Some Data byte array**
--SnlCg9JqTpQIl6t_mPzByTjZ8bD24kUj--
In this request, there is now content-length header in part data.
EDIT:
Earlier this client used to send content-length header in every part. But for some reason it is not sending it any more. Can anybody suggest any reason for that.
thanks
Like this : Reading file input from a multipart/form-data POST
Reading file input from a multipart/form-data POST
Take a look at RFC 2616 if you want to implement a HTTP/1.1 server. See section 4.4 on how to determine message length. See RFC 2388 on how to implement multipart/form-data.
The real answer is: don't reinvent the wheel, or you'll have to reimplmement a few hundred pages of RFC's. There are tons of libraries and servers out there.
If you do want to write your own web server, for example as an exercise, you would have found those RFC's already, right?

Why do my HTTP/1.0 GET request returns OK but no body content?

I'm making a simple http page-requester in C. It uses sockets to send HTTP/1.0 GET requests to hosts, and parses the answer to effectively download the html file.
However, when i send a request like this:
GET http://stackoverflow.com/questions HTTP/1.0
User-Agent: myRequester/1.0
It returns this
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Date: Mon, 19 Dec 2011 15:28:08 GMT
Content-Length: 54362
Connection: close
But no body content.
Yes, I've put CRLF on the end of every line and a blank line at the end.
I use only one socket through one connection. And i also have to stick to HTTP/1.0.
The most likely explanation is that the server is actually sending a body but you are not reading all of it. Most networking systems do not necessarily return all of the response in one function call, because they see it useful that what data is available immediately is returned immediately, even if more is expected.
The Unix system call recv returns zero when the connection has ended. You should keep calling it until you get zero or an error.

What, at the bare minimum, is required for an HTTP request?

I'm trying to issue a GET command to my local server using netcat by doing the following:
echo -e "GET / HTTP/1.1\nHost: localhost" | nc localhost 80
Unfortunately, I get a HTTP/1.1 400 Bad Request response for this. What, at the very minimum, is required for a HTTP request?
if the request is: "GET / HTTP/1.0\r\n\r\n" then the response contains header as well as body, and the connection closes after the response.
if the request is:"GET / HTTP/1.1\r\nHost: host:port\r\nConnection: close\r\n\r\n"
then the response contains header as well as body, and the connection closes after the response.
if the request is:"GET / HTTP/1.1\r\nHost: host:port\r\n\r\n" then the response contains header as well as body, and the connection will not close even after the response.
if your request is: "GET /\r\n\r\n" then the response contains no header and only body, and the connection closes after the response.
if your request is: "HEAD / HTTP/1.0\r\n\r\n" then the response contains only header and no body, and the connection closes after the response.
if the request is: "HEAD / HTTP/1.1\r\nHost: host:port\r\nConnection: close\r\n\r\n" then the response contains only header and no body, and the connection closes after the response.
if the request is: "HEAD / HTTP/1.1\r\nHost: host:port\r\n\r\n" then the response contains only header and no body, and the connection will not close after the response.
It must use CRLF line endings, and it must end in \r\n\r\n, i.e. a blank line. This is what I use:
printf 'GET / HTTP/1.1\r\nHost: www.example.com\r\nConnection: close\r\n\r\n' |
nc www.example.com 80
Additionally, I prefer printf over echo, and I add an extra header to have the server close the connection, but those aren’t needed.
See Wiki: HTTP Client Request (Example).
Note the following:
A client request (consisting in this case of the request line and only one header) is followed by a blank line, so that the request ends with a double newline, each in the form of a carriage return followed by a line feed. The "Host" header distinguishes between various DNS names sharing a single IP address, allowing name-based virtual hosting. While optional in HTTP/1.0, it is mandatory in HTTP/1.1.
The absolute minimum (if removing the Host is allowed ;-) is then GET / HTTP/1.0\r\n\r\n.
Happy coding
I was able to get a response from my Apache server with only the requested document, no response header, with just
GET /\r\n
If you want response headers, including the status code, you need one of the other answers here though.
The fact of the 400 Bad Request error itself does not imply that your request violates HTTP. The server very well could be giving this response for another reason.
As far as I know the absolute minimum valid HTTP request is:
GET / HTTP/1.0\r\n\r\n
Please, please, please, do not implement your own HTTP client without first reading the relevant specs. Please read and make sure that you've fully understood at least RFC 2616. (And if you're ambitious, RFC 7230 through 7235).
While HTTP looks like an easy protocol, there are actually a number of subtle points about it. Anyone who has written an HTTP server will tell you about the workarounds he had to implement in order to deal with incorrect but widely deployed clients. Unless you're into reading specifications, please use a well-established client library; Curl is a good choice, but I'm sure there are others.
If you're going to implement your own:
do not use HTTP/0.9;
HTTP/1.0 requires the query line and the empty line;
in HTTP/1.1, the Host: header is compulsory in addition to the above.
Omitting the Host: header in HTTP/1.1 is the most common cause of 400 errors.
You should add an empty line: \r\n\r\n
http://en.wikipedia.org/wiki/Hypertext_Transfer_Protocol#Client_request
The really REALLY BARE minimum, is not using netcat, but using bash itself:
user#localhost:~$ exec 3<>/dev/tcp/127.0.0.1/80
user#localhost:~$ echo -e "GET / HTTP/1.1\n" >&3
user#localhost:~$ cat <&3
HTTP/1.0 200 OK
Server: SimpleHTTP/0.6 Python/2.7.6
Date: Mon, 13 Oct 2014 17:55:55 GMT
Content-type: text/html; charset=UTF-8
Content-Length: 514
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN"><html>
<title>Directory listing for /</title>
<body>
<h2>Directory listing for /</h2>
<hr>
<ul>
</ul>
<hr>
</body>
</html>
user#localhost:~$

Chunked Transfer Encoding problem with Apache Abdera

I'm using Apache Abdera to POST atom multipart data to my server, and am having some odd problems that I can't pin down.
It looks like an issue with chunked transfer encoding, but I'm insufficiently experienced to be certain. The problem manifests as the server throwing an error indicating that the request I sent it contains only one mime part, not two as required. I attached Wireshark to the interface and captured the conversation, and it went like this:
POST /sss/col-uri/2ee98ea1-f9ad-4f01-9b1c-cfa3c4a6dc3c HTTP/1.1
Host: localhost
Expect: 100-continue
Transfer-Encoding: chunked
Content-Type: multipart/related; boundary="1306399868259";type="application/atom+xml;type=entry"
The server's response:
HTTP/1.1 100 Continue
My client continues:
198
--1306399868259
Content-Type: application/atom+xml;type=entry
Content-Disposition: attachment; name="atom"
<entry xmlns="http://www.w3.org/2005/Atom"><title xmlns="http://purl.org/dc/terms/">Richard Woz Ere</title><bibliographicCitation xmlns="http://purl.org/dc/terms/">this is my citation</bibliographicCitation><content type="application/zip" src="cid:48bd9436-e8b6-4f68-aa83-5c88eda52fd4" /></entry>
0
b0e9
--1306399868259
Content-Type: application/zip
Content-Disposition: attachment; name="payload"; filename="example.zip"
Content-ID: <48bd9436-e8b6-4f68-aa83-5c88eda52fd4>
Packaging: http://purl.org/net/sword/package/SimpleZip
And at this point the server responds with:
HTTP/1.1 400 Bad Request
Date: Thu, 26 May 2011 08:51:08 GMT
Server: Apache/2.2.17 (Unix) mod_ssl/2.2.17 OpenSSL/0.9.8l DAV/2 mod_wsgi/3.3 Python/2.6.1
Connection: close
Transfer-Encoding: chunked
Content-Type: text/xml
Indicating the error (which is well understood). My server goes on to stream a pile of base64 encoded bits onto the output stream, but in the mean time the server is not listening, it has already decided that the request was erroneous.
Unfortunately, I'm not in charge of the HTTP layer - this is all handled by Abdera using Apache httpclient. My code that does this looks like this:
client.execute("POST", url.toString(), new SWORDMultipartRequestEntity(deposit), options);
Here, the SWORDMultipartRequestEntity is a copy of the standard Abdera MultipartRequestEntity class, with a few extra headers thrown in (see, for example, Packaging in the above snippet); the "deposit" argument is just an object holding the atom part and the inputstream.
When attaching a debugger I get to this line of code fine, and then it disappears into a rat hole and then I get this error back.
Any hints or tips? I've pretty much exhausted my angles of attack!
The only thing that stands out for me is that immediately after the atom:entry document, there is a newline with "0" on it alone, which appears to be chunked transfer encoding speak for "I'm finished". Not sure how it got there, or whether it really has any effect. Help much appreciated.
Cheers,
Richard
The lonely 0 may indeed be a problem. My uninformed guess is that it results from some call to flush(), which then writes the whole buffer as another HTTP chunk. Unfortunately at the point where flush is called, the buffer had already been flushed and its size is therefore zero. So the HttpChunkedOutputFilter (or however it is called) should be taught than an empty buffer does not need to be flushed.
[update:] You should set a breakpoint in the ChunkedOutputStream class, especially the flush method. I just looked at its code and it seems to be ok, but maybe I missed something.

Resources