chunked encoding and connection:close together - http

I'm a bit confused about what to do with the "Connection" header when using chunked-encoding. Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection? I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent where if I don't add the Connection header, it would remain open
Thanks

Message Length and Connection Management are two different things that really have nothing to do with each other (except in the one case where the Message Length is completely unknown and closing the connection is the only possible way to denote EOF, but that rarely happens, and is not your situation).
Chunking only applies to Message Length, where a 0-length chunk denotes EOF. If the connection is closed prematurely before EOF is reached, the Message is incomplete, and the receiver can decide whether to keep/process it or not.
The Connection header is used by the client to specify whether it wants the server to close the connection, or leave it open, after sending its response. The same header is used by the server to specify whether the connection is actually closing or staying open after the response has been sent.
Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
That has nothing to do with chunking. Regardless of the format of the Message, the client and server should always indicate their intentions for the current connection, whether it should be closed or left open. That is done via the presence of, or lack of, a Connection header, depending on the HTTP version:
If HTTP 1.0 is used, the default behavior is close unless Connection: keep-alive is explicitly sent.
If HTTP 1.1+ is used, the default behavior is keep-alive unless Connection: close is explicitly sent.
If the client requests a keep-alive, the server decides whether or not to honor it. The server may either leave the connection open, or it may close the connection.
If the client requests a closure, the server MUST honor it and close the connection.
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection?
No. Only the Connection header does that. Especially in a keep-alive scenario, thus the connection is left open so the client can reuse the existing connection to send a new request after the transfer of the previous response is finished.
I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent
Correct, especially in HTTP 1.1+ where keep-alive is the default behavior.
where if I don't add the Connection header, it would remain open
The meaning of an omitted Connection header depends on the HTTP version being used, as described above.

A server is allowed to send Connection: close while using chunked transfer encoding, this should not cause clients to disconnect before the response is finished.
According to RFC2616 https://www.rfc-editor.org/rfc/rfc2616#section-14.10
For example,
Connection: close
in either the request or the response header fields indicates that
the connection SHOULD NOT be considered `persistent' (section 8.1)
after the current request/response is complete.
Since the response is not complete while chunks are being sent, the connection is not forbidden from persisting.

Related

HTTP-Long Polling keep-alive and handshakes

I'm doing a test where I examine how much HTTP-long polling compared to Websockets is affecting the battery performance on my iPhone. Basically what I have is a Node.js with express server that sends out a random string every 0.5 or 10th second to the iPhone. I've inspected the messages in Chrome and I can see the keep-alive header is present. I know keep-alive is a default feature since HTTP/1.1. From what I've understood the TCP-connection will be held open and can be used for pipelining, and this is certainly the case when I'm sending out pings from the server every 0.5 seconds. But when I send out every 10 seconds, will the connection be closed during that time?
How do I know how long the connection is open? This seems to be a crucial part to have in mind when doing the tests.
Will the HTTP-handshake still be made when the TCP-connection is open?
AFAIK, in HTTP 1, the server cannot send a response back to the client if that client didn't send a request first. That might sound irrelevant to your question but bear with me.
The Connection: keep-alive header tells the client that it can reuse the connection if he want to, not that it must. The client can decide to close it any time, it all depends on the client library implementation and you don't have any guarantee.
The only way to force the client to not close the connection is to not finish the response. The only way to do that is to send a response with a Transfer-Encoding: chunked, and never send the final chunk (this has some serious caveats, like a buffer overrun on the client...).
So to answer your 2 points:
You can't, this low-level detail is totally hidden (for good reasons) from the client.
There is no HTTP handshake, there is a TCP handshake which is made when the client socket connects to the server socket. There is the TLS handshake which is made after the TCP connection and before any request is made. Once the connection is open, http requests are sent by the client and the server responds with resources.

Are HTTP keep-alive connections possible without content-length headers?

I understand that in HTTP 1.0, the content of a response is terminated by closing the connection.
In HTTP 1.1, keep-alive connections were introduced, enabling multiple requests and responses in a single TCP connection.
When multiple messages are sent over the same connection, there needs to be a mechanism that defines where one message ends and the next message starts.
By testing, I found out that this works when I set the content-length header in a response. By knowing the content length, the client knows when the content is fully received and can parse the next response.
My question is:
Is it possible to send multiple responses in a keep-alive connection without setting the content-length header?
If yes, how?
For clarification: I am thinking about scenarios where the length of the response is not known when starting to send it to the client and I would like to know if closing the connection is the only way to implement that.
The Transfer-Encoding header is what I was looking for.
By setting the transfer-encoding to chunked, it is possible to omit the Content-Length header.
In the chunked transfer encoding, a message can be sent in multiple chunks for which the length is known. To terminate a message, a chunk with length zero is sent.
This makes it possible to have a keep-alive connection and still send messages where the length is unknown when starting to send them.

Is this HTTP Header wrong?

I am trying to write a torrent downloader and needed to work out how to contact a tracker. I used the Fiddler2 program to intercept a tracker request sent from Vuze to it's tracker.
In the message sent (shown below), the Connection header is declared twice with different values.
Is this a correct use of the Connection header?
What does Connection: keep-alive do?
GET /announce?info_hash=0Z%22...&azver=3&azas=12576 HTTP/1.1
User-Agent: Azureus 4.7.0.2;Windows 7;Java 1.6.0_31
Connection: close
Accept-Encoding: gzip
Host: tracker.update.vuze.com:6969
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
From RFC2616 section 4.2:
Multiple message-header fields with the same field-name MAY be present
in a message if and only if the entire field-value for that header
field is defined as a comma-separated list [i.e., #(values)]. It MUST
be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message, by appending each subsequent field-value to the first, each
separated by a comma. The order in which header fields with the same
field-name are received is therefore significant to the interpretation
of the combined field value, and thus a proxy MUST NOT change the
order of these field values when a message is forwarded.
Edit:
According to section 14.10, Connection is such a field-name, so having multiple Connection headers is technically correct.
From 14.10 the grammar production for the Connection header is Connection = "Connection" ":" 1#(connection-token), so one or more comma separated tokens are valid.
In practise however, it may be that the second Connection header will be ignored, and thus the web server will expect to close the underlying TCP connection once the response has been sent.
For HTTP 1.1 the default mode is for the server to keep the underlying TCP connection open for subsequent requests, although many servers will limit the total number of requests made before closing the connection anyway.
HTTP 1.1 allows multiple Connection headers. The semantics of multiple such headers are defined to be the same as a single header with all values joined together with commas, e.g.
Connection: close
Connection: keep-alive
is the same as:
Connection: close,keep-alive
So technically, these headers are fine. However, I will herewith predict without having made some experiments that there will be a lot of servers out there (especially not so well tested ones such as torrent trackers) who will ignore either one of these headers.
Now, the deeper problem is that "keep-alive" is some http 1.0 extension, and kind of contradicts the "close", so I guess this combination is simply a bug in the torrent client. I guess most trackers will not allow persistent connections anyway, though, so I think the bets way to proceed i to only have a single "Connection: close" header.
Its weird to have 2 Connection Header Fields. I think you can only expect a non-deterministic behavior as the handling of this case depends solely on the implementation of the web server.
It might end up with any of the 2 in (let's say) the hashmap storing the fields.
Basically, keep alive means that the browser is allowed to maintain the connection to the server and to keep retrieving images, scripts over it. Usually this is not the case. Normally, a Connection to a web server is closed after the request was given a response.
Connection: Close
means that after the the request has completed close the connection.
Connection: keep-alive
means keep the connection open for future request.
You should only have one Connection parameter.
So therefore the HTTP header is not correct.

HTTP: shutdown socket for writing after sending the request?

How is a socket used by an HTTP client properly closed after transmitting the request? Or does it have to remain open (bidirectionally) until the complete response has been received? If so, how is the end of the request body determined by the server?
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4, closing the socket is not an option for a request. That doesn't sound logical to me - why should a half-closed TCP connection be a problem for the server, if the client doesn't try to transmit anything after closing its half of the socket? The client can still receive data after all.
It seems to me that shutting down the write part of a socket would be a very practical way of letting the server know that the request has been finished. http://docs.python.org/howto/sockets.html#disconnecting even specifically mentions that use case.
If that's really the wrong way to do it, what's the alternative? Do I really always have to send a "Content-length" or use chunked transport to enable the server to properly find the end of a request? How does that work for requests with unknown body length?
Transfer-Encoding: chunked is specifically designed to allow sending data with an unknown body length, for both requests and responses. The end of the data is determined by receiving a chunk whose payload size is 0. If you do not send a chunked request, then you must send a Content-Length instead.
Are you talking about this?:
Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
I think the text is talking about full close, you can do a half(write)-close. I'm not sure that's a HTTP compilant way of doing it, but I would think most servers will accept it.
Regarding your second question, simply use chunked encoding:
All HTTP/1.1 applications that receive entities MUST accept the "chunked" transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.

How do I report an error midway through a chunked http repsonse without closing the connection?

I have an HTTP server that returns large bodies in response to POST requests (it is a SOAP server). These bodies are "streamed" via chunking. If I encounter an error midway through streaming the response how can I report that error to the client and still keep the connection open? The implementation uses a proprietary HTTP/SOAP stack so I am interested in answers at the HTTP protocol level.
Once the server has sent the status line (the very first line of the response) to the client, you can't change the status code of the response anymore. Many servers delay sending the response by buffering it internally until the buffer is full. While the buffer is filling up, you can still change your mind about the response.
If your client has access to the response headers, you could use the fact that chunked encoding allows the server to add a trailer with headers after the chunked-encoded body. So, your server, having encountered the error, could gracefully stop sending the body, and then send a trailer that sets some header to some value. Your client would then interpret the presence of this header as a sign that an error happened.
Also keep in mind that chunked responses can contain "footers" which are just like HTTP headers. After failing, you can send a footer such as:
X-RealStatus: 500 Some bad stuff happened
Or if you succeed:
X-RealStatus: 200 OK
you can change the status code as long as response.iscommitted() returns false.
(fot HttpServletResponse in java, im sure there exists an equivalent in other languages)

Resources