I understand that in HTTP 1.0, the content of a response is terminated by closing the connection.
In HTTP 1.1, keep-alive connections were introduced, enabling multiple requests and responses in a single TCP connection.
When multiple messages are sent over the same connection, there needs to be a mechanism that defines where one message ends and the next message starts.
By testing, I found out that this works when I set the content-length header in a response. By knowing the content length, the client knows when the content is fully received and can parse the next response.
My question is:
Is it possible to send multiple responses in a keep-alive connection without setting the content-length header?
If yes, how?
For clarification: I am thinking about scenarios where the length of the response is not known when starting to send it to the client and I would like to know if closing the connection is the only way to implement that.
The Transfer-Encoding header is what I was looking for.
By setting the transfer-encoding to chunked, it is possible to omit the Content-Length header.
In the chunked transfer encoding, a message can be sent in multiple chunks for which the length is known. To terminate a message, a chunk with length zero is sent.
This makes it possible to have a keep-alive connection and still send messages where the length is unknown when starting to send them.
Related
I'm a bit confused about what to do with the "Connection" header when using chunked-encoding. Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection? I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent where if I don't add the Connection header, it would remain open
Thanks
Message Length and Connection Management are two different things that really have nothing to do with each other (except in the one case where the Message Length is completely unknown and closing the connection is the only possible way to denote EOF, but that rarely happens, and is not your situation).
Chunking only applies to Message Length, where a 0-length chunk denotes EOF. If the connection is closed prematurely before EOF is reached, the Message is incomplete, and the receiver can decide whether to keep/process it or not.
The Connection header is used by the client to specify whether it wants the server to close the connection, or leave it open, after sending its response. The same header is used by the server to specify whether the connection is actually closing or staying open after the response has been sent.
Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
That has nothing to do with chunking. Regardless of the format of the Message, the client and server should always indicate their intentions for the current connection, whether it should be closed or left open. That is done via the presence of, or lack of, a Connection header, depending on the HTTP version:
If HTTP 1.0 is used, the default behavior is close unless Connection: keep-alive is explicitly sent.
If HTTP 1.1+ is used, the default behavior is keep-alive unless Connection: close is explicitly sent.
If the client requests a keep-alive, the server decides whether or not to honor it. The server may either leave the connection open, or it may close the connection.
If the client requests a closure, the server MUST honor it and close the connection.
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection?
No. Only the Connection header does that. Especially in a keep-alive scenario, thus the connection is left open so the client can reuse the existing connection to send a new request after the transfer of the previous response is finished.
I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent
Correct, especially in HTTP 1.1+ where keep-alive is the default behavior.
where if I don't add the Connection header, it would remain open
The meaning of an omitted Connection header depends on the HTTP version being used, as described above.
A server is allowed to send Connection: close while using chunked transfer encoding, this should not cause clients to disconnect before the response is finished.
According to RFC2616 https://www.rfc-editor.org/rfc/rfc2616#section-14.10
For example,
Connection: close
in either the request or the response header fields indicates that
the connection SHOULD NOT be considered `persistent' (section 8.1)
after the current request/response is complete.
Since the response is not complete while chunks are being sent, the connection is not forbidden from persisting.
I'm working with an HTTP request tool (similar to cURL) and having an issue with the server response. Either that or my understanding of the RFC for HTTP 1.1 and chunked data.
What I'm seeing is chunked data should be in this format:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
in\r\n\r\nchunks.\r\n
0\r\n
\r\n
what I'm actually seeing is the following:
4\r\n
Wiki\r\n
5\r\n
pedia\r\n
e\r\n
in\r\n\r\nchunks.\r\n
0
In other words, the few servers I've tested with send no more data after the 0.. not CRLF, much less CRLFCRLF.
How are we supposed to know it's the end of the chunked data without the proper format of the chunked tags? Timeouts happen looking for the CRLFs after the 0, and that's no sufficient.
Yes, it violates standard. But we want to be compatible with all possible http servers and clients, so we have to understand a way how it can be violated.
Chunked is used often in a way of content streaming over http 1.1 protocol. Standard ask to end content with additional CRLF. So we can see the following pseudo code:
def stream(endpoint)
Socket.open(endpoint) do |socket|
sleep 10
more_data do |data|
print data.length.to_s(16)
print data
print "CRLF"
end
end
print "CRLF"
end
But the right code is the following:
def stream(endpoint)
Socket.open(endpoint) do |socket|
sleep 10
more_data do |data|
print data.length.to_s(16)
print data
print "CRLF"
end
end
ensure
print "CRLF"
end
It means that after input socket interruption of any other exception wrong version of method won't be able to print additional "CRLF" to output socket.
How are we supposed to know it's the end of the chunked data without
the proper format of the chunked tags? Timeouts happen looking for the
CRLFs after the 0, and that's no sufficient.
Many implementations ignores this violation because they don't need to know the size of content. They just tries to receive as much data as possible before socket will be closed.
Use Content-Length, definitely whenever I know it; for file download, checking the filesize is insignificant in terms of resources. For chunked transfer we do not scan the message body for a CRLF pair. It first reads the specified number of bytes, and then reads two more bytes to confirm that they are CR and LF. If they're not, the message body is ill-formed, and either the size was specified improperly or the data was otherwise corrupted.
For more information read RCF, which says
A server using chunked transfer-coding in a response MUST NOT use the
trailer for any header fields unless at least one of the following is
true:
a)the request included a TE header field that indicates "trailers" is
acceptable in the transfer-coding of the response, as described in
section 14.39; or,
b)the server is the origin server for the response, the trailer fields
consist entirely of optional metadata, and the recipient could use the
message (in a manner acceptable to the origin server) without
receiving this metadata. In other words, the origin server is willing
to accept the possibility that the trailer fields might be silently
discarded along the path to the client.
Way to Determine Message Body Length:
If header has Transfer-Encoding and the chunked transfer is final encoding, then message body length is determined by reading and decoding the chunked data until the transfer coding indicates the data is complete.
If header has Transfer-Encoding and the chunked transfer is not final encoding, then message body length is determined by reading the connection until it is closed by the server.
If header has Transfer-Encoding in request and the chunked transfer is not final encoding, then message body length cannot be determined reliably; the server MUST respond with the 400 (Bad Request) status code and then close the connection.
If a message is received with both a Transfer-Encoding and Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request response splitting and ought to be handled as an error. A sender MUST remove the received Content-Length field prior to forwarding such a message downstream.
I am trying to write a torrent downloader and needed to work out how to contact a tracker. I used the Fiddler2 program to intercept a tracker request sent from Vuze to it's tracker.
In the message sent (shown below), the Connection header is declared twice with different values.
Is this a correct use of the Connection header?
What does Connection: keep-alive do?
GET /announce?info_hash=0Z%22...&azver=3&azas=12576 HTTP/1.1
User-Agent: Azureus 4.7.0.2;Windows 7;Java 1.6.0_31
Connection: close
Accept-Encoding: gzip
Host: tracker.update.vuze.com:6969
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive
From RFC2616 section 4.2:
Multiple message-header fields with the same field-name MAY be present
in a message if and only if the entire field-value for that header
field is defined as a comma-separated list [i.e., #(values)]. It MUST
be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message, by appending each subsequent field-value to the first, each
separated by a comma. The order in which header fields with the same
field-name are received is therefore significant to the interpretation
of the combined field value, and thus a proxy MUST NOT change the
order of these field values when a message is forwarded.
Edit:
According to section 14.10, Connection is such a field-name, so having multiple Connection headers is technically correct.
From 14.10 the grammar production for the Connection header is Connection = "Connection" ":" 1#(connection-token), so one or more comma separated tokens are valid.
In practise however, it may be that the second Connection header will be ignored, and thus the web server will expect to close the underlying TCP connection once the response has been sent.
For HTTP 1.1 the default mode is for the server to keep the underlying TCP connection open for subsequent requests, although many servers will limit the total number of requests made before closing the connection anyway.
HTTP 1.1 allows multiple Connection headers. The semantics of multiple such headers are defined to be the same as a single header with all values joined together with commas, e.g.
Connection: close
Connection: keep-alive
is the same as:
Connection: close,keep-alive
So technically, these headers are fine. However, I will herewith predict without having made some experiments that there will be a lot of servers out there (especially not so well tested ones such as torrent trackers) who will ignore either one of these headers.
Now, the deeper problem is that "keep-alive" is some http 1.0 extension, and kind of contradicts the "close", so I guess this combination is simply a bug in the torrent client. I guess most trackers will not allow persistent connections anyway, though, so I think the bets way to proceed i to only have a single "Connection: close" header.
Its weird to have 2 Connection Header Fields. I think you can only expect a non-deterministic behavior as the handling of this case depends solely on the implementation of the web server.
It might end up with any of the 2 in (let's say) the hashmap storing the fields.
Basically, keep alive means that the browser is allowed to maintain the connection to the server and to keep retrieving images, scripts over it. Usually this is not the case. Normally, a Connection to a web server is closed after the request was given a response.
Connection: Close
means that after the the request has completed close the connection.
Connection: keep-alive
means keep the connection open for future request.
You should only have one Connection parameter.
So therefore the HTTP header is not correct.
How is a socket used by an HTTP client properly closed after transmitting the request? Or does it have to remain open (bidirectionally) until the complete response has been received? If so, how is the end of the request body determined by the server?
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4, closing the socket is not an option for a request. That doesn't sound logical to me - why should a half-closed TCP connection be a problem for the server, if the client doesn't try to transmit anything after closing its half of the socket? The client can still receive data after all.
It seems to me that shutting down the write part of a socket would be a very practical way of letting the server know that the request has been finished. http://docs.python.org/howto/sockets.html#disconnecting even specifically mentions that use case.
If that's really the wrong way to do it, what's the alternative? Do I really always have to send a "Content-length" or use chunked transport to enable the server to properly find the end of a request? How does that work for requests with unknown body length?
Transfer-Encoding: chunked is specifically designed to allow sending data with an unknown body length, for both requests and responses. The end of the data is determined by receiving a chunk whose payload size is 0. If you do not send a chunked request, then you must send a Content-Length instead.
Are you talking about this?:
Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
I think the text is talking about full close, you can do a half(write)-close. I'm not sure that's a HTTP compilant way of doing it, but I would think most servers will accept it.
Regarding your second question, simply use chunked encoding:
All HTTP/1.1 applications that receive entities MUST accept the "chunked" transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.
I have an HTTP server that returns large bodies in response to POST requests (it is a SOAP server). These bodies are "streamed" via chunking. If I encounter an error midway through streaming the response how can I report that error to the client and still keep the connection open? The implementation uses a proprietary HTTP/SOAP stack so I am interested in answers at the HTTP protocol level.
Once the server has sent the status line (the very first line of the response) to the client, you can't change the status code of the response anymore. Many servers delay sending the response by buffering it internally until the buffer is full. While the buffer is filling up, you can still change your mind about the response.
If your client has access to the response headers, you could use the fact that chunked encoding allows the server to add a trailer with headers after the chunked-encoded body. So, your server, having encountered the error, could gracefully stop sending the body, and then send a trailer that sets some header to some value. Your client would then interpret the presence of this header as a sign that an error happened.
Also keep in mind that chunked responses can contain "footers" which are just like HTTP headers. After failing, you can send a footer such as:
X-RealStatus: 500 Some bad stuff happened
Or if you succeed:
X-RealStatus: 200 OK
you can change the status code as long as response.iscommitted() returns false.
(fot HttpServletResponse in java, im sure there exists an equivalent in other languages)