Is this HTTP Header wrong? - http

I am trying to write a torrent downloader and needed to work out how to contact a tracker. I used the Fiddler2 program to intercept a tracker request sent from Vuze to it's tracker.
In the message sent (shown below), the Connection header is declared twice with different values.
Is this a correct use of the Connection header?
What does Connection: keep-alive do?
GET /announce?info_hash=0Z%22...&azver=3&azas=12576 HTTP/1.1
User-Agent: Azureus 4.7.0.2;Windows 7;Java 1.6.0_31
Connection: close
Accept-Encoding: gzip
Host: tracker.update.vuze.com:6969
Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2
Connection: keep-alive

From RFC2616 section 4.2:
Multiple message-header fields with the same field-name MAY be present
in a message if and only if the entire field-value for that header
field is defined as a comma-separated list [i.e., #(values)]. It MUST
be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message, by appending each subsequent field-value to the first, each
separated by a comma. The order in which header fields with the same
field-name are received is therefore significant to the interpretation
of the combined field value, and thus a proxy MUST NOT change the
order of these field values when a message is forwarded.
Edit:
According to section 14.10, Connection is such a field-name, so having multiple Connection headers is technically correct.
From 14.10 the grammar production for the Connection header is Connection = "Connection" ":" 1#(connection-token), so one or more comma separated tokens are valid.
In practise however, it may be that the second Connection header will be ignored, and thus the web server will expect to close the underlying TCP connection once the response has been sent.
For HTTP 1.1 the default mode is for the server to keep the underlying TCP connection open for subsequent requests, although many servers will limit the total number of requests made before closing the connection anyway.

HTTP 1.1 allows multiple Connection headers. The semantics of multiple such headers are defined to be the same as a single header with all values joined together with commas, e.g.
Connection: close
Connection: keep-alive
is the same as:
Connection: close,keep-alive
So technically, these headers are fine. However, I will herewith predict without having made some experiments that there will be a lot of servers out there (especially not so well tested ones such as torrent trackers) who will ignore either one of these headers.
Now, the deeper problem is that "keep-alive" is some http 1.0 extension, and kind of contradicts the "close", so I guess this combination is simply a bug in the torrent client. I guess most trackers will not allow persistent connections anyway, though, so I think the bets way to proceed i to only have a single "Connection: close" header.

Its weird to have 2 Connection Header Fields. I think you can only expect a non-deterministic behavior as the handling of this case depends solely on the implementation of the web server.
It might end up with any of the 2 in (let's say) the hashmap storing the fields.
Basically, keep alive means that the browser is allowed to maintain the connection to the server and to keep retrieving images, scripts over it. Usually this is not the case. Normally, a Connection to a web server is closed after the request was given a response.

Connection: Close
means that after the the request has completed close the connection.
Connection: keep-alive
means keep the connection open for future request.
You should only have one Connection parameter.
So therefore the HTTP header is not correct.

Related

chunked encoding and connection:close together

I'm a bit confused about what to do with the "Connection" header when using chunked-encoding. Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection? I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent where if I don't add the Connection header, it would remain open
Thanks
Message Length and Connection Management are two different things that really have nothing to do with each other (except in the one case where the Message Length is completely unknown and closing the connection is the only possible way to denote EOF, but that rarely happens, and is not your situation).
Chunking only applies to Message Length, where a 0-length chunk denotes EOF. If the connection is closed prematurely before EOF is reached, the Message is incomplete, and the receiver can decide whether to keep/process it or not.
The Connection header is used by the client to specify whether it wants the server to close the connection, or leave it open, after sending its response. The same header is used by the server to specify whether the connection is actually closing or staying open after the response has been sent.
Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
That has nothing to do with chunking. Regardless of the format of the Message, the client and server should always indicate their intentions for the current connection, whether it should be closed or left open. That is done via the presence of, or lack of, a Connection header, depending on the HTTP version:
If HTTP 1.0 is used, the default behavior is close unless Connection: keep-alive is explicitly sent.
If HTTP 1.1+ is used, the default behavior is keep-alive unless Connection: close is explicitly sent.
If the client requests a keep-alive, the server decides whether or not to honor it. The server may either leave the connection open, or it may close the connection.
If the client requests a closure, the server MUST honor it and close the connection.
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection?
No. Only the Connection header does that. Especially in a keep-alive scenario, thus the connection is left open so the client can reuse the existing connection to send a new request after the transfer of the previous response is finished.
I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent
Correct, especially in HTTP 1.1+ where keep-alive is the default behavior.
where if I don't add the Connection header, it would remain open
The meaning of an omitted Connection header depends on the HTTP version being used, as described above.
A server is allowed to send Connection: close while using chunked transfer encoding, this should not cause clients to disconnect before the response is finished.
According to RFC2616 https://www.rfc-editor.org/rfc/rfc2616#section-14.10
For example,
Connection: close
in either the request or the response header fields indicates that
the connection SHOULD NOT be considered `persistent' (section 8.1)
after the current request/response is complete.
Since the response is not complete while chunks are being sent, the connection is not forbidden from persisting.

Are HTTP keep-alive connections possible without content-length headers?

I understand that in HTTP 1.0, the content of a response is terminated by closing the connection.
In HTTP 1.1, keep-alive connections were introduced, enabling multiple requests and responses in a single TCP connection.
When multiple messages are sent over the same connection, there needs to be a mechanism that defines where one message ends and the next message starts.
By testing, I found out that this works when I set the content-length header in a response. By knowing the content length, the client knows when the content is fully received and can parse the next response.
My question is:
Is it possible to send multiple responses in a keep-alive connection without setting the content-length header?
If yes, how?
For clarification: I am thinking about scenarios where the length of the response is not known when starting to send it to the client and I would like to know if closing the connection is the only way to implement that.
The Transfer-Encoding header is what I was looking for.
By setting the transfer-encoding to chunked, it is possible to omit the Content-Length header.
In the chunked transfer encoding, a message can be sent in multiple chunks for which the length is known. To terminate a message, a chunk with length zero is sent.
This makes it possible to have a keep-alive connection and still send messages where the length is unknown when starting to send them.

HTTP Connection: Keep-Alive

I was looking at the HTTP 1.1 spec and was looking at the part of the spec related to the 'Connection' header. I noticed the the only token that is specified for the 'Connection' header is "close". After a little digging I found that the 'Keep-Alive' token that is found in the 'Connection' header in many server implementations, including Vim's which is using Apache 2.2.3, is left over from HTTP 1.0. Given the wide spread use of HTTP 1.1 how much value is there in adding Keep-Alive and similar inherited tokens from HTTP 1.0?
Some value; depends on the specific use.
In HTTP 1.1, all connections are considered persistent unless declared otherwise.
In practice, implementations do what they want:
When the client sends another request [after a HTTP Connection: Keep-Alive], it uses the same connection.
This will continue until either the client or the server decides that
the conversation is over, and one of them drops the connection.
So, it's really up to the implementers of the clients and servers to determine how long they keep the TCP connection open for. For example,
The default connection timeout of Apache 2.0 httpd[2] is as little as
15 seconds[3] and for Apache 2.2 only 5 seconds.
It looks like SPDY will form the basis for the upcoming HTTP 2.0. This changes connection handling dramatically.
Sources:
http://en.wikipedia.org/wiki/HTTP_persistent_connection#HTTP_1.1
http://en.wikipedia.org/wiki/SPDY
http://en.wikipedia.org/wiki/HTTP_2.0
https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-http2-08

HTTP: shutdown socket for writing after sending the request?

How is a socket used by an HTTP client properly closed after transmitting the request? Or does it have to remain open (bidirectionally) until the complete response has been received? If so, how is the end of the request body determined by the server?
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4, closing the socket is not an option for a request. That doesn't sound logical to me - why should a half-closed TCP connection be a problem for the server, if the client doesn't try to transmit anything after closing its half of the socket? The client can still receive data after all.
It seems to me that shutting down the write part of a socket would be a very practical way of letting the server know that the request has been finished. http://docs.python.org/howto/sockets.html#disconnecting even specifically mentions that use case.
If that's really the wrong way to do it, what's the alternative? Do I really always have to send a "Content-length" or use chunked transport to enable the server to properly find the end of a request? How does that work for requests with unknown body length?
Transfer-Encoding: chunked is specifically designed to allow sending data with an unknown body length, for both requests and responses. The end of the data is determined by receiving a chunk whose payload size is 0. If you do not send a chunked request, then you must send a Content-Length instead.
Are you talking about this?:
Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
I think the text is talking about full close, you can do a half(write)-close. I'm not sure that's a HTTP compilant way of doing it, but I would think most servers will accept it.
Regarding your second question, simply use chunked encoding:
All HTTP/1.1 applications that receive entities MUST accept the "chunked" transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.

Chunked encoding and content-length header

Is it possible to set the content-length header and also use chunked transfer encoding? and does doing so solve the problem of not knowing the length of the response at the client side when using chunked?
the scenario I'm thinking about is when you have a large file to transfer and there's no problem in determining its size, but it's too large to be buffered completely.
(If you're not using chunked, then the whole response must get buffered first? Right??)
thanks.
No:
"Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding. If the message does include a non-identity transfer-coding, the Content-Length MUST be ignored." (RFC 2616, Section 4.4)
And no, you can use Content-Length and stream; the protocol doesn't constrain how your implementation works.
Well, you can always send a header stating the size of the file.
Something like response.addHeader("File-Size","size of the file");
And ignore the Content-Length header.
The client implementation has to be tweaked to read this value, but hey you can achieve both the things you want :)
You have to use either Content-Length or chunking, but not both.
If you know the length in advance, you can use Content-Length instead of chunking even if you generate the content on the fly and never have it all at once in your buffer.
However, you should not do that if the data is really large because a proxy might not be able to handle it. For large data, chunking is safer.
This headers can be cause of Postman Parse Error:
"Content-Length" and "Transfer-Encoding" can't be present in the response headers together.
Using parametrized ResponseEntity<?> except raw ResponseEntity in controller can fixed the issue.
The question asks:
Is it possible to set the content-length header and also use chunked transfer encoding?
The RFC HTTP/1.1 spec, quoted in Julian's answer, says:
Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding.
There is an important difference between what's possible, and what's allowed by a protocol. It is certainly possible, for example, for you to write your own HTTP/1.1 client which sends malformed messages with both headers. You would be violating the HTTP/1.1 spec in doing so, and so you'd imagine some alarm bells would go off and a bunch of Internet police would burst into your house and say, "Stop, arrest that client!" But that doesn't happen, of course. Your request will get sent to wherever it's going.
OK, so you can send a malformed message. So what? Surely on the receiving end, the server will detect the HTTP/1.1 protocol client-side violation, vanquish your malformed request, and serve you back a stern 400 response telling you that you are due in court the following Monday for violating the protocol. But no, actually, that probably won't happen. Of course, it's beyond the scope of HTTP/1.1 to prescribe what happens to misbehaving clients; i.e. while the HTTP/1.1 protocol is analogous to the "law", there is nothing in HTTP/1.1 analogous to the judicial system.
The best that the HTTP/1.1 protocol can do is dictate how a server must act/respond in the case of receiving such a malformed request. However, it's quite lenient in this case. In particular, the server does not have to reject such malformed requests. In fact, in such a scenario, the rule is:
If the message does include a non-identity transfer-coding, the Content-Length MUST be ignored.
Unfortunately, though, some HTTP servers will violate that part of the HTTP/1.1 protocol and will actually give precedence to the Content-Length header, if both headers are present. This can cause a serious problem, if the message visits two servers in sequence in the same system and they disagree about where one HTTP message ends and the next one starts. It leaves the system vulnerable to HTTP Desync attacks a.k.a. Request Smuggling.

Resources