HTTP: shutdown socket for writing after sending the request? - http

How is a socket used by an HTTP client properly closed after transmitting the request? Or does it have to remain open (bidirectionally) until the complete response has been received? If so, how is the end of the request body determined by the server?
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4, closing the socket is not an option for a request. That doesn't sound logical to me - why should a half-closed TCP connection be a problem for the server, if the client doesn't try to transmit anything after closing its half of the socket? The client can still receive data after all.
It seems to me that shutting down the write part of a socket would be a very practical way of letting the server know that the request has been finished. http://docs.python.org/howto/sockets.html#disconnecting even specifically mentions that use case.
If that's really the wrong way to do it, what's the alternative? Do I really always have to send a "Content-length" or use chunked transport to enable the server to properly find the end of a request? How does that work for requests with unknown body length?

Transfer-Encoding: chunked is specifically designed to allow sending data with an unknown body length, for both requests and responses. The end of the data is determined by receiving a chunk whose payload size is 0. If you do not send a chunked request, then you must send a Content-Length instead.

Are you talking about this?:
Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
I think the text is talking about full close, you can do a half(write)-close. I'm not sure that's a HTTP compilant way of doing it, but I would think most servers will accept it.
Regarding your second question, simply use chunked encoding:
All HTTP/1.1 applications that receive entities MUST accept the "chunked" transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.

Related

chunked encoding and connection:close together

I'm a bit confused about what to do with the "Connection" header when using chunked-encoding. Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection? I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent where if I don't add the Connection header, it would remain open
Thanks
Message Length and Connection Management are two different things that really have nothing to do with each other (except in the one case where the Message Length is completely unknown and closing the connection is the only possible way to denote EOF, but that rarely happens, and is not your situation).
Chunking only applies to Message Length, where a 0-length chunk denotes EOF. If the connection is closed prematurely before EOF is reached, the Message is incomplete, and the receiver can decide whether to keep/process it or not.
The Connection header is used by the client to specify whether it wants the server to close the connection, or leave it open, after sending its response. The same header is used by the server to specify whether the connection is actually closing or staying open after the response has been sent.
Shall the "Connection" header not be added (or set to keep-alive, which is the same as we're talking about HTTP 1.1) or is this authorized to set it to Connection:close
That has nothing to do with chunking. Regardless of the format of the Message, the client and server should always indicate their intentions for the current connection, whether it should be closed or left open. That is done via the presence of, or lack of, a Connection header, depending on the HTTP version:
If HTTP 1.0 is used, the default behavior is close unless Connection: keep-alive is explicitly sent.
If HTTP 1.1+ is used, the default behavior is keep-alive unless Connection: close is explicitly sent.
If the client requests a keep-alive, the server decides whether or not to honor it. The server may either leave the connection open, or it may close the connection.
If the client requests a closure, the server MUST honor it and close the connection.
Sending an empty chunk, means the end of the transfer, but does it mean the end of the connection?
No. Only the Connection header does that. Especially in a keep-alive scenario, thus the connection is left open so the client can reuse the existing connection to send a new request after the transfer of the previous response is finished.
I thought that I would need to add Connection:close when my intention is to close the connection after an empty chunk has been sent
Correct, especially in HTTP 1.1+ where keep-alive is the default behavior.
where if I don't add the Connection header, it would remain open
The meaning of an omitted Connection header depends on the HTTP version being used, as described above.
A server is allowed to send Connection: close while using chunked transfer encoding, this should not cause clients to disconnect before the response is finished.
According to RFC2616 https://www.rfc-editor.org/rfc/rfc2616#section-14.10
For example,
Connection: close
in either the request or the response header fields indicates that
the connection SHOULD NOT be considered `persistent' (section 8.1)
after the current request/response is complete.
Since the response is not complete while chunks are being sent, the connection is not forbidden from persisting.

Are HTTP keep-alive connections possible without content-length headers?

I understand that in HTTP 1.0, the content of a response is terminated by closing the connection.
In HTTP 1.1, keep-alive connections were introduced, enabling multiple requests and responses in a single TCP connection.
When multiple messages are sent over the same connection, there needs to be a mechanism that defines where one message ends and the next message starts.
By testing, I found out that this works when I set the content-length header in a response. By knowing the content length, the client knows when the content is fully received and can parse the next response.
My question is:
Is it possible to send multiple responses in a keep-alive connection without setting the content-length header?
If yes, how?
For clarification: I am thinking about scenarios where the length of the response is not known when starting to send it to the client and I would like to know if closing the connection is the only way to implement that.
The Transfer-Encoding header is what I was looking for.
By setting the transfer-encoding to chunked, it is possible to omit the Content-Length header.
In the chunked transfer encoding, a message can be sent in multiple chunks for which the length is known. To terminate a message, a chunk with length zero is sent.
This makes it possible to have a keep-alive connection and still send messages where the length is unknown when starting to send them.

Response sent in chunked transfer encoding and indicating errors happening after some data has already been sent

I am sending large amount of data in my response to the client in chunked transfer encoding format.
How should I be dealing with any errors that occur midway during writing of the response?
I would like to know if there is any HTTP Spec recommended practice regarding this for clients to know that indeed the response is not a successful one but that the server ran into some issue.
Once you have started sending the HTTP headers to the client, you can't send anything else. You have to finish sending the response you intended to send, ie the chunked data and associated headers. If an error occurs midway through that, there is no way to report that error to the client. All you can do is close the connection. Either the client does not receive all of the headers, or it does not receive the terminating 0-length chunk at the end of the response. Either way is enough for the client to know that the server encountered an error during sending.
Chunked responses can use trailer header with some integrity check or post-processing status. This may be better than closing the connection as described in accepted answer because you can have custom error message. However, you only benefit from that if you have control of both server and client.

Mapping HTTP requests to HTTP responses

If I make multiple HTTP Get Requests to the same server and get HTTP 200 OK responses to each one how do I tell which request maps to which response using Wireshark?
Currently it looks like an http request is made, and the next HTTP 200 OK response is quickly received so everything is in a the proper sequence. I have seen things to the contrary however. For example using the Google Maps API v2 I've made several requests for location information and then the information is received in an arbitrary order (closely resembling the order in which I requested it, but not necessarily perfect.)
So my intuition is I cannot assume that my responses will be received in a specific order, even though they may be in order most of the time. So I'm wondering how I can determine this order from the response.
Update: Clarification as to what I need. I just need to know that the server has received the request. It seems like I need to do this by looking at sequence numbers and perhaps even ACKS. The reasoning behind this approach is I'm basically observing a web app and checking it is sending the information and the information is being received.
Update: This has nothing to do with wireshark specifically. I believe it is confusing people so I removing it from the title. It has to do with the HTTP protocol on top of the TCP/IP protocol and how we map responses to requests.
Thanks.
After you have stopped capturing packets follow this steps:
position the cursor on a GET request
Open the Analyze menu
click "Follow TCP Stream"
You get a new window with requests and responses in sequence.
While I was googling for a complete different question, I saw this one and I think I can provide a more complete answer :
HTTP dictates that responses must arrive in the order they were requested, Therefore, if you are looking at a single TCP connection at a given time you should be seeing :
Request ; Response ; Request ; Response ...
Also in HTTP/1.1, there is support for "Pipeline" where the client doesn't have to wait for responses to arrive in order to issue the next request. What could be observed in such cases is :
Request ; Response ; Request ; Request ; Response ; Response ; Request ; Response
In the HTTP response itself, there is no reference to the specific request that triggered it.
Filipo's suggestion is classic when debugging / observing a single TCP connection, but, when observing multiple TCP connections, you can't click the follow TCP Stream because you'd have to do it for each connection.
If you have many TCP connections, and many requests/responses you will have to look at TCP Source port in the request packet, and the TCP dest port in the response packet to know which response is related to each tcp connection, and then apply the HTTP request/response order rules.
Also, Wireshark CAN decompress the response body, and it will do it automatically if all the response body has arrived, but it will do so NOT in the Follow TCP Stream.
I always use Wireshark to debug HTTP.
Seems like this ability is not provided by the HTTP protocol at the application layer so I must go down to the transportation layer to determine this. In my case the TCP/IP layer using sequence numbers.
HTTP only presumes a reliable
transport; any protocol that provides
such guarantees can be used; the
mapping of the HTTP/1.1 request and
response structures onto the
transport data units of the protocol
in question is outside the scope of
this specification.
Read more:
http://www.faqs.org/rfcs/rfc2616.html#ixzz0e20kxKcz
Don't use Wireshark to debug HTTP, use an HTTP debugger such as Fiddler2

How do I report an error midway through a chunked http repsonse without closing the connection?

I have an HTTP server that returns large bodies in response to POST requests (it is a SOAP server). These bodies are "streamed" via chunking. If I encounter an error midway through streaming the response how can I report that error to the client and still keep the connection open? The implementation uses a proprietary HTTP/SOAP stack so I am interested in answers at the HTTP protocol level.
Once the server has sent the status line (the very first line of the response) to the client, you can't change the status code of the response anymore. Many servers delay sending the response by buffering it internally until the buffer is full. While the buffer is filling up, you can still change your mind about the response.
If your client has access to the response headers, you could use the fact that chunked encoding allows the server to add a trailer with headers after the chunked-encoded body. So, your server, having encountered the error, could gracefully stop sending the body, and then send a trailer that sets some header to some value. Your client would then interpret the presence of this header as a sign that an error happened.
Also keep in mind that chunked responses can contain "footers" which are just like HTTP headers. After failing, you can send a footer such as:
X-RealStatus: 500 Some bad stuff happened
Or if you succeed:
X-RealStatus: 200 OK
you can change the status code as long as response.iscommitted() returns false.
(fot HttpServletResponse in java, im sure there exists an equivalent in other languages)

Resources