Is there any rule for http server implementation to read or skip the request body before sending a response?
Once you have received the full HTTP request you are free to do with it whatever you want to do with it. If you don't care about the body you can simply read it and discard it. You should empty the read buffer obviously.
Related
There are similar questions on managing long-running HTTP requests, but the goal is to avoid streaming or chunked encoding.
The reason is to keep a simpler experience for the API user, where they make an API request and receive back an image file in the response -- no need to reassemble chunked/stream data.
Based on posts elsewhere, sending the response headers immediately, before the body is ready, will keep the connection alive.
Chunked encoding lets you do this, but this imposes more work on the user to reassemble the chunked/stream body data.
Is it possible to send one response header property back, then send the remaining headers when the body is ready?
I need to get a CRC32 checksum of a file i'm downloading through an http GET request - without actually opening the response body.
I am building a proxy app - which gets a request from a client, and does the actual GET call. I'd like the response the proxy gets from the server to contain the checksum, without having to read through the actual data in the response body. I connect the response body reader stream, to the writer stream which I return to the client.
I read about the "Want-Digest" header which I can add to the request, and should result in the response containing a "Digest" header, with a checksum - but it did not work.
I also looked into the Content-MD5 header, but when I try to download some photos, I see i'm not getting it in the response (also, I read that it is deprecated).
Thanks in advance!
Any headers, such as 'Want-Digest' or 'Content-MD5', will be up to the server to implement. Most servers will probably ignore those headers, which is why they aren't working for you. If you want to calculate the CRC32 of the body, you'll have to open the body and calculate it yourself.
If you have access to the TCP headers I suppose you could access the TCP checksum, though that is a relatively weak checksum even compared to CRC32, and it is also a checksum of the entire packet, not just the body.
If I make
r = requests.get('http://github.com', stream=True)
and see in tcpdump, the content of page downloaded just after requests.get. After r.content, no tcpdump transfer activity. The same with requests.Session(stream=True).
Don't use GET if you don't want a response body to be sent by the server. Use a HEAD request instead if all you need is the header information.
All stream=True does is not read the response body from the socket. The server can still initiate sending that body, so the socket receive buffer will already have (some) of that body for Python to read.
How is a socket used by an HTTP client properly closed after transmitting the request? Or does it have to remain open (bidirectionally) until the complete response has been received? If so, how is the end of the request body determined by the server?
According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4, closing the socket is not an option for a request. That doesn't sound logical to me - why should a half-closed TCP connection be a problem for the server, if the client doesn't try to transmit anything after closing its half of the socket? The client can still receive data after all.
It seems to me that shutting down the write part of a socket would be a very practical way of letting the server know that the request has been finished. http://docs.python.org/howto/sockets.html#disconnecting even specifically mentions that use case.
If that's really the wrong way to do it, what's the alternative? Do I really always have to send a "Content-length" or use chunked transport to enable the server to properly find the end of a request? How does that work for requests with unknown body length?
Transfer-Encoding: chunked is specifically designed to allow sending data with an unknown body length, for both requests and responses. The end of the data is determined by receiving a chunk whose payload size is 0. If you do not send a chunked request, then you must send a Content-Length instead.
Are you talking about this?:
Closing the connection cannot be used to indicate the end of a request body, since that would leave no possibility for the server to send back a response.
I think the text is talking about full close, you can do a half(write)-close. I'm not sure that's a HTTP compilant way of doing it, but I would think most servers will accept it.
Regarding your second question, simply use chunked encoding:
All HTTP/1.1 applications that receive entities MUST accept the "chunked" transfer-coding (section 3.6), thus allowing this mechanism to be used for messages when the message length cannot be determined in advance.
I have an HTTP server that returns large bodies in response to POST requests (it is a SOAP server). These bodies are "streamed" via chunking. If I encounter an error midway through streaming the response how can I report that error to the client and still keep the connection open? The implementation uses a proprietary HTTP/SOAP stack so I am interested in answers at the HTTP protocol level.
Once the server has sent the status line (the very first line of the response) to the client, you can't change the status code of the response anymore. Many servers delay sending the response by buffering it internally until the buffer is full. While the buffer is filling up, you can still change your mind about the response.
If your client has access to the response headers, you could use the fact that chunked encoding allows the server to add a trailer with headers after the chunked-encoded body. So, your server, having encountered the error, could gracefully stop sending the body, and then send a trailer that sets some header to some value. Your client would then interpret the presence of this header as a sign that an error happened.
Also keep in mind that chunked responses can contain "footers" which are just like HTTP headers. After failing, you can send a footer such as:
X-RealStatus: 500 Some bad stuff happened
Or if you succeed:
X-RealStatus: 200 OK
you can change the status code as long as response.iscommitted() returns false.
(fot HttpServletResponse in java, im sure there exists an equivalent in other languages)