I've read about chunked encoding in few places, but still don't quite get why for example a Radio streaming server such as this one sends its data as chunked Transfer-Encoding. A client of such server just continuously reads data from the server. It doesn't in any point needs to know how much data is currently being sent since it is always consuming data (as long as it stays connected).
It seems that the client doesn't use these pieces of data, he just discards them, which adds more job for him.. Can anyone explain this to me?
HTTP/1.1 needs to either send a Content-Length, or chunked encoding. If you can't know the length, you must use chunked encoding. A continuous stream would in theory have an infinite length.
A HTTP client needs either of these to know when the response ends. After the respons a new HTTP request could be sent.
You are correct that in the case of a continuous stream it's not needed to detect when the stream ends, because it doesn't. It could have been possible for the authors of HTTP/1.1 to have a third 'continuous stream of bytes' option for HTTP. Perhaps that use-case wasn't considered all those years ago.
Note that HTTP/2 doesn't have chunked encoding anymore, but it does something similar and sends multiple body frames.
Related
I'm teaching myself how to make a rudimentary HTTP server, and I've recently learned about Transfer-Encoding: chunked. I understand that each chunk reports its own size, but all of the documentation I can find seems to indicate that Content-Length on the initial response is only useful for a standard response body, and is ignored for chunked content, making it effectively impossible for a client to know the size of a chunked response body until it's finished.
However, nearly every file I've ever downloaded in my time on the internet has somehow reported its size to the browser ahead of time, so it's clearly not only possible, but common, to the point where it's odd not to.
Is this (non-standard?) behavior common HTTP clients implement, reading the Content-Length (or some other) header as an indicator of total chunked length, or something else entirely?
There are similar questions on managing long-running HTTP requests, but the goal is to avoid streaming or chunked encoding.
The reason is to keep a simpler experience for the API user, where they make an API request and receive back an image file in the response -- no need to reassemble chunked/stream data.
Based on posts elsewhere, sending the response headers immediately, before the body is ready, will keep the connection alive.
Chunked encoding lets you do this, but this imposes more work on the user to reassemble the chunked/stream body data.
Is it possible to send one response header property back, then send the remaining headers when the body is ready?
I would like to understand the usage of 'Transfer-encoding: Chunked' in case of HTTP requests.
Is it common for requests to be chunked?
My thinking is no since requests need to be completely read before processing, it does not make sense to be sending chunked requests.
It is not that common, but it can be very useful for large request bodies.
My thinking is no since requests need to be completely read before processing, it does not make sense to be sending chunked requests.
(1) No, they don't need to be read completely.
(2) ...and the main reason to compress it to save bytes on the wire anyway.
For an HTTP agent acting as a reverse proxy or a forward proxy, so taking a message from one side and sending it on the other side, using a chunked transmission means you can send the parts of the message you have without storing it locally. You avoid the 'buffering' problems, slowdown and storage.
You also have some optimizations based on each actor preferred size of data blocks, like you could have an actor which likes sending packets of 8000 bytes, because that's the good number for his own kernel settings (tcp windows, internal http server buffer size, etc), while another actor on the message transmission using smaller chunks of 2048 bytes.
Finally, you do not need to compute the size of the message, the message will end on the end-of-stream marker, that's all. Which is also usefull if you are sending something which is compressed on the fly, you may not know the final size until everything is compressed.
Chunked transmission is used a lot. It is the default mode of most HTTP servers if you ask for HTTP/1.1 mode and not HTTP/1.0.
Say I started writing to the response body, but there was some error, and I need to indicate that it's an HTTP 500 even if an HTTP 200 OK header was already written as a header...
How can I write something to the body of the response that's guaranteed to be malformed so that the response is interpreted as some sort of error by the client?
In general, this is impossible. Some clients only care about the response header, and may stop paying attention to what you send after the header.
But with certain clients, in certain cases, this may be possible.
I assume HTTP/1.1 here. HTTP/2 probably gives even more opportunities, because there’s more to screw up in the protocol, and the implementations are often stricter. Conversely, HTTP/1.0 is dumber and laxer, so harder to break.
Close the connection before the end of response, as indicated by your framing. If your response is framed with Content-Length: 100, close before you’ve sent the 100th byte of payload. If your response is framed with Transfer-Encoding: chunked, close before you’ve sent the final empty chunk. If the client expects to receive the entire payload, it may (and should) treat this as an error. But some won’t, including very popular client libraries.
If the payload is in a structured format, like JSON or XML, then do the same as 1, but before closing, send something that would disrupt that format. For example, no valid JSON text can end with {. Even if the client doesn’t recognize the incomplete payload as an error, it might then fail on trying to parse it.
Same as 1, but instead of closing the connection, just stop sending data. The client will “hang” until its receive operation times out, which it may treat as an error. This may be a bad idea if the client is operated by someone who is not prepared for such extravagant timeouts.
Only with Transfer-Encoding: chunked: Same as 3, but instead of hanging, send bogus very long chunks and/or keep sending chunks indefinitely, until the client gives up or crashes. Probably a very bad idea, bordering on malicious.
Can I be sure that a chunked HTTP response will be sent uninterrupted by anything else? I need to differentiate responses (and requests) and this isn't a simple case of reading content length, seeing a closed connection or a no-body response code.
Can I read each chunk and once chunk-size is 0 I will have read exactly one response (or request)? i.e. is it possible for part of any other response to have been sent interleaved? I suspect it is sent consecutively and uninterrupted as there doesn't appear to be any kind of identification in the spec for chunked transfer, so how could more than one be reassembled?
Finally, if a response is sent chunked, does the client send anything more than its original request? I'm thinking along the lines of flow control and error checking but that is all handled at lower layers, so I suspect that the client does not send anything more.
Thanks!
"Interleaved" with what exactly? HTTP doesn't allow to send several responses concurrently on the same connection. Even with pipelining responses are still sent after each other. That is, you will see all the chunks coming in order before the response to any other request.
As for your final question, no, the client doesn't send anything more than the original request.