Chunked encoding and content-length header

Chunked encoding and content-length header - http

Is it possible to set the content-length header and also use chunked transfer encoding? and does doing so solve the problem of not knowing the length of the response at the client side when using chunked?
the scenario I'm thinking about is when you have a large file to transfer and there's no problem in determining its size, but it's too large to be buffered completely.
(If you're not using chunked, then the whole response must get buffered first? Right??)
thanks.

No:
"Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding. If the message does include a non-identity transfer-coding, the Content-Length MUST be ignored." (RFC 2616, Section 4.4)
And no, you can use Content-Length and stream; the protocol doesn't constrain how your implementation works.

Well, you can always send a header stating the size of the file.
Something like response.addHeader("File-Size","size of the file");
And ignore the Content-Length header.
The client implementation has to be tweaked to read this value, but hey you can achieve both the things you want :)

You have to use either Content-Length or chunking, but not both.
If you know the length in advance, you can use Content-Length instead of chunking even if you generate the content on the fly and never have it all at once in your buffer.
However, you should not do that if the data is really large because a proxy might not be able to handle it. For large data, chunking is safer.

This headers can be cause of Postman Parse Error:
"Content-Length" and "Transfer-Encoding" can't be present in the response headers together.
Using parametrized ResponseEntity<?> except raw ResponseEntity in controller can fixed the issue.

The question asks:
Is it possible to set the content-length header and also use chunked transfer encoding?
The RFC HTTP/1.1 spec, quoted in Julian's answer, says:
Messages MUST NOT include both a Content-Length header field and a non-identity transfer-coding.
There is an important difference between what's possible, and what's allowed by a protocol. It is certainly possible, for example, for you to write your own HTTP/1.1 client which sends malformed messages with both headers. You would be violating the HTTP/1.1 spec in doing so, and so you'd imagine some alarm bells would go off and a bunch of Internet police would burst into your house and say, "Stop, arrest that client!" But that doesn't happen, of course. Your request will get sent to wherever it's going.
OK, so you can send a malformed message. So what? Surely on the receiving end, the server will detect the HTTP/1.1 protocol client-side violation, vanquish your malformed request, and serve you back a stern 400 response telling you that you are due in court the following Monday for violating the protocol. But no, actually, that probably won't happen. Of course, it's beyond the scope of HTTP/1.1 to prescribe what happens to misbehaving clients; i.e. while the HTTP/1.1 protocol is analogous to the "law", there is nothing in HTTP/1.1 analogous to the judicial system.
The best that the HTTP/1.1 protocol can do is dictate how a server must act/respond in the case of receiving such a malformed request. However, it's quite lenient in this case. In particular, the server does not have to reject such malformed requests. In fact, in such a scenario, the rule is:
If the message does include a non-identity transfer-coding, the Content-Length MUST be ignored.
Unfortunately, though, some HTTP servers will violate that part of the HTTP/1.1 protocol and will actually give precedence to the Content-Length header, if both headers are present. This can cause a serious problem, if the message visits two servers in sequence in the same system and they disagree about where one HTTP message ends and the next one starts. It leaves the system vulnerable to HTTP Desync attacks a.k.a. Request Smuggling.

Related

How to ask a http server present a http header field `Content-Length`?

I am testing the Last.fm api using row socket interface.
Now i noticed that some of api's http response have not contain a field Content-Length .
But I want to know is there a way to ask the server presenting it?
Because i can't take good care of this in my program elegantly.
Quoth the RFC:
7.2.2 Length
When an Entity-Body is included with a message, the length of that body may be determined in one of two ways. If a Content-Length header field is present, its value in bytes represents the length of the Entity-Body. Otherwise, the body length is determined by the closing of the connection by the server.

The right RFC to look at is RFC 7230 (Section 3.3.2).
And no, in HTTP/1.1 a client has to be able to process chunked encoding (which would be the only legitimate reason not to provide a Content-Length header field).

HEAD headers differ from GET, chunked transfer

A web application under test behaves in an odd way. A HEAD request returns the header Content-Length, but the consequent GET returns Transfer-Encoding: chunked. I expected the headers to be equal, and RFC says SHOULD, so my question is: how legit and how common is this behaviour?
UPDATE It turns out, that the root cause of the problem is HAProxy's behaviour. If that's a HEAD request, the response is propagated as is from the application underneath. But for GET it applies the compression and sets the chunked transfer. I'll close this question as an off-topic and perhaps will ask at ServerFault.

If the server use chunked encoding for GET, but returns Content-Length for HEAD this is IMHO an indication that the information returned for HEAD is unlikely to be correct.

The HEAD method response does not return entity-body but GET responds with an entity-body, if the HTTP server has the "Chunked transfer encoding" enabled does not send the "Content-Length" in the response because is not used, the server does not need to know the length of the content before it starts transmitting a response to the client. The server can begin transmitting dynamically-generated content to the client before knowing the total size of that content. Perhaps this is the most likely explanation.

How to tell there's something wrong with the server during response that started as 200 OK. Fail gracefully

I am qurious if there is any standard method in HTTP 1.X protocol to tell there is a problem on the server during http response that started as 200 OK.
How to tell there's any error on the server if 200 OK header is already returned and we are currently sending the response body? In some standards-compilliant way.
UPD : There is a duplicate, but without a single answer (!) HTTP: error during reply after 200 OK status code.
To be specific: I can not use Content-Length for checking at response end, because the length can't be known at response start.
Additionaly, I can't cache the whole response on the server before sending (because it is too big and I will run out of memory, and it's too long to generate so the user can't wait, etc...).

There is no standard method to do what you want.
To be precise, the standard method is to buffer the response on the server, then send a 200 OK and the Content-Length, followed by the content. As stated, this does not work for you.
The only alternative I can think of, is to wrap the content in some format that makes it discoverable whether it was sent correctly. For example, you might end it with a hash or even a digital signature. But obviously, such mechanisms are not part of the HTTP standard.

If I check the "content-length" header, is it 100% accurate?

If not, how accurate is it?
I want to know the size of the image before I download it.

Can the HTTP Content-length header be malformed? Yes.
Should you trust it to be a fair representation of the size of the message body? Yes.

It should be, and usually is, accurate. However it is entirely possible for a web server to report a incorrect content length although this obviously doesn't happen often (I recall old versions of apache retuning nonsensical content lengths on files > 2GB).
It is also not mandatory to provide a Content-Length header

It had better be - otherwise why have it at all?
If it can't be reliably determined in advanced, it shouldn't be served by the server at all. (When dealing with dynamically generated text, for example, something like chunked transfer encoding may be used - which doesn't require the final length to be known when the HTTP header is written at the beginning of the stream.)

Content-Length can be sent by the server code (or) by the apache layer itself.
When the code is not sending apache will send it.
There are known client-crashes when the client connects and closes the socket when the
content-length is sent smaller.
Since the images are usually not generated by code in run-time, you can rely on it.

Browsers can be unforgiving if the content-length is incorrect.
I was having a problem here, where the server was sometimes returning a content-length that was too low. The browsers just wouldn't handle it.
So yes, you can assume that the server is setting the content-length correctly, based on the knowledge that browser clients work on the same assumption.

A sharing for what I've discovered recently.
My case was using NodeJS to post http request to another server, which I'm able to set the content-length value by my self into the header (the value is lower than the actual size).
When the other server received my http request, it only process the request param/body size up to the point that the content-length told the server.
E.g. my request actual length is 100, but content-length mentioned 80, the server who received the request will only process 1-80 and the last 20 wasn't being processed and caused some error like "Invalid parameter".
Nowadays sever side will automatically insert the content-length for you, unless there are needs and you know what you are doing, else you don't want to change it as it may cause you trouble.

HTTP Transfer-Encoding and requests

the HTTP specification states that the Transfer-Encoding header is allowed for requests - but what error code should a server respond if it doesn't understand that given Transfer-Encoding.
As far as I know the HTTP standard doesn't cover this possibility, but maybe I have just overlooked it.

An unknown transfer-encoding should raise a HTTP error 501 "NOT IMPLEMENTED".
That's what Apache does, at least.
Also see http://argray.com/unixfaq/httpd_error_codes.shtml
Edit: pointer to the corresponding RFC section: http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html#sec10.5.2

I agree that the answer to this is non-obvious, and have followed up on the HTTP WG's mailing list.
UPDATE: Björn H. rightfully points out:
Section 3.6 of RFC 2616:
A server which receives an
entity-body with a transfer-coding it
does not understand SHOULD return
501 (Unimplemented), and close the
connection.
So it does address this already.

Mostly a personal opinion.
i always thought 5xx errors were actual programming errors, like something fell over. If a server doesnt understand the request i would say a 4xx error is a better response as the problem is with the request not so a failed process on the server. Im not sure which 4xx but there are a few so selecting one should not be hard.

A request with an invalid Transfer-Encoding for the HTTP version is malformed. Therefore, the server should respond with 400 Bad Request.

Arguably failing to understand chunked encoding ought to be a 500 Internal Server Error rather than 501, because RFC-2616 says the server MUST understand it.
However, if a server chooses not to accept requests with chunked bodies, and it wants to blame the client for this, one way to do so legally would be 411 Length Required -- since one must not use Content-Length and Transfer-Encoding at the same time, and it is not practical to send a request without either anyway.

The RFC is a bit unclear, but IMHO it should be 406 Not Acceptable.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex