Avoiding Content-Length in HEAD response - http

The Content-Length response header is often expensive to generate for a just HEAD request (e.g. when dealing with dynamically generated resources), but may be essentially "free" after doing the work needed to generate a GET response body.
When it is reasonable to provide a Content-Length (instead of a chunked response) when responding to a GET request, yet unreasonable or slow to calculate the Content-Length for the corresponding HEAD request, is it permissible for the HEAD response to:
Omit the Content-Length header altogether?
Respond with Transfer-Encoding: chunked even though GET response would have a Content-Length?
The relevant W3C specification indicates that a HEAD request "SHOULD" (not "MUST") respond with identical headers; is the cleanliness and, admittedly often minor, total transfer size savings of using Content-Length in the GET response worth violating the aforementioned "SHOULD" in the case of HEAD, or is the only reasonable choice to have both responses send a Transfer-Encoding: chunked header?

Thanks to a tip from #julian-reschke, rfc-7231 indicates that:
The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request had been a GET, except that the payload header fields (Section 3.3) MAY be omitted.
Per section 3.3 of that same document, the payload header fields consist of:
Content-Length
Content-Range
Trailer
Transfer-Encoding

Related

last-modified vs expires precedence

I have read through lot of discussions, and found most of the claim saying expires is higher precedence over last-modified, meaning if a response already expired, it will not even send out if-modified-since to server and of course the response code will not be 304.
But my situation is totally weird, I have returned back last-modified in response, and somehow CDN/proxy side add in expires header, which value is same as date response header, I suppose same value in expires and date header will cause the response stale immediately, but in fact, my client browser will still send out request with if-modified-since header, this will cause a 304 response code returned from the server.
I read throught RFC 2616, it doesn't tell much as well. So what happen to this case?
Almost 2 years but find no answer...
I managed find out some reference:
The freshness lifetime is calculated based on several headers. If a
"Cache-control: max-age=N" header is specified, then the freshness
lifetime is equal to N. If this header is not present, which is very
often the case, it is checked if an Expires header is present. If an
Expires header exists, then its value minus the value of the Date
header determines the freshness lifetime. Finally, if neither header
is present, look for a Last-Modified header. If this header is
present, then the cache's freshness lifetime is equal to the value of
the Date header minus the value of the Last-modified header divided by
10.
Although I can't find confirmed precedence in RFC, but I think this MDN quote is quite reliable enough.
It is pretty normal if some browser doesn't implement in this way... So to avoid any issue, the best is to not return these two headers in the response at the same time.

Examining the HTTP request Content-type header

I need to do a spot of server-side parsing of raw HTTP headers - in particular the Content-type header. Whilst what I see for this header in different browser is appears to confirm to the same rules of capitalization and space usage I need to be sure. For the mainstream browsers is it safe to assume that this header string will bear the form (for multipart form data)
Content-type:...; boundary=...
or is it necessary to check for redundant spaces, e.g. boundary = etc?

How to determine valid Range Header w.r.t HTTP Entity?

As per HTTP/1.1 spec for Range header (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.35), it is stated that
Byte range specifications in HTTP apply to the sequence of bytes in the entity-body (not necessarily the same as the message-body).
My question is suppose I am requesting to download a binary file of size 1GB & it is having multiple encrypted blocks of 128MB. Since Byte range of HTTP is not equal to the size of file instead the HTTP entity, to download these chunks parallely from the server without breaking the boundaries. Please note that I don't want to reassemble the file. I want to process these blocks separately to decrypt. which Range header would be most suitable & how to derive the correct value to be sent to in that Range header?
Thanks,
The Range header is applicable for not full HTTP Entity rather only the entity-body of that HTTP entity. The HTTP Message RFC (http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html) says
The message-body (if any) of an HTTP message is used to carry the entity-body associated with the request or response. The message-body differs from the entity-body only when a transfer-coding has been applied, as indicated by the Transfer-Encoding header field (section 14.41).
Another good reference to read is http://www.ietf.org/rfc/rfc3229.txt (section 4 - The HTTP message-generation sequence) which explains how the HTTP response is generated. Conceptually, when a Range header & transfer encoding both are provided in the request, Range is applied first for message response generation & then the transfer encoding is applied. I think most of the HTTP servers should be confirming to this, so we can apply the range header w.r.t message content length.

Http response with no http header

I have written a mini-minimalist http server prototype ( heavily inspired by boost asio examples ), and for the moment I haven't put any http header in the server response, only the html string content. Surprisingly it works just fine.
In that question the OP wonders about necessary fields in the http response, and one of the comments states that they may not be really important from the server side.
I have not tried yet to respond binary image files, or gzip compressed file for the moment, in which cases I suppose it is mandatory to have a http header.
But for text only responses (html, css, and xml outputs), would it be ok never to include the http header in my server responses ? What are the risks / errors possible ?
At a minimum, you must provide a header with a status line and a date.
As someone who has written many protocol parsers, I am begging you, on my digital metaphoric knees, please oh please oh please don't just totally ignore the specification just because your favorite browser lets you get away with it.
It is perfectly fine to create a program that is minimally functional, as long as the data it produces is correct. This should not be a major burden, since all you have to do is add three lines to the start of your response. And one of those lines is blank! Please take a few minutes to write the two glorious line of code that will bring your response data into line with the spec.
The headers you really should supply are:
the status line (required)
a date header (required)
content-type (highly recommended)
content-length (highly recommended), unless you're using chunked encoding
if you're returning HTTP/1.1 status lines, and you're not providing a valid content-length or using chunked encoding, then add Connection: close to your headers
the blank line to separate header from body (required)
You can choose not to send a content-type with the response, but you have to understand that the client might not know what to do with the data. The client has to guess what kind of data it is. A browser might decide to treat it as a downloaded file instead of displaying it. An automated process (someone's bash/curl script) might reasonably decide that the data isn't of the expected type so it should be thrown away.
From the HTTP/1.1 Specification section 3.1.1.5. Content-Type:
A sender that generates a message containing a payload body SHOULD
generate a Content-Type header field in that message unless the
intended media type of the enclosed representation is unknown to the
sender. If a Content-Type header field is not present, the recipient
MAY either assume a media type of "application/octet-stream"
([RFC2046], Section 4.5.1) or examine the data to determine its type.

Pre flush head tag with gzip support

http://developer.yahoo.com/performance/rules.html
There it is given it is good to preflush the head tag .
But I have a question will it help while using gzip ? (I am using apache2).
I think full document will get gziped at one shot and then send to the client.
or is it also possible to have gzip as well as pre-flush the head tag
EDITED
The original version of this question suggested we were dealing with HTTP headers rather than the <head> section on an HTML document. I will leave my original answer below, but it actually has no relevance to this specific question.
To answer the question about pre-flushing the <head> section of a document - while it would be possible to do this in combination with gzip, it is probably not possible without more granular control over the gzip process than Apache affords. It is possible to break a gzipped stream into chunks that can be decompressed on their own (see this) but if there is a way to control Apache's gzip implementation to such a degree then I am not aware of it.
Doing so would likely decrease the efficacy of the gzip, making the compressed size larger, and would only be worth doing when the <head> of a document was particularly large, say, greater than 10KB (this is a somewhat arbitrary value I arrived at by reading about how gzip works under the bonnet, and should definitely not be taken as gospel).
Original answer, relating to the HTTP headers:
Purely from the viewpoint of the HTTP protocol, rather than exactly how you would implement it on an Apache based server, I can't see any reason why you can't preflush the headers and also use gzip to compress the body. Keeping in mind that fact that the headers are never gzipped (if they were, how would the client know they had been?), the transfer encoding of the content should have no effect on when you send the headers.
There are, however a couple of things to keep in mind:
Once the headers have been sent, you can't change your mind about the transfer encoding. So if you send the headers which state that the body will be gzipped, then realise that your body is only 4 bytes, your would still have to gzip it anyway, which would actually increase the size of the body. This probably wouldn't be a problem unless you were omitting the Content-Length: header which while possible, is bad practice as it means you cannot use persistent connections. This leads on to the next point...
You cannot send a Content-Length: header in this secenario. This is because you don't know what the size of the body is until you have compressed it, by which time it is ready to send, so you are not really (from the server's point of view) preflushing the headers, even if you do send them seperately before you start to send the body.
If it takes you a long time time to compress the body of the message (slow/heavily loaded server, very large body etc etc), and you don't start the compression until after you have sent the headers, there is a risk the client may time out waiting for the rest of the response. This depends entirely on the client, but the are so many HTTP client implementations out there that this possibility cannot be totally discounted.
In short, yes it is possible to do it, but there is no catch-all, "Yes, do it" or "No, don't do it" answer - whether you would do it depends on each request and the nature of it's response.

Resources