My reading of RFC 2616 hasn't answered my question:
How should a server interpret multiple Accept, Accept-Encoding, Accept-Language, etc, headers?
Granted, this should generally be a rare occurrence, but far be it from me to assume every HTTP client actually does what it should.
Imagine an HTTP request includes the following:
Accept-Language: en
Accept-Language: pt
Should the server:
Combine the results, to an effective Accept-Language: en, pt?
Honor only the first one (en)?
Honor only the last one (pt)?
Throw a hissy fit (return a 400 status, perhaps?)
Option #1 seems the most natural to me, and the most likely to be what the client means, and the least likely to completely break expectations even if it's not what the client means.
But is there any actual rule (ideally specified by an RFC) for how to handle these situations?
1) You are looking at an outdated RFC. RFC 2616 has been obsoleted two years ago.
2) That said, the answer is 1); see https://greenbytes.de/tech/webdav/rfc7230.html#rfc.section.3.2.2.p.3: "A recipient MAY combine multiple header fields with the same field name into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field value to the combined field value in order, separated by a comma. (...)"
Related
There is a confusing bit of terminology in the RFC for HTTP/2 that I wish was clearer.
Per the RFC https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2
Just as in HTTP/1.x, header field names are strings of ASCII
characters that are compared in a case-insensitive fashion. However,
header field names MUST be converted to lowercase prior to their
encoding in HTTP/2. A request or response containing uppercase header
field names MUST be treated as malformed
This seems to outline two conflicting ideas
Header field names are case-insensitive in HTTP/2
If you receive or send fields that are not lowercase, the request/response is malformed.
If a request or response that contains non-lowercase headers is invalid, how can it be considered case-insensitive?
There are two levels of "HTTP": a more abstract, upper, layer with the HTTP semantic (e.g. PUT resource r1), and a lower layer where that semantic is encoded. Think of these two as, respectively, the application layer of HTTP, and the network layer of HTTP.
The application layer can be completely unaware of whether the semantic HTTP request PUT r1 has arrived in HTTP/1.1 or HTTP/2 format.
On the other hand, the same semantic, PUT r1, is encoded differently in HTTP/1.1 (textual) vs HTTP/2 (binary) by the network layer.
The referenced section of the specification should be interpreted in the first sentence as referring to the application layer: "as in HTTP/1.1 header names should compared case insensitively".
This means that if an application is asked "is header ACCEPT present?", the application should look at the header names in a case insensitive fashion (or be sure that the implementation provides such feature), and return true if Accept or accept is present.
The second sentence should be interpreted as referring to the network layer: a compliant HTTP/2 implementation MUST send the headers over the network lowercase, because that is how HTTP/2 encodes header names to be sent over the wire.
Nothing forbids a compliant HTTP/2 implementation to receive content-length: 128 (lowercase), but then convert this header into Content-Length: 128 when it makes it available to the application - for example for maximum compatibility with HTTP/1.1 where the header has uppercase first letters (for example to be printed on screen).
It's a mistake, or at least an unnecessarily confusing specification.
Header names must be compared case-insensitive, but that's irrelevant since they are only transmitted or received lowercase.
I.e. in the RFC, "Content-Length" refers to to the header "content-length."
RFC 7540 has been obsoleted by RFC 9113, which states simply:
Field names MUST be converted to lowercase when constructing an HTTP/2 message.
https://www.rfc-editor.org/rfc/rfc9113#section-8.2
And the prose now uses the lowercase header names.
I've just read an article about differences between http1 and http2. The main question that I have is when it says that http2 is a binary protocol but http1 is a textual protocol.
Maybe I'm wrong but I know that any data, text or whatever format it can be, has a binary representation form in memory, and even when transfer through TCP/IP network the data is split to a format according with the layer of the OSI model or the TCP/IP model representation which means that technically textual format doesn't exist in the context of data transfer through network.
I cannot really understand this difference between http2 and http1, can you help me please with a better explanation?
Binary is probably a confusing term - everything is ultimately binary at some point in computers!
HTTP/2 has a highly structured format where HTTP messages are formatted into packets (called frames) and where each frame is assigned to a stream. HTTP/2 frames have a specific format, including a length which is declared at the beginning of each frame and various other fields in the frame header. In many ways it’s like a TCP packet. Reading an HTTP/2 frame can follow a defined process (the first 24 bits are the length of this packet, followed by 8 bits which define the frame type... etc.). After the frame header comes the payload (e.g. HTTP Headers, or the Body payload) and these will also be in a specific format that is known in advance. An HTTP/2 message can be sent in one or more frames.
By contrast HTTP/1.1 is an unstructured format made up of lines of text in ASCII encoding - so yes this is transmitted as binary ultimately, but it’s basically a stream of characters rather than being specifically broken into separate pieces/frames (other than lines). HTTP/1.1 messages (or at least the first HTTP Request/Response line and HTTP Headers) are parsed by reading in characters one at a time, until a new line character is reached. This is kind of messy as you don’t know in advance how long each line is so you must process it character by character. In HTTP/1.1 the HTTP Body’s length is handled slightly different as typically is known in advance as a content-length HTTP header will define this. An HTTP/1.1 message must be sent in its entirety as one continuous stream of data and the connection can not be used for anything else but transmitting that message until it is completed.
The advantage that HTTP/2 brings is that, by packaging messages into specific frames we can intermingle the messages: here’s a bit of request 1, here’s a bit of request 2, here’s some more of request 1... etc. In HTTP/1.1 this is not possible as the HTTP message is not wrapped into packets/frames tagged with an id as to which request this belongs to.
I’ve a diagram here and an animated version here that help conceptualise this better.
HTTP basically encodes all relevant instructions as ASCII code points, e.g.:
GET /foo HTTP/1.1
Yes, this is represented as bytes on the actual transport layer, but the commands are based on ASCII bytes, and are hence readable as text.
HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still "GET /foo" to be found somewhere in there.)
for example:
Connection: keep-alive
in http1.1
in http1.1 it will be encoded into(often in utf-8):
Connection: keep-alive
Just the text.
in http2
Beforehand the client and the server have agreed on some value collections like:
headerField: ['user-agent','cookie', 'connection',...]
connection-values: ['keep-alive', 'close'...]
Then Connection: keep-alive will be encode into:
2:0
end
Here is a protocol similiar with http2 binary protocol: thrift binary protocol
I believe the primary reason HTTP/2 uses binary encoding is to pack the payload into the fixed sized frames.
Plain text cannot fit exactly into the frame. So binary encoding the data and splitting into multiple frames would make lot more sense.
I was trying to create a tool to grab frames from a mjpeg stream that is transmitted over http. I did not find any specification so I looked at what wikipedia says here:
In response to a GET request for a MJPEG file or stream, the server
streams the sequence of JPEG frames over HTTP. A special mime-type
content type multipart/x-mixed-replace;boundary=<boundary-name>
informs the client to expect several parts (frames) as an answer
delimited by <boundary-name>. This boundary name is expressly
disclosed within the MIME-type declaration itself.
But this doesn't seem to be very accurate in practice. I dumped some streams to find out how they behave. Most streams have the following format (where CRLF is a carriage return line feed, and a partial header are some header fields without a status line):
Status line (e.g. HTTP/1.0 200 OK) CRLF
Header fields (e.g. Cache-Control: no-cache) CRLF
Content-Type header field (e.g. Content-Type: multipart/x-mixed-replace; boundary=--myboundary) CRLF
CRLF (Denotes that the header is over)
Boundary (Denotes that the first frame is over) CRLF
Partial header fields (mostly: Content-type: image/jpeg) CRLF
CRLF (Denotes that this "partial header" is over)
Actual frame data CRLF
(Sometimes here is an optional CRLF)
Boundary
Starting again at partial header (line 6)
The first frame never contained actual image data.
All of the analyzed streams had the Content-Type header, with the type set to multipart/x-mixed-replace.
But some of the streams get things wrong here:
Two Servers claimed boundary="MOBOTIX_Fast_Serverpush" but then used --MOBOTIX_Fast_Serverpush as frame delimiter.
This irritated me quite a bit so I though of an other approach to get the frames.
Since each JPEG starts with 0xFF 0xD8 as Start of Image marker and ends with 0xFF 0xD9 I could just start looking for these. This seems to be a very dirty approach and I don't really like it, but it might be the most robust one.
Before I start implementing this, are there some points I missed about MJPEG over HTTP? Is there any real specification of transmitting MJPEG over HTTP?
What are the caveats when just watching for the Start and End markers of a JPEG instead of using the boundary to delimit frames?
this doesn't seem to be very accurate in practice.
It is very accurate in practice. You are just not handling it correctly.
The first frame never contained actual image data.
Yes, it does. There is always a starting boundary before the first MIME entity (as MIME can contain prologue data before the first entity). You are thinking that MIME boundaries exist only after each MIME entity, but that is simply not true.
I suggest you read the MIME specification, particularly RFC 2045 and RFC 2046. MIME works fine in this situation, you are just not interpreting the results correctly.
Actual frame data CRLF
(Sometimes here is an optional CRLF)
Boundary
Actually, that last CRLF is NOT optional, it is actually part of the next boundary that follows a MIME entity's data (see RFC 2046 Section 5). MIME boundaries must appear on their own lines, so a CRLF is artificially inserted after the entity data, which is especially important for data types (like images) that are not naturally terminated by their own CRLF.
Two Servers claimed boundary="MOBOTIX_Fast_Serverpush" but then used --MOBOTIX_Fast_Serverpush as frame delimiter
That is how MIME is supposed to work. The boundary specified in the Content-Type header is always prefixed with -- in the actual entity stream, and the terminating boundary after the last entity is also suffixed with -- as well.
For example:
Content-Type: multipart/x-mixed-replace; boundary="MOBOTIX_Fast_Serverpush"
--MOBOTIX_Fast_Serverpush
Content-Type: image/jpeg
<jpeg bytes>
--MOBOTIX_Fast_Serverpush
Content-Type: image/jpeg
<jpeg bytes>
--MOBOTIX_Fast_Serverpush
... and so on ...
--MOBOTIX_Fast_Serverpush--
This irritated me quite a bit so I though of an other approach to get the frames.
What you are thinking of will not work, and is not as robust as you are thinking. You really need to process the MIME stream correctly instead.
When processing multipart/x-mixed-replace, what you are supposed to do is:
read and discard the HTTP response body until you reach the first MIME boundary specified by the Content-Type response header.
then read a MIME entity's headers and data until you reach the next matching MIME boundary.
then process the entity's data as needed, according to its headers (for instance, displaying a image/jpeg entity onscreen).
if the connection has not been closed, and the last boundary read is not the termination boundary, go back to 2, otherwise stop processing the HTTP response.
I'm working in a crawler and I want to know if the page accept multiple languages.
My request is as follows:
GET www.stackoverflow.com HTTP/1.1
Host: www.stackoverflow.com
Accept-Language: en
How I know in the response if they accept more than one language? In the header?
Content language specifies just one?
(this is an example header, not the stackoverflow answer)
HTTP/1.1·200·OK
Date:·Sat,·06·Set·2014·15:52:50·GMT
Server:·Apache/2
Content-Location:·qa-http-and-lang.en.php
Vary:·negotiate,accept-language,Accept-Encoding
TCN:·choice
P3P:·policyref="http://www.w3.org/2001/05/P3P/p3p.xml"
Connection:·close
Transfer-Encoding:·chunked
Content-Type:·text/html; charset=utf-8
Content-Language:·en
In first place you don´t have to set the Accept-Language attribute. You only have to parse the HTTP response and get Content-Language. It should have the values for all the languages that the content is intended. If no Content-Language is specified, the default is that the content is intended for all language audiences. This might mean that the sender does not consider it to be specific to any natural language, or that the sender does not know for which language it is intended.
So, if Content-Language is specified and have more than 1 value then the page accept multiple languages and if no Content-Language is specified you should decide if you consider that it accepts multiple languages or not.
Reference:http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.12
Hope it helps
This question is regarding the order of precedence for the media-types of the HTTP Header "Accept-Encoding" when all are of equal weight and has been prompted by this comment on my blog.
Background:
The Accept-Encoding header takes a comma separated list of media-types the browser can accept e.g. gzip,deflate
A quality factor can also be specified to give preference to other media-types e.g. in the case of "gzip;q=.8,deflate", deflate is preferred - but is not relevant to this question. NB: A type with a "q=0" means "not acceptable".
RFC2616 also states that the "most specific reference" for the media-type definition should be weighted first. i.e. "text/html;level=1" should be used over "text/html" - this is not relevant to the question also.
Question:
In the following case, which media-type has precedence?
Accept-Encoding: gzip,deflate
Both types have an equivalent quality factor of 1, and both types are "acceptable" to the browser - so either one could be used. I'd always assumed that the first type entered should be "preferred", but there doesn't seem to be a specific example or preference for this particular case in the RFC.
I believe somewhere in the RFC, or in a related RFC, it states that the first is preferred for all fields of this format.
However, in the special case of gzip vs deflate, you should probably use deflate if you can due to lower overhead (fewer headers and footers, and although it still has an adler32 checksum, it doesn't have a crc32 on top). Other than that they are exactly the same. The actual data is compressed in the same way for both. This means deflate is both faster and produces a smaller output. Both of these become far more important on a page under heavy load. Most of the extra headers in gzip are things like unix style file permissions, which are useless in this context anyway.
Really, clients should want to be served gzip due to reliability and servers should want to serve deflate due to performance. The extra overhead is much more important when it happens thousands of times a second than when it happens once for every page you load.
On my own sites I check for deflate first and use that if I can, then I check for gzip. If I can't use either, I just send in plain text. I don't know what language you are using but it's about 5 lines of ASP.NET to do that.
There is no client side preference here. Just pick one what you (the server side) would prefer.