There is a confusing bit of terminology in the RFC for HTTP/2 that I wish was clearer.
Per the RFC https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2
Just as in HTTP/1.x, header field names are strings of ASCII
characters that are compared in a case-insensitive fashion. However,
header field names MUST be converted to lowercase prior to their
encoding in HTTP/2. A request or response containing uppercase header
field names MUST be treated as malformed
This seems to outline two conflicting ideas
Header field names are case-insensitive in HTTP/2
If you receive or send fields that are not lowercase, the request/response is malformed.
If a request or response that contains non-lowercase headers is invalid, how can it be considered case-insensitive?
There are two levels of "HTTP": a more abstract, upper, layer with the HTTP semantic (e.g. PUT resource r1), and a lower layer where that semantic is encoded. Think of these two as, respectively, the application layer of HTTP, and the network layer of HTTP.
The application layer can be completely unaware of whether the semantic HTTP request PUT r1 has arrived in HTTP/1.1 or HTTP/2 format.
On the other hand, the same semantic, PUT r1, is encoded differently in HTTP/1.1 (textual) vs HTTP/2 (binary) by the network layer.
The referenced section of the specification should be interpreted in the first sentence as referring to the application layer: "as in HTTP/1.1 header names should compared case insensitively".
This means that if an application is asked "is header ACCEPT present?", the application should look at the header names in a case insensitive fashion (or be sure that the implementation provides such feature), and return true if Accept or accept is present.
The second sentence should be interpreted as referring to the network layer: a compliant HTTP/2 implementation MUST send the headers over the network lowercase, because that is how HTTP/2 encodes header names to be sent over the wire.
Nothing forbids a compliant HTTP/2 implementation to receive content-length: 128 (lowercase), but then convert this header into Content-Length: 128 when it makes it available to the application - for example for maximum compatibility with HTTP/1.1 where the header has uppercase first letters (for example to be printed on screen).
It's a mistake, or at least an unnecessarily confusing specification.
Header names must be compared case-insensitive, but that's irrelevant since they are only transmitted or received lowercase.
I.e. in the RFC, "Content-Length" refers to to the header "content-length."
RFC 7540 has been obsoleted by RFC 9113, which states simply:
Field names MUST be converted to lowercase when constructing an HTTP/2 message.
https://www.rfc-editor.org/rfc/rfc9113#section-8.2
And the prose now uses the lowercase header names.
Related
I've just read an article about differences between http1 and http2. The main question that I have is when it says that http2 is a binary protocol but http1 is a textual protocol.
Maybe I'm wrong but I know that any data, text or whatever format it can be, has a binary representation form in memory, and even when transfer through TCP/IP network the data is split to a format according with the layer of the OSI model or the TCP/IP model representation which means that technically textual format doesn't exist in the context of data transfer through network.
I cannot really understand this difference between http2 and http1, can you help me please with a better explanation?
Binary is probably a confusing term - everything is ultimately binary at some point in computers!
HTTP/2 has a highly structured format where HTTP messages are formatted into packets (called frames) and where each frame is assigned to a stream. HTTP/2 frames have a specific format, including a length which is declared at the beginning of each frame and various other fields in the frame header. In many ways it’s like a TCP packet. Reading an HTTP/2 frame can follow a defined process (the first 24 bits are the length of this packet, followed by 8 bits which define the frame type... etc.). After the frame header comes the payload (e.g. HTTP Headers, or the Body payload) and these will also be in a specific format that is known in advance. An HTTP/2 message can be sent in one or more frames.
By contrast HTTP/1.1 is an unstructured format made up of lines of text in ASCII encoding - so yes this is transmitted as binary ultimately, but it’s basically a stream of characters rather than being specifically broken into separate pieces/frames (other than lines). HTTP/1.1 messages (or at least the first HTTP Request/Response line and HTTP Headers) are parsed by reading in characters one at a time, until a new line character is reached. This is kind of messy as you don’t know in advance how long each line is so you must process it character by character. In HTTP/1.1 the HTTP Body’s length is handled slightly different as typically is known in advance as a content-length HTTP header will define this. An HTTP/1.1 message must be sent in its entirety as one continuous stream of data and the connection can not be used for anything else but transmitting that message until it is completed.
The advantage that HTTP/2 brings is that, by packaging messages into specific frames we can intermingle the messages: here’s a bit of request 1, here’s a bit of request 2, here’s some more of request 1... etc. In HTTP/1.1 this is not possible as the HTTP message is not wrapped into packets/frames tagged with an id as to which request this belongs to.
I’ve a diagram here and an animated version here that help conceptualise this better.
HTTP basically encodes all relevant instructions as ASCII code points, e.g.:
GET /foo HTTP/1.1
Yes, this is represented as bytes on the actual transport layer, but the commands are based on ASCII bytes, and are hence readable as text.
HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still "GET /foo" to be found somewhere in there.)
for example:
Connection: keep-alive
in http1.1
in http1.1 it will be encoded into(often in utf-8):
Connection: keep-alive
Just the text.
in http2
Beforehand the client and the server have agreed on some value collections like:
headerField: ['user-agent','cookie', 'connection',...]
connection-values: ['keep-alive', 'close'...]
Then Connection: keep-alive will be encode into:
2:0
end
Here is a protocol similiar with http2 binary protocol: thrift binary protocol
I believe the primary reason HTTP/2 uses binary encoding is to pack the payload into the fixed sized frames.
Plain text cannot fit exactly into the frame. So binary encoding the data and splitting into multiple frames would make lot more sense.
I was trying to create a tool to grab frames from a mjpeg stream that is transmitted over http. I did not find any specification so I looked at what wikipedia says here:
In response to a GET request for a MJPEG file or stream, the server
streams the sequence of JPEG frames over HTTP. A special mime-type
content type multipart/x-mixed-replace;boundary=<boundary-name>
informs the client to expect several parts (frames) as an answer
delimited by <boundary-name>. This boundary name is expressly
disclosed within the MIME-type declaration itself.
But this doesn't seem to be very accurate in practice. I dumped some streams to find out how they behave. Most streams have the following format (where CRLF is a carriage return line feed, and a partial header are some header fields without a status line):
Status line (e.g. HTTP/1.0 200 OK) CRLF
Header fields (e.g. Cache-Control: no-cache) CRLF
Content-Type header field (e.g. Content-Type: multipart/x-mixed-replace; boundary=--myboundary) CRLF
CRLF (Denotes that the header is over)
Boundary (Denotes that the first frame is over) CRLF
Partial header fields (mostly: Content-type: image/jpeg) CRLF
CRLF (Denotes that this "partial header" is over)
Actual frame data CRLF
(Sometimes here is an optional CRLF)
Boundary
Starting again at partial header (line 6)
The first frame never contained actual image data.
All of the analyzed streams had the Content-Type header, with the type set to multipart/x-mixed-replace.
But some of the streams get things wrong here:
Two Servers claimed boundary="MOBOTIX_Fast_Serverpush" but then used --MOBOTIX_Fast_Serverpush as frame delimiter.
This irritated me quite a bit so I though of an other approach to get the frames.
Since each JPEG starts with 0xFF 0xD8 as Start of Image marker and ends with 0xFF 0xD9 I could just start looking for these. This seems to be a very dirty approach and I don't really like it, but it might be the most robust one.
Before I start implementing this, are there some points I missed about MJPEG over HTTP? Is there any real specification of transmitting MJPEG over HTTP?
What are the caveats when just watching for the Start and End markers of a JPEG instead of using the boundary to delimit frames?
this doesn't seem to be very accurate in practice.
It is very accurate in practice. You are just not handling it correctly.
The first frame never contained actual image data.
Yes, it does. There is always a starting boundary before the first MIME entity (as MIME can contain prologue data before the first entity). You are thinking that MIME boundaries exist only after each MIME entity, but that is simply not true.
I suggest you read the MIME specification, particularly RFC 2045 and RFC 2046. MIME works fine in this situation, you are just not interpreting the results correctly.
Actual frame data CRLF
(Sometimes here is an optional CRLF)
Boundary
Actually, that last CRLF is NOT optional, it is actually part of the next boundary that follows a MIME entity's data (see RFC 2046 Section 5). MIME boundaries must appear on their own lines, so a CRLF is artificially inserted after the entity data, which is especially important for data types (like images) that are not naturally terminated by their own CRLF.
Two Servers claimed boundary="MOBOTIX_Fast_Serverpush" but then used --MOBOTIX_Fast_Serverpush as frame delimiter
That is how MIME is supposed to work. The boundary specified in the Content-Type header is always prefixed with -- in the actual entity stream, and the terminating boundary after the last entity is also suffixed with -- as well.
For example:
Content-Type: multipart/x-mixed-replace; boundary="MOBOTIX_Fast_Serverpush"
--MOBOTIX_Fast_Serverpush
Content-Type: image/jpeg
<jpeg bytes>
--MOBOTIX_Fast_Serverpush
Content-Type: image/jpeg
<jpeg bytes>
--MOBOTIX_Fast_Serverpush
... and so on ...
--MOBOTIX_Fast_Serverpush--
This irritated me quite a bit so I though of an other approach to get the frames.
What you are thinking of will not work, and is not as robust as you are thinking. You really need to process the MIME stream correctly instead.
When processing multipart/x-mixed-replace, what you are supposed to do is:
read and discard the HTTP response body until you reach the first MIME boundary specified by the Content-Type response header.
then read a MIME entity's headers and data until you reach the next matching MIME boundary.
then process the entity's data as needed, according to its headers (for instance, displaying a image/jpeg entity onscreen).
if the connection has not been closed, and the last boundary read is not the termination boundary, go back to 2, otherwise stop processing the HTTP response.
I need to do a spot of server-side parsing of raw HTTP headers - in particular the Content-type header. Whilst what I see for this header in different browser is appears to confirm to the same rules of capitalization and space usage I need to be sure. For the mainstream browsers is it safe to assume that this header string will bear the form (for multipart form data)
Content-type:...; boundary=...
or is it necessary to check for redundant spaces, e.g. boundary = etc?
I am building a web application that will send a set of flag states to its server by converting the flags into binary, then converting the binary into ascii characters. The ascii characters will be sent using a post command, then a response (encoded the same way) will be sent back. I would like to know if there are ascii character that can cause the HTTP requests and data transfer to break down or get misdirected. Are there standard ascii characters (0-127) that need to be avoided?
Despite its name, HTTP is agnostic to the format and semantics of the entity-body content. It doesn't need to be text. Describing it as text and giving a character encoding is metadata for the sending and receiving applications. Your actual entity-data isn't text but if you've added a layer of re-interpretation so you could provide that metadata.
HTTP bodies are of two types: counted or chunked. For counted, the message-body is the same as the entity-body. Counted is used unless you want to start streaming data before knowing its entire length. Just send the Content-Length header with the number of octets in the entity-body and copy it into the output stream.
I see various spellings of the non-RFC "XFF" HTTP header used for identifying the IP addresses the request has been forwarded through. Is there a difference between these different header names: X-FORWARDED-FOR, X_FORWARDED_FOR, and HTTP_X_FORWARDED_FOR? Do I need to look for all three?
PS - Yes, I know this header can be easily spoofed :)
The HTTP_ prefix is used by some languages like PHP simply to distinguish HTTP headers from other server variables:
$_SERVER['HTTP_X_FORWARDED_FOR']
The HTTP header name is actually
X-Forwarded-For
The header name itself is case insensitive. However, when you want to query a request header, programming languages are largely case sensitive about it (again, PHP is one of them).
The X- indicates that the Forwarded-For header is non-standard. I don't think there's a difference whether a language uses dashes or underscores to refer to header names.
Essentially, they're all the same header, just referred to differently by various implementations.