Why is it said that HTTP2 is a binary protocol? - http

I've just read an article about differences between http1 and http2. The main question that I have is when it says that http2 is a binary protocol but http1 is a textual protocol.
Maybe I'm wrong but I know that any data, text or whatever format it can be, has a binary representation form in memory, and even when transfer through TCP/IP network the data is split to a format according with the layer of the OSI model or the TCP/IP model representation which means that technically textual format doesn't exist in the context of data transfer through network.
I cannot really understand this difference between http2 and http1, can you help me please with a better explanation?

Binary is probably a confusing term - everything is ultimately binary at some point in computers!
HTTP/2 has a highly structured format where HTTP messages are formatted into packets (called frames) and where each frame is assigned to a stream. HTTP/2 frames have a specific format, including a length which is declared at the beginning of each frame and various other fields in the frame header. In many ways it’s like a TCP packet. Reading an HTTP/2 frame can follow a defined process (the first 24 bits are the length of this packet, followed by 8 bits which define the frame type... etc.). After the frame header comes the payload (e.g. HTTP Headers, or the Body payload) and these will also be in a specific format that is known in advance. An HTTP/2 message can be sent in one or more frames.
By contrast HTTP/1.1 is an unstructured format made up of lines of text in ASCII encoding - so yes this is transmitted as binary ultimately, but it’s basically a stream of characters rather than being specifically broken into separate pieces/frames (other than lines). HTTP/1.1 messages (or at least the first HTTP Request/Response line and HTTP Headers) are parsed by reading in characters one at a time, until a new line character is reached. This is kind of messy as you don’t know in advance how long each line is so you must process it character by character. In HTTP/1.1 the HTTP Body’s length is handled slightly different as typically is known in advance as a content-length HTTP header will define this. An HTTP/1.1 message must be sent in its entirety as one continuous stream of data and the connection can not be used for anything else but transmitting that message until it is completed.
The advantage that HTTP/2 brings is that, by packaging messages into specific frames we can intermingle the messages: here’s a bit of request 1, here’s a bit of request 2, here’s some more of request 1... etc. In HTTP/1.1 this is not possible as the HTTP message is not wrapped into packets/frames tagged with an id as to which request this belongs to.
I’ve a diagram here and an animated version here that help conceptualise this better.

HTTP basically encodes all relevant instructions as ASCII code points, e.g.:
GET /foo HTTP/1.1
Yes, this is represented as bytes on the actual transport layer, but the commands are based on ASCII bytes, and are hence readable as text.
HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still "GET /foo" to be found somewhere in there.)

for example:
Connection: keep-alive
in http1.1
in http1.1 it will be encoded into(often in utf-8):
Connection: keep-alive
Just the text.
in http2
Beforehand the client and the server have agreed on some value collections like:
headerField: ['user-agent','cookie', 'connection',...]
connection-values: ['keep-alive', 'close'...]
Then Connection: keep-alive will be encode into:
2:0
end
Here is a protocol similiar with http2 binary protocol: thrift binary protocol

I believe the primary reason HTTP/2 uses binary encoding is to pack the payload into the fixed sized frames.
Plain text cannot fit exactly into the frame. So binary encoding the data and splitting into multiple frames would make lot more sense.

Related

Difference between binary and text protocols in the context of http

What I read everywhere is http is a text based and http/2 is a binary based protocol. Lots of articles online suggests that http/2 binary protocol is more compact and efficient to process.
Where exactly in the http work flow , text based protocol is adding the overhead ? At the application layer, we would always need to serialize the data (text) into binary anyway to transfer on the wire. So, essentially are we not transferring the data in a binary using both http/http2 ?
Where exactly binary protocol in http2 work flow is bringing in that compactness and processing efficiency
At the application layer, we would always need to serialize the data (text) into binary anyway to transfer on the wire.
True, but till HTTP/1, data was written to the underneath layer like tcp layer in text. But in Http/2 the data is encoded into binary packed into frames and sent to the underlying layer.
Text format cannot fit exactly into the fixed sized frames. A character may get split across multiple frames. Rather binary encoding the data and splitting it into multiple frames looks more preferable.

What size does the server have to give to each response chunk?

If I wanted to configure my personal server so that the response for a certain request is set according to the chunk rules: what size should each of the server response chunk have?
For example, let's say that the chunked response is a long HTML page or a file.
How would you behave in these two cases?
From the RFC:
This allows dynamically produced content to be transferred...
In other words: Transfer-Encoding: chunked is needed when the length of content is unknown.
The length of your content may be as big as 10Tb... but also it can be as small as 10 bytes. It doesn't matter. The chucks' sizes depend solely on the algorithms you are using to generate them and to read then.
Let's say you generate a stream of messages of different lengths, one character per second. In this case you can decide to send one byte chucks to the client. This way the client will be able to use the data as soon as it arrives. But if your client have no use for partial messages, then you probably should save the bandwidth and send a chunk at the moment you've finished generating the next message. And again it doesn't matter how big or small the message is. It can be 2 characters or it can be 1000.
On second thought, there are some use cases for Transfer-Encoding: chunked with the data of known size. But then your question becomes to broad to answer. It depends on your client code, server code, network conditions, data properties, desired user experience, etc.
And if by any chance you are asking about optimal size from the network perspective, then just send the whole file - that the best bet. And support Content-Range on your server instead of Transfer-Encoding: chunked.

Understanding tshark output

I am trying to understand the output of network data captured by tshark using the following command
sudo tshark -i any ‘tcp port 80’ -V -c 800 -R ‘http contains <filter__rgument>' > <desired_file_location>
Accordingly, I get some packets in output each starting with a line something like this:
Frame 5: 1843 bytes on wire (14744 bits), 1843 bytes captured (14744 bits) on interface 0
I have some basic questions regarding a packet:
Is a frame and a packet the same thing (used interchangeably)?
Does a packet logically represent 1 request (in my case HTTP request)? If not, can a request span across multiple packets, or can a packet contain multiple requests? A more basic question will be what does a packet represent?
I see a lot of information being captured in the request. Is there a way using tshark to just capture the http headers and http reqeust body? Basically, my motive of this whole exercise is to capture all these requests to replay them later.
Any pointers in order to answer these doubts will be really helpful.
You've asked several questions. Here are some answers.
Are frames and packets the same things?
No. Technically, when you are looking at network data and that data includes the Layer 2 frame header, you are looking at a frame. The IP packet inside of that frame is just data from Layer 2's point of view. When you look at the IP datagram (or strip off the frame header), you are now looking at a packet.
Ultimately, I tell people that you should know the difference and try to use the terms properly, but in practice it's not an extremely important distinction.
Does a packet represent a single request?
This really depends. With HTTP 1.0 and 1.1, you could look at it this way, though there's no reason that, if the client has a significant amount of POST data to send, the request can't span multiple packets. It is better to think of a single "connection" or "session" as a single request/response. (This is not necessarily strictly true with HTTP 1.1, but it is generally true)
With HTTP 2.0, this is by design not true. A single connection or session is used to handle multiple data streams (requests/responses).
How can I get at the request headers?
This is far too lengthy for me to answer here. The simplest thing to do, most likely, is to simply fire up WireShark, go into the filter bar and type "http." As soon as you hit the dot, you will see a list of all of the different sub-elements that you can look at. You can use these in tshark using the '-Y' option, and you can additionally specify columns that you would like to display (so you can add and remove columns, effectively).
An alternative way to see this information is to use the filter expression button to bring up the protocols selector. If you scroll down to HTTP, you can select it and then see all of the fields that are available.
When looking through these, realize that some of the fields are in the top-level rather than within request or response. For example, content-length appears as a field under http rather than http.request.content_length. This is because content-length is a field common to all requests and responses.

How is encoded data sent over a network?

I'm a bit confused how people represent binary data, and how it is sent over networks. I will explain through Wikipedia's example. Shown here <- https://imgur.com/a/POELH -> So I have my binary data encoded as base 64, and I am sending the text TWFU. So I am sending T then W then F and finally U. But to send T, a char. I will need one byte to send it, like I've always been told. One character sent over a network is one byte.
Because now I've come to think that if I encode 24 bytes, I will be sending over 4 characters, but to send over 4 characters I need the same amount of bytes as characters??
So when sending over the network "Man" (unencoded) (Requiring 3 bytes normally) vs "TWFu" (encoded) (requiring 4 bytes normally) in the example from above, are the same sequence of bits sent over the network the same. Because the last time I've used a socket to send over data, they just ask for a string input, never a text + encoding input.
Synopsis: "How" is an agreement. "Raw" is common.
Data is sent in whichever way the sender and receiver agree. There are many protocols that are standard agreements. Protocols operate at many levels. A very common pair that covers two levels is TCP/IP. Many higher-level protocols are layered on top of them. (A higher-level protocol may or may not depend on specific underlying protocols.) HTTP and SMTP are very common higher-level protocols, often with SSL sandwiched in between.
Sometimes the layers or the software that implements them is called a stack. There is also the reference (or conceptual) OSI Model. The key point about it is that it provides a language to talk about different layers. The layers it defines may or may not map to any specific stack.
Your question is too vague to answer directly. With HTTP, "raw" binary data is transferred all the time. The HTTP headers can give the length of the body in octets and the body follows the header. As part of the agreement between the sender and receiver, the header might give meta-data about the binary data using MIME headers. For example: Your gravatar
is sent with headers including:
content-length:871
content-type:image/png
That's enough for the receiver to know that the sender claims that it is a PNG graphic of 871 bytes. The receiver will read the header and then read 871 bytes for the body and then assume that what follows is another HTTP header.
Some protocols use synchronizations methods other than bodies with pre-declared sizes. They might be entirely text-based and use a syntax that allows only certain characters. They can be extended by a nesting agreement to use something like Base64 to represent binary data as text.
Some layers might provide data compression of sufficient density that expansion by higher layers, such as Base64, is not a great concern. See HTTP Compression, for example.
If you want to see HTTP in action, hit F12 and go the Network tab. If you want to see other protocols active on your computer try WireShark, Microsoft Message Analyzer, Fiddler or similar.
Base64 is a method for encoding arbitrary 8-bit data in a purely 7-bit channel. As much as the internet is based on the principle of 8-bit bytes, for text mode it's presumed to be 7-bit ASCII unless otherwise specified.
If you're sending that data Base64 encoded then you'll literally send TWFU. Many text-based protocols use Base64 out of convenience: It's an established standard and it's efficient enough for most applications.
The foundation of the internet, IP, is a protocol based on 8-bit bytes. When sending binary data you can make full use of all 8 bits, but if you're working with a text-mode protocol, of which there are many, you're generally stuck using 7-bit ASCII unless the protocol has a way of specifying which character set or encoding you're using.
If you have the option to switch to a "binary" transfer then you can side-step the need for Base64. If you're working with a 7-bit ASCII protocol then you're probably going to need Base64.
Note this isn't the only method for encoding arbitrary binary characters. There's also quoted printable as used in email, and URI encoding for URLs. These are more efficient in cases where escaping is exceptional, but far less efficient if it's required for each character.
If you know you're dealing with 7-bit text only there's no need for base-64 encoding.
However, if you'd need to send
Man
Boy
over a purely 7-bit text channel you couldn't send it as literal with the line breaks. Instead, you'd send encoded in base64
TWFuDQpCb3kNCg==
which has encoded line breaks but doesn't use incompatible characters. Of course, the receiver needs to know that you're sending encoded text - either implied by the protocol or explicitly marked in some way.

Sending raw bytes over network. Bad?

This post to the question "What is base 64 encoding used for?" says:
When you have some binary data that you want to ship across a network, you generally don't do it by just streaming the bits and bytes over the wire in a raw format. Why? because some media are made for streaming text. You never know -- some protocols may interpret your binary data as control characters (like a modem), or your binary data could be screwed up because the underlying protocol might think that you've entered a special character combination (like how FTP translates line endings).
I've used sockets in Java a hundert times to send binary data over networks. And as far as I know it very common to send binary data over networks especially if you have big data. I don't see why some devices could interpret binary data wrong, since it contains TCP header etc.
SOAP MTOM also sends binary data over networks.
Am I misunderstanding something? I'm irritated, because this post has many upvotes and is accepted.
The answer you link to isn't incorrect, it just fails to explicitly mention some examples. The answer is in the quote as well:
because some media are made for streaming text
Sockets deal in bytes, they don't care what they transport. It is the higher-level protocols, or the message formats they transport, that do.
It's when this binary data is wrapped in envelopes of such protocols or formats that they can wreak havoc. A less than (<) character in image bytes is perfectly valid, but when used in an XML message, it will break the XML. Other characters, like control characters, can have an influence on how further data is to be interpreted by a protocol handler.
So base64 is used to wrap binary data in a safe-for-transport way where that would otherwise not be safe.

Resources