Http Compression on Binary Data - http

I am serving binary data through http.
For the time being I use Content-Disposition: attachment.
Can I use the built in http compression (having the client request data with the header Accept-Encoding) to compress the attachment?
Or should I compress the attachment manually?
What is the proper way of serving compressed byte arrays through http?

Content-disposition is merely a header instructing the browser to either render the response, or to offer it as a download to the user. It doesn't change any HTTP semantics, and it doesn't change how the response body is transferred or interpreted.
So just use the built-in compression that compresses the response body according to the request header Accept-encoding.

Related

Chunked Transfer Encoding, Content-Length, and Byte Serving

In https://gist.github.com/CMCDragonkai/6bfade6431e9ffb7fe88, it says
Do note that byte serving is compatible with chunked encoding, this
would be applicable where you know the total content length, want to
allow partial or resumable downloads, but you want to stream each
partial response to the client.
I thought that if you want to allow partial and resumable downloads, you need to use the Content-Length HTTP header which is not allowed with chunked encoding. Is my understanding incorrect?

How does browser download binary files?

When I click on a binary file (image files, PDF files, videos files etc) in a browser to download, does the server return these files in an HTTP response body? Does HTTP protocol support binary HTTP response body in the first place? Or does the browser uses some other protocol internally to transfer these files?
Any reference (books, links) on how browser works would be appreciated!
Does HTTP protocol support binary HTTP response body in the first place?
Yes. I believe the browser knows it is binary because of the Content-Type header in the response.
In the image below, captured from Wireshark, the data highlighted in blue is binary.
You can see this data is in the body of the response of an HTTP request for an image/x-icon.

Content-Encoding vs Transfer Encoding in HTTP

I have a question on usage of Content-Encoding and Transfer-Encoding:
Please let me know if my below understanding is right:
Client in its request can specify which encoding types it is willing to accept using accept-encoding header. So, if Server wishes to encode the message before transmission, eg. gzip, it can zip the entity (content) and add content-encoding: gzip and send across the HTTP response. On reception, client can receive and decompress and parse the entity.
In case of Transfer Encoding, Client may specify what kind of encoding it is willing to accept and perform its action on fly. i.e. if Client sends a TE: gzip; q=1, it means that if Server wishes, it can send a 200 OK with Transfer-Encoding: gzip and as it tries sending the stream, it can compress and send across, and client upon receiving the content, can decompress on fly and perform its parsing.
Is my understanding right here? Please comment.
Also, what is the basic advantage of compressing the entity on fly vs compressing the entity first and then transmitting it across? Is transfer-encoding valid only for chunked responses as we do not know the size of the entity before transmission?
The difference really is not about on-the-fly or not -- Content-Encoding can be both pre-computed and on the fly.
The differences are:
Transfer Encoding is hop-by-hop, not end-to-end
Transfer Encodings other than "chunked" (sadly) aren't implemented in practice
Transfer Encoding is on the message layer, Content Encoding on the payload layer
Using Content Encoding affects entity tags etc.
See http://greenbytes.de/tech/webdav/rfc7230.html#transfer.codings and http://greenbytes.de/tech/webdav/rfc7231.html#data.encoding.

Difference between multipart and chunked protocol?

Can some experts explain the differences between the two? Is it true that chunked is a streaming protocol and multipart is not? What is the benefit of using multipart?
More intuitively,
Chunking is a way to send a single message from server to client, where the server doesn't have to wait for the entire response to be generated but can send pieces (chunks) as and when it is available. Now this happens at data transfer level and is oblivious to the client. Appropriately it is a 'Transfer-Encoding' type.
While Multi-part happens at the application level and is interpreted at the application logic level. Here the server is telling client that the content , even if it is one response body it has different logical parts and can be parsed accordingly. Again appropriately, this is a setting at 'Content-Type' as the clients ought to know it.
Given that transfer can be chunked independent of the content types, a multi-part http message can be transferred using chunked encoding by the server if need be.
Neither is a protocol. HTTP is the protocol. In fact, the P in HTTP stands for Protocol.
You can read more on chunked and multipart under Hypertext Transfer Protocol 1.1
Chunked is a transfer coding found in section 3.6 Transfer Codings.
Multipart is a media type found in section 3.7.2 Multipart Types a subsection of 3.7 Media Types.
Chunked also affects other aspects of the protocol such as the content-length as specified under 4.4 as chunked must be used when message length cannot be predetermined (mainly when delivering dynamic content).
From 14.41 (Transfer-Encoding header field)
The Transfer-Encoding general-header field indicates what (if any)
type of transformation has been applied to the message body in order
to safely transfer it between the sender and the recipient. This
differs from the content-coding in that the transfer-coding is a
property of the message, not of the entity.
Put more simply, chunking is how you transfer a block of data, while multipart is the shape of the data.

Is it possible to have chunked Http GET requests?

I am trying to write a simple proxy. I just want to know whether it is possible to have chunked Http GET requests?
The answer is no and yes simultaneously. GET requests don't have any content, so they obviously cannot use chunked transfer encoding (there is nothing to transfer). However the response to a GET request can contain a body that is encoded using chunked transfer encoding. So whenever there is a body, chunked transfer encoding may be used. The wikipedia page has more information and also links to the corresponding RFC.

Resources