Does progressive JPEG require chunked transfer encoding? - http

I've been looking at how to implement an API over HTTP that allows for online processing of the resource it returns. This resource could be a progressive JPEG, for example. In reading up on progressive JPEGs and how they are rendered in browsers I never see any mention of this requiring chunked transfer encoding to work. If I'm understanding things correctly I don't see how progressive JPEGs could be rendered before they are fully downloaded without the use of chunked transfer encoding. Is this correct?
Edit: To clarify why I think chunked encoding is needed if you don't use chunked encoding to GET a progressive JPEG then the browser or other application that sent the GET request for the JPEG wouldn't be passed the JPEG resource until it was fully received. With chunked encoding, on the other hand, as each chunk of the JPEG came in, the application (browser or otherwise) could render or do whatever with the portion of the JPEG that was received instead of not having anything to process until the full JPEG was downloaded.

the browser or other application that sent the GET request for the JPEG wouldn't be passed the JPEG resource until it was fully received
That's not true. Browsers can access resources they are downloading before the download completes.
In the end it's all received over a socket, and a proper abstraction layer lets the application code "stream" bytes from that socket as they come in in packets.

Related

How is a browser able to resume downloads?

Do downloads use HTTP? How can they resume downloads after they have been suspended for several minutes? Can they request a certain part of the file?
Downloads are done over either HTTP or FTP.
For a single, small file, FTP is slightly faster (though you'll barely notice a differece). For downloading large files, HTTP is faster due to automatic compression. For multiple files, HTTP is always faster due to reusing existing connections and pipelining.
Parts of a file can indeed be requested independent of the whole file, and this is actually how downloads work. This is a process known as 'Chunked Encoding'. A browser requests individual parts of a file, downloads them independently, and assembles them in the correct order once all parts have been downloaded:
In chunked transfer encoding, the data stream is divided into a series of non-overlapping "chunks". The chunks are sent out and received independently of one another. No knowledge of the data stream outside the currently-being-processed chunk is necessary for both the sender and the receiver at any given time.
And according to FTP vs HTTP:
During a "chunked encoding" transfer, the sending party sends a stream of [size-of-data][data] blocks over the wire until there is no more data to send and then it sends a zero-size chunk to signal the end of it.
This is combined with a process called 'Byte Serving' to allow for resuming of downloads:
Byte serving begins when an HTTP server advertises its willingness to serve partial requests using the Accept-Ranges response header. A client then requests a specific part of a file from the server using the Range request header. If the range is valid, the server sends it to the client with a 206 Partial Content status code and a Content-Range header listing the range sent.
Do downloads use HTTP?
Yes. Especially since major browsers had deprecated FTP.
How can they resume downloads after they have been suspended for several minutes?
Not all downloads can resume after this long. If the (TCP or SSL/TLS) connection had been closed, another one has to be initiated to resume the download. (If it's HTTP/3 over QUIC, then it's another story.)
Can they request a certain part of the file?
Yes. This can be done with Range Requests. But it require server-side support (especially when the requested resource is provided by a dynamic script).
That other answer mentioning chunked transfer had mistaken it for the underlaying mechanism of TCP. Chunked transfer is not designed for the purpose of resuming partial downloads. It's designed for delimiting message boundary when the Content-Length header is not present, and when the communicating parties wish to reuse the connection. It is also used when the protocol version is HTTP/1.1 and there's a trailer fields section (which is similar to header fields section, but comes after the message body). HTTP/2 and HTTP/3 have their own way to convey trailers.
Even if multiple non-overlapping "chunks" of the resource is requested, it's encapsulated in a multipart/* message.

How does browser download binary files?

When I click on a binary file (image files, PDF files, videos files etc) in a browser to download, does the server return these files in an HTTP response body? Does HTTP protocol support binary HTTP response body in the first place? Or does the browser uses some other protocol internally to transfer these files?
Any reference (books, links) on how browser works would be appreciated!
Does HTTP protocol support binary HTTP response body in the first place?
Yes. I believe the browser knows it is binary because of the Content-Type header in the response.
In the image below, captured from Wireshark, the data highlighted in blue is binary.
You can see this data is in the body of the response of an HTTP request for an image/x-icon.

How does browser download image and other binary files?

I want to know the exact mechanism behind the transfer of binary files using a browser. If the browser uses purely HTTP that means only text is allowed so the image is encoded using base64 and decoded later in browser? Or does the browser download this using some other mechanism where this encoding/decoding is not needed?
Just in case someone wants to know the answer. While you can send the binary data over HTTP using base64 encoding, it is not the most efficient process, as encoding and decoding is required. So when you request an image file using http, the server gives you the metadata information such as MIME type, content-length etc. Using this information, the HTTP agent (eg. browser) actually downloads the image directly using TCP and not HTTP.

Sonos speakers need content-length header?

I am currently writing a DLNA server which serves streams for DMRs such as the Sonos Play 1 & 3.
Accordingly to the HTTP 1.1 specification, when you do not know the actual length of a track, do not specify Content-Length, instead just specify Connection: Close.
This should make the clients read the stream until the server closes the connection.
This is working fine for wav and flac streams. But for mp3 and ogg streams i need to specify a Content-Length to make them play.
Otherwise the Sonos clients just close the connection instantly by them self.
In my case its a live stream of the computers current playback. Therefore it is impossible to know the length. As long as the computer runs there is content to play.
My current solution is to fake the content length and set it to an absurd value (100gb) to make the stream play forever.
I am wondering about that behaviour, because it is working fine for wav and flac, but not for mp3 and ogg.
What am i doing wrong? Or is this just a deviation from the HTTP 1.1 specification?

Compress file before upload via http

Is it possible to compress data being sent from the client's browser (a file upload) to the server?
Flash, silverlight and other technology is ok!
Browsers never compress uploaded data because they have no way of knowing whether the server supports it.
Downloaded content can be compressed because the Accept-Encoding request header allows the browser to indicate to the server that it supports compressed content. Unfortunately, there's no equivalent protocol that works the other way and allows the server to indicate to the browser that it supports compression.
If you have control over the server and client (e.g. using silverlight, flash) then you could make use of compressed request bodies.
For Silverlight there is a library called Xceed which amongst other things "Lets you compress data as it is being uploaded.", it is not free though. I believe that this can only be done via a technology such as Flash or Silverlight and not natively on the browser.
I disagree with the above poster about browsers doing this automatically and I believe this only happens with standard HTML/CSS/Text files and only if the server and browser both have compression enabled (gzip, deflate).

Resources