Can I do something similar to pipelining on HTTP 2? - http

Pipelining is a technique in HTTP/1.1 where multiple requests are sent at once without waiting for a response, on a keepalive connection. The responses are then returned in order by the server, without waiting for a round-trip-time between a response being sent and the next request being received.
HTTP/2 adds a feature called multiplexing, which similarly allows the client to send off multiple requests at once. In this case however, the server can send responses all at once.
Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
This would be useful when downloading many large files, without much available memory to buffer several partially-completed responses.

Without control of the server, Can I achieve something similar to pipelining (i.e. receiving responses in order one-at-a-time without latency between responses) when using HTTP/2?
No you cannot, unless the server cooperates (for example the server can be configured to handle requests sequentially or something similar).
As a side note, while request pipelining was allowed in HTTP/1.1, it has always been considered a bad idea and as such made irrelevant by all major implementations (i.e. browsers don't do it, servers don't really support it, etc.).
The main problem is error handling and buggy proxy servers.
HTTP/2 allows a client to set priorities on requests so that requests are processed in priority order.
However, this feature is optional and servers may not implement it, so again you need to carefully choose/configure the server in order to get the behavior you want.
If you can control a little the server side, for both HTTP/1.1 and HTTP/2, a better solution would be to ask the server for all the files in a single request, and have the server reply with a multipart response.

Related

Are HTTP responses to the same resource returned in sequence?

Say I want to request some resource on a given domain, e.g. example.com/image.jpg.
If I do two requests to this particular resource, can I expect the first response I get back to be mappable to the first response I sent, and so on (or does this depend on client/server implementation)?
I'm asking because if I want to debug request-response pair, I necessarily need to know exactly which response belongs to which request (for timing purposes etc.). So, are there any sure-fire ways to achieve this mapping?
I'm assuming that you are talking about HTTP/1.x using persistent connections (i.e. HTTP keep-alive) when sending a new request without having the response for the previous one yet, i.e. using HTTP pipelining. In this case the order of the responses matches the order of the requests.
If you are instead talking about HTTP/2 then the situation is different because multiple requests can receive the responses in parallel inside the same TCP connection and this also means that a later requests might have received a response before earlier requests. And in which order requests to the same resource are handled by the server depends fully on the server implementation.
The same is true when the requests are done within independent TCP connections. The order might be even more unpredictable than with HTTP/2 because the requests might be handled in different threads or processes and thus the order also depends on the scheduling of these in the operating system.

How to tell a proxy a connection is still used using HTTP communication?

I have a client side GUI app for human usage that consumes some SOAP web services and uses cURL as the underlying HTTP communication lib. Depending on the input, processing a request can take some large amount of time, even one hour. Neither the client nor server time out for that reason on their own and that's tested and works. Most of the requests get processed in some minutes anyway, so this is an edge case.
One of my users is forced to use a proxy between my client app and my server and for various reasons has no control over it. That proxy has a time out configured and closes the connection to my client after 4 minutes of no data transfer. So the user can (and did) upload data for e.g. 30 minutes, afterwards the server starts to process the data and after 4 minutes the proxy closes the connection, the server will silently continue to process the request, but the user is left with some error message AND won't get the processing result. My app already uses TCP Keep Alive, so that shouldn't be the problem, but instead the time out seems to be defined for higher level data. It works the same like the option read_timeout for squid, which I used to reproduce the behaviour in our internal setup.
What I would like to do now is start a background thread in my web service which simply outputs some garbage data to my client over all the time the request is processed, which is ignored by the client and tells the proxy that the connection is still active. I can recognize my client using the user agent and can configure if to ouput that data or not server side and such, so other clients consuming the web service wouldn't get a problem.
What I'm asking for is, if there's any HTTP compliant method to output such garbage data before the actual HTTP response? So e.g. would it be enough to simply output \r\n without any additional content over and over again to be HTTP compliant with all requesting libs? Or maybe even binary 0? Or some full fledged HTTP headers stating something like "real answer about to come, please be patient"? From my investigation this pretty much sounds like chunked HTTP encoding, but I'm not sure yet if this is applicable.
I would like to have the following, where all those "Wait" stuff is simply ignored in the end and the real HTTP response at the end contains Content-Length and such.
Wait...\r\n
Wait...\r\n
Wait...\r\n
[...]
HTTP/1.1 200 OK\r\n
Server: Apache/2.4.23 (Win64) mod_jk/1.2.41\r\n
[...]
<?xml version="1.0" encoding="UTF-8"?><soap:Envelope[...]
Is that possible in some standard HTTP way and if so, what's the approach I need to take? Thanks!
HTTP Status 102
Isn't HTTP Status 102 exactly what I need? As I understand the spec, I can simply print that response line over and over again until the final response is available?
HTTP Status 102 was a dead-end, two things might work, depending on the proxy used: A NPH script can be used to regularly print headers directly to the client. The important thing is that NPH scripts normally bypass header buffers from the web server and can therefore be transferred over the wire as needed. They "only" need be correct HTTP headers and depending on the web server and proxy and such it might be a good idea to create incrementing, unique headers. Simply by adding some counter in the header name.
The second thing is chunked transfer-encoding, in which case small chunks of dummy data can be printed to the client in the response body. The good thing is that such small amount of data can be transferred over the wire as needed using server side flush and such, the bad thing is that the client receives this data and by default behaves as if it was part of the expected response body. That might break the application of course, but most HTTP libs provide callbacks for processing received data and if you print some unique one, the client should be able to filter the garbage out.
In my case the web service is spawning some background thread and depending on the entry point of the service requested it either prints headers using NPH or chunks of data. In both cases the data can be the same, so a NPH-header can be used for chunked transfer-encoding as well.
My NPH solution doesn't work with Squid, but the chunked one does. The problem with Squid is that its read_timeout setting is not low level for the connection to receive data at all, but instead some logical HTTP thing. This means that Squid does receive my headers, but it expects a complete HTTP header within the period of time defined using read_timeout. With my NPH approach this isn't the case, simply because by design I only want to send some garbage headers to ignore until the real headers arrive.
Additionally, one has to be careful about NPH in Apache httpd, but in my use case it works. I can see the individual headers in Squid's log and without any garbage after the response body or such. Avoid the Action directive.
Apache2 sends two HTTP headers with a mapped "nph-" CGI

How does HTTP/2 provide faster browsing speed compared to HTTP/1.1?

I was reading reading an article on launching of HTTP/2. It was said that HTTP/2 is based on SPDY (speedy) protocol and it can provide faster browsing speed compared to HTTP/1.1 by using "header field compression" and "multiplexing". How does these terms exactly work?
Am I supposed to believe that in HTTP/1.1 requests are processed in a 'one after the other' manner?
Multiplexing
With HTTP 1.1 a lot of time is spent just waiting. A browser sends requests and waits for the response to come back and then sends another GET etc. An inefficient use of the bandwidth. At times it would use Pipelining but that too suffers that sometimes requests need to wait for the requests done prior. The head of line blocking problem.
With multiplexing, there's virtually no waiting but the browsers can just ask for hundreds of things at once and they will be delivered in whatever order they can be delivered and without individual streams or objects having to wait for each other. (With prioritization and flow control to help control them properly.)
This will be most notable on high-latency connections. For a visible and clear demo what it can do, see the golang's gophertiles demo at https://http2.golang.org/gophertiles?latency=1000 (requires a HTTP/2 enabled browser)
Header compression
Additionally, HTTP/2 offers header compression that makes a client able to squeeze in more requests earlier in a TCP connection lifetime. In the early slow-start period of a new TCP connection it can be valuable to cram in more requests so that the responses come back earlier. HTTP headers are extremely repetitive in their nature.
Server push
A HTTP/2 server can send data to the client as if the client asked for it, before the client asks for it! If the server thinks the client is likely to want/need that too, and thus a half RTT can be saved.
Starting with HTTP/2, both headers and HTTP response content could be compressed. With HTTP/1.1, headers are never compressed contrary to the content (specified with Content-Encoding header).
Multiplexing is related to server-push. Actually, when the server sends an HTML page, it can use the same connection to push additional resources like css and javascript files. If the HTML page needs to load those additional scripts, no more request will be sent to the server as they were already sent previously.

Non-serial pipelined HTTP possible?

RFC 2616 section 8.1.2.2 states:
A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MUST send its responses to those requests in the same order that the requests were received.
Serial responses are often more harm than good, since serial responses actually require the server to do more processing and negates the performance benefits gained by pipelining.
For example, if a HTTP client requests for files 1.jpg, 2.jpg, 3.jpg, 4.jpg, and 5.jpg, it doesn't matter if 3.jpg is returned before 1.jpg, or if 4.jpg is returned before 3.jpg. The client simply want the responses as soon as they are available, in any order.
How can a HTTP client gain the benefits of pipelining, and at the same time not pay for the disadvantages of response queueing?
A client can't circumvent HOL-queueing as it's part of RFC 2616. The only benefit of pipelining (in my opinion) is in extremely specific and narrow cases. Consider:
R1cost = Request A processing cost.
R2cost = Request B processing cost.
TCPcost = Cost of negotiating new TCP connection.
Using pipelining would, therefore, be viable in specific cases where:
R1cost ≥ R2cost ≤ TCPcost
How often is a request more expensive than a previous request and less expensive than negotiating a new TCP connection? Not often. I would add that Websockets are (by far) a more interesting and appropriate solution (as far as parallel back-end processing is concerned).
It can't (in HTTP/1.1). It might be in a future version of HTTP.
There is no default mechanism in the HTTP headers to identify which response would match which request. A response is known to be that to a specific request because of the order in which it's received. If you requested 1.jpg, 2.jpg, 3.jpg, 4.jpg, and 5.jpg and sent the responses in any order, you wouldn't know which one is which.
(You could implement your own markers in client and server headers, but you'd certainly not be compliant with the protocol and most implementations would not know how to deal with that. You would have to do some processing to map, which may negate the anticipated benefits of this parallel implementation too.)
The main benefits you get from the existing HTTP pipeline mechanism are:
Possible reduced communication latency. This may matter depending on your connection.
For request that require some longer server-side computation, the server could start this computation in the background, upon reception of the request, while it's sending a previous response, so as to be able to start sending the second result earlier. (This is also a form a latency, but in terms of response preparation.)
Some of these benefits can also be gained by more modern web-browser techniques, where multiple requests can be sent separately and parts of the page may be updated progressively (via AJAX).

Is SPDY any different than http multiplexing over keep alive connections

HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY

Resources