The number of HTTP requests need to be made for downing a webpage? - http

Are all assets (html files, js files, css files, images) in one webpage transmitted through a single HTTP request/response, or through multiple HTTP requests/responses, one for each asset?
Assumed no XHR in that webpage.

All the digital assets on a web document are transmitted on separate HTTP requests. However modern web servers and browsers are able to use the same TCP connection with HTTP keep-alive.

Conceptually, each asset is a separate request. In practise, most servers allow the browser to re-use the same physical socket connection for multiple requests (but they are still issued one after the other) and this can significantly improve performance (because you need extra round-trips to establish a connection, and subsequent requests can piggy-back on the ACKs for the previous request: you cut down on a lot of round-trips).
But yes, there is always one request/response per asset on the page.
On connections with high latency (e.g. Australia -> U.S.) the number of round-trips can be a significant bottleneck, and that's why things like CSS sprites are widely used.

It's one request per asset, but you can use multiple TCP connections to send multiple HTTP requests in parallel. In fact all browser do exactly that.

I'd recommend downloading Firebug for Firefox, then watching its 'Net' tab while you browser some sites. It would answer this question and so many more.

Related

HTTP2 and NGINX - when would I use a keepalive directive?

I am experimenting with migration to http/2. I have set up a Wordpress website and configured NGINX to serve it using http/2 (using SSL/TLS).
My understanding of http/2 is that by default it uses one connection to transfer a number of files - rather than opening new connections and repeating SSL handshakes for each file.
To achieve a similar effect for http/1.x, NGINX offered the keepalive directive.
So does using the keepalive directive make sense when using http/2 exclusively? Checking my config files with "nginx -t" reports no errors when I include it. But does it have any effect? Benchmarking showed no difference.
You're misunderstanding how this works.
Keep Alive's work between requests.
When you download a webpage it downloads the HTML page and discovers it needs another 20 resources say (CSS files, javascript files, images, fonts... etc.).
Under HTTP/1.1 you can only request one of these resources at once so typically the web browser fires up another 5 connections (giving 6 in total) and requests 6 of those 20 resources. Then it requests the remaining 14 resources as those connections free up. Yes keep-alives help in between those requests but that's not its only use as we'll discuss below. The overhead of setting up those connections is small but noticeable and there is a delay in only being able to request 6 resources of those 20 at a time. This is why HTTP/1.1 is inefficient for today's usage of the web where a typical web page is made up of 100 resources.
Under HTTP/2 we can fire off all 20 requests at once on the same connection so some good gains there. And yes technically you don't really benefit from keep-alives in between those as connection is still in use until they all arrive - though still benefit from small delay between first HTML request and the other 20.
However after that initial load there are likely to be more requests. Either because you are browsing around the site or because you interact with the page and it makes addition XHR api calls. Those will benefit from keep-alives whether on HTTP/1.1 or HTTP/2.
So HTTP/2 doesn't negate need for keep-alives. It negates need for multiple connections (amongst other things).
So the answer is to always use keep-alives unless you've a very good reason not to. And what type of benchmarking are you doing to say it makes no difference?

How does HTTP/2 provide faster browsing speed compared to HTTP/1.1?

I was reading reading an article on launching of HTTP/2. It was said that HTTP/2 is based on SPDY (speedy) protocol and it can provide faster browsing speed compared to HTTP/1.1 by using "header field compression" and "multiplexing". How does these terms exactly work?
Am I supposed to believe that in HTTP/1.1 requests are processed in a 'one after the other' manner?
Multiplexing
With HTTP 1.1 a lot of time is spent just waiting. A browser sends requests and waits for the response to come back and then sends another GET etc. An inefficient use of the bandwidth. At times it would use Pipelining but that too suffers that sometimes requests need to wait for the requests done prior. The head of line blocking problem.
With multiplexing, there's virtually no waiting but the browsers can just ask for hundreds of things at once and they will be delivered in whatever order they can be delivered and without individual streams or objects having to wait for each other. (With prioritization and flow control to help control them properly.)
This will be most notable on high-latency connections. For a visible and clear demo what it can do, see the golang's gophertiles demo at https://http2.golang.org/gophertiles?latency=1000 (requires a HTTP/2 enabled browser)
Header compression
Additionally, HTTP/2 offers header compression that makes a client able to squeeze in more requests earlier in a TCP connection lifetime. In the early slow-start period of a new TCP connection it can be valuable to cram in more requests so that the responses come back earlier. HTTP headers are extremely repetitive in their nature.
Server push
A HTTP/2 server can send data to the client as if the client asked for it, before the client asks for it! If the server thinks the client is likely to want/need that too, and thus a half RTT can be saved.
Starting with HTTP/2, both headers and HTTP response content could be compressed. With HTTP/1.1, headers are never compressed contrary to the content (specified with Content-Encoding header).
Multiplexing is related to server-push. Actually, when the server sends an HTML page, it can use the same connection to push additional resources like css and javascript files. If the HTML page needs to load those additional scripts, no more request will be sent to the server as they were already sent previously.

What is the use of http non persistent connection mode

It may seem to be a trivial question but still.. I have a confusion over it.
Almost at every site I have read that HTTP persistent or keep-alive connections are better than the non-persistent one.
Ques: So, why do non-persistent even exists?
Some says that persistent has disadvantage if server is serving many clients as users are deprived of connection.
Ques: All the popular websites server millions of clients, does that mean they don't use persistent mode?
As per my understanding I can think search engines may not be using persistent connections.
Can someone please enlighten me on this topic.
Another doubt I have is regarding the HTTP requests. I have read that if a page contains link to several objects then web browser makes that many request to fetch all those (this is why persistent connections are used). My doubt is why all the objects are not embedded in the page and sent as one object? If argument is that it makes page heavy and not bandwidth friendly then anyways the browser open parallel connections to fetch multiple objects which again putting the same load on the network.
OK, I understand that this cannot be done for like image search but if a page contains few objects then can we embed them into the page and send.
These may seem foolish questions but I can't help. I have a doubt and I need to clear that and you can help.
Thanks
The original HTTP specification always uses non-persistent connections; HTTP/1.1 added persistence because it is more efficient for web pages that embed a lot of external objects (which were rare when HTTP/1.0 was written.)
However, even though HTTP/1.1 allows persistent connections there are implementations that don't support them, or which still only support HTTP/1.0. For this reason, HTTP/1.1 requires that the Connection: keep-alive header be sent in order to enable this feature, and Connection: close be sent to disable it.
It is possible to include media directly in the HTML by base64 encoding the data and including it in a data: URL. This is not usually done because it slows down your web browser. With a standard HTML page, the browser can start rendering the structure of the page without waiting for the (rather large) inline data: links to download.
As you say most of the webpages hosted over the internet will not only handle fewer data, and nobody can estimate that. The HTTP server should be generic and it should have a mechanism to avoid multiple requests in the name of dependencies. You say that the non-persistent method avoids the blocking of ports by a single client for a long time where as the server may have to serve more clients and it would give a lot of stress, that is not true. Persistent connections actually reduce the load for a server by limiting the number of queries it has to serve.
Hope this HTTP Persistent connection will help you understand.

Does SPDY/HTTP2 concatenates responses?

I have a question about SPDY/HTTP2:
Normally you concatenate multiple CSS and JS files into one file to save requests and to get a better performance. I heard that SPDY/HTTP2 combines multiple requests into a single response. Would that mean that I don't need to pre-concatenate CSS and JS files anymore, because this is handled by the protocol?
To say it in other words:
Can I use <script source="moduleA.js"></script> and <script source="moduleB.js"></script> with SPDY/HTTP2 in the same way as I would use <script source="allScripts.js"></script> with HTTP1? Is this the same from a response performance point of view, but with the benefit of caching each file on its own, so that I can change moduleB.js and keep moduleA.js cached?
HTTP/2.0 does not (AFAIK) exist yet - it's still a proposed standard. But it seems likely that it will use similar connection handling to SPDY.
SPDY doesn't concatenate them it multiplexes the requests across the same connection - from the network's point of view the effect is the same.
Yes, you don't need to merge the content files by hand, yes they will be cached independently.
SPDY3 and HTTP2 are multiplexing requests on the same physical connection.
But even multiplexed, requests may be sent sequentially for each resource, causing major slowdowns due to roundtrip time waits.
Both SPDY3 and HTTP2 have a feature called "Resource Push" (also known as "SPDY Push", not to be confused with "Server Push") that allows related resources to be pushed without the client requesting them, and the Jetty project - I am a committer - is the only one to my knowledge that implements that feature.
You can watch Resource Push in action in this video: http://webtide.intalio.com/2012/10/spdy-push-demo-from-javaone-2012/.
With Resource Push, you save additional roundtrips to get all the different JS files and still benefit of the browser cache per single file.
The whole point of resource concatenation is exactly to reduce the number of roundtrips necessary to get all the resources needed, and Resource Push helps to solve that problem.
HTTP/2.0 allows for multiplexing, where multiple request/response streams exchange data over the same TCP connection.
Because creating and starting TCP connections is expensive, HTTP/2.0's multiplexing will usually be faster than the semi-parallel downloading of HTTP/1.1, where a limited amount of TCP connections is (re)used by the browser to perform a given amount of requests for resources.
But your mileage may vary. Measure it.
As a sidenote, you might want to reference all your libraries separately when developing and debugging, but bundle and minify the JS/CSS into one file upon a deploy.

Is SPDY any different than http multiplexing over keep alive connections

HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY

Resources