I've been put in charge of building some tools to help end-user test why their browser might not work with a website.
Among the reason I was given to why it might not work there was this "require HTTP1.1" line. I've looked through most browser options and only IE (version 6 and up, even 9) allow you to disable HTTP1.1.
Are there any use to be able to restrict yourself to http1.0?
Generally speaking, no, you don't ever want the client to only offer HTTP/1.0, as this will slow things down.
Some servers will intentionally use HTTP/1.0 with a Keep-Alive header on some responses because certain browsers (e.g. IE6/IE7) will allow more parallel connections for HTTP/1.0 (four) vs. HTTP/1.1 (two).
Related
I've been googling around and I cannot seem to find a straight answer to this question, and some people offer contradictory answers.
Most browsers have a 6 connection limit for each domain. So for example, if your website is example.com and it initializes a persistent Server Sent Event connection on page load, then the end user can open that tab five more times, but the sixth tab won't load at all, because the 6 persistent TCP connection has been reached for that domain.
Now, I see some people saying that this is just a perennial problem with SSE, and the only alternative is hacky workarounds involving detecting this connection limit and then either closing the connections in hidden tabs or closing those connections in those tabs and switching to long polling.
However, some people claim that HTTP2 solves this with multiplexing, such that you can have as many open tabs of that website as you'd like, as all the tabs multiplex onto the same TCP connection. I cannot find a primary source for this claim, nor anyone with significant authority making it either.
So, is it true? Does HTTP2 multiplexing solve the issue of the common 6 connection limit for website domains? Or is one basically required to instead use websockets if they would like to support many open tabs to their site?
I have implemented HTTP/2 in Jetty.
As explained in this answer, with HTTP/2 the max number of concurrent requests that a browser can make to the server is largely increased - not infinite but increased from 6-8 to about 100.
So yes, multiplexing solves this issue in practice (unless you open more than 100 or so tabs).
Note that this value is configured by servers, so it's possible that a server sends to the client a configuration with max number of concurrent requests set to a small number, but in practice servers have settled on a number around 100.
Having said that, you want to also read this other answer for a discussion about SSE vs WebSocket.
The limit is applied in the browser (not server-side), and varies per browser.
At least it did as of http/1.1. I've been unable to find any configuration specific to http/2, so I think we have to assume the limits are still there. I would assume they've been kept to prevent abuse, or accidental DoS attacks.
https://developer.mozilla.org/en-US/docs/Mozilla/Preferences/Mozilla_networking_preferences
See network.http.max-persistent-connections-per-server. (You can also see it in about:config.)
Here are all the settings I see (in Firefox) filtering for "connections":
I wonder if SSE is governed by http.max-connections or the websocket.max-connections? (Ignoring any per-domain limit, for the moment.)
You could also just try it, of course (there were example scripts in my above-linked question). But unless you control the configuration of all clients, you have to assume some people will be running with a limit of 6.
It may seem to be a trivial question but still.. I have a confusion over it.
Almost at every site I have read that HTTP persistent or keep-alive connections are better than the non-persistent one.
Ques: So, why do non-persistent even exists?
Some says that persistent has disadvantage if server is serving many clients as users are deprived of connection.
Ques: All the popular websites server millions of clients, does that mean they don't use persistent mode?
As per my understanding I can think search engines may not be using persistent connections.
Can someone please enlighten me on this topic.
Another doubt I have is regarding the HTTP requests. I have read that if a page contains link to several objects then web browser makes that many request to fetch all those (this is why persistent connections are used). My doubt is why all the objects are not embedded in the page and sent as one object? If argument is that it makes page heavy and not bandwidth friendly then anyways the browser open parallel connections to fetch multiple objects which again putting the same load on the network.
OK, I understand that this cannot be done for like image search but if a page contains few objects then can we embed them into the page and send.
These may seem foolish questions but I can't help. I have a doubt and I need to clear that and you can help.
Thanks
The original HTTP specification always uses non-persistent connections; HTTP/1.1 added persistence because it is more efficient for web pages that embed a lot of external objects (which were rare when HTTP/1.0 was written.)
However, even though HTTP/1.1 allows persistent connections there are implementations that don't support them, or which still only support HTTP/1.0. For this reason, HTTP/1.1 requires that the Connection: keep-alive header be sent in order to enable this feature, and Connection: close be sent to disable it.
It is possible to include media directly in the HTML by base64 encoding the data and including it in a data: URL. This is not usually done because it slows down your web browser. With a standard HTML page, the browser can start rendering the structure of the page without waiting for the (rather large) inline data: links to download.
As you say most of the webpages hosted over the internet will not only handle fewer data, and nobody can estimate that. The HTTP server should be generic and it should have a mechanism to avoid multiple requests in the name of dependencies. You say that the non-persistent method avoids the blocking of ports by a single client for a long time where as the server may have to serve more clients and it would give a lot of stress, that is not true. Persistent connections actually reduce the load for a server by limiting the number of queries it has to serve.
Hope this HTTP Persistent connection will help you understand.
I'm with a webhost for my company's website that has 47 pages (asp.net 4/vb). All speed tests, Google, Yslow, Gtmetrix, etc., want me to enable keep-alive. I have already set http header requests to expire after 30 minutes via IIS7.
I submitted a ticket to my webhost, and they say they don't offer keep-alive. Does that just mean I'm stuck if I stay with this webhost, or is there something I can do on my own, regardless of my webhost? Thanks for any guidance!
If the webhost does not support keep-alive, I don't think there is anything you can do to enable it yourself. They are most likely not allowing it to be overridden.
I think keep-alive is nice to have, but less of an impact compared to some of the key practices like gzip compression, proper caching, and reducing the number of HTTP requests.
In your case if the need for keep-alive was significantly important, you could consider switching web hosts.
HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY
Say one is to write an HTTP server/client, how important is it to support HTTP/1.0? Is it still used anywhere nowdays?
Edit: I'm less concerned with the usefullness/importance of HTTP/1.0, rather the amount of software that actually uses it for non-internal (unit testing being internal use, for example) purposes in the real world (browsers, robots, smartphones/stupidphones, etc...).
As of 2016, you would think that the prominence would decline even more since 1.1 was introduced in 1999 so this is about 17 years.
I checked 7,727,198 lines of logs to see what percent I get of HTTP/1.0 and HTTP/1.1:
Protocol Counts Percent
--------------------------------
HTTP/0.9 0 0.00%
HTTP/1.0 1,636,187 21.17% (all)
HTTP/1.0 15,415 0.20% (without the obvious robots)
HTTP/1.1 6,091,011 78.83%
HTTP/2 0 0.00%
From what I can see, most of the HTTP/1.0 are from robots. So I tried to remove entries that were obviously from such (i.e. Agent including the word robot, bot, slurp, etc.)
So it looks like the amount of end users still stuck with HTTP/1.0 is very limited today (0.2%). However, if you want to let robots check out your websites, you may need/want to keep HTTP/1.0 operational. Most will anyway include the Host: ... header even though they advertise their connection as an HTTP/1.0 protocol.
Also, the differences between HTTP/1.0 and HTTP/1.1 is very blurry in terms of implementation. Most people are happily mixing both. I would not worry too much about still accepting/handling HTTP/1.0 requests.
On another server I am starting to see HTTP/2.0 requests that look like this (got 2427 and I see 34,161,268 HTTP/1.0 and HTTP/1.1 requests, so 0.007%):
PRI * HTTP/2.0
wget uses HTTP/1.0, and it is still relatively popular (though it does support a few HTTP/1.1 features like the Host: header, which is necessary to access any virtual hosts).
A fair number of servers will deliberately return HTTP/1.0 responses because some (older) browsers will afford a HTTP/1.0 server a higher connection limit than the 2-connection limit imposed for HTTP/1.1's persistent connections.
But in general, most "HTTP/1.0" implementations are really just slightly limited versions of the HTTP/1.1 implementations, and many HTTP/1.1 implementations don't really support some features of that version (e.g. pipelining in particular).
I use it all the time when I'm telnet-ing to a server to verify connectivity or figure out why it's not working:
$ telnet 192.168.1.1 80
GET / HTTP/1.0\r\n
\r\n
...
(Because making a 1.0 request doesn't require that I provide any extra headers).
HTTP/1.0 is very important in writing very basic clients that don't need the overhead of all the 1.1 things like pipelining and other complicated things required by 1.1. Post a request get a response and disconnect is very easy to code for. This might be useful in writing test cases for your server that just want to test the application functionality and NOT the HTTP protocol implementation.
There are lots of mobile browsers and applications that use 1.0 because they don't have the space or need for more sophisticated 1.1 implementations, and the latency issues with non-3G connections on non-smart phones completely negates any benefits of 1.1 features.
There are also lots of proxies that degrade everything to 1.0 regardless of what the client asks for, and then there is IE issues.
So the short answer is, for a general purpose HTTP server, 1.0 is very relevant.
Looking into this myself for other purposes:
"HTTP/1.0 is in use by proxies, some mobile clients, and IE when
configured to use a proxy. So 1.0 appears to still account for a non-
trivial % of traffic on the web overall.
...
Yes, there are many 1.0 clients still out there."
Source (July 2009): http://groups.google.com/group/erlang-programming/msg/08f6b72d5156ef74
:-(
Update (March 2011):
If you are going to build a client/server thingy, make the client use HTTP/1.1, and make the server accept both 1.1 and 1.0.
Doing web-development, it is a PITA to get clients trying to load a page without the Host header, because I have no way to know which site I am supposed to load :-S
So you better don't build a client like that ;-)
IME its been a very long time since I've seen a true HTTP/1.0 request. (including mobile devices fuzzylollipop).
I say a true request as MSIE still (pretends) to downgrade to HTTP/1.0 by default (unless yo sig in the config) when you connect via a proxy (all the outgoing requests are flagged as HTTP/1.0) - however it still includes HTTP/1.1 specific request headers and respects all the HTTP/1.1 responses.
Curiously, IIS, in a mirror image, happily ignores the HTTP version (although I've not experimented much with this to see if only does this for MSIE user agents).
So by curious coincidence, MSIE and IIS work much better with proxies than with standards-compliant tools.
C.