Why HTTPS is faster then HTTP? - http

It makes me crazy.
I know that HTTP connections must be faster then HTTPS since we need some time for SSL handshaking and encoding / decoding data.
But I have checked two images from deviantart and from flickr and get same results.
Also I have checked results in Firefox network tab and in HTTP Debugger Pro and got same results (I don't know why FF shows different sizes for same image).
Here is the test image with and without HTTPS:
http://fc05.deviantart.net/fs70/f/2014/082/a/0/flying_jellyfish_wallpaper_by_andrework-d7bcloj.jpg
https://fc05.deviantart.net/fs70/f/2014/082/a/0/flying_jellyfish_wallpaper_by_andrework-d7bcloj.jpg

As a protocol, HTTPS is not faster than HTTP. Holding this claim3, then:
HTTPS might benefit from QoS (Circumvention) on your path2. I get the same speed for both resources, which is to be expected1.
Alternatively, it could be another artifact on your path such as an HTTP proxy; or pretty much anything which only slows down the HTTP traffic.
I suspect your path is the issue because the same symptom - which is a significant time difference! - is seen when connecting to different servers.
1 Any handshake overhead is dominated by the transfer time on a low-latency connection. Likewise, any encryption overhead is dominated by the network transfer speed.
2 The particular network path taken from your browser to the server, whatever that is.
3 This is a very weak claim (that is, I did not claim that HTTP was faster) and is not a difficult proposition to back up. If HTTPS traffic was fundamentally faster (much less twice as fast!), nobody would still be using HTTP.

Related

Deciding whether to use ws or wss?

I have an account with a trading exchange and they have a websockets API which supports ws://... and wss://...
For the non-authenticated channels such as the current state of the orderbook, is it an easy decision to just use ws, mostly for the (however minimal) time savings? Obviously I would like to have my data as recent as possible.
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
I see absolutely no reason not to use wss for all your connections, particular with a webSocket. The normal webSocket use is to make a connection, then keep that connection for a long time and use it. While there is a bit of overhead on every transmission because of the encryption, the main wss overhead is when you first make the connection and that only happens once per connection.
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
No, there is not some other factor. In fact, the opposite. There are more and more and more reasons these days to use TLS whenever possible to protect your privacy.
For the non-authenticated channels such as the current state of the orderbook, is it an easy decision to just use ws, mostly for the (however minimal) time savings?
Why? If wss is available, I'd be using it for everything. If you actually run into a CPU problem down the road, you could revisit whether using wss has anything to do with it, but that is unlikely to happen and, in my opinion, you have nothing to lose by starting with wss. While one wants to design code intelligently, you don't want to try to micro-optimize performance-related things before you even have a documented, measured performance issue to worry about.
General reasons to use TLS:
Privacy (nobody in the middle can snoop on what you're doing)
Security of data (nobody can read your data, not even proxies)
Security of endpoint (endpoint you're connecting to can't be hijacked without you knowing about it)
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
Actually, there is.
This is also stated in the RFC:
At the time of writing of this specification, it should be noted that connections on ports 80 and 443 have significantly different success rates, with connections on port 443 being significantly more likely to succeed, though this may change with time.
Some network intermediaries (especially with some mobile providers) will fail on ws connections but work properly when using wss.
The reason seems to be that these intermediaries (proxies / routers) will attempt to read the WebSocket message as if it were HTTP and "fix" HTTP errors or resolve caching (which actually corrupts WebSocket data).
The encrypted wss protocol will trigger a pass-through mode, since these intermediaries won't be able to read the data or "fix" any HTTP errors.
The Websocket protocol uses client-side frame masking for the same purpose, but sometimes with limited results. Using wss increases connectivity on some networks.

HTTP2 and NGINX - when would I use a keepalive directive?

I am experimenting with migration to http/2. I have set up a Wordpress website and configured NGINX to serve it using http/2 (using SSL/TLS).
My understanding of http/2 is that by default it uses one connection to transfer a number of files - rather than opening new connections and repeating SSL handshakes for each file.
To achieve a similar effect for http/1.x, NGINX offered the keepalive directive.
So does using the keepalive directive make sense when using http/2 exclusively? Checking my config files with "nginx -t" reports no errors when I include it. But does it have any effect? Benchmarking showed no difference.
You're misunderstanding how this works.
Keep Alive's work between requests.
When you download a webpage it downloads the HTML page and discovers it needs another 20 resources say (CSS files, javascript files, images, fonts... etc.).
Under HTTP/1.1 you can only request one of these resources at once so typically the web browser fires up another 5 connections (giving 6 in total) and requests 6 of those 20 resources. Then it requests the remaining 14 resources as those connections free up. Yes keep-alives help in between those requests but that's not its only use as we'll discuss below. The overhead of setting up those connections is small but noticeable and there is a delay in only being able to request 6 resources of those 20 at a time. This is why HTTP/1.1 is inefficient for today's usage of the web where a typical web page is made up of 100 resources.
Under HTTP/2 we can fire off all 20 requests at once on the same connection so some good gains there. And yes technically you don't really benefit from keep-alives in between those as connection is still in use until they all arrive - though still benefit from small delay between first HTML request and the other 20.
However after that initial load there are likely to be more requests. Either because you are browsing around the site or because you interact with the page and it makes addition XHR api calls. Those will benefit from keep-alives whether on HTTP/1.1 or HTTP/2.
So HTTP/2 doesn't negate need for keep-alives. It negates need for multiple connections (amongst other things).
So the answer is to always use keep-alives unless you've a very good reason not to. And what type of benchmarking are you doing to say it makes no difference?

What is the recommended HTTP POST content length?

I have several clients that constantly post data to a REST service. REST service is put behind a network load balancer. Each client sends 100 - 500 MB a day and I need to support 500+ clients.
I can POST either very large packets, this will reduce overhead for TCP/IP session set up and HTTP headers. This will, however, firmly tie one client to a particular server and limit my scalability options. Alternatively, I can send small HTTP packets, which I can load balance well, but I will get more overhead for TCP/IP session set up and HTTP headers.
What is the recommended packet size for HTTP POST? Or how can I calculate one for my environment?
There is no recommended size.
While HTTP POST size is not constrained by the RFCs, since HTTP is a commodity protocol implementing request / response type messaging, most of the infrastructure is configured around the idea that TCP connections are not particularly long lasting / does not carry significant amounts of data. i.e. there will be factors outside your control which may impact the service - although HTTP supports range requests for responses, there is no corollary for requests.
You can get around a lot of these (although not all) by using HTTPS. However you still need to think about how you detect/manage outages - are you happy to wait for a TCP timeout?
With 500+ clients presumably using the system quite heavily, the congestion avoidance limits shouldn't be a problem - whether TCP window scaling is likely to be an issue depends on how the system is used. HTTP handshakes should not be an issue unless you restrict the request size to something silly.
If the service is highly dependant on clients pushing lots of data on to your server, then I'd encourage you to look at parsing the data on the client (given the volume, presumably it's coming from files - implying a signed java applet or javascript with UniversalBrowserRead privilege) then sending it over a bi-directional communication channel (e.g. websocket).
Leaving that aside for now, the only way you can find out what the route between your clients and your server will support is to measure it - and monitor it. I would expect that a 2Mb upload size would work pretty much anywhere, while a 10Mb size would work most of the time within the US or Europe - and that you could probably increase this to 50Mb as long as there's no mobile clients.
But if you want to maintain the effectiveness of the service you'll need to monitor bandwidth, packet loss and lost connections.

Is SPDY any different than http multiplexing over keep alive connections

HTTP 1.1 supports keep alive connections, connections are not closed until "Connection: close" is sent.
So, if the browser, in this case firefox has network.http.pipelining enabled and network.http.pipelining.maxrequests increased isn't the same effect in the end?
I know that these settings are disabled because for some websites this could increase load but I think a simple http header flag could tell the browser that is ok tu use multiplexing and this problem can be solved easier.
Wouldn't be easier to change default settings in browsers than invent a new protocol that increases complexity especially in the http servers?
SPDY has a number of advantages that go beyond what HTTP pipelining can offer, which are described in the SPDY whitepaper:
With pipelining, the server still has to return the responses one at a time in the order they were requested. This can be a problem if the client requests a resource that's dynamically generated before one that is static: the server cannot send any of the "easy" static responses until the dynamically generated one has been generated and sent. With SPDY, responses can be returned out of order or in parallel as they are generated, lowering the total time to receive all resources.
As you noted in your question, not all servers are able to deal with pipelining: it's not just load, some servers actually behave incorrectly when the client requests pipelining. Using a header to indicate that it's okay to do pipelining is too late to get the maximum benefit: you are already receiving the first response at that point, so while you can use it on future connections it's already too late for this one.
SPDY compresses headers using an algorithm which is specific to that task (stateful and with knowledge of what is normally in HTTP headers); while yes, SSL already includes compression, just compressing them with deflate is not as efficient. Most HTTP requests have no bodies and only a short GET line, so the headers make up virtually the entire request: any compression you can get is an improvement. Many responses are also small compared to their headers.
SPDY allows servers to send back additional responses without the client asking for them. For example, a server might start sending back the CSS for a page along with the original HTML, before the client has had a chance to receive and parse the HTML to determine the stylesheet URL. This can speed up page loads even further by eliminating the need for the client to actually parse the HTML before requesting other resources needed to render the page. It also supports a less bandwidth-heavy version of this feature where it can "hint" about which resources might be needed, and allow the client to decide: this allows, for example, clients that don't care about images to not bother to request them, but clients that want to display images can still request the images using the given URLs without needing to wait for the HTML.
Other things too: see William Chan's answer for even more.
HTTP pipelining is susceptible to head of line blocking (http://en.wikipedia.org/wiki/Head-of-line_blocking) at the HTTP transaction level whereas SPDY only has head of line blocking at the transport level, due to its use of multiplexing.
HTTP pipelining has deployability issues. See https://datatracker.ietf.org/doc/html/draft-nottingham-http-pipeline-01 which describes a number of different workarounds and heuristics to mitigate this. SPDY as deployed in the wild does not have this problem since it is generally deployed over SSL (port 443) using NPN (http://technotes.googlecode.com/git/nextprotoneg.html) to negotiate SPDY support. SSL is key, since it prevents intermediaries from interfering.
SPDY has header compression. See http://dev.chromium.org/spdy/spdy-whitepaper which discusses some benchmark results of the benefits of header compression. Now, it's useful to note that bandwidth is less and less of an issue (see http://www.belshe.com/2010/05/24/more-bandwidth-doesnt-matter-much/), but it's also useful to remember that bandwidth is still key for mobile. Check out https://developers.google.com/speed/articles/spdy-for-mobile which shows how beneficial SPDY is for mobile.
SPDY supports features like server push. See http://dev.chromium.org/spdy/spdy-best-practices for ways to use server push to improve cacheability of content and still reduce roundtrips.
HTTP pipelining has ill-defined failure semantics. When the server closes the connection, how do you know which requests have been successfully processed? This is a major reason why POST and other non-idempotent requests are not allowed over pipelined connections. SPDY provides semantics to cancel individual streams on the same connection, and also has a GOAWAY frame which indicates the last stream to be successfully processed.
HTTP pipelining has difficulty, often due to intermediaries, in allowing deep pipelines. This (in addition to many other reasons like HoL blocking) means that you still need to utilize multiple TCP connections to achieve maximal parallelization. Using multiple TCP connections means that congestion control information cannot be shared, that compression contexts cannot be shared (like SPDY does with headers), is worse for the internet (more costly for intermediaries and servers).
I could go on and on about HTTP pipelining vs SPDY. But I'd recommend just reading up on SPDY. Check out http://dev.chromium.org/spdy and our tech talk on SPDY at http://www.youtube.com/watch?v=TNBkxA313kk&list=PLE0E03DF19D90B5F4&index=2&feature=plpp_video.
See Difference between HTTP pipeling and HTTP multiplexing with SPDY

Are socket connections faster than http on Blackberry?

I'm writing an app for Blackberry that was originally implemented in standard J2ME. The network connection was done using Connector.open("socket://...:80/...") instead of http://
Now, I've implemented the connection using both methods, and it seems like some times, the socket method is more responsive, and some times it doesn't work at all. Is there a significant difference between the two? Mostly what I'm trying to achieve is responsiveness from the connection to get a smooth progress bar.
Blackberry's implementation of http and https provide more options for connecting to the target server than socket, and, of course, implement all the HTTP protocol stuff for you. I've not benchmarked them, but it makes a certain amount of sense that direct TCP via socket would be quicker in some cases, especially if what is listening on port 80 isn't an HTTP server (no protocol overhead)
I've had difficulty in the past with different network providers, some requiring deviceside=true others deviceside=false, and no real way to know until the first support call for that network came in.
Mostly what I'm trying to achieve is responsiveness from the connection to get a smooth progress bar.
Pardon my saying so, but a "smooth progress bar" is "gilding the lily" - nice to have and look at, but not critical to the application's function, reliability or robustness. Go with what is more robust and reduces code size - likely http in this case.
Since both operate over a network I don't think you can guarantee a smooth progress bar. You might have more chance of that if you remind the person to stay in one place so you have a chance of a consistent connection ;)
There is less overhead with a socket connection than an HTTP one. In fact, HTTP connections run over the socket connection. You can take advantage of the reduced overhead of the socket connection to appear more responsive, but you will likely have more work to do than you would with HTTP. The API is more low-level so coding is more complex.
One difference between a socket and an HTTP connection on the BlackBerry is that HTTP connections may be transparently routed via an HTTP proxy in the case of BES and BIS connections.
In theory sockets will be faster, but then you're responsible for managing the overhead of rolling your own protocol (depending on complexity). Though sockets are more lightweight, I've found that HTTP and all the comes with it greatly reduces the headache.

Resources