HTTP2 protocol impact on web developement? - http

I would like to bring your attention to something that I re-think for days. The new features and impact of HTTP/2 protocol for web development. I would also like to ask some related questions, because my annual planning is getting less accurate because of HTTP/2.
Since HTTP/2 uses a single, multiplexed connection, instead of multiple connections in HTTP 1.x domain sharding techniques will not be needed any more.
With HTTP/1.x you may have already put files in different domains to increase parallelism in file transfer to the web browser; content domain networks (CDNs) do this automatically. But it doesn't help – and can hurt – performance under HTTP/2.
Q1: Will HTTP/2 minimize the need for CDNs?
Code files concatenating. Code chunks that would normally be maintained and transferred as separate files are combined into one. The browser then finds and runs the needed code within the concatenated file as needed.
Q2. Will HTTP/2 eliminate the need to concatenate files with similar extensions (css, javascript) and the usage of great Grunt and Gulp tools to do so?
Q. Also, in order to simplify and keep the question more compact, I would ask quite generally what may be other impacts of HTTP/2 on web development as you can foresee?

Q1: Will HTTP/2 minimize to need for CDNs?
It will certainly shift the balance a bit, provided that you use the right software. I talk about balance because CDNs cost money and management time.
If you are using CDNs to offload traffic you still will need them to offload traffic.
If you are a smallish website (and most websites are, in numerical terms), you will have less of a reason to use a CDN, as latency can be hidden quite effectively with HTTP/2 (provided that you deploy it correctly). HTTP/2 is even better than SPDY, and check this article for a use case regarding SPDY.
Also, most of the third-party content that we incorporate into our sites already uses CDNs.
Q2. Will HTTP/2 eliminate the need to concatenate files with similar extensions (css, javascript) and the usage of great Grunt and Gulp tools to do so?
Unfortunately not. Concatenating things won't be needed, unless the files you are delivering are extremely small, say a few hundred bytes. Everything else is still relevant, including minification and adding those ugly query strings for cache busting.
Q3 . Also, in order to simplify and keep the question more compact, I would ask quite general what may be other impacts of HTTP/2 on web development as you can foresee?
This is a tricky question. In one hand HTTP/2 arrives at a moment when the web is mature, and developers have whole stacks of things to take care of. HTTP/2 can be seen as a tiny piece to change in such a way that the entire stack doesn't crumble. Indeed, I can imagine many teams selling HTTP/2 to management this way ("It won't be a problem, we promise!").
But from a technical standpoint, HTTP/2 allows for better development workflows. For example, the multiplexing nature of HTTP/2 means that most of the contents of a site can be served over a single connection, allowing some servers to learn about interactions between assets by just observing browser behaviors. The information can be used together with other features of HTTP/2 and the modern web (specifically, HTTP/2 PUSH and the pre-open headers) to hide a lot of latency. Think about how much work that can save developers interested in performance.

Q1: Will HTTP/2 minimize to need for CDNs?
No. CDN's are primarily to co-locate content close to the user based on geographic location. Closer your are to the server, faster you will get the contet.
Q2. Will HTTP/2 eliminate the need to concatenate files with similar extensions (css, javascript) and the usage of great Grunt and Gulp tools to do so?
Concatenation is only a part of things a tool like is Grunt/Gulp does. Linting, conversions, runnings tests are other things you would still need a tool for. So they will stay. In terms of concat, you would ideally move away from creating a single large concat file per type and move to creating smaller concatenated files per module.
Q3. Also, in order to simplify and keep the question more compact, I would ask quite general what may be other impacts of HTTP/2 on web development as you can foresee?
General idea is HTTP/2 will not make a huge change to the way we develop things as its a protocol level change. Developers would ideally remove optimizations (like compacting, sharding) which are not optimization techniques with http/2

Related

http2 domain sharding without hurting performance

Most articles consider using domain sharding as hurting performance but it's actually not entirely true. A single connection can be reused for different domains at certain conditions:
they resolve to the same IP
in case of secure connection the same certificate should cover both domains
https://www.rfc-editor.org/rfc/rfc7540#section-9.1.1
Is that correct? Is anyone using it?
And what about CDN? Can I have some guarantees that they direct a user to the same server (IP)?
Yup that’s one of the benefits of HTTP/2 and in theory allows you to keep sharding for HTTP/1.1 users and automatically unshard for HTTP/2 users.
The reality is a little more complicated as always - due mostly to implementation issues and servers resolving to different IP addresses as you state. This blog post is a few years old now but describes some of the issues: https://daniel.haxx.se/blog/2016/08/18/http2-connection-coalescing/. Maybe it’s improved since then, but would imagine issues still exist. Also new features like the ORIGIN frame should help but are not widely supported yet.
I think however it’s worth revisiting the assumption that sharding is actually good for HTTP/1.1. The costs of setting up new connections (DNS lookup, TCP setup, TLS handshake and then the actual sending HTTP messages) are not immaterial and studies have shown the 6 connection browser limit is really used never mind adding more by sharding. Concatenation, spriting and inlining are usually much better options and these can still be used for HTTP/2. Try it on your site and measure is the best way of being sure of this!
Incidentally it is for for these reasons (and security) that I’m less keen on using common libraries (e.g. jquery, bootstrap...etc.) from their CDNs instead of hosted locally. In my opinion the performance benefit of a user already having the version your site uses already cached is over stated.
With al these things, HTTP/1.1 will still work without sharded domains. It may (arguably) be slower but it won’t break. But most users are likely on HTTP/2 so is it really worth adding the complexity for the minority’s of users? Is this not a way of progressively enhancing your site for people on modern browsers (and encouraging those not, to upgrade)? For larger sites (e.g. Google, Facebook... etc.) the minority may still represent a large number of users and the complexity is worth it (and they have the resources and expertise to deal with it) for the rest of us, my recommendation is not to shard, to upgrade to new protocols like HTTP/2 when they become common (like it is now!) but otherwise to keep complexity down.

What make http/2 faster than http/1 beyond multiplexing and server push?

I could understand why multiplexing and server push help speed up web page loading and reduce workload on server side. But I have also learned that binary protocol, header compression, and prioritization of requests also contribute to performance improvements of http/2 over http/1. How do these three features actually contribute to the improvements?
Binary protocol
This actually doesn’t help that much IMHO other than the allowing of multiplexing (which DOES help a lot with performance). Yes it’s easier for a program to parse binary packets than text but I don’t think that’s going to make a massive performance boast. The main reason to go binary, as I say are for the other benefits (multiplexing and header compression) and to make parsing easier, than for performance.
Header compression
This can have a big potential impact. Most requests (and responses) repeat a LOT of data. So by compressing headers (which works by replacing repeated headers with references across requests rather than by compressing within requests like HTTP body compression works) can significantly reduce the size of request (but less so for responses where the headers are often not a significant portion of the total response).
Prioritisation of requests
This is one of the more interesting parts of HTTP/2 which has huge potential but has not been optimised for yet. Think of it like this: imagine you have 3 critical CSS files and 3 huge images to download. Under HTTP/1.1, 6 connections would be opened and all 6 items would download in parallel. This may seem fine but it means the less critical image files are using up bandwidth that would be better spent on the critical CSS files. With HTTP/2 you can say “download the critical CSS first with high priority and only when they are done, look at those 3 image files”. Unfortunately, despite the fact that HTTP/2 has a prioritisation model that allows as complex prioritisation as you want (too complex some argue!) browsers and servers don’t currently use it well (and website owners and web developers currently have very little way to influence it at all). In fact bad prioritisation decisions can actually make HTTP/2 slower than HTTP/1.1 as the 6 connection limit is lifted and hundreds of resources can all download in parallel, all fighting over the same bandwidth. I suspect there will be a lot more research and change here in implementations, but there shouldn’t need to be much change in the spec as it already allows for very complex prioritisation as I mentioned.
We’ve been optimising for HTTP/1.1 for decades and have squeezed a lot out of it. I suspect we’ve a lot more to get out of HTTP/2 (and HTTP/3 when it comes along too). Check out my upcoming book if interested in finding out more on this topic.

Since HTTP 2.0 is rolling out, are tricks like asset bundle still necessary?

How can we know how many browsers support HTTP 2.0?
How can we know how many browsers support HTTP 2.0?
A simple Wikipedia search will tell you. They cover at least 60% of the market and probably more once you pick apart the less than 10% browsers. That's pretty good for something that's only been a standard for a month.
This is a standard people have been waiting for for a long time. It's based on an existing protocol, SPDY, that's had some real world vetting. It gives some immediate performance boosts, and performance in browsers is king. Rapid adoption by browsers and servers is likely. Everyone wants this. Nobody wants to allow their competitors such a significant performance edge.
Since http 2.0 is rolling out, does tricks like asset bundle still be necessary?
HTTP/2 is designed to solve many of the existing performance problems of HTTP/1.1. There should be less need for tricks to bundle multiple assets together into one HTTP request.
With HTTP/2 multiple requests can be performed in a single connection. An HTTP/2 server can also push extra content to the client before the client requests, allowing it to pre-load page assets in a single request and even before the HTML is downloaded and parsed.
This article has more details.
When can we move on to the future of technologies and stop those dirty optimizations designed mainly for HTTP 1?
Three things have to happen.
Chrome has to turn on their support by default.
This will happen quickly. Then give a little time for the upgrade to trickle out to your users.
You have to use HTTPS everywhere.
Most browsers right now only support HTTP/2 over TLS. I think everyone was expecting HTTP/2 to only work encrypted to force everyone to secure their web sites. Sort of a carrot/stick, "you want better performance? Turn on basic security." I think the browser makers are going to stick with the "encrypted only" plan anyway. It's in their best interest to promote a secure web.
You have to decide what percentage of your users get degraded performance.
Unlike something like CSS support, HTTP/2 support does not affect your content. Its benefits are mostly performance. You don't need HTTP/1.1 hacks. Your site will still look and act the same for HTTP/1.1 if you get rid of them. It's up to you when you want to stop putting in the extra work to maintain.
Like any other hack, hopefully your web framework is doing it for you. If you're manually stitching together icons into a single image, you're doing it wrong. There are all sorts of frameworks which should make this all transparent to you.
It doesn't have to be an all-or-nothing thing either. As the percentage of HTTP/1.1 connections to your site drops, you can do a cost/benefit analysis and start removing the HTTP/1.1 optimizations which are the most hassle and the least benefit. The ones that are basically free, leave them in.
Like any other web protocol, the question is how fast will people upgrade? These days, most browsers update automatically. Mobile users, and desktop Firefox and Chrome users, will upgrade quickly. That's 60-80% of the market.
As always, IE is the problem. While the newest version of IE already supports HTTP/2, it's only available in Windows 10 which isn't even out yet. All those existing Windows users will likely never upgrade. It's not in Microsoft's best interest to backport support into old versions of Windows or IE. In fact, they just announced they're replacing IE. So that's probably 20% of the web population permanently left behind. The statistics for your site will vary.
Large institutional installations like governments, universities and corporations will also be slow to upgrade. Regardless of what browser they have standardized on, they often disable automatic updates in order to more tightly control their environment. If this is a large chunk of your users, you may not be willing to drop the HTTP/1.1 hacks for years.
It will be up to you to monitor how people are connecting to your web site, and how much effort you want to put into optimizing it for an increasingly shrinking portion of your users. The answer is "it depends on who your users are" and "whenever you decide you're ready".

Overhead of serving pages - JSPs vs. PHP vs. ASPXs vs. C

I am interested in writing my own internet ad server.
I want to serve billions of impressions with as little hardware possible.
Which server-side technologies are best suited for this task? I am asking about the relative overhead of serving my ad pages as either pages rendered by PHP, or Java, or .net, or coding Http responses directly in C and writing some multi-socket IO monster to serve requests (I assume this one wins, but if my assumption is wrong, that would actually be most interesting).
Obviously all the most efficient optimizations are done at the algorithm level, but I figure there has got to be some speed differences at the end of the day that makes one method of serving ads better than another. How much overhead does something like apache or IIS introduce? There's got to be a ton of extra junk in there I don't need.
At some point I guess this is more a question of which platform/language combo is best suited - please excuse the in-adroitly posed question, hopefully you understand what I am trying to get at.
You're going to have a very difficult time finding an objective answer to a question like this. There are simply too many variables:
Does your app talk to a database? If so, which one? How is the data modeled? Which strategy is used to fetch the data?
Does your app talk across a network to serve a request (web service, caching server, etc)? If so, what does that machine look like? What does the network look like?
Are any of your machines load balanced? If so, how?
Is there caching? What kind? Where does it live? How is cached data persisted?
How is your app designed? Are you sure it's performance-optimal? If so, how are you sure?
When does the cost of development outweigh the cost of adding a new server? Programmers are expensive. If reduced cost is your goal with reducing hardware, you'll likely save more money by using a language in which your programmers feel productive.
Are you using 3rd party tools? Should you be? Are they fast? Won't some 3rd party tools reduce your cost?
If you want some kind of benchmark, Trustleap publishes challenge results between their G-Wan server using ANSI C scripts, IIS using C#, Apache with PHP, and Glassfish with Java. I include it only because it attempts to measure the exact technologies you mention. I would never settle on a technology without considering the variables above and more.
Errata:
G-Wan uses ANSI C scripts (rather than "compiled ANSI C" as explained above)
And it transparently turns synchronous (connect/recv/send/close) system calls into asynchronous calls (this is working even with shared libraries).
This can help a great deal to scale with database server requests, posts, etc.

Mochiweb's Scalability Features

From all the articles I've read so far about Mochiweb, I've heard this over and over again that Mochiweb provides very good scalability. My question is, how exactly does Mochiweb get its scalability property? Is it from Erlang's inherent scalability properties or does Mochiweb have any additional code that explicitly enables it to scale well? Put another way, if I were to write a simple HTTP server in Erlang myself, with a simple 'loop' (recursive function) to handle requests, would it have the same level of scalability as a simple web server built using the Mochiweb framework?
UPDATE: I'm not planning to implement a full blown web-server supporting every feature possible. My requirements are very specific - to handle POST data from a HTML form with fixed controls.
Probably. :-)
If you were to write a web server that handles each request in a separate process (light weight thread in Erlang) you could reach the same kind of "scalability" easily. Of course the feature set would be different, unless you implement everything Mochiweb has.
Erlang also has great built in support for distribution among many machines, this might be possible to use to gain even more scalability.
MochiWeb isn't scalable itself, as far as I understand it. It's a fast, tiny server library that can handle thousands of requests per second. The way in which it does that has nothing to do with "scalability" (aside from adjusting the number of mochiweb_acceptors that are listening at any given time).
What you get with MochiWeb is a solid web server library, and Erlang's scalability features. If you want to run a single MochiWeb server, when a request comes in, you can still offload the work of processing that request to any machine you want, thanks to Erlang's distributed node infrastructure and cheap message passing. If you want to run multiple MochiWeb servers, you can put them behind a load balancer and use mnesia's distributed features to sync session data between machines.
The point is, MochiWeb is small and fast (enough). Erlang is the scalability power tool.
If you roll your own server solution, you could probably meet or beat MochiWeb's efficiency and "scalability" out of the box. But then you'd have to rethink everything they've already thought of, and you'd have to battle test it yourself.

Resources