An architect at my work recently read Yahoo!'s Exceptional Performance Best Practices guide where it says to use a far-future Expires header for resources used by a page such as JavaScript, CSS, and images. The idea is you set a Expires header for these resources years into the future so they're always cached by the browser, and whenever we change the file and therefore need the browser to request the resource again instead of using its cache, change the filename by adding a version number.
Instead of incorporating this into our build process though, he has another idea. Instead of changing file names in source and on the server disk for each build (granted, that would be tedious), we're going to fake it. His plan is to set far-future expires on said resources, then implement two HttpModules.
One module will intercept all the Response streams of our ASPX and HTML pages before they go out, look for resource links and tack on a version parameter that is the file's last modified date. The other HttpModule will handle all requests for resources and simply ignore the version portion of the address. That way, the browser always requests a new resource file each time it has changed on disk, without ever actually having to change the name of the file on disk.
Make sense?
My concern relates to the module that rewrites the ASPX/HTML page Response stream. He's simply going to apply a bunch of Regex.Replace() on "src" attributes of <script> and <img> tags, and "href" attribute of <link> tags. This is going to happen for every single request on the server whose content type is "text/html." Potentially hundreds or thousands a minute.
I understand that HttpModules are hooked into the IIS pipeline, but this has got to add a prohibitive delay in the time it takes IIS to send out HTTP responses. No? What do you think?
A few things to be aware of:
If the idea is to add a query string to the static file names to indicate their version, unfortunately that will also prevent caching by the kernel-mode HTTP driver (http.sys)
Scanning each entire response based on a bunch of regular expressions will be slow, slow, slow. It's also likely to be unreliable, with hard-to-predict corner cases.
A few alternatives:
Use control adapters to explicitly replace certain URLs or paths with the current version. That allows you to focus specifically on images, CSS, etc.
Change folder names instead of file names when you version static files
Consider using ASP.NET skins to help centralize file names. That will help simplify maintenance.
In case it's helpful, I cover this subject in my book (Ultra-Fast ASP.NET), including code examples.
He's worried about stuff not being cached on the client - obviously this depends somewhat on how the user population has their browsers configured; if it's the default config then I doubt you'd need to worry about trying to second guess the client caching, it's too hard and the results aren't guaranteed, also it's not going to help new users.
As far as the HTTP Modules go - in principle I would say they are fine, but you'll want them to be blindingly fast and efficient if you take that track; it's probably worth trying out. I can't speak on the appropriateness of use RegEx to do what you want done inside, though.
If you're looking for high performance, I suggest you (or your architect) do some reading (and I don't mean that in a nasty way). I learnt something recently which I think will help -let me explain (and maybe you guys know this already).
Browsers only hold a limited number of simultaneous connections open to a specific hostname at any one time. e.g, IE6 will only do 6 connections to say www.foo.net.
If you call your images from say images.foo.net you get 6 new connections straight away.
The idea is to seperate out different content types into different hostnames (css.foo.net, scripts.foo.net, ajaxcalls.foo.net) that way you'll be making sure the browser is really working on your behalf.
http://code.google.com/p/talifun-web
StaticFileHandler - Serve Static Files in a cachable, resumable way.
CrusherModule - Serve compressed versioned JS and CSS in a cachable way.
You don't quite get kernel mode caching speed but serving from HttpRuntime.Cache has its advantages. Kernel Mode cache can't cache partial responses and you don't have fine grained control of the cache. The most important thing to implement is a consistent etag header and expires header. This will improve your site performance more than anything else.
Reducing the number of files served is probably one of the best ways to improve the speed of your website. The CrusherModule combines all the css on your site into one file and all the js into another file.
Memory is cheap, hard drives are slow, so use it!
Related
For a long time I've been updating ASP.NET pages on the server and never find the correct way to make changes visible on files like CSS and images.
I know if a append something in the URL the browser will think the file is another one:
<img src="/images/myLogo.png?v=1"/>
or perhaps changing its name:
<img src="/images/myLogo.v1.png"/>
Unfortunately it does not look the correct way. In a case were I'm using App_Themes the files in this folder are automatically injected in the page in a way I can't easily change the URL.
So my question is:
When I'm publishing de ASP.NET Application on the server what is the correct way to signal to IIS (and it notify browser after that) that a file was changed? It is not automatic? Should I change some configuration in IIS or perhaps make some "decoration" in the code?
I've already tried many questions here in SO like "ASP.NET - Invalidate browser cache", "How to refresh the browser cache of an image?", "Handle cached images? How to get the browser to show the new version?", and even "What is an elegant way to force browsers to reload cached CSS/JS files?" but none of them actually take another aproach else in a way you must handle it manually in the code instead of IIS or ASP.NET configuration.
The closer I could find is "Asking browsers to cache our images (ASP.NET/IIS)" where they set expiration but not based on the fact the files were update. Instead they used days or hour to cache those file so they would updated even when no changes were made.
I'm want to know if IIS or ASP.NET offers something related to this, automatically send to the browser that the files was changed. Is it possible/built in?
The options you have to update the browser side, cached item are:
Change the file name
Add url parameter
Place it on cache for a limited time (eg for couple of hours)
Compare the date-time of creation.
Signaling with eTag.
With the three two you avoiding one server call for each item, but the third option load it again after some time.
With the others you have to make one call to the server to see if needs to be load it again.
So you can not have all here, there is not correct way, and you need to chose what is the best for you, and what you can do. The faster from client perspective is the (1) and (2) options.
The direct answer to your question is to use eTag, or date-time compare of the file creation, but you loose that way, a call to the server, you only win the size of what is travel back.
Some more links:
http eTag
How do I support ETags in ASP.NET MVC?
Configuring ETags with Http module in asp.net
How to control web page caching, across all browsers?
Jquery getScript caching
and you can find even more.
This discussion around gzip, minified, and combining both for maximum compression effects made me wonder if I can get a net performance effect by placing jQuery libraries directly inside my HTML file?
I already do this for icon sprites, which I base64 encode and then load via CSS datasource, in an effort to minimize the number of HTTP requests.
Now I ask: do you know if loading the jQuery library in its raw min+gzip form, as available on the official production download link, would be more efficient compared to loading it via the Google AJAX libraries that are hosted on their CDN?
TL;DR:
You should always keep common items (like a JS library) in external files, because that allows for caching on your server, in the browser, and at many nodes along the way. Caching times dramatically outweigh compression or request times when looking at the overall speed of a page.
Longer Version:
There are a number of things to consider for this question. Here are the ones that came off the top of my head.
Minifying scripts and styles is always good for speed. I don't think there's much to discuss here.
Compression (and extraction in the browser) has an overhead. Firing up the gzip module takes time away from sending the request, so for small files or binary files (images, etc.), it's usually better to skip the gzip. Something large and consisting mostly of ascii (like a JS library), however, is a good thing to compress, so that checks out.
Reducing HTTP requests is generally a good thing to do. Note, though, that ever since HTTP 1.1 (supported in all major browsers), a new HTTP request does not mean a new socket connection. With keep-alive, the same socket connection can serve the webpage and all the affiliated files in one go. The only extra overhead on a request is the request and response headers which can be negligible unless you have a lot of small images/scripts. Just make sure these files are served from a cookie-free domain.
Related to point 3, embedding content in the page does have the benefit of reducing HTTP requests, but it also has the drawback of adding to the size of the page itself. For something large like a JS library, the size of library vastly outweighs the size of the HTTP overhead needed for an extra file.
Now here's the kicker: if the library is embedded in the page, it will almost never be cached. For starters, most (all?) major browsers will NOT cache the main HTML page by default. You can add a CACHE-CONTROL meta tag to the page header if you want, but sometimes that's not an option. Furthermore, even if that page is cached, the rest of your pages (that are probably also using the library) will now need to be cached as well. Suddenly you have your whole site cached on the user's machine including many copies of the JS library, and that's probably not what you want to do to your users. On the other hand, if you have the JS library as an external file, it can be cached by itself just once, especially (as #Barmar says) if the library is loaded from a common CDN like Google.
Finally, caching doesn't only happen in the client's browser. If you're running an enterprise-level service, you may have multiple server pools with shared and individual caches, CDN caches that have blinding speed, proxy caches on the way to your domain, and even LAN caches on some routers and other devices. So now I direct you back to the TL;DR at the top, to say that the speed-up you can gain by caching (especially in the browser) trumps any other consideration you are likely to encounter.
Hope that helps.
I am building a AJAX intensive web application (using ASP.NET, JQuery, and WCF web services) and am looking into building an HTTP Handler that handles script combining and compression for my JavaScript files and my CSS files. I know not combining the scripts is generally a less preferred approach, and I'm sure it's probably the way I will end up going, but my question is this...
Since so many of my JS files are due to the controls I use don't they get cached by the browser after the first load anyway? Since so many of these controls can be found on many of the pages of my web application is it actually faster to combine all of my scripts and serve that one file (which will vary for every page) or to serve the individual files which will get cached? I guess what I'm getting at is, by enabling script combining am I now losing part of the caching ability of the browser? I know I can cache the combined script, but the combined script will be different for every page whereas with the individual control scripts each one will be cached and the number of new scripts will be minimal for each page call.
Does this make any sense? Thoughts?
The fewer number of JS files you serve, the faster your pages will be, due a smaller number of round trips to the server. I would manually put all the common js code into one file (or as few files as possible), all the css code into one file etc., and not worry about using a handler to combine the files. The handler is going to take processing time to combine the files, so you are going to pay that penalty also. You can turn on gzip compression on IIS and have it handle that for you. I would run something like YUI Compressor on the Javascript files used in production.
If the handler changes the file contents from page to page, browsers won't be able to cache it. If you are using SSL this point will be moot though as the browser won't cache the files anyway.
EDIT
I've been corrected some browsers (like FF) can cache SSL content but not all.
As other mentioned: minify, gzip and turn on caching (set expire time and see to that you support etags) on the one static JS file you have and the one static CSS file you have. On top of this it's recommended to load your CSS files as early as possible and your JS files as late as possible (JS file loading is blocking other downloads and it's faster for the browser to render the page if it got the CSS as soon as possible). Sprites also help if you have many small images/icons. Loading static content from sub domains will help the browser to have more simultaneous downloads and you could drop all you cookies for those sub domains to lower the http header size.
You could consult YSlow for performance analysis, it's a great tool!
generally a less preferred
is it for live js? I thinking of JavaScriptMVC which compresses all the code into one file when complied for production, not development... It's a heavy weight I believe.
Usually it's better to combine all scripts. In this case you'll reduce http overhead. Minified controls scripts usually are quite small. In rare case when you are using quite large control you could not combine it to main js.
I've recently started embedding JavaScript and CSS files into our common library DLLs to make deployment and versioning a lot simpler. I was just wondering if there is any reason one might want to do the same thing with a web application, or if it's always best to just leave them as regular files in the web application, and only use embedded resources for shared components?
Would there be any advantage to embedding them?
I had to make this same decision once. The reason I chose to embed my JavaScript/CSS resources into my DLL was to prevent tampering of these files (by curious end users who've purchased my web application) once the application's deployed.
I doubting and questioning the validity of Easement's comment about how browsers download JavaScript files. I'm pretty sure that the embedded JavaScript/CSS files are recreated temporarily by ASP.NET before the page is sent to the browser in order for the browser to be able to download and use them. I'm curious about this and I'm going to run my own tests. I'll let you know how it goes....
-Frinny
Of course if anyone who knew what they were doing could use the assembly Reflector and extract the JS or CSS. But that would be a heck of a lot more work than just using something like FireBug to get at this information. A regular end user is unlikely to have the desire to go to all of this trouble just to mess with the resources. Anyone who's interested in this type of thing is likely to be a malicious user, not the end user. You have probably got a lot of other problems with regards to security if a user is able to use a tool like the assembly reflector on your DLL because by that point your server's already been compromised. Security was not the factor in my decision for embedding the resources.
The point was to keep users from doing something silly with these resources, like delete them thinking they aren't needed or otherwise tamper with them.
It's also a lot easier to package the application for deployment purposes because there are less files involved.
It's true that the DLL (class library) used by the pages is bigger, but this does not make the pages any bigger. ASP.NET generates the content that needs to be sent down to the client (the browser). There is no more content being sent to the client than what is needed for the page to work. I do not see how the class library helping to serve these pages will have any effect on the size of data being sent between the client and server.
However, Rjlopes has a point, it might be true that the browser is not able to cache embedded JavaScript/CSS resources. I'll have to check it out but I suspect that Rjlopes is correct: the JavaScript/CSS files will have to be downloaded each time a full-page postback is made to the server. If this proves to be true, this performance hit should be a factor in your decision.
I still haven't been able to test the performance differences between using embedded resources, resex, and single files because I've been busy with my on endeavors. Hopefully I'll get to it later today because I am very curious about this and the browser caching point Rjlopes has raised.
Reason for embedding: Browsers don't download JavaScript files in parallel. You have a locking condition until the file is downloaded.
Reason against embedding: You may not need all of the JavaScript code. So you could be increasing the bandwidth/processing unnecessarily.
Regarding the browser cache, as far as I've noticed, response on WebRecource.axd says "304 not modified". So, I guess, they've been taken from cache.
I had to make this same decision once. The reason I chose to embed my JavaScript/CSS resources into my DLL was to prevent tampering of these files (by curious end users who've purchased my web application) once the application's deployed.
Reason against embedding: You may not need all of the JavaScript code. So you could be increasing the bandwidth/processing unnecessarily.
You know that if somebody wants to tamper your JS or CSS they just have to open the assembly with Reflector, go to the Resources and edit what they want (probably takes a lot more work if the assemblies are signed).
If you embed the js and css on the page you make the page bigger (more KB to download on each request) and the browser can't cache the JS and CSS for next requests. The good news is that you have fewer requests (at least 2 if you are like me and combine multiple js and css and one), plus javascripts have the problem of beeing downloaded serially.
Any suggestions on how to do browser caching within a asp.net application. I've found some different methods online but wasn't sure what would be the best. Specifically, I would like to cache my CSS and JS files. They do change, however, it is usually once a month at the most.
Another technique is to stores you static images, css and js on another server (such as a CDN) which has the Expires header set properly. The advantage of this is two-fold:
The expires header will encourage browsers and proxies to cache these static files
The CDN will offload from your server serving up static files.
By using another domain name for your static content, browsers will download faster. This is because serving resources from four or five different hostnames increases parallelization of downloads.
If the CDN is configured properly and uses cookieless domain then you don't have unnecessary cookies going back and forth.
Its worth bearing in mind that even without Cache-Control or Expires headers most browsers will cache content such as JS and CSS. What should happen though is the browser should request the resource every time its needed but will typically get a "304 Unmodified" response and the browser then uses the cached item. This is can still be quite costly since its a round trip to the server but the resource itself isn't sent so the bytes transfered is limited.
IE left with no specific instructions regarding caching will by default use its own heuristics to determine if it should even bother to re-request an item its cached. This despite not being explicitly told that it can cache a resource. Its hueristics are based on the Last-Modified date of the resource, the older its the less likely it'll have changed now is its typical reasoning. Very wooly.
Frankly if you want to make a site perfomant you need to have control over such cache settings. If you don't have access to these settings then don't wouldn't worry about performance. Just inform the sponsor that it may not perform well because they haven't facilitated a platform that lets you deliver that.
You best bet to do this is to set an Expires header in IIS on the folders you want the content cached. This will tell most modern browsers and proxies to cache this static content. In IIS 6:
Right click on the folder (example CSS or JS) you want to be cached by the browser.
Click properties
Go to the HTTP Headers Tab
Check "Enabled content expiration"
Set some long period for expiration, like "Expires after 90 days"
Yahoo Developer's Blog talks about this technique.
Unless you configure IIS to give asp.net control of js/css/image requests it won't see them by default, hence your best plan (for long term maintainability) is to deliberately tweak the response headers at your firewall/trafficmanager/server or (better and what most of the world does at this point) to version your files in path, i.e:
Instead of writing this in your mark-up:
http://www.foo.com/cachingmakesmesad.css
Use this:
http://www.foo.com/cachingmakesmesad.css?v1
..and change the version number when you need to effectively clear cache. If that's every time then you could even append a GUID or datestamp instead, but I can't think of any occasion where I would want to do that really.
I thought your question was anti-cache but re-reading it I see I wasted a good answer there :P
Long story short, browsers are normally very aggressively pro-caching "simple" resources so you shouldn't have to worry about this, but if you did want to do something about it you would have to have access to a firewall/trafficmanager/IIS for the reasons above (ASP.NET won't be given a chance by default).
However... there is no way you can absolutely force caching, and nor should you. What is and isn't cached is rightfully the decision of the end-user, all you can do is strongly request.
In .net you can set up your JavaScript, CSS and Images as embedded resources.
.Net will then handle the file expiration for you.
The downside to this approach is that you have to do a new build for each set of changes (this might be an upside, depending on your deployment and workflow).
You could also use ETags, but from what I understand in some cases it doesn’t work well if you have mix of IIS and apache Webservers hosting your images, (or if you plan to switch in the future).
You can just make sure the file date is newer, and let the server handle it for you, but you’ve got to make sure the server is configured correctly.
You can cache static content by adding following code in web.config
<system.webServer>
<staticContent>
<clientCache httpExpires="Tue, 12 Apr 2016 00:00:00 GMT" cacheControlMode="UseExpires" />
</staticContent>
</system.webServer>
See the clientCache documentation for more details.