Custom webserver caching - http

I'm working with a custom webserver on an embedded system and having some problems correctly setting my HTTP Headers for caching.
Our webserver is generating all dynamic content as XML and we're using semi-static XSL files to display it with some dynamic JSON requests thrown in for good measure along with semi-static images. I say "semi-static" because the problems occur when we need to do a firmware update which might change the XSL and image files.
Here's what needs to be done: cache the XSL and image files and do not cache the XML and JSON responses. I have full control over the HTTP response and am currently:
Using ETags with the XSL and image files, using the modified time and size to generate the ETag
Setting Cache-Control: no-cache on the XML and JSON responses
As I said, everything works dandy until a firmware update when the XSL and image files are sometimes cached. I've seen it work fine with the latest versions of Firefox and Safari but have had some problems with IE.
I know one solution to this problem would be simply rename the XSL and image files after each version (eg. logo-v1.1.png, logo-v1.2.png) and set the Expires header to a date in the future but this would be difficult with the XSL files and I'd like to avoid this.
Note: There is a clock on the unit but requires the user to set it and might not be 100% reliable which is what might be causing my caching issues when using ETags.
What's the best practice that I should employ? I'd like to avoid as many webserver requests as possible but invalidating old XSL and image files after a software update is the #1 priority.

Are we working on the same project? I went down a lot of dead ends figuring out the best way to handle this.
I set my .html and my .shtml files (dynamic JSON data) to expire immediately. ("Cache-Control: no-cache\r\nExpires: -1\r\n")
Everything else is set to expire in 10 years. ("Cache-Control: max-age=290304000\r\n")
My makefile runs a perl script over all the .html files and identifies what you call "semi-static" content (images, javascript, css.) The script then runs a md5 checksum on those files and appends the checksum to the file:
<script type="text/Javascript" src="js/all.js?7f26be24ed2d05e7d0b844351e3a49b1">
Everything after the question mark is ignored, but no browser will cache it unless everything between the quotes matches.
I use all.js and all.css because everything's combined and minified using the same script.
Out of curiosity, what embedded webserver are you using?

Try Cache-Control: no-store. no-cache tells the client that the response can be cached; it just generally isn't reused unless the cache can't contact the origin server.
BTW, setting an ETag alone won't make the response cacheable; you should also set Cache-Control: max-age=nnn.
You can check how your responses will be treated with http://redbot.org/

Related

Putting dynamic CSS URLs in HTTP headers with Fastly CDN

I'm generating dynamic CSS URLs for cache-busting. I.e. they're in the format styles-thisisthecontenthash123.css.
I also want to use HTTP Link headers to load the files slightly faster. I.e. have the header Link: <styles-thisisthecontenthash123.css>; rel=stylesheet
I'm pretty sure it's possible to do this in Fastly using VCL, but I'm not familiar enough with the ecosystem to figure it out. The CSS URL is in index.html, which is cached. I'm thinking I can open index.html and maybe use regex to parse out the CSS URL. How would I do this?
If I'm understanding your question correctly, you want to include a link header for all requests for index.html. You can do that with Fastly, but if the URL for the CSS file is changing you're not going to be able to pull that info out with VCL (you can't inspect the response body).
You could use edge dictionaries and whenever your CSS filename changes, update the reference via the API.
Thing is, if you're going to make an API call whenever the file changes, might as well just keep the filename consistent (styles.css) and whenever you publish a new version send a cache invalidation (purge). Fastly will clear the cache in ~150ms, so you then all you have to do is add the header which is can be done in the Fastly web portal with a condition.

How long will cached CSS file get updated in browser?

How long will cached CSS file get updated in browser if I don't do anything specifically?
I googled this but haven't found a clear answer. I know I can use file.css?v=1 to force the browser to load the updated version or I can use hard reload feature of the browser. But what if I don't do all of these? So far the browser will always load the cached old version.
Without hard reload and any other setup in server, how long will a local browser update the cached CSS file? Will the cached version stay there forever? (unless the cache space is full to make space).
Browsers generally follow the IETF spec for HTTP caching. This was introduced in the HTTP 1.1 spec. But they do all vary if the content being served doesn't use an HTTP Cache-Control header. Ultimately you can't rely on the hope that your updated file will be loaded by the client unless you either use a URL cache-buster, as you mentioned, or serve your content with proper cache-control headers.

Loading src files once per session in asp.net

I have way too many pages in the application that basically load the same set of xml and js files for client side interaction and validation. So, I have about dozen lines like this one <script type="text/javascript" src="JS/CreateMR.js"></script> or like this one <xml id="DefaultDataIslands" src="../XMLData/DataIslands.xml">.
These same files are included in every page and as such browser sends request to read them every time. It takes about 900ms just to load these files.
I am trying to find a way to load them on just the login page, and then use that temp file as source. Is it possible to do so? If yes, how and where should I start?
P.S. A link to a tutorial will work too, as I have currently no knowledge about that.
Edit:
I can't cache the whole page, because the pages are generated at runtime based on the different possible view modes. I can only cache the js and xml file. Caching everything might be a problem.
Anyway, I am reading through the articles suggested to figure out how to do it. So, I may not be able to accept any answer right away, while I finish reading and try to implement it in one page.
Edit:
Turns out caching is already enabled, it is just that my server is acting crazy. Check the screenshot below.
With Cache
Without cache
As you see, with cache, it is actually taking more time to process some of the requests. I have no idea what that problem is, but I guess I should go to the server stack exchange to figure this out.
As for the actual problem, turns out I don't have to do anything to enable caching of xml and js files. Had no idea browsers automatically cache js files without using specific tag.
Totally possible and in fact recommended.
Browsers cache content that have been sent down with appropriate HTTP caching headers and will not request it again until the cache has expired. This will make your pages faster and more responsive and your server's load much lighter.
Here is a good read to get you started.
Here is ASP.NET MVC caching guide. It focuses on caching content returned from controllers.
Here is a read about caching static content on IIS with ASP.NET MVC.
Basically, you want to use browser caching mechanism to cache the src files after the first request.
If you're using F12 tools in your browser to debug network requests, make sure you have disable cache option unchecked. Otherwise, it forces browser to ignore cached files.
Make sure your server sends and respects cache headers - it should return HTTP status 304 Unmodified after first request to a static file.
Take a look at Asp.Net Bundling and minification - if you have for example multiple js source files, you could bundle them into one file that will be cached on the first request.
Additionally, if you use external js libraries, you could download them from a CDN instead of your server - this will both offload your server and enable user browser to use cached script version (meaning - if some other page that user has visited also used the same script, browser should already have it cached).
One approach is caching static files via IIS by adding <clientCache> element in web.config file. The <clientCache> element of the <staticContent> element specifies cache-related HTTP headers that IIS and later sends to Web clients, which control how Web clients and proxy servers will cache the content that IIS and later returns.
How to configure static content cache per folder and extension in IIS7?
Client Cache
for more info on client side caching read this part of Ultra-Fast ASP.NET 4.5 book:
Browser Cache and Caching Static Content
Other approach is caching portions of page.
if your are using Web Form:
Caching Portions of an ASP.NET Page
and if you are using MVC, use Donut Hole Caching
ASP.NET MVC Extensible Donut Caching
Donut Caching and Donut Hole Caching with Asp.Net MVC
The browser has to ask the server if the file has been modified or not since it put it to the cache, therefore the http statuscode 304. Read more from https://httpstatuses.com/304.
As this is asp.net please make sure you are first running it with
<compilation debug="false"/>
as enabling debugging has some side effects which include.
"All client-javascript libraries and static images that are deployed via
WebResources.axd will be continually downloaded by clients on each page
view request and not cached locally within the browser."
More read from https://blogs.msdn.microsoft.com/prashant_upadhyay/2011/07/14/why-debugfalse-in-asp-net-applications-in-production-environment/

Making a content generator Apache module that only works for text/html requests

The title of this question may be a bit misleading. I couldn't quite think of anything better.
Here is my problem. I am developing an Apache module that needs to manipulate a bit of content in the requested HTML document (this document can be a file on the disk or may be dynamically generated by CGI or PHP) and so I am using libxml2 with it.
I developed something working, but the problem is that when the browser requests for a page, let's say
http://localhost/a.html
the module does it's job. But if that page has references to a javascript file, a.js, or a stylesheet file, a.css, they are not getting served.
The reason as I perceived by examining the logs is that, as the browser sends requests for a javascript file let's say
[http://localhost/a.js] //putting [] because of limit of 1 url per post.
Apache again runs my module, the module uses a HTML parser, so when the content is not HTML, it gives error and exits, the request is abandoned.
How can I make my module to work only for text/html requests ?
I don't know how to make your module to decline the request. But if you don't figure how to do it you can just configure it in httpd.conf
AddHandler my-handler .html

Browser Caching in ASP.NET application

Any suggestions on how to do browser caching within a asp.net application. I've found some different methods online but wasn't sure what would be the best. Specifically, I would like to cache my CSS and JS files. They do change, however, it is usually once a month at the most.
Another technique is to stores you static images, css and js on another server (such as a CDN) which has the Expires header set properly. The advantage of this is two-fold:
The expires header will encourage browsers and proxies to cache these static files
The CDN will offload from your server serving up static files.
By using another domain name for your static content, browsers will download faster. This is because serving resources from four or five different hostnames increases parallelization of downloads.
If the CDN is configured properly and uses cookieless domain then you don't have unnecessary cookies going back and forth.
Its worth bearing in mind that even without Cache-Control or Expires headers most browsers will cache content such as JS and CSS. What should happen though is the browser should request the resource every time its needed but will typically get a "304 Unmodified" response and the browser then uses the cached item. This is can still be quite costly since its a round trip to the server but the resource itself isn't sent so the bytes transfered is limited.
IE left with no specific instructions regarding caching will by default use its own heuristics to determine if it should even bother to re-request an item its cached. This despite not being explicitly told that it can cache a resource. Its hueristics are based on the Last-Modified date of the resource, the older its the less likely it'll have changed now is its typical reasoning. Very wooly.
Frankly if you want to make a site perfomant you need to have control over such cache settings. If you don't have access to these settings then don't wouldn't worry about performance. Just inform the sponsor that it may not perform well because they haven't facilitated a platform that lets you deliver that.
You best bet to do this is to set an Expires header in IIS on the folders you want the content cached. This will tell most modern browsers and proxies to cache this static content. In IIS 6:
Right click on the folder (example CSS or JS) you want to be cached by the browser.
Click properties
Go to the HTTP Headers Tab
Check "Enabled content expiration"
Set some long period for expiration, like "Expires after 90 days"
Yahoo Developer's Blog talks about this technique.
Unless you configure IIS to give asp.net control of js/css/image requests it won't see them by default, hence your best plan (for long term maintainability) is to deliberately tweak the response headers at your firewall/trafficmanager/server or (better and what most of the world does at this point) to version your files in path, i.e:
Instead of writing this in your mark-up:
http://www.foo.com/cachingmakesmesad.css
Use this:
http://www.foo.com/cachingmakesmesad.css?v1
..and change the version number when you need to effectively clear cache. If that's every time then you could even append a GUID or datestamp instead, but I can't think of any occasion where I would want to do that really.
I thought your question was anti-cache but re-reading it I see I wasted a good answer there :P
Long story short, browsers are normally very aggressively pro-caching "simple" resources so you shouldn't have to worry about this, but if you did want to do something about it you would have to have access to a firewall/trafficmanager/IIS for the reasons above (ASP.NET won't be given a chance by default).
However... there is no way you can absolutely force caching, and nor should you. What is and isn't cached is rightfully the decision of the end-user, all you can do is strongly request.
In .net you can set up your JavaScript, CSS and Images as embedded resources.
.Net will then handle the file expiration for you.
The downside to this approach is that you have to do a new build for each set of changes (this might be an upside, depending on your deployment and workflow).
You could also use ETags, but from what I understand in some cases it doesn’t work well if you have mix of IIS and apache Webservers hosting your images, (or if you plan to switch in the future).
You can just make sure the file date is newer, and let the server handle it for you, but you’ve got to make sure the server is configured correctly.
You can cache static content by adding following code in web.config
<system.webServer>
<staticContent>
<clientCache httpExpires="Tue, 12 Apr 2016 00:00:00 GMT" cacheControlMode="UseExpires" />
</staticContent>
</system.webServer>
See the clientCache documentation for more details.

Resources