Prevent bundle from responding if cache buster doesn't match content - asp.net

I'm using Bundling and Minification across a server farm where there is a cross-over period of old and new servers.
The issue I have is that the old servers are caching the content of the new bundle cache buster URL.
For example the new HTML is cached with the new bundle URL:
<script src="/bundle.css?v=RBgbF6A6cUEuJSPaiaHhywGqT7eH1aP8JvAYFgKh"></script>
This then makes a request to an old server which hasn't yet been updated with the new CSS code and this then cached.
Any subsequent calls to the new bundle URL will then return the old code.
Therefore is there a way of checking that the content of the bundle matches the hashed cache buster? And if it doesn't throw a 404 for example.
Using my example above when the request goes back to the old server for the bundle it would check the contents of the bundle, generate a content hash and compare this to the querystring.
In this instance the cache-buster wouldn't match the actual content hash and a 404 would be returned.
Eventually a user would hit a new server with the bundle request and the correct content would be cached.

We're bound to encounter the same issue ourselves soon, but we've been sticking to only 2 update domains (split the servers in half so that there's not more than one version running at a time).
As far as I see it there are 4 possible alternatives:
Have your static content always point to an up to date server. This could be done (depending on your configuration) by IP address or by using a URL that you know has already been updated (if you have a server that gets updated first).
Configure your load balancer so that requests from the same IP address end up on the same system (if your static content is served from your app servers). If this can't be done at the load balancer level then you can do it further up by configuring different IP addresses for different environments and then swapping their DNS records around.
Implement a handler in ASP.NET that listens for CSS files, and checks whether the hash of the bundle is what it expected to be. You will probably need a singleton object to be storing these when they're generated when your app loads. This can then either return a 404, 301 (to get them to retry) or return the old version but instruct it to not cache the results. Note that with HttpPipelining you're not likely to hit a different server so a redirect might not help.
Have a config flag that is set while you're doing a deploy that changes all of your cache-busting URLs to be the current date. This will effectively disable all caching until your deploy is completed, meaning that any "wrong" assets delivered will not be kept.
Is this a problem you're actually seeing, or is it hypothetical? Unless your site is very high traffic and your deploys take a good few minutes it's not something you're likely to see. You'll want to be wary of returning 404s as sometimes the wrong stylesheet is better than no stylesheet.

Related

Force nginx cache

We are stuck with a non-trivial question with nginx.
Our web server (Nginx) has a cache enabled for all links.
When we do newsletter campaigns, each customer accesses our website using a unique URL and this means that the nginx cache is not used.
This results in a high load on the website every time we send a newsletter, so we need to deal with this in some way.
An example of link
https://www.example.com/deals/calendarraa?utm_content=calendars_link&xnpe_tifc=hFzDOFnDbuPsOfnDx9pZVjQsVuU_O.bpx.US4FzXxDxZ4.VdxDPuxuYj4nTT&utm_source=newsletter&utm_campaign=deals%2520offer%3A%2520calendar%2520and%2520greeting%2520cards&utm_medium=email
Is it possible to configure nginx so that in case we have params starting with utm_ &/or xnpe_tifc, e.g. we ignore these parameters and nginx serves website with URL without them?
That means: https://www.example.com/deals/calendarra? or even better just https://www.example.com/deals/calendarra

CDN: Forward to a different resource instead of redirect

I need to send different resources (specially images) for same urls depending on a complex logic based on different factors (cookie, IP, time, random). I want to take advantage of CDNs (cache, availability, proximity). So, I want this CDN to make a call to my server in order to decide which resource serve to any request. It is very important to not use redirects, so the user will never see a 30X status code.
For clarification:
User makes a request to http://resources.mydomain.com/img/a.jpg, which domain is under CDN
CDN makes a call to my server, sending url requested, cookies and user IP
My server returns the name of the real resource to serve (http://hidden.mydomain.com/img/a-version3.jpg)
CDN requests that image if not in cache
CDN responds to user request sending a-version3.jpg data, but without any redirect
Is it possible using any current commercial solution?
Yes, I think it is already supported by CDNetworks long time ago.
It is called "Origin Logic Control" now. You can check the description from http://www.cdnetworks.com/wp-content/uploads/2013/08/CDNetworks-ContentAccel-DS-EN2.pdf:
Allows a customer’s domain to require checking with the origin on every request.
You can return a special HTTP header (or special HTTP body, I am not sure now) to tell CDNetworks to return resources directly (and using cached version if available), not 30x status code.
You can enable Redirect Chasing to get what you are looking for. Alternatively, look at the Akamai blog post on Edge Redirect for a faster option.

Do any CDNs allow rewriting request URI's so that client-side routing plays nicely with browser refreshes?

I have an HTML5 app written in static html/js/css (it's actually written in Dart, but compiles down to javascript). I'm serving the application files via CDN, with the REST api hosted on a separate domain. The app uses client-side routing, so as the user goes about using the app, the url might change to be something like http://www.myapp.com/categories. The problem is, if the user refreshes the page, it results in a 404.
Are there any CDN's that would allow me to create a rule that, if the user requests a page that is a valid client-side route, it would just return the (in my case) client.html page?
More detailed explanation/example
The static files for my web app are stored on S3 and served via Amazon's CloudFront CDN. There is a single HTML file that bootstraps the application, client.html. This is the default file served when visiting the domain root, so if you go to www.mysite.com the browser is actually served www.mysite.com/client.html.
The web app uses client-side routing. Once the app loads and the user starts navigating, the URL is updated. These urls don't actually exist on the CDN. For example, if the user wanted to browse widgets, she would click a button, client-side routing would display the "widgets" view, and the browser's url would update to www.mysite.com/widgets/browse. On the CDN, /widgets/browse doesn't actually exist, so if the user hits the refresh button on the browser, they get a 404.
My question is whether or not any CDNs support looking at the request URI and rewriting it. So, I could see a request for /widgets/browse and rewrite it to /client.html. That way, the application would be served instead of returning a 404.
I realize there are other solutions to this problem, namely placing a server in front of the CDN, but it's less ideal.
I do this using CloudFront, but I use my own server running Apache to accomplish this. I realize you're using a server with Amazon, but since you didn't specify that you're restricted to that, I figured I'd answer with how to accomplish what you're looking to do anyway.
It's pretty simple. Any time you query something that isn't already in the cache on CloudFront, or exists in the Cache but is expired, CloudFront goes back to your web server asking it to serve up the content. At this point, you have total control over the request. I use the mod_rewrite in Apache to capture the request, then determine what content I'm going to serve depending on the request. In fact, there isn't a single file (spare one php script) on my server, yet cloudfront believes there are thousands. Pretty sure url rewriting is standard on most web servers, I can only confirm on lighttp and apache from my own experience though.
More Info
All you're doing here is just telling your server to rewrite incoming requests in order to satisfy them. This would not be considered a proxy or anything of the sort.
The flow of content between your app and your server, with cloudfront in between is like this:
appRequest->cloudFront
if cloudFront has file, return data to user without asking your server
for the file.
If cloudFront DOESN'T have the file (or it has expired), go back to
the origin server and ask it for a new copy to cache.
So basically, what is happening in your situation is this:
A)app->ask cloudfront for url cloud front doesn't have
B)cloudfront
then asks your source server for the file
C)file doesn't exist there,
so the server tells cloudFront to go fly a kite
D)cloudFront comes back empty handed and makes your app 404
E)app crashes and
burns, users run away and use something else.
So, all you're doing with mod_rewrite is telling your server how it can re-interpret certain formatted requests and act accordingly. You could point all .jpg requests to point to singleImage.jpg, then have your app ask for:
www.mydomain.com/image3.jpg
www.mydomain.com/naughtystuff.jpg
Neither of those images even have to exist on your server. Apache would just honor the request by sending back singleImage.jpg. But as far as cloudfront or your app is concerned, those are two different files residing at two different unique places on the server.
Hope this clears it up.
http://httpd.apache.org/docs/current/mod/mod_rewrite.html
I think you are using the URL structure in a wrong way. the path which is defined by forward slashes is supposed to bring you to a specific resource, in your example client.html. However, for routing beyond that point (within that resource) you should make use of the # - as is done in many javascript frameworks. This should tell your router what the state of the resource (your html page or app) is. if there are other resources referenced, e.g. images, then you should provide different paths for them which would go through the CDN.

Rewrite URL for all requests to a folder in ASP.NET 3.5 - IIS 7

We have a number of existing clients that point to urls like:
http://sub1.site.com/images/image1.jpg
/images is a virutal directory that points to a directory that actually contains image1.jpg on that server.
We're moving all of the files out of this directory and onto a separate server that will not run this same application.
The file will now only be available at:
http://sub2.site.com/image1.jpg
What is the best way to make it so clients requesting
http://sub1.site.com/images/image1.jpg will get the content that now resides at http://sub2.site.com/image1.jpg?
A few requirements:
We need the actual content to be returned through that url - not a 302 response.
We cannot modify the IIS server configuration - only the web.config for the site
Again, we're running asp.net 3.5
Thanks.
Not totally sure this would work, but you could setup URL Routing on the old site so all requests are sent to a handler and within that handler you could do a web request to get the file from it's new location.
I use a variation of the process to map image URLs to different locations and my handler does some database queries to get the mapped relationship and provide the correct image. I don't see why you could do a web request to get the image.
Since you are using IIS7, you can use the built in URL rewrite module.
You would want an inbound and an outbound rule to change \images\image1.jpg to \image1.jpg
It can get pretty involved, but this should be rather simple.
Assuming you can add handlers to your site (as in add a DLL to your /bin directory in the site) and with the restriction that you can't send 302 responses for better performance, then alternatively you could write a custom handler to grab all requests that match that URL pattern, do the web request for the sub2.site image from the original site via web client code, then serve it back out of the original site, sub1.site.com.
See How To Create an ASP.NET HTTP Handler by Using Visual C# .NET for the very basics of creating and setting up a custom handler. Then use the HttpWebRequest to make the request of sub2.site.com, as in the guide A Deeper Look at Performing HTTP Requests in an ASP.NET Page. Plus a little other code to handle errors, timeouts, passing the image through with as little processing and memory usage as possible, etc.
Depending on the response time/lag between the two servers, this may be slow, but it would fit all your requirements. But if the point of moving the images to site 2 was for performance (CPU or memory) or bandwidth limitations, then this solution would nullify any gains — and would actually make things worse. But if they were moved for other business or technical reasons though, then this solution might be helpful still.
If you have other control over the server or anything upstream from the server, you could use mod_proxy (or similar Windows/IIS tool) to intercept those URLs and forward them to another server and respond back with the real request. Depending on your network configuration and available servers, this could be the simplest, best performing solution.
Can IIS be configure to forward request to another web server? on serverfault has a quick process and link for an IIS 7.5 solution.

Is it safe to redirect to the same URL?

I have URLs of the form http://domain/image/⟨uuid⟩/42x42/some_name.png. The Web server (nginx) is configured to look for a file /some/path/image/⟨uuid⟩/thumbnail_42x42.png, and if it does not exist, it sends the URL to the backend (Django via mod_wsgi) which then generates the thumbnail. Then the backend emits a 302 redirect to exactly the same URL that was requested by the client, with the idea that upon this second request the server will notice the thumbnail file and send it directly.
The question is, will this work with all the browsers? So far testing has shown no problems, but can I be sure all the user agents will interpret this as intended?
Update: Let me clarify the intent. Currently this works as follows:
The client requests a thumbnail of an image.
The server sees the file does not exist, so it forwards the request to the backend.
The backend creates the thumbnail and returns 302.
The backend releases all the resources, letting the server share the newly generated file to current and subsequent clients.
Having the backend serve the newly created image is worse for two reasons:
Two ways of serving the same data must be created;
The server is much better at serving static content. What if the client has an extremely slow link? The backend is not particularly fast nor memory-efficient, and keeping it in memory while spoon-feeding the client can be wasteful.
So I keep the backend working for the minimum amount of time.
Update²: I’d really appreciate some RFC references or opinions of someone with experience with lots of browsers. All those affirmative answers are pleasant but they look somewhat groundless.
If it doesn't, the client's broken. Most clients will follow redirect loops until a maximum value. So yes, it should be fine until your backend doesn't generate the thumbnail for any reason.
You could instead change URLs to be http://domain/djangoapp/generate_thumbnail and that'll return the thumbnail and the proper content-type and so on
Yes, it's fine to re-direct to the same URI as you were at previously.

Resources