How to Remotely force a client to purge a cached website?

How to Remotely force a client to purge a cached website? - nginx

We are experiencing an issue where a previous version of our home page is being displayed. Even though there has been changes since then, the web page will always show the old version.
This issue stems from us using a WordPress plugin that added a
Last-Modified: Tue, 19 Apr 2016 15:18:40 GMT
header to the response.
The only way found to fix this issue is by force refresh on the browser. Is there a way to invalidate that cache remotely for all clients ?
The Request-Response header

If you mean the stylesheets or javascript for example you can update the version of the stylesheet see below for an example
<link rel="stylesheet" type="text/css" href="mystyle.css">
You can change to
<link rel="stylesheet" type="text/css" href="mystyle.css?v=1.0">
Notice the ?v=1.0 parameter at the end of the source, this works for Javascript also.
If you need images and things to update you can find lots here about cache busting here
Refresh image with a new one at the same url
you can also try adding
<META HTTP-EQUIV="CACHE-CONTROL" CONTENT="NO-CACHE">
<META HTTP-EQUIV="EXPIRES" CONTENT="Mon, 20 Feb 2012 00:00:01 GMT">
To the Head of the HTML Page.

Browsers are going to honor the cache settings that were originally provided to it, you should be able to look in the developer tools of the browser to see what the cached headers are.
For example, if the content sent something like:
Cache-Control: public, max-age=86400
Then it will have no reason to request an updated version of the content from your server for a day.
If the server is able to handle the load of receiving requests for the content, you can do ensure that there is an ETag and Last-Modified header, and then use a short expiration time, such as:
Cache-Control: public, max-age=600
ETag: abcdefg
Last-Modified: Tue, 19 Apr 2016 15:18:40 GMT
Then, after 10 minutes the browser will issue a request that asks the server if the content has changed. If not, the server should issue an empty 304 Not Modified response to indicate no difference. So this saves on your bandwidth and the only cost is however "expensive" resource-wise it is to determine the headers to send.
I would absolutely suggest using small cache times for your primary HTML (or any dynamic content) if you know it will change as the entire purpose of those caching headers is to allow browsers to serve as quickly as possible the version they have as well as save you on CPU and bandwidth.
Side note: If you were able to "reach out" to it in that way, it would actually be somewhat terrifying.

Based on all of the information provided, you're missing on Varnish HTTP Purge plugin and/or have not configured the VCL for it.
If you're seeing old cache version for the homepage this means that the page's cache was not purged after updating its contents in Wordpress admin.
In a typical scenario for Wordpress, you will set maximum cache lifetime and use a plugin like the one mentioned to invalidate cache based on relevant Wordpress hooks.

Related

WordPress enable browser cache of css/js files while preserving query string

Wanted to cache css/js files in WordPress in the browser as much as possible (for faster loading). However, (atleast Chrome) won't cache if there is a query string apparently. But putting version in query string is very useful and besides there are css files in WordPress code and other modules such as tinymce where I don't have control (they all use version query string). Is there a way still files can be cached?

There are lots of places your file may get cached between the source and the user. The browser is just one of these.
The recommendation to avoid query string arguments is based on many popular cache servers not caching if there is a query string. We're talking about servers that sit in front of your web server. (Some, like Cloudflare let you choose how you handle query strings).
On the whole, browsers still cache a resource that has a query string if you send the appropriate headers.
The common way to get the best of both worlds is the version that actual name:
script-3.4.2.js
This gives you versioning without a query string and makes you agnostic to the caching technology.
Chrome Does Cache
If you are a WordPress user, the chances are you have a script loaded from:
https://www.example.com/wp-includes/js/wp-embed.min.js?ver=4.8.2
As with all scripts, you gotta get it on the first load. But it should have some form of cache control headers.
cache-control:public, max-age=172800
content-encoding:gzip
content-type:text/javascript
date:Thu, 19 Oct 2017 18:26:18 GMT
etag:W/"576-543356dcbb9ba"
expires:Sat, 21 Oct 2017 18:26:18 GMT
last-modified:Fri, 09 Dec 2016 08:20:37 GMT
status:200
vary:Accept-Encoding
When you navigate to another page that references the script (i.e. you can't test this by loading the file directly as the browser behaves differently)... it loads it from the cache:
If you find it isn't caching, double check the headers being sent with the file as they are more likely to be a problem than the query string.

Chrome & Expires Header - Image Caching

I have a web application that contains a few hundred small images, and is performing quite badly on load.
To combat this, I would like to cache static files in the browser.
Using a servlet filter on Tomcat 7, I now set the expires header correctly on static files, and can see that this is returned to Chrome:
Accept-Ranges:bytes
Cache-Control:max-age=3600
Content-Length:40284
Content-Type:text/css
Date:Sat, 14 Apr 2012 09:37:04 GMT
ETag:W/"40284-1333964814000"
**Expires:Sat, 14 Apr 2012 10:37:05 GMT**
Last-Modified:Mon, 09 Apr 2012 09:46:54 GMT
Server:Apache-Coyote/1.1
However, I notice that Chrome is still doing a round trip to the server for each static resource on reloads, sending an if-modified header and getting a correct 304 Not Modified response from Tomcat.
Is there any way to make Chrome avoid these 100+ requests to the server until the expiry has genuinely passed?

There are 3 ways of loading a page -
Putting the url in the address bar and pressing enter which is equivalent to navigating from a hyper link (Default browsing behaviour). This will honour the Expires headers and will first check the cache of the static content to be valid and then if the Expires header time is in future it will load it directly from the cache. In this case the browser will make no request at all to the server. In case the cached content is in-valid it will make a request to the server.
Pressing f5 to refresh the page. This would basically send a if-modified header to the server and verify if the content has changed. If it has changed you would get a 200 response else if not then a 304 response. In both cases the image is not loaded on the page until a response is received from the server.
Pressing ctrl+f5 which would forcefully clear all the cache and reload all the images. It will not spend time in verifying if the images have changed or not using the headers.
I guess the behaviour you are expecting is the first kind. The only thing that you should be looking at is the way you are loading the page. Normally people are not going to press f5 or ctrl+f5 thus your static content will not be re-validated every time. It will forcefully clear the cache and reload every static item on the page.
In short just remember - reload the page by pressing enter in the address bar instead. The browser will honour the headers that you have provided. This is not specific to chrome, its a W3C standard.

Be carefull when you are testing. I noticed that in Chrome version 20 if I hit F5 to reload the page then in the network panel I see new requests.
Hoewer if I place the cursor to the title bar, after the current page url, and hit enter, I get resources from cache, whitch header was set to cache.
Also a good reading:
http://betterexplained.com/articles/how-to-optimize-your-site-with-http-caching/

Assuming you have ruled out the various gotchas that have already been suggested, I found that Google Chrome can ignore the Cache-Control directive unless it includes public, and that it has to be first. For example:
Cache-Control: public, max-age=3600
In my experiments I also removed ETags from the server response, so that could be a factor, but I didn't go back and check.

ASP.NET site sometimes freezing up and/or showing odd text at top of the page while loading, on load balanced servers

I have various servers (dev, 2 x test, 2 x prod) running the same asp.net site.
The test and prod servers are in load-balanced pairs (prod1 with prod2, and test1 with test2).
The test server pair is exhibiting some kind of (super) slowdown or freezing during about one in ten page loads. Sometimes a line of text appears at the very top of the page which looks something like:
00 OK Date: Thu, 01 Apr 2010 01:50:09 GMT Server: Microsoft-IIS/6.0 X-Powered_By: ASP.NET X-AspNet-Version:2.0.50727 Cache-Control:private Content-Type:text/html; charset=ut
(the beginning and end are "cut off".)
Has anyone seen anything like this before? Any idea what it means or what's causing it?
Edit:
I often see this too when clicking something - it comes up as red text on a yellow page:
XML Parsing Error: not well-formed
Location: http://203.111.46.211/3DSS/CompanyCompliance.aspx?cid=14
Line Number 1, Column 24:2mMTehON9OUNKySVaJ3ROpN" />
-----------------------^
If I go back and click again, it works (I see the page I clicked on, not the above error message).
Update:
...And, instead of the page loading, I sometimes just get a white screen with text like this in black (looks a lot like the above text):
HTTP/1.1 302 Found Date: Wed, 21 Apr 2010 04:53:39 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET X-AspNet-Version: 2.0.50727 Location: /3DSS/EditSections.aspx?id=3&siteId=56&sectionId=46 Set-Cookie: .3DSS=A6CAC223D0F2517D77C7C68EEF069ABA85E9HFYV64F&FA4209E2621B8DCE38174AD699C9F0221D30D49E108CAB8A828408CF214549A949501DAFAF59F080375A50162361E4AA94E08874BF0945B2EF; path=/; HttpOnly Cache-Control: private Content-Type: text/html; charset=utf-8 Content-Length: 184
object moved here
Where "here" is a link that points to a URL just like the one I'm requesting, except with an extra folder in it, meaning something like:
http://123.1.2.3/MySite//MySite/Page.aspx?option=1
instead of:
http://123.1.2.3/MySite/Page.aspx?option=1
Update:
A colleague of mine found some info saying it might be because the test servers are running iis in 64 bit (64bit win 2003) (prod servers are 32 bit win 2003).
So we tried telling IIS to use 32 bit:
cscript %SYSTEMDRIVE%\inetpub\adminscripts\adsutil.vbs SET W3SVC/AppPools/Enable32bitAppOnWin64 1
%SYSTEMROOT%\Microsoft.NET\Framework\v2.0.50727\aspnet_regiis.exe -i
(from this MS support page)
But iis stopped working altogether (got "server unavailable" on a white page instead of web sites).
Reversing the above (see the link) didn't work at first either. The ASP.NET tab disappeared from our IIS web site properties and we had to mess around for an hour uninstalling (aspnet_regiis.exe -u) and reinstalling 32 bit ASP.NET and adding Default.aspx manually back into default documents.
We'll probably try again in a few days, if anyone has anything to add in the meantime, please do.
Update:
This seems at odds with everything we've found out so far, but our testing shows that this problem happens only in Firefox, not IE or Chrome (!!??).
Update: The Solution
For anyone finding this later:
On Aristos's suggestion (see accepted answer) we searched the code for the HTTP Header "Content-Length". There was a page which set it, a page that pulls an image out of the DB for displaying a company logo (spits it straight to response, i.e.: instead of linking to say "log56.gif" you can link to "ThisImagePage.aspx?id=56" and it will serve the specified gif from the DB).
We commented out the line:
HttpContext.Current.Response.AddHeader("Content-Length", File.Length.ToString());
... and it worked. If anyone can see a bug in that, let us know, otherwise I guess it was some kind of IIS or load balancer configuration problem, that only appears when manually setting the content-length on binary files, and only in Firefox (!?).

MGOwen I will share my experience for a similar problem that I was have.
Some time ago I have a similar problem, the pages work well, except some pages that after compress them with gZipped they have problems and not working correctly, something like yours.
I discover that the problem was because I set the Content-Length on header, and then for some reason when was going to be gZipped the Content-Length, didn't change / or was calculate false, and the result was a similar error like yours.
So check if you set the Content-Length in a way on your pages, and then use gZip filter. If yes then remove the Content-Length setting in your program.
In general speaking I say that the length have the problem on your page, and the lenght is a variable in Content-Length.
-and in your header the Content-Length exist !-
update
Also one other think that I notice, if your page send as gZip where is that in your head ? if this is the full head of your page.

The text you are seeing is the page header. I'm guessing both it and the xml parsing error are being caused by the output to the browser being cut off or, even more bizarrely, only a chunk from the response being relayed.
I'd start with the load balancer and see if there are any logs available. After that, I'd try disabling the IIS compression that Aristos mentioned and see if that has an affect (in IIS get properties on the "Web Sites" folder and then go to the Service tab or find out if compression was enabled / changed for that particular site).
After that, you'll probably have to resort to some kind of packet sniffer to see what is actually being sent on either side of the balancer.

are the machines sharing the same machine key?
This along with accessing Session state could cause strange errors.

It seems like this issue is with the IIS and .net.
Click on Start ->Run and type in the following command if you are using .NET 2.0 framework:
%Windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_regiis -i
Detailed discussion here:
http://social.msdn.microsoft.com/Forums/en-US/xmlandnetfx/thread/47f6eb6a-b062-4f4d-8b7f-b4afb1b2725d

Using #font-face slows load time. Can I force the client to cache the font?

Update:
Looks like the header request information is the culprit. How would I change the max-age property of the request header? TIA.
Hi, I'm using #font-face on a website and i'm experiencing delayed loading of the text (presumably due to the loading of the font every page). I understand the client has to download the font once to display properly, but every page?
Is there a way I can force the browser to cache that file? Or is there another alternative to speed up the font's loading time? (Is this a question more appropriate to post on Server Fault?)
Thanks in advance. Worst case, I'll live with the delay, so I don't need any "take off #font-face" answers... ;)
Additional Information:
I've tested this in both Safari (4) and Firefox (3.5RC1) on both Mac and Windows (XP and 7)
All the browsers I've tested on are currently set up to allow caching (it's on by default)
The URL is not dynamic, it's simply "/fonts/font.otf"
The font URL is correct, as the page loads the font and displays it correctly, albeit slower then normal
Request Header :
Cache-Control:max-age=0
If-Modified-Since:Wed, 24 Jun 2009 03:46:28 GMT
If-None-Match:W/"484d9f2-a5ac-46d10ff2ebcc0"
Referer:http://testurl.com/
User-Agent:Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6; en-us) AppleWebKit/530.13 (KHTML, like Gecko) Version/4.0 Safari/530.15
Response headers:
Connection:Keep-Alive
Date:Thu, 25 Jun 2009 02:21:31 GMT
Etag:"484d9f2-a5ac-46d10ff2ebcc0"
Keep-Alive:timeout=10, max=29
Server:Apache/2.2.11 (Unix) mod_ssl/2.2.11 OpenSSL/0.9.8i DAV/2 mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635

You can never force a browser to cache something, only encourage it. I can think of no reason why a font file with the correct expires headers wouldn't be cached which brings us to:
It's a browser bug (you don't say which browser)
Your cache control headers are missing or wrong
Your browser is configured to not cache anything (do images cache?)
Your font URL is dynamic so the browser thinks each request is for a different resource
The font face file is actually missing or or the URL misspelt.
The delay is NOT caused by the font download (you did say you presume this is the issue)
I think more information is in order.
EDIT: To set cache control is a server and language specific thing. Look at mod_expires for information on caching in Apache.

Are you sure your font files are cachable? Just like other static content, they should have far-future expires dates, and their headers should be configured to allow them to be cached. If you are hosting your fonts on a server farm, you will want to make sure your etag header is normalized across all the servers in the farm...otherwise subsequent requests for the font may force it to be re-downloaded from an alternative server even though the same data was already downloaded from another server.

Is there a way to keep a page from rendering once a person has logged out but hit the "back" button?

I have some website which requires a logon and shows sensitive information.
The person goes to the page, is prompted to log in, then gets to see the information.
The person logs out of the site, and is redirected back to the login page.
The person then can hit "back" and go right back to the page where the sensitive information is contained. Since the browser just thinks of it as rendered HTML, it shows it to them no problem.
Is there a way to prevent that information from being displayed when the person hits the "back" button from the logged out screen? I'm not trying to disable the back button itself, I'm just trying to keep the sensitive information from being displayed again because the person is not logged into the site anymore.
For the sake of argument, the above site/scenario is in ASP.NET with Forms Authentication (so when the user goes to the first page, which is the page they want, they're redirected to the logon page - in case that makes a difference).

The short answer is that it cannot be done securely.
There are, however, a lot of tricks that can be implemented to make it difficult for users to hit back and get sensitive data displayed.
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Cache.SetExpires(Now.AddSeconds(-1));
Response.Cache.SetNoStore();
Response.AppendHeader("Pragma", "no-cache");
This will disable caching on client side, however this is not supported by all browsers.
If you have the option of using AJAX then sensitive data can be retrieved using a updatepanel that is updated from client code and therefore it will not be displayed when hitting back unless client is still logged in.

Cache and history are independent and one shouldn't affect each other.
The only exception made for banks is that combination of HTTPS and Cache-Control: must-revalidate forces refresh when navigating in history.
In plain HTTP there's no way to do this except by exploiting browser bugs.
You could hack around it using Javascript that checks document.cookie and redirects when a "killer" cookie is set, but I imagine this could go seriously wrong when browser doesn't set/clear cookies exactly as expected.

From aspdev.org:
Add the following line on top of the Page_Load event handler and your ASP.NET page will not be cached in the users browsers:
Response.Cache.SetCacheability(HttpCacheability.NoCache)
Settings this property ensures that if the user hits the back-button the content will be gone, and if he presses "refresh" he will be redirected to the login-page.

DannySmurf, <meta> elements are extremely unreliable when it comes to controlling caching, and Pragma in particular even more so. Reference.

dannyp and others, no-cache does not stop caches from storing sensitive resources. It merely means that a cache cannot serve a resource it has stored without revalidating it first. If you wish to prevent sensitive resources from being cached, you need to use the no-store directive.

You could have a javascript function does a quick server check (ajax) and if the user is not logged in, erases the current page and replaces it with a message. This would obviously be vulnerable to a user whos javascript is off, but that is pretty rare. On the upside, this is both browser and server technology (asp/php etc) agnostic.

You are looking for a no-cache directive:
<META HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">
If you've got a master page design going, this may be a little bit of a juggle, but I believe you can put this directive on a single page, without affecting the rest of your site (assuming that's what you want).
If you've got this directive set, the browser will dutifully head back to the server looking for a brand new copy of the page, which will cause your server to see that the user is not authenticated and bump him to the login page.

Have the logout operation be a POST. Then the browser will prompt for "Are you sure you want to re-post the form?" rather than show the page.

I don't know how to do it in ASP.NET but in PHP I would do something like:
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT");
header("Cache-Control: no-cache");
header("Pragma: no-cache");
Which forces the browser to recheck that the item, so your authentication checking should be triggered, denying the user access.

It's a bit of a strain, but if you had a java applet or a flash application that was embedded and authentication was done through that you could make it so that they had to authenticate in, erm, 'real-time' with the server everytime they wanted to view the information.
Using this you could also encrypt any information.
There's always the possibility that someone can just save the page with the sensitive information on, having no cache isn't going to get around this situation (but then a screenshot can always be taken of a flash or java application).

For completeness:
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Cache.SetNoStore();
Response.Cache.SetExpires(DateTime.Now.AddMinutes(-1));

The correct answer involves use of setting the HTTP Cache-Control header on the response. If you want to ensure that they never cache the output, you can do Cache-Control: no-cache. This is often used in coordination with no-store as well.
Other options, if you want limited caching, include setting an expires time and must-revalidate, but these could potentially all cause a cached page to be displayed again.
See http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.4

Well, in a major brazilian bank corporation (Banco do Brasil) which is known by having one of the world´s most secure and efficient home banking software, they simply put history.go(1) in every page.So, if you hit the back button, you will be returned. Simple.

Please look into the HTTP response headers. Most of the ASP code that people are posting looks to be setting those. Be sure.
The chipmunk book from O'Reilly is the bible of HTTP, and Chris Shiflett's HTTP book is good as well.

You can have the web page with the sensitive be returned as an HTTP POST, then in most cases browsers will give you the message asking if you want want to resubmit the data. (Unfortunately I cannot find a canonical source for this behavior.)

I just had the banking example in mind.
The page of my bank has this in it:
<meta http-equiv="expires" content="0" />
This should be about this I suppose.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex