There are lots of question and answers about preventing the browser from caching and I've tried lots of them. It seems like Cache-Control: no-cache,no-store along with Pragma: no-cache and Expires: 0 is supposed to do the trick. But, Firefox only makes a single request for my image.
Here are the response headers I'm sending. Is there anything obvious that's missing here?
Accept-Ranges bytes
Cache-Control no-cache,no-store,private,must-revalidate,max-stale=0,post-check=0,pre-check=0
Content-Length 4830
Content-Type image/jpeg
Date Thu, 23 Mar 2000 22:16:32 GMT
Etag "706331614"
Expires 0
Last-Modified Thu, 23 Mar 2000 22:16:31 GMT
Pragma no-cache
Server lighttpd/1.4.32
Related
I get the following header on CURL request.
Am I correct to assume that because the last-modified is that back in the past - the request is not cached in the browser?
➜ curl -I https://dommain.com/assets/sounds/intro.mp3
HTTP/2 200
date: Sat, 12 Nov 2022 23:39:39 GMT
content-type: audio/mpeg
content-length: 33976
last-modified: Tue, 01 Jan 1980 00:00:01 GMT
etag: "12cea601-84b8"
cache-control: no-cache
accept-ranges: bytes
strict-transport-security: max-age=15724800; includeSubDomains
No. You might be thinking of the Expires header. In the absence of Cache-Control an Expires header with a date in the past would mean that the cached response couldn't be used without revalidation.
In this case, the cache policy is set by the Cache-Control: no-cache header. That means that the browser can store the response, but can't serve it without confirming that it's up to date by making a conditional request.
Since the response has an ETag header, that will be used to determine if the cached response is still valid or not. In the absence of ETag, Last-Modified would be used for that purpose.
In sum, what you have here is a perfectly standard approach to caching, especially for large files. The response can be stored indefinitely, but can only be served from the cache after the origin server has confirmed that the stored response is still valid.
I have a resource that is served with the following headers:
Cache-Control: private, max-age=0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/javascript;charset=utf-8
Date: Wed, 05 May 2021 12:22:58 GMT
etag: "1609459200000-gzip"
expires: Wed, 05 May 2021 12:22:58 GMT
last-modified: Fri, 01 Jan 2021 00:00:00 GMT
Vary: Accept-Encoding
Yesterday we updated this resource and we noted that about 10% of the traffic is, 24h later, still using the stale resource. In those cases, the browser uses the old cached resource without hitting the server (to revalidate or get a fresh version).
Why some browsers don't honor max-age and don't revalidate (or get a fresh version) of this resource?
Is it mandatory to use, in this case, no-cache to force a revalidation?
If I want to use the cached resource and revalidate only after 10 seconds, will this be enough for any modern browser?
Cache-Control: private, must-revalidate, max-age=10
CloudFront ignores my cache header and my pictures have to be picked up from the server again after a while.
~$ curl -I http://d2573vy43ojbo7.cloudfront.net/attachments/store/limit/64/3720c5574063aebc90511061b99de858740ad764c6981d2bf30ff121ada0/image.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 1645
Connection: keep-alive
Server: nginx/1.4.1
Date: Thu, 12 Feb 2015 14:37:41 GMT
Status: 200 OK
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers:
Access-Control-Allow-Method:
Cache-Control: public, must-revalidate, max-age=31536000
Expires: Fri, 12 Feb 2016 14:37:41 GMT
Content-Disposition: inline; filename="image.jpg"
Last-Modified: Thu, 12 Feb 2015 14:37:41 GMT
X-Content-Type-Options: nosniff
X-Request-Id: 239b0fda-cae9-452f-9d1b-ccbf035bbf69
X-Runtime: 3.457939
X-Cache: Miss from cloudfront
Via: 1.1 6cde3c778df412041adc7610331b57bc.cloudfront.net (CloudFront)
X-Amz-Cf-Id: yicAkZYc5XpowKRFMOXDKSJKBMWZ4kq2B3vLK8Q-Py124D8lQq_1lg==
I tried to get the same file yesterday and then it was the same, after the second time i tried it was reached and served by CloudFront but not anymore. It's the same for all my images. They are cached but are removed from the cache after a couple of hours.
What's wrong? My cache behavior settings on CloudFront is set to default and it uses Origin Cache Headers.
Take a look here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that are more popular.
It means that object is not popular enough to stay in cache for a longer time. If you have enough viewers hitting this object AND this particular CloudFront location, it would have stayed in cache longer
I read about Pragma header on Wikipedia which says:
"The Pragma: no-cache header field is an HTTP/1.0 header intended for
use in requests. It is a means for the browser to tell the server and
any intermediate caches that it wants a fresh version of the resource,
not for the server to tell the browser not to cache the resource. Some
user agents do pay attention to this header in responses, but the
HTTP/1.1 RFC specifically warns against relying on this behavior."
But I haven't understood what it does? What is the difference between the Cache-Control header whose value is no-cache and Pragma whose value is also no-cache?
Pragma is the HTTP/1.0 implementation and cache-control is the HTTP/1.1 implementation of the same concept. They both are meant to prevent the client from caching the response. Older clients may not support HTTP/1.1 which is why that header is still in use.
There is no difference, except that Pragma is only defined as applicable to the requests by the client, whereas Cache-Control may be used by both the requests of the clients and the replies of the servers.
So, as far as standards go, they can only be compared from the perspective of the client making a requests and the server receiving a request from the client. The http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32 defines the scenario as follows:
HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had
sent "Cache-Control: no-cache". No new Pragma directives will be
defined in HTTP.
Note: because the meaning of "Pragma: no-cache as a response
header field is not actually specified, it does not provide a
reliable replacement for "Cache-Control: no-cache" in a response
The way I would read the above:
if you're writing a client and need no-cache:
just use Pragma: no-cache in your requests, since you may not know if Cache-Control is supported by the server;
but in replies, to decide on whether to cache, check for Cache-Control
if you're writing a server:
in parsing requests from the clients, check for Cache-Control; if not found, check for Pragma: no-cache, and execute the Cache-Control: no-cache logic;
in replies, provide Cache-Control.
Of course, reality might be different from what's written or implied in the RFC!
Stop using (HTTP 1.0)
Replaced with (HTTP 1.1 since 1999)
Expires: [date]
Cache-Control: max-age=[seconds]
Pragma: no-cache
Cache-Control: no-cache
If it's after 1999, and you're still using Expires or Pragma, you're doing it wrong.
I'm looking at you Stackoverflow:
200 OK
Pragma: no-cache
Content-Type: application/json
X-Frame-Options: SAMEORIGIN
X-Request-Guid: a3433194-4a03-4206-91ea-6a40f9bfd824
Strict-Transport-Security: max-age=15552000
Content-Length: 54
Accept-Ranges: bytes
Date: Tue, 03 Apr 2018 19:03:12 GMT
Via: 1.1 varnish
Connection: keep-alive
X-Served-By: cache-yyz8333-YYZ
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1522782193.766958,VS0,VE30
Vary: Fastly-SSL
X-DNS-Prefetch-Control: off
Cache-Control: private
tl;dr: Pragma is a legacy of HTTP/1.0 and hasn't been needed since Internet Explorer 5, or Netscape 4.7. Unless you expect some of your users to be using IE5: it's safe to stop using it.
Expires: [date] (deprecated - HTTP 1.0)
Pragma: no-cache (deprecated - HTTP 1.0)
Cache-Control: max-age=[seconds]
Cache-Control: no-cache (must re-validate the cached copy every time)
And the conditional requests:
Etag (entity tag) based conditional requests
Server: Etag: W/“1d2e7–1648e509289”
Client: If-None-Match: W/“1d2e7–1648e509289”
Server: 304 Not Modified
Modified date based conditional requests
Server: last-modified: Thu, 09 May 2019 19:15:47 GMT
Client: If-Modified-Since: Fri, 13 Jul 2018 10:49:23 GMT
Server: 304 Not Modified
last-modified: Thu, 09 May 2019 19:15:47 GMT
Although I have set Expires to a date in the past, and Cache-Control to no-store, no-cache, I still get one of my web pages cached.
Here are the HTTP headers sent to the browser:
Date: Tue, 02 Nov 2010 09:13:23 GMT
Server: Apache/2.2.15 (el)
X-Powered-By: PHP/5.2.13
Set-Cookie: PHPSESSID=2luvb7b316lfc8ht570s1l1v84; path=/
Set-Cookie: Newsletter_Counter=17; expires=Wed, 02-Nov-2011 09:13:23 GMT; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 20
Connection: close
Content-Type: text/html; charset=UTF-8
Same behavior for FF 3.6, Safari and IE 8.
How do I get browsers to stop caching the page?
Browsers decide caching themselves. You can use a random GET parameter to force browsers not to cache, e.g.
http://www.foo.com/yourfile.zip?id=1234
The following headers have always worked well for me (for HTTP/1.1). You should not need Pragma: no-cache.
Cache-Control: no-cache
Expires: <some date in the past>
Vary: *
Try changing your Vary value to the asterisk from my example.
Per http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44:
"A Vary field value of "*" implies that a cache cannot determine from the request headers of a subsequent request whether this response is the appropriate representation."
Using Cache-Control: no-store should forbid any storage:
no-store
[…] If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it. This directive applies to both non- shared and shared caches. […]
You certainly seem to be doing the right things (but like a lot of people seem to assume that sending a 'Pragma: no-cache' response header has some effect on browser side caching - it should not).
What do you mean its getting cached? It will not (usually) be fetched again from the server if the user clicks on the 'back button' and was retrieved using a GET operation.