Ideal HTTP cache control headers for different types of resources - http

I want to find a minimal set of headers, that work with "all" caches and browsers (also when using HTTPS!)
On my web site, I'll have three kinds of resources:
(1) Forever cacheable (public / equal for all users)
Example: 0A470E87CC58EE133616F402B5DDFE1C.cache.html (auto generated by GWT)
These files are automatically assigned a new name, when they change content (based on the MD5).
They should get cached as much as possible, even when using HTTPS (so I assume, I should set Cache-Control: public, especially for Firefox?)
They shouldn't require the client to make a round-trip to the server to validate, if the content has changed.
(2) Changing occasionally (public / equal for all users)
Examples: index.html, mymodule.nocache.js
These files change their content without changing the URL, when a new version of the site is deployed.
They can be cached, but probably need a round-trip to be revalidated every time.
(3) Individual for each request (private / user specific)
Example: JSON responses
These resources should never be cached unencrypted to disk under no circumstances. (Except maybe I'll have a few specific requests that could be cached.)
I have a general idea on which headers I would probably use for each type, but there's always something I could be missing.

I would probably use these settings:
Cache-Control: max-age=31556926 – Representations may be cached by any cache. The cached representation is to be considered fresh for 1 year:
To mark a response as "never expires," an origin server sends an
Expires date approximately one year from the time the response is
sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one
year in the future.
Cache-Control: no-cache – Representations are allowed to be cached by any cache. But caches must submit the request to the origin server for validation before releasing a cached copy.
Cache-Control: no-store – Caches must not cache the representation under any condition.
See Mark Nottingham’s Caching Tutorial for further information.

Cases one and two are actually the same scenario.
You should set Cache-Control: public and then generate a URL with includes the build number / version of the site so that you have immutable resources that could potentially last forever.
You also want to set the Expires header a year or more in the future so that the client will not need to issue a freshness check.
For case 3, you could all of the following for maximum flexibility:
"Cache-Control", "no-cache, must-revalidate"
"Expires", 0
"Pragma", "no-cache"

Related

Caching strategy using ETag and Expires/Cache-control with no assets version/ID

After reading a lot about caching validators (more intensively after reading this answer on SO), I had a doubt that didn't find the answer anywhere.
My use-case is to serve a static asset (a javascript file, ie: https://example.com/myasset.js) to be used in other websites, so messing with their Gpagespeed/gmetrix score matters the most.
I also need their users to receive updated versions of my static asset every time I deploy new changes.
For this, I have the following response headers:
Cache-Control: max-age=10800
etag: W/"4efa5de1947fe4ce90cf10992fa"
In short, we can see the following flow in terms of how browser behaves using etag
For the first request, the browser has no value for the If-None-Match Request Header, so the Server will send back the status code 200 (Ok), the content itself, and a Response header with ETag value.
For the subsequent requests, the browser will add the previously received ETag value in a form of the If-None-Match Request Header. This way, the server can compare this value with the current value from ETag and, if both match, the server can return 304 (Not Modified) telling the browser to use the latest version of the file, or just 200 followed by the new content and the related ETag value instead.
However, I couldn't find any information in regards to using the Cache-Control: max-age header and how will this affect the above behavior, like:
Will the browser request for new updates before max-age has met? Meaning that I can define a higher max-age value (pagespeed/gmetrix will be happy about it) and force this refresh using only etag fingerprint.
If not, then what are the advantages of using etag and adding extra bits to the network?
No, the browser will not send any requests until max-age has passed.
The advantage of using ETag is that, if the file hasn't changed, you don't need to resend the entire file to the client. The response will be a small 304.
Note that you can achieve the best of both worlds by using the stale-while-revalidate directive, which allows stale responses to be served while the cache silently revalidates the resource in the background.

Checking if HTTP resource has changed after maximum cache time has expired

I'm trying to work out a new caching policy for the static resources on a website. A common problem is whenever javascript, CSS etc. is updated, many users hold onto stale versions because currently there are no caching specific HTTP headers included in the file responses.
This becomes a serious problem when, for example, the javascript updates are linked to server-side updates, and the stale javascript chokes on the new server responses.
Eliminating browser caching completely with a cache-control: max-age=0, no-cache seems like overkill, since I'd still like to take some pressure off the server by letting browsers cache temporarily. So, setting the cache policy to a maximum of one hour seems alright, like cache-control: max-age=3600, no-cache.
My understanding is that this will always fetch a new copy of the resource if the cached copy is older than one hour. I'd specifically like to know if it's possible to set a HTTP header or combination of headers that will instruct browsers to only fetch a new copy if the resource was last checked more than one hour ago AND if the resource has changed.
I'm just trying to avoid browsers blindly fetching new copies just because the cached resource is older than one hour, so I'd also like to add the condition that the resource has been changed.
Just to illustrate further what I'm asking:
New user arrives at site and gets fresh copy of script.js
User stays on site for 45 mins, browser uses cached copy of script.js all the time
User comes back to site 2 hours later, and browser asks the server if script.js has changed
If it has, then it gets a fresh copy and the process repeats
If it has not changed, then it uses the cached copy for the next hour, after which it will check again
Have I misunderstood things? Is what I'm asking how it actually works, or do I have to do something different?
Have I misunderstood things? Is what I'm asking how it actually works,
or do I have to do something different?
You have some serious misconceptions about what the various cache control directives do and why cache behaves as it does.
Eliminating browser caching completely with a cache-control:
max-age=0, no-cache seems like overkill, since I'd still like to take
some pressure off the server by letting browsers cache temporarily ...
The no-cache option is wrong too. Including it means the browser will always
check with the server for modifications to the file every time.
That isn't what the no-cache means or what it is intended for - it means that a client MUST NOT used a cached copy to satisfy a subsequent request without successful revalidation - it does not and has never meant "do not cache" - that is what the no-store directive is for
Also the max-age directive is just the primary means for caches to calculate the freshness lifetime and expiration time of cached entries. The Expires header (minus the value of the Date header can also be used) - as can a heuristic based on the current UTC time and any Last-Modified header value.
Really if your goal is to retain the cached copy of a resource for as long as it is meaningful - whilst minimising requests and responses you have a number of options.
The Etag (Entity Tag) header - this is supplied by the server in response to a request in either a "strong" or "weak" form. It is usually a hash based on the resource in question. When a client re-requests a resource it can pass the stored value of the Etag with the If-None-Match request header. If the resource has not changed then the server will respond with 304 Not Modified.
You can think Etags as fingerprints for resources. They can be used to massively reduce the amount of information sent over the wire - as only fresh data is served - but they do not have any bearing on the number of times or frequency of requests.
The last-modified header - this is supplied by the server in response to a request in HTTPdate format - it tells the client the last time the resource was modified.
When a client re-requests a resource it can pass the stored value of the last-modified header with the If-Modified-Since request header. If the resource has not changed since the time it was last modified then the server will respond with 304 Not Modified.
You can think of last modified as a weaker form of entity checking than Etags. It addresses the same problem (bandwidth/redundancy) it in a less robust way and again it has no bearing at all on the actual number of requests made.
Revving - a technique that use a combination of the Expires header and the name (URN) of a resource. (see stevesouders blog post)
Here one basically sets a far forward Expires header - say 5 years from now - to ensure the static resource is cached for a long time.
You then have have two options for updating - either by appending a versioning query string to the requests URL - e.g. "/mystyles.css?v=1.1" - and updating the version number as and when the resource changes. Or better - versioning the file name itself e.g. "/mystyles.v1.1.css" so that each version is cached for as long as possible.
This way not only do you reduce the amount of bandwidth - you will as eliminate all checks to see if the resource has changed until you rename it.
I suppose the main point here is none of the catch control directives you mention max-age, public, etc have any bearing at all on if a 304 response is generated or not. For that use either Etag / If-None-Match or last-modified / If-Modified-Since or a combination of them (with If-Modified-Since as a fallback mechanism to If-None-Match).
It seems that I have misunderstood how it works, because some testing in Chrome has revealed exactly the behavior that I was looking for in the 5 steps I mentioned.
It doesn't blindly grab a fresh copy from the server when the max-age has expired. It does a GET, and if the response is 304 (Not Modified), it continues using the cached copy until the next hour has expired, at which point it checks for changes again etc.
The no-cache option is wrong too. Including it means the browser will always check with the server for modifications to the file every time. So what I was really looking for is:
Cache-Control: public, max-age=3600

Is Cache-Control:must-revalidate obliging to validate all requests, or just the stale ones?

I have a mess with this header, I have read that Cache-Control:must-revalidate oblige to validate all requests with the source before serving a cached item, but just the stale ones? or all no matter if stale or fresh? I have read both things in different places.
What is the difference with Cache-Control:no-cache ? Because these headers look equivalent to me.
UPDATE 1: I have read this from a book:
The Cache-Control: must-revalidate response header tells the cache
to bypass the freshness calculation mechanisms and revalidate on every
access:
#Peter O. has pointed out what the RFC says. So that old book is wrong.
UPDATE 2: In this tutorial : http://www.mnot.net/cache_docs/
no-cache — forces caches to submit the request to the origin server
for validation before releasing a cached copy, every time. This is
useful to assure that authentication is respected (in combination with
public), or to maintain rigid freshness, without sacrificing all of
the benefits of caching.
must-revalidate — tells caches that they must
obey any freshness information you give them about a representation.
HTTP allows caches to serve stale representations under special
conditions; by specifying this header, you’re telling the cache that
you want it to strictly follow your rules.
Section 14.9.4 of HTTP/1.1:
When the must-revalidate directive is present in a response
received by
a cache, that cache MUST NOT use the entry after it becomes
stale
to respond to a subsequent request without first revalidating it
with the
origin server
Section 14.8 of HTTP/1.1:
If the response includes the "must-revalidate" cache-control
directive, the cache MAY use that response in replying to a
subsequent request. But if the response is stale, all caches
MUST first revalidate it with the origin server...
So it appears that only stale responses must be revalidated if
must-revalidate is received.
For no-cache, see section 14.9.1:
If the no-cache directive does not specify a field-name [which is
the case
here], then a cache MUST NOT use the response to satisfy a
subsequent
request without successful revalidation with the origin server...
Thus, no-cache applies both to fresh and stale responses.
EDIT:
This phrase may be relevant here (section 13.3):
When a cache has a stale entry that it would like to use as a response
to a client's request, it first has to check with the origin server
(or possibly an intermediate cache with a fresh response) to see if
its cached entry is still usable.
So, must-revalidate is probably relevant when the cache has intermediate
caches, since otherwise the cache can check the intermediate cache for a
fresh response rather than check the origin server directly.

HTTP Headers: Controlling Cache and History Mechanism

I'm trying to figure out the best HTTP headers to send for four use cases. I'm hoping to come up with headers that do not depend on user agent / protocol version sniffing but I'll accept that if nothing else fits. All URLs are fetched through fully custom handler so I can select all headers as I like, this is all about intermediate proxies and user agents. If possible, this should be compatible with both HTTP/1.0 and HTTP/1.1 clients. If multiple solutions exists, the best one will be the shortest one when sent over the wire.
Static public content
All "Static public content" is stuff that HTTP is really all about: if the URL is the same, the content is the same. I can do this easily: for example, I put user profile icon into http://domain.com/profiles/xyz/icon/1234abcd where "1234abcd" is the SHA-1 of the file contents of the icon. If I change to icon in the future, I'll create a new URL and and modify all existing referrers that should use the new icon. What are the best headers to declare that this may be cached forever and may be shared? I'm currently thinking something along the lines:
Date: <current time>
Expires: <current time + one year>
Is this enough to allow caching by user agents and proxies? Do I need Last-Modified or Pragma?
Static non-public content
All "Static non-public content" is stuff that is static but may not be available to everybody. In fact, this content will be available only to selected logged in users (session is kept with session cookie holding session UUID). If the URL is the same, the content is the same. However, the response is not public. An use case could be an image shared to selected friends in a social network service. I'm currently thinking something along the lines:
Date: <current time>
Expires: <current time>
Cache-Control: private, max-age=<huge number>, s-maxage=0
Is this enough to allow caching by user agents and and disable proxies? Do I need Pragma?
Volatile public content
All "Volatile public content" is stuff that is volatile and available to everybody. Something like frontpage of http://slashdot.org/ when not logged in. The intent is to allow rapidly updating content in a non-changing URL. Note that I do NOT want to break the user agent history mechanism (that is, clicking something from a volatile page and then hitting the back button should not result in fetching the volatile page from the server -- however, clicking a link that goes to front page should fetch the resource from the server). I'm currently thinking something along the lines:
Date: <current time>
Expires: <current time>
Cache-Control: public, max-age=0, s-maxage=0
Is this enough to prevent caching but to allow history mechanism (back button)? I know that if I send Cache-Control: no-store, must-revalidate I can force reloading but this is not what I want because that will break the back button, too. Do I need Last-Modified or Pragma?
Even though this is public, it probably does not make sense to allow intermediate proxies to cache this because it's volatile.
Volatile non-public content
All "Volatile non-public content" is stuff that is volatile and not available to everybody (private). Something like frontpage of http://slashdot.org/ when you are logged in. The intent is to allow rapidly updating content in a non-changing URL. Note that I do NOT want to break the user agent history mechanism (that is, clicking something from a volatile page and then hitting the back button should not result in fetching the volatile page from the server -- however, clicking a link that goes to front page should fetch the resource from the server). I'm currently thinking something along the lines:
Date: <current time>
Expires: <current time>
Cache-Control: private, max-age=0, s-maxage=0
Is this enough to prevent caching but to allow history mechanism (back button)? Do I need Pragma?
Things that still need testing with my suggested headers:
Verify that private content will not be leaked through HTTP/1.0 proxies.
Verify that caching works correctly in proxies.
Verify that caching works correctly in user agents.
Verify that user agent history mechanism works in user agents (all cases).
Verify that following a link to a volatile page fetches fresh content from the server.
Verify all the results when using HTTPS instead of HTTP.
I'll answer my own question:
Static public content
Date: <current time>
Expires: <current time + one year>
Rationale: This is compatible with the HTTP/1.0 proxies and RFC 2616 Section 14: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.21
The Last-Modified header is not needed for correct caching (because conforming user agents follow the Expires header) but may be included for the end user consumption. Including the Last-Modified header may also decrease the server data transfer in case user hits the Reload/Refresh button. If Last-Modified header is added, it should reflect real data instead of something invented up. If you want to decrease server data transfer (in case user hits Reload/Refresh button) and cannot include real Last-Modified header, you may add ETag header to allow conditional GET (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26). If you already include Last-Modified also adding ETag is just waste. Note that Last-Modified is clearly superior because it's supported by HTTP/1.0 clients and proxies, too. A suitable value for ETag in case of dynamic pages is SHA-1 of the contents of the page/resource. Note that using Last-Modified or ETag will not help with the server load, only with the server outgoing internet pipe / data transfer rate.
Static non-public content
Date: <current time>
Expires: <current time>
Cache-Control: private, max-age=31536000, s-maxage=0
Vary: Cookie
Rationale: The Date and Expires headers are for HTTP/1.0 compatibility and because there's no sensible way to specify that the response is private, these headers communicate that the response may not be cached. The Cache-Control header tells that this response may be cached by private cache but shared cache may not cache the response. The s-maxage=0 is added because private may not be supported by all proxies that support Cache-Control (http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.3 - I have no idea which proxies are broken). The max-age is set to value of 60*60*24*365 (1 year) because the HTTP/1.1 specification does not define any upper limit for this parameter, I guess that this is implementation dependant. The Expires headers SHOULD be limited to one year in the future, so using the same logic here should be okay. The Vary: Cookie header is required because the session that is used to check if the visitor is allowed to see the content is transferred in a cookie; because the returned response depends on the cookie value the cache may not use cached response if cookie header is changed.
I might personally break the last part. By not including the Vary: Cookie header I can improve caching a lot. For example: I have a profile image at http://example.com/icon/12 which is returned only for selected authenticated users. I have a visitor X with session id 5f2 and I allow the image to that user. Visitor X logs out and then later logs in again. Now X has session id 2e8 stored in his session cookie. If I have Vary: cookie, the user agent of X cannot use the cached image and is forced to reload this to its cache. Because the content varies by Cookie, a conditional GET with last modification time cannot be used. I haven't tested if using ETag could help in this case because in that case, the server response would be the same (match the SHA-1 ETag computed from the contents of the response). Be warned that Internet Explorer (at least up to version 9) always forces conditional GET for resources that include Vary: Cookie even if suitable response were already in cache (source: http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx). This is because internal cache implementation of MSIE does not remember which Cookie it sent the first time so it cannot know if the current Cookie is the same one.
However, here's an example of a problem that is caused by dropping the Vary: Cookie header to show why this is indeed required for technically correct behavior: see the example above and imagine that after X has logged out, visitor Y logs in with the same user agent (the user agent may have been restarted between X and Y, it does not matter). If Y views a page that includes a link to http://example.com/icon/12 then Y will see the icon embedded inside the page even though Y wouldn't be able to see the icon if X had not been using the same user agent previously. In my case I don't consider this a big enough problem because Y would be able to access the icon manually by inspecting the user agent cache regardless of possibly added Vary: Cookie. However, this issue may prevent Y from noticing that he wouldn't technically have access to this content (this may be important e.g. if Y is co-authoring the content). If the content is considered sensitive, the server must send no-store regardless of the problems caused by this Cache-Control directive.
Here too, adding Last-Modified header will help with users hitting Reload/Refresh button (see discussion above).
Volatile public content
Date: <current time>
Expires: <current time>
Cache-Control: public, max-age=0, s-maxage=0
Last-Modified: <real-last-modification-time>
Rationale: Tell HTTP/1.0 clients and proxies that this response should be considered stale immediately. The Last-Modified time is included to allow skipping content data transmission when the resource is accessed again and client supports conditional GET. If the Last-Modified cannot be used, ETag may be used as a replacement (see discussion above). It's critical to use Last-Modified to allow conditional GET with HTTP/1.0 compatible clients.
If the content may be delayed even slightly, then Expires, max-age and s-maxage [sic] should be adjusted suitably. For example, adding 5 seconds to those might help a lot for highly popular site, as suggested by symcbean's answer. Note that unlike conditional GET, increasing the expiry time will decrease server load instead of just decreasing server outgoing data traffic (because the server will see less requests in total).
Volatile non-public content
Date: <current time>
Expires: <current time>
Cache-Control: private, max-age=0, s-maxage=0
Last-Modified: <real-last-modification-time>
Vary: Cookie
Rationale: Tell HTTP/1.0 clients and proxies that this response should be considered stale immediately. The Last-Modified time is included to allow skipping content data transmission when the resource is accessed again and client supports conditional GET. If the Last-Modified cannot be used, ETag may be used as a replacement (see discussion above). It's critical to use Last-Modified to allow conditional GET with HTTP/1.0 compatible clients. Also note that Cache-Control must not include no-cache, must-revalidate or no-store because using any of these directives will break the back button in at least one user agent. However, if the content the server is transferring contains sensitive material that should not be stored in permanent storage, the no-store flag MUST be used regardless of breaking the back button. Warning: note that the use of no-store cannot prevent sensitive material ending up on the hard disk without encryption if the operating system has swapping enabled and the swap is not encrypted! Also note that using no-store makes very little sense unless the connection is encrypted (HTTPS/SSL).
Mostly OK, however you do need to bear in mind that HTTP/1.0 proxies may cache content served up as
Cache-Control: private
So you should set an explicit Date-modified header as well as the expires header.
For your 'Static non-public content' you should add a 'Varies: Cookie' header.
For your 'Volatile public content': How fast is it changing? Setting an TTL of +5 seconds may offload a lot of effort from your servers.
For 'Volatile non-public content' you should probably add no-cache,must-revalidate to the Cache-control header.
Pragma headers issued from the server should have no effect on clients nor proxies.
Do test out what happens when your cache expires (IME you can end up with a system even slower than one accessed with no populated cache due to all the conditional requests / 304 responses)

Why both no-cache and no-store should be used in HTTP response?

I'm told to prevent user-info leaking, only "no-cache" in response is not enough. "no-store" is also necessary.
Cache-Control: no-cache, no-store
After reading this spec http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html, I'm still not quite sure why.
My current understanding is that it is just for intermediate cache server. Even if "no-cache" is in response, intermediate cache server can still save the content to non-volatile storage. The intermediate cache server will decide whether using the saved content for following request. However, if "no-store" is in the response, the intermediate cache sever is not supposed to store the content. So, it is safer.
Is there any other reason we need both "no-cache" and "no-store"?
I must clarify that no-cache does not mean do not cache. In fact, it means "revalidate with server" before using any cached response you may have, on every request.
must-revalidate, on the other hand, only needs to revalidate when the resource is considered stale.
If the server says that the resource is still valid then the cache can respond with its representation, thus alleviating the need for the server to resend the entire resource.
no-store is effectively the full do not cache directive and is intended to prevent storage of the representation in any form of cache whatsoever.
I say whatsoever, but note this in the RFC 2616 HTTP spec:
History buffers MAY store such responses as part of their normal operation
But this is omitted from the newer RFC 7234 HTTP spec in potentially an attempt to make no-store stronger, see:
https://www.rfc-editor.org/rfc/rfc7234#section-5.2.1.5
Under certain circumstances, IE6 will still cache files even when Cache-Control: no-cache is in the response headers.
The W3C states of no-cache:
If the no-cache directive does not
specify a field-name, then a cache
MUST NOT use the response to satisfy a
subsequent request without successful
revalidation with the origin server.
In my application, if you visited a page with the no-cache header, then logged out and then hit back in your browser, IE6 would still grab the page from the cache (without a new/validating request to the server). Adding in the no-store header stopped it doing so. But if you take the W3C at their word, there's actually no way to control this behavior:
History buffers MAY store such responses as part of their normal operation.
General differences between browser history and the normal HTTP caching are described in a specific sub-section of the spec.
From the HTTP 1.1 specification:
no-store:
The purpose of the no-store directive is to prevent the inadvertent release or retention of sensitive information (for example, on backup tapes). The no-store directive applies to the entire message, and MAY be sent either in a response or in a request. If sent in a request, a cache MUST NOT store any part of either this request or any response to it. If sent in a response, a cache MUST NOT store any part of either this response or the request that elicited it. This directive applies to both non- shared and shared caches. "MUST NOT store" in this context means that the cache MUST NOT intentionally store the information in non-volatile storage, and MUST make a best-effort attempt to remove the information from volatile storage as promptly as possible after forwarding it.
Even when this directive is associated with a response, users might explicitly store such a response outside of the caching system (e.g., with a "Save As" dialog). History buffers MAY store such responses as part of their normal operation.
The purpose of this directive is to meet the stated requirements of certain users and service authors who are concerned about accidental releases of information via unanticipated accesses to cache data structures. While the use of this directive might improve privacy in some cases, we caution that it is NOT in any way a reliable or sufficient mechanism for ensuring privacy. In particular, malicious or compromised caches might not recognize or obey this directive, and communications networks might be vulnerable to eavesdropping.
no-store should not be necessary in normal situations, and can harm both speed and usability. It is intended for use where the HTTP response contains information so sensitive it should never be written to a disk cache at all, regardless of the negative effects that creates for the user.
How it works:
Normally, even if a user agent such as a browser determines that a response shouldn't be cached, it may still store it to the disk cache for reasons internal to the user agent. This version may be utilised for features like "view source", "back", "page info", and so on, where the user hasn't necessarily requested the page again, but the browser doesn't consider it a new page view and it would make sense to serve the same version the user is currently viewing.
Using no-store will prevent that response being stored, but this may impact the browser's ability to give "view source", "back", "page info" and so on without making a new, separate request for the server, which is undesirable. In other words, the user may try viewing the source and if the browser didn't keep it in memory, they'll either be told this isn't possible, or it will cause a new request to the server. Therefore, no-store should only be used when the impeded user experience of these features not working properly or quickly is outweighed by the importance of ensuring content is not stored in the cache.
My current understanding is that it is just for intermediate cache server. Even if "no-cache" is in response, intermediate cache server can still save the content to non-volatile storage.
This is incorrect. Intermediate cache servers compatible with HTTP 1.1 will obey the no-cache and must-revalidate instructions, ensuring that content is not cached. Using these instructions will ensure that the response is not cached by any intermediate cache, and that all subsequent requests are sent back to the origin server.
If the intermediate cache server does not support HTTP 1.1, then you will need to use Pragma: no-cache and hope for the best. Note that if it doesn't support HTTP 1.1 then no-store is irrelevant anyway.
If you want to prevent all caching (e.g. force a reload when using the back button) you need:
no-cache for IE
no-store for Firefox
There's my information about this here:
http://blog.httpwatch.com/2008/10/15/two-important-differences-between-firefox-and-ie-caching/
For chrome, no-cache is used to reload the page on a re-visit, but it still caches it if you go back in history (back button). To reload the page for history-back as well, use no-store. IE needs must-revalidate to work in all occasions.
So just to be sure to avoid all bugs and misinterpretations I always use
Cache-Control: no-store, no-cache, must-revalidate
if I want to make sure it reloads.
If a caching system correctly implements no-store, then you wouldn't need no-cache. But not all do. Additionally, some browsers implement no-cache like it was no-store. Thus, while not strictly required, it's probably safest to include both.
Note that Internet Explorer from version 5 up to 8 will throw an error when trying to download a file served via https and the server sending Cache-Control: no-cache or Pragma: no-cache headers.
See http://support.microsoft.com/kb/812935/en-us
The use of Cache-Control: no-store and Pragma: private seems to be the closest thing which still works.
Originally we used no-cache many years ago and did run into some problems with stale content with certain browsers... Don't remember the specifics unfortunately.
We had since settled on JUST the use of no-store. Have never looked back or had a single issue with stale content by any browser or intermediaries since.
This space is certainly dominated by reality of implementations vs what happens to have been written in various RFCs. Many proxies in particular tend to think they do a better job of "improving performance" by replacing the policy they are supposed to be following with their own.
Just to make things even worse, in some situations, no-cache can't be used, but no-store can:
http://faindu.wordpress.com/2008/04/18/ie7-ssl-xml-flex-error-2032-stream-error/
To answer the question, there are two players here, the client (request) and the server (response).
Client:
The client can only request with ONE cache method. There are different methods and if not specified, will use default.
default: Inspect browser cache:
If cached and "fresh": Return from cache.
If cached, stale, but still "valid": Return from cache, and schedule a fetch to update cache (for next use).
If cached and stale: Fetch with conditions, cache, and return.
If not cached: Fetch, cache, and return.
no-store: Fetch and return.
reload: Fetch, cache, and return. (default-4)
no-cache: Inspect browser cache:
If cached: Fetch with conditions, cache, and return. (default-3)
If not cached: Fetch, cache, and return. (default-4)
force-cache: Inspect browser cache:
If cached: Return it regardless if stale.
If not cache: Fetch, cache, and return. (default-4)
only-if-cached: Inspect browser cache:
If cached: Return it regardless if stale.
If not cached: Throw network error.
Notes:
Still "valid" means the current age is within the stale-while-revalidate lifetime. It needs "revalidation", but is still acceptable to return.
"Fetch" here, for simplicity, is short for "non-conditional network
fetch".
"Fetch with conditions" means fetch using headers like
If-Modified-Since, or ETag so the server can respond with 304: (Not Modified).
https://fetch.spec.whatwg.org/#concept-request-cache-mode
Server::
Now that we understand what the client can do, the server responses make more sense.
Looking at the Cache-Control header, if the server returns:
no-store: Tells client to not use cache at all
no-cache: Tells client it should do conditional requests and ignore freshness
max-age: Tells client how long a cache is "fresh"
stale-while-revalidate: Tells client how long cache is "valid"
immutable: Cache forever
Now we can put it all together. That means the only possibilities are:
Non-conditional network fetch
Conditional network fetch
Return stale cache
Return stale but valid cache
Return fresh cache
Return any cache
Any combination of client, or server can dictate what method, or set of methods, to use. If the server returns no-store, it's not going to hit the cache, no matter what the client request type. If the client request was no-store, it doesn't matter what the server returns, it won't cache. If the client doesn't specify a request type, the server will dictate it with Cache-Control.
It makes no sense for a server to return both no-cache and no-store since no-store overrides everything. Yes, you've probably seen both together, and it's useless outside of broken browser implementations. Still, no-store has been part of spec since 1999: https://datatracker.ietf.org/doc/html/rfc2616#section-14.9.2
In real life usage, if your server supports 304: Not Modified, and you want to use client cache as a way to improve speed, but still want to force a network fetch, use no-cache. If don't support 304, and want to force a network fetch, use no-store. If you're okay with cache sometimes, use freshness and revalidation headers.
In reality, if you're mixing up no-cache and no-store on the client, very little would change. Then, just a couple of headers get sent and there will different internal responses handled by the browser. An issue can occur if you use no-cache and then forget to use it later. no-cache tells it to store the response in the cache, and a later request without it might trigger internal cache.
There are times when you may want to mix methods even on the same resource based on context. For example, you may want to use reload on a service worker and background sync, but use default for the web page itself. This is where you can manipulate the user agent (browser) cache to your liking. Just remember that the server generally has the final say as to how the cache should work.
To clarify some possible future confusion. The client can use the Cache-Control header on the request, to tell the server to not use its own cache system when responding. This is unrelated to the browser/server dynamic, and more about the server/database dynamic.
Also no-store technically means must not store to any non-volatile storage (disk) and release it from volatile storage (memory) ASAP. In practice, it means don't use a cache at all. The command actually goes both ways. A client request with no-store shouldn't write to disk or database and is meant to transient.
TL;DR: no-store overrides no-cache. Setting both is useless, unless we are talking out-of-spec or HTTP/1.0 browsers that don't support no-store (Maybe IE11?). Use no-cache for 304 support.
A pretty old topic but I'll share some recent ideas:
no-store: Must not attempt to store anything, and must also take action to delete any copy it might have.
no-cache: Never use a local copy without first validating with the origin server. It prevents all possibility of a cache hit, even with fresh resources.
So, answering the question, using only one of them is enough.
Also, some (not very) recent works prove that browsers are more Cache-Control compatible nowadays.
OWASP discusses this:
What's the difference between the cache-control directives: no-cache, and no-store?
The no-cache directive in a response indicates that the response must not be used to serve a subsequent request i.e. the cache must not display a response that has this directive set in the header but must let the server serve the request. The no-cache directive can include some field names; in which case the response can be shown from the cache except for the field names specified which should be served from the server. The no-store directive applies to the entire message and indicates that the cache must not store any part of the response or any request that asked for it.
Am I totally safe with these directives?
No. But generally, use both Cache-Control: no-cache, no-store and Pragma: no-cache, in addition to Expires: 0 (or a sufficiently backdated GMT date such as the UNIX epoch). Non-html content types like pdf, word documents, excel spreadsheets, etc often get cached even when the above cache control directives are set (although this varies by version and additional use of must-revalidate, pre-check=0, post-check=0, max-age=0, and s-maxage=0 in practice can sometimes result at least in file deletion upon browser closure in some cases due to browser quirks and HTTP implementations). Also, 'Autocomplete' feature allows a browser to cache whatever the user types in an input field of a form. To check this, the form tag or the individual input tags should include 'Autocomplete="Off" ' attribute. However, it should be noted that this attribute is non-standard (although it is supported by the major browsers) so it will break XHTML validation.
Source here.

Resources