Quote from Cache-Control:
no-cache
The no-cache request directive asks caches to validate the response with the origin server before reuse.
Cache-Control: no-cache
no-cache allows clients to request the most up-to-date response even if the cache has a fresh response.
My understanding is that when using no-cache, the caches validate the response's time with the system time, if it's not the same, the clients will then request the latest response, is that correct?
If not, then how do caches validate the response when using Cache-Control: no-cache?
Thanks to Joe's comment, quoting from Hypertext Transfer Protocol (HTTP/1.1): Caching:
Validation
When a cache has one or more stored responses for a requested URI,
but cannot serve any of them (e.g., because they are not fresh, or
one cannot be selected; see Section 4.1), it can use the conditional
request mechanism [RFC7232] in the forwarded request to give the next
inbound server an opportunity to select a valid stored response to
use, updating the stored metadata in the process, or to replace the
stored response(s) with a new response. This process is known as
"validating" or "revalidating" the stored response.
Related
I have a website that caches data, it uses a content-delivery-network called akamai, and this is the response header. 'cache-control': 'must-revalidate, max-age=600'. This means, re-validate after 600 seconds (stale). If i want the cdn to query the origin server each request, i can do this... cache-control: no-cache. When i send this request, i get the same response header... indicating that it isn't being re-validated? Is it actually not being re-validated, or is it being re-validated? Since the website is well-known, it is safe to say that the website is correctly responding to headers.
What you've observed is correct behavior.
Your Cache-Control request header applies to this request, while the Cache-Control response applies to future requests. Whether or not your client wants a fresh response to this request will not and should not change the server's general directions as to how its resources can be cached.
As long as you use no-cache in your requests you should not get a cached response.
See the screenshot above. The response header has a cache-control set to max-age, which means the maximum amount of time a resource is considered fresh. I believe if we make a request within the time frame, the browser will serve the local copies without bothering asking the server. and the request header has a cache-control set to no-cache, that means, according to MDN,
response may be stored by any cache, even if the request is normally
non-cacheable. However, the stored response MUST always go through
validation with the origin server first before using it,
So here we have a contradiction:
the cache-control directive in the request is no-cache, so the user agent has to consult the server first before using the cache to fulfill the request.
The cache-control in response has a max-age being 86400, suggesting that within that time frame user agents can just use the cache to fulfill the request.
If the time specified in response's max-age hasn't expired, does the browser bypass the cache and send a request to the server because of its no-cache or not?
If the time specified in response's max-age hasn't expired, does the browser bypass the cache and send a request to the server because of its no-cache or not?
Yes, a request will be sent to the origin server. From the specification:
The no-cache request directive indicates that a cache MUST NOT use
a stored response to satisfy the request without successful
validation on the origin server.
There's no contradiction. The max-age in the response indicates how long it can be considered to be fresh. It doesn't obligate anyone to use it. Indeed, caching is an entirely optional part of HTTP, so sending a full request to the origin every time would also be fully compliant with the specification.
Now imagine that the response uses no-cache and the request uses max-age=86400. Again, a request would be sent to the origin server, because "the no-cache response directive indicates that the response MUST NOT be used to satisfy a subsequent request without successful validation on the origin server."
So the real asymmetry here is not between requests and responses, but between caching (optional) and not caching (obligatory when specified).
If the time specified in response's max-age hasn't expired, does the browser bypass the cache and send a request to the server because of
its no-cache or not?
Yes, it will be bypassed and sent a request to the server.
If the client sets max-age and there is no max-stale present, there is no request until the max-age expires. On the other hand, If the client sets no-cache, it always means a request sent without any conditions.
In conclusion, the max-age value of the current request compare to the last value of the response, and if there is no value or equal to no-cache that means always must send a request because the client not spouse to cache anything about that resource
I have one question: suppose in each http request there is a cache-control: max-age=0 header, so each request will go all the way to the origin web server.
Does it mean CDN is not useful anymore if all requests are like this?
from other post:
When sent by the user agent
I believe shahkalpesh's answer applies to the user agent side. You can also look at 13.2.6 Disambiguating Multiple Responses.
If a user agent sends a request with Cache-Control: max-age=0 (aka. "end-to-end revalidation"), then each cache along the way will revalidate its cache entry (eg. with the If-Not-Modified header) all the way to the origin server. If the reply is then 304 (Not Modified), the cached entity can be used.
On the other hand, sending a request with Cache-Control: no-cache (aka. "end-to-end reload") doesn't revalidate and the server MUST NOT use a cached copy when responding.
It makes sense and match my result.
when cache is not expired in chrome,it will send request to CDN,CDN will query this with if-modified-since with origin ,then serve the end user.
By setting the max-age to 0, you effectively expire your page in your CDN edge cache immediately. Therefore, your CDN always hit your origin and render the CDN useless as you suggested.
Noticed from your other question that you are using Akamai. If so, then you can use the Edge-Control header to override your cache-control if you don't have direct control over that value, but still want to be able to leverage CDN functionality.
I'm trying to understand how the cache-control http header works.
The cache-control header can have the no-cache value. I have checked the definition in w3c and it said:
If the no-cache directive does not specify a field-name, then a cache
MUST NOT use the response to satisfy a subsequent request without
successful revalidation with the origin server.
It tells no-cache value will trigger validation for every request.
What I want to know is, what is cache validation and what it does in the http protocol?
thanks for your help guys. now i understand validation means check if cache contain latest content from server.
my further question would be what issues no-cache will fix. please provide some scenario, like after applied no-cache in http header, what security issue will be fixed.
thanks guys
The no-cache directive is not intended for a security purpose. Security gets covered in rules that define which data/resources a cdn/proxy server is not permitted to cache. So, if security is required, the no-store directive should be used by the client/server. Look under :
paragraph 2 under section 13.4 on https://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html
https://www.rfc-editor.org/rfc/rfc7234#section-3
The no-cache directive is used by the client when it is ready to accept a resource from a cache, provided there is a confirmation from the server that the cached resource is up to date (fresh). The proxy/cdn can use two methods to re-validate the resource's freshness :
If client sent an ETAG value, proxy/cdn can forward it to the server under an If-None-Match header. If server responds with '304 Not Modified', then the cached resource is fresh to serve.
Using an If-Modified-Since header with a date value that was received the last time the resource was downloaded from the server (was to be found under the Last-Modified header in server's last response).
I have a mess with this header, I have read that Cache-Control:must-revalidate oblige to validate all requests with the source before serving a cached item, but just the stale ones? or all no matter if stale or fresh? I have read both things in different places.
What is the difference with Cache-Control:no-cache ? Because these headers look equivalent to me.
UPDATE 1: I have read this from a book:
The Cache-Control: must-revalidate response header tells the cache
to bypass the freshness calculation mechanisms and revalidate on every
access:
#Peter O. has pointed out what the RFC says. So that old book is wrong.
UPDATE 2: In this tutorial : http://www.mnot.net/cache_docs/
no-cache — forces caches to submit the request to the origin server
for validation before releasing a cached copy, every time. This is
useful to assure that authentication is respected (in combination with
public), or to maintain rigid freshness, without sacrificing all of
the benefits of caching.
must-revalidate — tells caches that they must
obey any freshness information you give them about a representation.
HTTP allows caches to serve stale representations under special
conditions; by specifying this header, you’re telling the cache that
you want it to strictly follow your rules.
Section 14.9.4 of HTTP/1.1:
When the must-revalidate directive is present in a response
received by
a cache, that cache MUST NOT use the entry after it becomes
stale
to respond to a subsequent request without first revalidating it
with the
origin server
Section 14.8 of HTTP/1.1:
If the response includes the "must-revalidate" cache-control
directive, the cache MAY use that response in replying to a
subsequent request. But if the response is stale, all caches
MUST first revalidate it with the origin server...
So it appears that only stale responses must be revalidated if
must-revalidate is received.
For no-cache, see section 14.9.1:
If the no-cache directive does not specify a field-name [which is
the case
here], then a cache MUST NOT use the response to satisfy a
subsequent
request without successful revalidation with the origin server...
Thus, no-cache applies both to fresh and stale responses.
EDIT:
This phrase may be relevant here (section 13.3):
When a cache has a stale entry that it would like to use as a response
to a client's request, it first has to check with the origin server
(or possibly an intermediate cache with a fresh response) to see if
its cached entry is still usable.
So, must-revalidate is probably relevant when the cache has intermediate
caches, since otherwise the cache can check the intermediate cache for a
fresh response rather than check the origin server directly.