Resource cached with max-age=0 - http

I have a resource that is served with the following headers:
Cache-Control: private, max-age=0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/javascript;charset=utf-8
Date: Wed, 05 May 2021 12:22:58 GMT
etag: "1609459200000-gzip"
expires: Wed, 05 May 2021 12:22:58 GMT
last-modified: Fri, 01 Jan 2021 00:00:00 GMT
Vary: Accept-Encoding
Yesterday we updated this resource and we noted that about 10% of the traffic is, 24h later, still using the stale resource. In those cases, the browser uses the old cached resource without hitting the server (to revalidate or get a fresh version).
Why some browsers don't honor max-age and don't revalidate (or get a fresh version) of this resource?
Is it mandatory to use, in this case, no-cache to force a revalidation?
If I want to use the cached resource and revalidate only after 10 seconds, will this be enough for any modern browser?
Cache-Control: private, must-revalidate, max-age=10

Related

CloudFront ignores Cache-Control headers

CloudFront ignores my cache header and my pictures have to be picked up from the server again after a while.
~$ curl -I http://d2573vy43ojbo7.cloudfront.net/attachments/store/limit/64/3720c5574063aebc90511061b99de858740ad764c6981d2bf30ff121ada0/image.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 1645
Connection: keep-alive
Server: nginx/1.4.1
Date: Thu, 12 Feb 2015 14:37:41 GMT
Status: 200 OK
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers:
Access-Control-Allow-Method:
Cache-Control: public, must-revalidate, max-age=31536000
Expires: Fri, 12 Feb 2016 14:37:41 GMT
Content-Disposition: inline; filename="image.jpg"
Last-Modified: Thu, 12 Feb 2015 14:37:41 GMT
X-Content-Type-Options: nosniff
X-Request-Id: 239b0fda-cae9-452f-9d1b-ccbf035bbf69
X-Runtime: 3.457939
X-Cache: Miss from cloudfront
Via: 1.1 6cde3c778df412041adc7610331b57bc.cloudfront.net (CloudFront)
X-Amz-Cf-Id: yicAkZYc5XpowKRFMOXDKSJKBMWZ4kq2B3vLK8Q-Py124D8lQq_1lg==
I tried to get the same file yesterday and then it was the same, after the second time i tried it was reached and served by CloudFront but not anymore. It's the same for all my images. They are cached but are removed from the cache after a couple of hours.
What's wrong? My cache behavior settings on CloudFront is set to default and it uses Origin Cache Headers.
Take a look here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that are more popular.
It means that object is not popular enough to stay in cache for a longer time. If you have enough viewers hitting this object AND this particular CloudFront location, it would have stayed in cache longer

Weird HTTP Response Arduino

So, I wrote a program than is supposes to connected to a server, and it returns the time. It works on my server, but when I tried to use it on another server, it responses oddly. Here is the response from my server:
HTTP/1.1 200 OK
Date: Tue, 07 Jan 2014 00:06:20 GMT
Server: Apache/2.2.22 (Debian)
X-Powered-By: PHP/5.4.4-14+deb7u5
Set-Cookie: PHPSESSID=jlscamqbddtqibf9j7m0fu27p5; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Vary: Accept-Encoding
Content-Length: 6
Connection: close
Content-Type: text/html
4:06pm
which works great. Now here is the response from the other server (doesn't work):
HTTP/1.1 200 OK
Date: Tue, 07 Jan 2014 00:06:34 GMT
Server: Apache
Set-Cookie: PHPSESSID=krlqmoqgpiqm9b9u27agup53c7; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Connection: close
Content-Type: text/html
6
4:06pm
0
As you can see I'm getting some weird stuff before and after the expected response. The code on the server is exactly the same. And the code on the Arduino is the same except for the a couple strings.
Here is a pastebin of the code I am using: http://pastebin.com/TFF5h2Gw
Sorry there aren't a lot of comments and it's kinda jumbled together. I omitted a little bit of code that is used by other stuff that I haven't even gotten to test yet because I can't even get the time.
What you are seeing is a chunk-encoded response. That is okay as all HTTP/1.1 capable clients are supposed to understand this transport encoding. What is wierd is that the server is not explicitly marking the response as being chunk-encoded (This is usually done via the Transer-Encoding: chunked header).
A quick way to get rid of this is to issue a HTTP/1.0 request.

HTTP cache control: no expire date

I've found some HTTP headers related to caching:
Cache-Control: max-age=3600, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
I would like to set headers such that once the webpage is loaded it gets cached. It should expire after 24 hours, and if the browser wants to load it before it expires, it should load it from the cache (and not revalidate).
to set your content to expire afer 24 hours the http-header should be
Cache-Control: max-age=86400, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
The max-age tell the client that content is stale after 86400 seconds. must-revalidate tell the client to revalidate the content after expiration.
you can avoid Expires for HTTP/1.1, for HTTP/1.0 Expire header should be used.
refer to folowing linke for more details -
must revalidate
max-age
expire vs max-age

Difference between Pragma and Cache-Control headers?

I read about Pragma header on Wikipedia which says:
"The Pragma: no-cache header field is an HTTP/1.0 header intended for
use in requests. It is a means for the browser to tell the server and
any intermediate caches that it wants a fresh version of the resource,
not for the server to tell the browser not to cache the resource. Some
user agents do pay attention to this header in responses, but the
HTTP/1.1 RFC specifically warns against relying on this behavior."
But I haven't understood what it does? What is the difference between the Cache-Control header whose value is no-cache and Pragma whose value is also no-cache?
Pragma is the HTTP/1.0 implementation and cache-control is the HTTP/1.1 implementation of the same concept. They both are meant to prevent the client from caching the response. Older clients may not support HTTP/1.1 which is why that header is still in use.
There is no difference, except that Pragma is only defined as applicable to the requests by the client, whereas Cache-Control may be used by both the requests of the clients and the replies of the servers.
So, as far as standards go, they can only be compared from the perspective of the client making a requests and the server receiving a request from the client. The http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32 defines the scenario as follows:
HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had
sent "Cache-Control: no-cache". No new Pragma directives will be
defined in HTTP.
Note: because the meaning of "Pragma: no-cache as a response
header field is not actually specified, it does not provide a
reliable replacement for "Cache-Control: no-cache" in a response
The way I would read the above:
if you're writing a client and need no-cache:
just use Pragma: no-cache in your requests, since you may not know if Cache-Control is supported by the server;
but in replies, to decide on whether to cache, check for Cache-Control
if you're writing a server:
in parsing requests from the clients, check for Cache-Control; if not found, check for Pragma: no-cache, and execute the Cache-Control: no-cache logic;
in replies, provide Cache-Control.
Of course, reality might be different from what's written or implied in the RFC!
Stop using (HTTP 1.0)
Replaced with (HTTP 1.1 since 1999)
Expires: [date]
Cache-Control: max-age=[seconds]
Pragma: no-cache
Cache-Control: no-cache
If it's after 1999, and you're still using Expires or Pragma, you're doing it wrong.
I'm looking at you Stackoverflow:
200 OK
Pragma: no-cache
Content-Type: application/json
X-Frame-Options: SAMEORIGIN
X-Request-Guid: a3433194-4a03-4206-91ea-6a40f9bfd824
Strict-Transport-Security: max-age=15552000
Content-Length: 54
Accept-Ranges: bytes
Date: Tue, 03 Apr 2018 19:03:12 GMT
Via: 1.1 varnish
Connection: keep-alive
X-Served-By: cache-yyz8333-YYZ
X-Cache: MISS
X-Cache-Hits: 0
X-Timer: S1522782193.766958,VS0,VE30
Vary: Fastly-SSL
X-DNS-Prefetch-Control: off
Cache-Control: private
tl;dr: Pragma is a legacy of HTTP/1.0 and hasn't been needed since Internet Explorer 5, or Netscape 4.7. Unless you expect some of your users to be using IE5: it's safe to stop using it.
Expires: [date] (deprecated - HTTP 1.0)
Pragma: no-cache (deprecated - HTTP 1.0)
Cache-Control: max-age=[seconds]
Cache-Control: no-cache (must re-validate the cached copy every time)
And the conditional requests:
Etag (entity tag) based conditional requests
Server: Etag: W/“1d2e7–1648e509289”
Client: If-None-Match: W/“1d2e7–1648e509289”
Server: 304 Not Modified
Modified date based conditional requests
Server: last-modified: Thu, 09 May 2019 19:15:47 GMT
Client: If-Modified-Since: Fri, 13 Jul 2018 10:49:23 GMT
Server: 304 Not Modified
last-modified: Thu, 09 May 2019 19:15:47 GMT

IIS6 not doing gzip compression when including Via header in request

I have some static content going through a CDN. I am using IIS6's built in compression (gzip & deflate) for static content and this is working fine when I request it. However, when the CDN makes the initial request for the content, it is not being returned compressed. They therefore don't have compressed content to forward to people requesting it. (Yes this raises the issue of people requesting [the zipped] content from the CDN with a browser that can't handle the compression. - We'll put that to one side for now though)
Here's an example of requesting without the 'Via' header:
HEAD /flash/swfobject.js HTTP/1.1
User-Agent: curl/7.19.7 (i386-pc-win32)
Host: localhost:9120
Accept: */*
Connection: Keep-Alive
accept-encoding: gzip
And it returns a compressed response:
HTTP/1.1 200 OK
Content-Length: 4357
Content-Type: application/x-javascript
Content-Encoding: gzip
Expires: Wed, 01 Jan 2020 00:00:00 GMT
Last-Modified: Wed, 18 Nov 2009 15:36:52 GMT
Accept-Ranges: bytes
Vary: Accept-Encoding
Server: Microsoft-IIS/6.0
Date: Thu, 19 Nov 2009 10:27:50 GMT
However, if I include a 'Via' header in the request (as the CDN does) then the result comes back uncompressed:
Request:
HEAD /flash/swfobject.js HTTP/1.1
User-Agent: curl/7.19.7 (i386-pc-win32)
Host: localhost:9120
Accept: */*
Connection: Keep-Alive
Via: 1.1 204.160.105.17:80 (Footprint 4.5/FPMCP)
accept-encoding: gzip
Response:
HTTP/1.1 200 OK
Content-Length: 14602
Content-Type: application/x-javascript
Expires: Wed, 01 Jan 2020 00:00:00 GMT
Last-Modified: Wed, 18 Nov 2009 15:36:54 GMT
Accept-Ranges: bytes
Server: Microsoft-IIS/6.0
Date: Thu, 19 Nov 2009 10:29:52 GMT
Yes these demos use 'localhost' in the request. I get the same result using the actual domain name from various machines on various networks though.
Two questions then:
Could this be IIS not applying the compression due to the extra header? and if so what can I do about it?
How can I tell if the proxy is decompressing the content before returning it?
Bonus question 3 - What can I do to investigate this problem further?
I am aware of question 332049, but that has the header in the response, not the request.
I stumbled across your question while researching this myself. I uncovered an article on MSDN and the short answer is that the Via header is used for proxies and proxies typically mess up compression. You either have the option of removing the header or you can change the setting in the IIS metabase (HcNoCompressionForProxies="FALSE"). I had success with both options.

Resources