HTTP cache control: no expire date - http

I've found some HTTP headers related to caching:
Cache-Control: max-age=3600, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
I would like to set headers such that once the webpage is loaded it gets cached. It should expire after 24 hours, and if the browser wants to load it before it expires, it should load it from the cache (and not revalidate).

to set your content to expire afer 24 hours the http-header should be
Cache-Control: max-age=86400, must-revalidate
Expires: Fri, 30 Oct 1998 14:19:41 GMT
Last-Modified: Mon, 29 Jun 1998 02:28:12 GMT
The max-age tell the client that content is stale after 86400 seconds. must-revalidate tell the client to revalidate the content after expiration.
you can avoid Expires for HTTP/1.1, for HTTP/1.0 Expire header should be used.
refer to folowing linke for more details -
must revalidate
max-age
expire vs max-age

Related

Resource cached with max-age=0

I have a resource that is served with the following headers:
Cache-Control: private, max-age=0
Connection: keep-alive
Content-Encoding: gzip
Content-Type: application/javascript;charset=utf-8
Date: Wed, 05 May 2021 12:22:58 GMT
etag: "1609459200000-gzip"
expires: Wed, 05 May 2021 12:22:58 GMT
last-modified: Fri, 01 Jan 2021 00:00:00 GMT
Vary: Accept-Encoding
Yesterday we updated this resource and we noted that about 10% of the traffic is, 24h later, still using the stale resource. In those cases, the browser uses the old cached resource without hitting the server (to revalidate or get a fresh version).
Why some browsers don't honor max-age and don't revalidate (or get a fresh version) of this resource?
Is it mandatory to use, in this case, no-cache to force a revalidation?
If I want to use the cached resource and revalidate only after 10 seconds, will this be enough for any modern browser?
Cache-Control: private, must-revalidate, max-age=10

How to tell Chrome not use local cache prematurely

The page (just a static .html) is served with the following headers:
HTTP/1.1 200 OK
Server: nginx
Date: Wed, 29 Jul 2015 02:59:37 GMT
Content-Type: text/html; charset=utf-8
Last-Modified: Wed, 29 Jul 2015 02:53:23 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Vary: Accept-Encoding
Content-Encoding: gzip
Then, I close the tab, open the page again (by typing the url manually) and get the next response:
HTTP/1.1 304 Not Modified
Server: nginx
Date: Wed, 29 Jul 2015 02:58:45 GMT
Last-Modified: Wed, 29 Jul 2015 02:53:23 GMT
Connection: keep-alive
ETag: "55b84023-1ad"
I repeat it several times and then Chrome stops requesting it from the server and serves directly from its cache.
Status Code:200 OK (from cache)
Content-Encoding:gzip
Content-Type:text/html; charset=utf-8
Date:Wed, 29 Jul 2015 02:59:56 GMT
Last-Modified:Wed, 29 Jul 2015 02:53:23 GMT
Server:nginx
Vary:Accept-Encoding
Is there a way (server side) to tell Chrome not to do that, but respect HTTP caching headers on every request?

CloudFront ignores Cache-Control headers

CloudFront ignores my cache header and my pictures have to be picked up from the server again after a while.
~$ curl -I http://d2573vy43ojbo7.cloudfront.net/attachments/store/limit/64/3720c5574063aebc90511061b99de858740ad764c6981d2bf30ff121ada0/image.jpg
HTTP/1.1 200 OK
Content-Type: image/jpeg
Content-Length: 1645
Connection: keep-alive
Server: nginx/1.4.1
Date: Thu, 12 Feb 2015 14:37:41 GMT
Status: 200 OK
Access-Control-Allow-Origin: *
Access-Control-Allow-Headers:
Access-Control-Allow-Method:
Cache-Control: public, must-revalidate, max-age=31536000
Expires: Fri, 12 Feb 2016 14:37:41 GMT
Content-Disposition: inline; filename="image.jpg"
Last-Modified: Thu, 12 Feb 2015 14:37:41 GMT
X-Content-Type-Options: nosniff
X-Request-Id: 239b0fda-cae9-452f-9d1b-ccbf035bbf69
X-Runtime: 3.457939
X-Cache: Miss from cloudfront
Via: 1.1 6cde3c778df412041adc7610331b57bc.cloudfront.net (CloudFront)
X-Amz-Cf-Id: yicAkZYc5XpowKRFMOXDKSJKBMWZ4kq2B3vLK8Q-Py124D8lQq_1lg==
I tried to get the same file yesterday and then it was the same, after the second time i tried it was reached and served by CloudFront but not anymore. It's the same for all my images. They are cached but are removed from the cache after a couple of hours.
What's wrong? My cache behavior settings on CloudFront is set to default and it uses Origin Cache Headers.
Take a look here: http://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Expiration.html
If an object in an edge location isn't frequently requested, CloudFront might evict the object—remove the object before its expiration date—to make room for objects that are more popular.
It means that object is not popular enough to stay in cache for a longer time. If you have enough viewers hitting this object AND this particular CloudFront location, it would have stayed in cache longer

Can't get Firefox to stop caching

There are lots of question and answers about preventing the browser from caching and I've tried lots of them. It seems like Cache-Control: no-cache,no-store along with Pragma: no-cache and Expires: 0 is supposed to do the trick. But, Firefox only makes a single request for my image.
Here are the response headers I'm sending. Is there anything obvious that's missing here?
Accept-Ranges bytes
Cache-Control no-cache,no-store,private,must-revalidate,max-stale=0,post-check=0,pre-check=0
Content-Length 4830
Content-Type image/jpeg
Date Thu, 23 Mar 2000 22:16:32 GMT
Etag "706331614"
Expires 0
Last-Modified Thu, 23 Mar 2000 22:16:31 GMT
Pragma no-cache
Server lighttpd/1.4.32

How to crawl a wordpress blog?

I write a c program to crawl blogs. It works well until it meets this blog: www.ipujia.com. I send the HTTP request:
GET http://www.ipujia.com/ HTTP/1.0
to the website and get the response as below:
HTTP/1.1 301 Moved Permanently
Date: Sun, 27 Feb 2011 13:15:26 GMT
Server: Apache/2.2.16 (Unix) mod_ssl/2.2.16 OpenSSL/0.9.8e-fips-rhel5
mod_auth_passthrough/2.1 mod_bwlimited/1.4 FrontPage/5.0.2.2635 mod_perl/2.0.4
Perl/v5.8.8
X-Powered-By: PHP/5.2.14
Expires: Wed, 11 Jan 1984 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, max-age=0
Pragma: no-cache
Last-Modified: Sun, 27 Feb 2011 13:15:27 GMT
Location: http://http/www.ipujia.com/
Content-Length: 0
Connection: close
Content-Type: text/html; charset=UTF-8
This is strange because I cannot get the index page following the Location. Does anyone have any ideas?
The Location field in the response contains a malformed URI.
Location: http://http/www.ipujia.com/ (notice the protocol error)
Should be
Location: http://www.ipujia.com/
Unless you are in control of the server there is little you could do here.
To solve it could you not parse the "Location" response and attempt to extract a valid URI from the it?

Resources