Does web browser treat css/js/img different from text - http

I have a webpage and in it some images etc. I clear my cache and hit the URL. The page request has the following response headers:
Cache-Control:public, max-age=600
Connection:keep-alive
Content-Encoding:gzip
Content-Language:en
Content-Type:text/html; charset=utf-8
Date:Wed, 21 Nov 2012 07:14:35 GMT
Etag:"1353481170-1"
Expires:Sun, 19 Nov 1978 05:00:00 GMT
Last-Modified:Wed, 21 Nov 2012 06:59:30 +0000
Server:Apache/2.2.3 (CentOS)
Vary:Cookie,Accept-Encoding
And one of the images has these response headers:
Accept-Ranges:none
Cache-Control:max-age=1209600
Connection:keep-alive
Content-Encoding:gzip
Content-Length:2206
Content-Type:text/css
Date:Wed, 21 Nov 2012 07:14:36 GMT
ETag:"6c4f9-89e-4cee5893ab000"
Expires:Wed, 05 Dec 2012 07:14:36 GMT
Last-Modified:Tue, 20 Nov 2012 04:19:12 GMT
Server:Apache/2.2.3 (CentOS)
Vary:Accept-Encoding
When I hit the same URL again and see the console in chrome I see that for the main request my browser sends a If-modified-since and gets a prompt 304 from the server whereas for the image the browser does not send out a request and serves from the cache.
My questions are following:
Does the browser treat css/JS/img differently from text coz if you see the first response for both the resources had a max-age as well as last-modified header although the number of secs were different. Then how come for one it still sent a request and not for the other.
In case we have max-age as well as last-modified header which takes precedence ? The purpose of max-age (AFAIK) is that we save the round trip to the server and as per HTTP if a cache gets a last-modified header it will always send a if-modified-since in the subsequent request.

No, the browser shouldn't treat the files differently. It should just obey what the headers say. The browser doesn't send a new request for the image because the file it received hasn't yet expired. Both the max-age and Expires headers specify that the doesn't expire for 2 weeks.
I don't think the Last-Modified header is relevant; age is calculated as the time since the object was sent (as specified in the Date header), not when it was modified.
A better question is which takes precedence between the max-age cache-control directive and the Expires header. RFC 2616 section 14.9.3 says:
If a response includes both an Expires header and a max-age directive, the max-age directive overrides the Expires header, even if the Expires header is more restrictive.
Once an object has expired, the Last-Modified time will be used in the If-Modified-Since header of the request.

Related

When is an "if-none-match"-request sent?

While optimizing the caching-behaviour of our website, I noticed that a whole lot of if-none-match-requests are sent to our site. As far as I understand caching, this should not be the case as long as the cache is still valid.
One particular request generates the following response-header:
HTTP/1.1 200 OK
Cache-Control: public, max-age=25920000
Transfer-Encoding: chunked
Content-Type: application/javascript; charset=utf-8
Content-Encoding: gzip
Expires: Thu, 04 Feb 2016 17:20:09 GMT
Last-Modified: Mon, 01 Jan 2001 23:00:00 GMT
ETag: W/"0"
Vary: Accept-Encoding
Server: Microsoft-IIS/8.5
X-AspNet-Version: 4.0.30319
Date: Fri, 10 Apr 2015 16:20:09 GMT
As you can see, the cache should be valid for 300 days. The way I understand it, the browser should use its cache directly during that period. Only after this period is over, it should issue a request with the header if-none-match.
But browsers seem to ignore that and send this if-none-match -request each and every time the page is loaded just to receive a 304-response ("Not Modified").
What do I need to change to keep browsers from sending these useless requests?
Yes, while the cache is fresh browsers should use a local copy without revalidation. However, this is not guaranteed. For example, when users use the Refresh button browsers make requests to the origin server anyway.
There is a Cache-Control: immutable, max-age=… extension that tells browsers you really really mean they should use the cached resource without contacting the server.

Is Expires header not needed now?

I see big player (i.e. akamai) started to drop the Expires header all together and only use Cache-Control, e.g.
curl -I https://fbcdn-sphotos-e-a.akamaihd.net/hphotos-ak-snc7/395029_379645875452936_1719075242_n.jpg
HTTP/1.1 200 OK
Last-Modified: Fri, 01 Jan 2010 00:00:00 GMT
Date: Sun, 25 Nov 2012 16:46:43 GMT
Connection: keep-alive
Cache-Control: max-age=1209600
So still any reason to keep using Expires?
Cache-Control was introduced in HTTP 1.1 to replace Expires. If both headers are present, Cache-Control is preferred over Expires:
If a response includes both an Expires header and a max-age
directive, the max-age directive overrides the Expires header, even
if the Expires header is more restrictive. This rule allows an origin
server to provide, for a given response, a longer expiration time to
an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be
useful if certain HTTP/1.0 caches improperly calculate ages or
expiration times, perhaps due to desynchronized clocks.
But there are still clients out there that can only HTTP 1.0. So for HTTP 1.0 requests/responses, you should still use Expires.

Expires header and performance

OK, I'm having aplay with expires headers in IIS6 on our development server and I don't really get it!
So if I don't add an expires header to a file I get the following request/response when viewed with firebug:
Accept */*
Accept-Encoding gzip, deflate
Accept-Language en-gb,en;q=0.5
Cache-Control no-cache
Connection keep-alive
Cookie __utma=222382046.267771103.1330592028.1337002926.1340787333.122; __utmz=222382046.1330592028.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=76038230.629470783.1340728034.1340728034.1340786921.2; __utmz=76038230.1340728034.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); timeOutCookie=Wed%20Jun%2027%202012%2011%3A17%3A22%20GMT+0100%20%28GMT%20Daylight%20Time%29; __utmb=76038230.26.10.1340786921; __utmb=222382046.5.10.1340787333; ASP.NET_SessionId=yhib5kyxf1m5azuhoogrstt5; __utmc=76038230; Travel2=ECC62DC4F9C36A41F3BCF0C54F96D877FEA32D4867DB1A3A97D0C6A3BE79EE98517B9B1A4E24289C863D86A2A4A846EA1FF4BF3822E8B6CBF872E25DD1ADF306F724EE1500AA71E28CFCD02476748163929B73856C505E50D185C05E6322488F
Host site
Pragma no-cache
Referer http://site/Agents/Flights/FlightSearch.aspx?
Response:
Accept-Ranges bytes
Content-Length 17864
Content-Type application/x-javascript
Date Wed, 27 Jun 2012 09:21:07 GMT
Etag "0de7d7f192dcd1:a07d"
Last-Modified Tue, 08 May 2012 12:53:00 GMT
Server Microsoft-IIS/6.0
X-Powered-By ASP.NET
Now if I press f5 now, the system retrieves the file from the client cache, cool!
Now if I add the expires header and press ctrl f5 I get a slightly different request/response:
Accept */*
Accept-Encoding gzip, deflate
Accept-Language en-gb,en;q=0.5
Cache-Control no-cache
Connection keep-alive
Cookie __utma=222382046.267771103.1330592028.1337002926.1340787333.122; __utmz=222382046.1330592028.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __utma=76038230.629470783.1340728034.1340728034.1340786921.2; __utmz=76038230.1340728034.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); timeOutCookie=Wed%20Jun%2027%202012%2011%3A21%3A11%20GMT+0100%20%28GMT%20Daylight%20Time%29; __utmb=76038230.27.10.1340786921; __utmb=222382046.5.10.1340787333; ASP.NET_SessionId=yhib5kyxf1m5azuhoogrstt5; __utmc=76038230; Travel2=ECC62DC4F9C36A41F3BCF0C54F96D877FEA32D4867DB1A3A97D0C6A3BE79EE98517B9B1A4E24289C863D86A2A4A846EA1FF4BF3822E8B6CBF872E25DD1ADF306F724EE1500AA71E28CFCD02476748163929B73856C505E50D185C05E6322488F
Host site
Pragma no-cache
Referer http://site/Agents/Flights/FlightSearch.aspx?
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20100101 Firefox/13.0.1
Response:
Accept-Ranges bytes
Cache-Control max-age=86400
Content-Length 17864
Content-Type application/x-javascript
Date Wed, 27 Jun 2012 09:24:41 GMT
Etag "0de7d7f192dcd1:a082"
Last-Modified Tue, 08 May 2012 12:53:00 GMT
Server Microsoft-IIS/6.0
X-Powered-By ASP.NET
Brilliant I've now got a max age in the cache control. Now what's confusing me is this as far as I can tell has now practical difference in how the site performs in terms of downloads. If I press f5 it gets it from the cache, if I press control f5 it gets it from the server with a HTTP 200.
So how does this improve performance? How do you get a HTTP 304 instead of a http 200? I just don't get what this practically archives?
any help would be good thanks
When you set Expires or max-age explicitly, you are telling the client that it will be safe to cache the response for that much time. The client will happily get it from cache, it will not touch your server, there will be no 304. Unless you do Ctrl+F5, which forces the browser to do a full request anew, resulting in 200.
Now what if you don’t set Expires nor max-age? This just means that the client will pick an expiration time by itself, heuristically. Your response is still cached, only the browser has to guess for how long.
So, Expires/max-age are useful in two cases.
If you want to recommend caching for a specific period of time—longer than a browser would guess. This is often done with versioned static content, which never changes, so expiration time is set on the order of years.
If you want to prevent caching, in which case you set Cache-Control: no-cache and Expires in the past (some versions of IE will ignore no-cache).
Conditional requests, 304 and all that, only come to play after the content has already expired. To revalidate it, the client might do a conditional GET, which, depending on your server setup, may or may not result in 304.
The performance improvements come from making fewer HTTP requests. When a browser is parsing a page and sees it has to request a CSS file, if it's already got a copy of it in it's cache with a max-age=31536000, it knows it's cached copy of the file is good for 1 year and doesn't have to make the HTTP request to fetch the file.
Less round trips to the server should result in a faster loading page, and a better experience for users.

Ensure responses are not cached

I have a particular HTTP response which I don't want cached because it has private/sensitive data in it
I'm already setting Cache-Control to no-store,
which should handle clients supporting HTTP/1.1.
How do I use the Expires header to do the same for HTTP/1.0? Should I just set it with an arbitrary timestamp from 1970 or something? Is there a special value to tell it never to cache?
The HTTP RFC says:
To mark a response as "already expired," an origin server sends an Expires date that is equal to the Date header value.
You should set the expires header to a date in the past. And you should also set the must-revalidate flag on the Cache-Control header.
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Cache-control: no-cache, must-revalidate
You can find a good article dealing with caching issues on the doctype wiki:
Setting an Expires header in the past ensures that HTTP/1.0 and
HTTP/1.1 proxies and browsers will not cache the content. The
Cache-control directive also tells HTTP/1.1 proxies not to cache the
content. Even if proxies may be configured to return stale content
when they should not, the must-revalidate re-affirms that they SHOULD
NOT do it.

Why is Google's home page logo served with contradictory "Expires" and "Cache-Control" headers?

Here is the logo currently used on www.google.com:
http://www.google.com/images/logos/ps_logo2.png
Here's its HTTP response:
HTTP/1.1 200 OK
Content-Type: image/png
Last-Modified: Thu, 05 Aug 2010 22:54:44 GMT
Date: Fri, 25 Mar 2011 16:41:05 GMT
Expires: Fri, 25 Mar 2011 16:41:05 GMT
Cache-Control: private, max-age=31536000
X-Content-Type-Options: nosniff
Server: sffe
Content-Length: 26209
Age: 0
Via: 1.1 localhost.localdomain
The Cache-Control header says it's good for 1 year. But Expires is the same as Date, i.e. it's stale immediately.
Why the difference?
Cache-Control overrides Expires on any HTTP/1.1 cache or client.
So I assume Google wants to cache the image for HTTP/1.1 but not cache it at all for HTTP/1.0.
I don't know why Google cares. I would think they'd want to cache the logo even for older clients.
The reason is that google wants the user to cache the image but not intermediate shared caches (hence the private directive).
Many intermediate cache systems can be outdated and ignore new HTTP features (as the cache-control header), so this approach makes them not to cache the resource (via the expires header). For the rest of agents understanding both, the cache-control overrides expires header.
This is a common practice referenced in rfc2616 sec14.9.3
An origin server might wish to use a relatively new HTTP cache control feature, such as the "private" directive, on a network including older caches that do not understand that feature. The origin server will need to combine the new feature with an Expires field whose value is less than or equal to the Date value. This will prevent older caches from improperly caching the response.

Resources