Recognize HTTP 304 in service worker / fetch() - http

I build a service worker which always responds with data from the cache and then, in the background, sends a request to the server. If the server responds with HTTP 304 - not modified everything is fine, if the server responds with HTTP 200, that means the data was changed and the new file is put into the cache, also the user is notified and asked for a page refresh.
I use the not-modified-since / last-modified headers to make sure the client gets the most up-to-date version. When a request is sent via fetch() the request passes the HTTP-cache on it's way to the network - also the response passes the HTTP cache when it arrives on the client. The problem is when the response has the status 304 - not modified the HTTP cache responds to the service worker with the cached version and replaces the status with 200 (as it is described in the fetch specification - HTTP-network-or-cache-fetch). In the service worker there is no possibility to find out whether the 200 response was initially sent by the server (the user needs to be updated) or it was sent by the cache and the server originally responded with 304 (most up-to-date version is already loaded).
There is the cache-mode flag which can be set to no-cache, but this bypasses the HTTP-cache also when the request is sent to the server, which means that the if-modified-since header is not set and the server has no chance to find out which version the client has. Also this flag is only supported by firefox nightly right now.
I think the best solution is to set a custom HTTP header like x-was-modified when the server responds with 200. This custom header can be accessed in the service worker and can be used to find out whether a resource was updated or not - even if the HTTP cache replaces the 304 status with 200.
Is this a legit solution/workaround? Are there any suggested approaches to solve this problem?
Should I even rely on HTTP headers which are supposed to handle the HTTP cache when implementing the service worker cache? Or should I rather use custom x-if-modified-since / x-last-modified headers and use indexedDB to store the information on the client and append it to each request?
Why does fetch() even replace the 304 code with 200 if there is a up-to-date version in the cache?

You can’t rely on the status code (304 vs. 200) to determine whether something has changed. What if some other part of your code requests the same resource, thus updating the browser’s cache?
Instead, simply compare the response’s Last-Modified header to what you sent in If-Modified-Since, or whatever you last saw in Last-Modified. If the values don’t match, something has changed.
For more precision (if the data can change several times in 1 second), consider using ETag instead of Last-Modified.
Why does fetch() even replace the 304 code with 200 if there is a up-to-date version in the cache?
Because usually people just want to get fresh content, regardless of where it comes from. A 304 response is only interesting to those who implement their own HTTP caches.

Related

Is server code executed when server returns http status code 304

Does/should server code run when the server return status code 304?
I understand that the server should not return anything (client should use the cache), but I cant find any info on whether the server will executed the code in an api endpoint for example.
RFC 2616 section 10.3.5 describes the 304 Not Modified response:
If the client has performed a conditional GET request and access is
allowed, but the document has not been modified, the server SHOULD
respond with this status code. The 304 response MUST NOT contain a message-body
The server will send Date, ETag and/or Content-Location (200 only), or Expires, Cache-Control, and/or Vary if the respond may vary.
What is a Conditional GET?
An cliet application, browser, or proxy with a retained Last-Modified or Etag value will issue a Conditional GET as an initial header only. This allows the client to determine if the resource has been updated.
How does the client know if the resource changed?
Well it depends on how the server is configured.
The Origin Server May:
Ignore caching, serve every request new.
Similar to ignoring, you may develop the application to change query strings. This prevents caching at Proxy Servers and invalidates the client cache.
If configured to do so, issue a Last-Modified or Etag value. Often done for static content. Proxy Servers and Client Caches use to invalidate their version.
A Web application could issue a Last-Modified far
into the future then change the URL to invalidate stale content. This requires the application be developed with this feature in mind.
Resources may also be issued version numbers. This allows them to preserve Proxy Caches but invalidate Client Caches.
Does server code run when the server return status code 304?
Almost Never. Disregarding that it is technically possible for an incorrectly configured application to respond with a 304 Not Modified code instead of a 200 code.
With a ETag and/or Content-Location value, a server (nginx for example) can confirm nothing has changed without issued a call to the application. This also neatly handles resources with version numbers the same way.
For query strings (image.jpg?version=12), the client cache will invalidate the content. A Proxy Server will also invalidate, and the query will be requested fresh.
I understand that the server should not return anything (client should use the cache), but I cant find any info on whether the server will executed the code in an api endpoint for example.
I'm a fan of nginx, here is a good resource on how caching applies to it.
In short, as much as you can do to support various caches between your client and the application the more requests you can support per day.

cache-control not working without etag

I am sending the following header in the reponse. "Cache-Control: public, max-age=300", but still every time I hit refresh I get a 200 response(the request is made to he server again). Same happens if I add the "Expires" header.
But if I add a ETag to the headers, then I get 304 on refresh(the request goes to the server, the server prepares the response, then matches the ETag and returns a 204 response).
What should I change so that "Cache-Control" header is used and the
content is served from local cache and no request is sent to the server until the age becomes more than "max-age"?
EDIT: Here is an image that doesn't get cached https://image-dev-dot-quizizz-dev.appspot.com/resource/gs/quizizz-image/rejected.jpeg
Your image is in proxy cache - notice that in a response you get Age header. Furthermore, every next request in 300 seconds takes much less time. Why is the status not 304? According to this article:
200 (from cache) vs 304
Now the other day while performing a site performance audit I noticed
that a lot of our assets were returning 304 statuses. While comparing
another site I noticed that it was returning a 200 (from cache) status
code. This intrigued me and I wanted to dig deeper.
It turns out that when a 200 (from cache) response is given it means
that a future expiration date on the content is set. In essence the
browser doesn’t even really communicate with the server to check on
the file. It knows not to do it until the expiration date has
expired.
By contrast a 304 goes to the server and receives a response back that
the data has not change. The server is telling the browser to use the
cache as a result.

304 Not Modified with 200 (from cache)

I'm trying to understand what exactly is the difference between "status 304 not modified" and "200 (from cache)"
I'm getting 304 on javascript files that I changed last. Why does this happen?
Thanks for the assistance.
https://sookocheff.com/post/api/effective-caching/ is an excellent source to form the required understanding around these 2 HTTP status codes.
After reading this thoroughly, I had this understanding
In typical usage, when a URL is retrieved, the web server will return the resource's current representation along with its corresponding ETag value, which is placed in an HTTP response header "ETag" field. The client may then decide to cache the representation, along with its ETag. Later, if the client wants to retrieve the same URL resource again, it will first determine whether the local cached version of the URL has expired (through the Cache-Control and the Expire headers). If the URL has not expired, it will retrieve the local cached resource. If it determined that the URL has expired (is stale), then the client will contact the server and send its previously saved copy of the ETag along with the request in a "If-None-Match" field. (Source: https://en.wikipedia.org/wiki/HTTP_ETag)
But even when expires time for an asset is set in future, browser can still reach the server for a conditional GET using ETag as per the 'Vary' header. Details on 'vary' header: https://www.fastly.com/blog/best-practices-using-vary-header/
"200 (from cache)" means the browser found a cached response for the request made. So instead of making a network call to fetch that resource from the remote server, it simply made use of the cached response.
Now these cached responses have a life time associated with them. With time, the responses can become stale. And if these are not allowed to be served stale (see section 4.2.4 - RFC7234), the browser needs to contact the remote server and verify whether these cached responses are still valid or not. The 304 response status code is the servers' way of letting the browser know that the response hasn't changed and is still valid.
If the browser receives a 304 response, it can "freshen" the relevant stale response(s). And subsequent requests to the resource can again be served from the browser cache without checking with the remote server (until the response becomes stale again).
304 Modified
A 304 Not Modified means that the files have not been modified since the specified version by the "If-Modified-Since", or "If-None-Match".
200 OK
This is the response you'll get if an HTTP request works. GET requests will have something relating to the files. A POST request will have something holding the result of the action.
Happy coding!Lyfe

How does the AIR cache control work, what is the maximum age of cached content?

I'm currently working on a project that needs to request a url multiple times. Having studied the the HTTP Proxy (Charles) it seems that AIR will cache the first response and then return the same response for each subsequent request.
Does anybody know how to know if the response has been cached other than setting the URLRequest to useCache, but this doesn't say if the response was a cached response or not. The digest isn't set on the URLRequest either, although it does mention this is for swz only, so how does it know if the content is the current content or not? Is the responseHeaders used to find out how long to hold the cache i.e.
Cache-Control: max-age=900
Also does anyone know how to flush/purge the cache or are we at the whim of the GC and in that case how does it know if to leave it in the cache or now?
This makes sense to me, but still I would like to know how to regulate this cache.
Further more: I've tested a set up where parallel URLLoaders (10) are made and created which open the same url to see what happens in that instance. It seems that each parallel request is made until a successful response is given, all subsequent calls are then cached. Calls which are sent out before the successful request is then completed. It looks like the items which are already in being processed do not use the cache and return with correct data.
Additional The AIR runtime doesn't even send a "If-Modified-Since" header, so the cache isn't even honoring HTTP protocol. So it seems as if Adobe has implemented it's own version of a cache which doesn't even use HTTP/1.1 Header Field Definitions. Perfect.
Thanks for any help.
Simon
From the documentation of URLRequest class, it seems that it uses Operating System's HTTP Cache. On a windows 7 OS, it seems that it is using IE's cache.
You can use a HTTP monitor tool like Fiddler, and verify this.
First request is 200, and subsequent requests are 304. After clearing the IE cache. and run the application again, You can see that it results in a HTTP 200 status.

Max-age and 304 Not Modified Processing

I've been looking at the standards - but was not entirely sure about the following:
If we have a variant (resource, image, page etc) that is served with a cache setting of max-age=259200 (3 days) and the server is also processing ETags and last modified dates - then what will happen when the max-age is reached - but the resource has not been modified?
What I'm hoping will happen is that after 3 days - the client will request the resource again - and if it has not changed will received a 304 Not Modified response. If the cache control response (during the 304 response) also still contains max-age=259200 - then I'm hoping the client will continue to use its local cached copy and not request again for another 3 days.
What I'm afraid will happen is that once the max-age is reached - the client will no longer cache the resource - making a fresh request each time the resource is loaded - followed by a 304 Not Modified response if the resource has not been modified. i.e. we're now getting http requests for every use as opposed to using the local cache for another 3 days.
Thoughts?
It will cache for 3 more days. RFC 2616 10.3.5:
If a cache uses a received 304 response to update a cache entry, the cache MUST update the entry to reflect any new field values given in the response.
Details about age calculation.

Resources