ETag with Accept-Language - etag

I have a web service which puts an ETag on to each response so future calls can make use of the HTTP 304 (Not Modified) status. The ETag I generate really just a Base64 encoding of the query type along with the timestamp.
The problem I have is if the browser requests the same resource with a difference Accept-Language. The browser currently sends the same If-None-Match header, so the response is a 304, even thought the actual resource would have come back in a different language. So I thought the way to do this was to add a Vary Header, to specify to the client that the response varys with Accept & Accept-Language, as shown below.
Vary:Accept, Accept-Language
However my browser (Chrome) uses the same ETag regardless of the accept-language. What is the correct convention to use here?
Thanks

E-Tag identifies the response contents.
So better use a response body hash for E-Tag construction.
At least you can use hash of a query and a language concatenated.

Related

Question about Etag and Cache-Control Header

I am trying to customize cache strategy.
because default setting is not fit on me ... Serviced by (Vercel).
I am unfamiliar with the cache setting through the http header, so I would like to ask for help.
Please check if my understanding is correct.
First of all, my response data doesn't change often.
But occasionally it changes.
Here is my plan.
Maybe i should revalidate every time. so i need to set my Cache-Control like this.
Cache-Control: public max-age=0 must-revalidate
I guess, with this way, my data will be stored in the client browser
and the browser will always send request that check the value is new.
And on the server, i'll write code to handle the things written in the lines below.
First, set Etag on response header.
Second, if the If-None-Match value comes from the request header, I compare it with my service logic to check whether the value is up to date.
If the If-None-Match value is up to date, I just write 304 in the response and send it.
And otherwise, I just respond with the data normally. with the new Etag value.

How does http caching work in combination with the accept-language header

I am trying to understand the deeper inner workings of the caching that happens in the browser. Could you help me?
The page https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching states this: "The primary cache key consists of the request method and target URI (often, only the URI is used because only GET requests are caching targets)."
Do i understand it correctly when i state / recall it like this?:
The response coming from a GET request to http://my.site/my/resource is stored in a key value pair like database where the key will either be GET http://my.site/my/resource or http://my.site/my/resource. I think of the response as the HTML / JSON, whathever the response body will contain.
I suspect that this interpretation is flawed in some ways:
What if i do a GET request to the url http://my.site/my/resource with an accept-language: nl-NL header. And immediately after that i do another GET request with accept-language: en-US. I assume because of what started to understand earlier, the second request should return the same response as from the first request. Because the cache keys are the same. Do i understand that correctly?
If that is correct, how can a request result in a different representation then when the accept-language header is changed. Do you need to send a header with the request, to bypass the cache when the accept-language header was changed?

Caching strategy using ETag and Expires/Cache-control with no assets version/ID

After reading a lot about caching validators (more intensively after reading this answer on SO), I had a doubt that didn't find the answer anywhere.
My use-case is to serve a static asset (a javascript file, ie: https://example.com/myasset.js) to be used in other websites, so messing with their Gpagespeed/gmetrix score matters the most.
I also need their users to receive updated versions of my static asset every time I deploy new changes.
For this, I have the following response headers:
Cache-Control: max-age=10800
etag: W/"4efa5de1947fe4ce90cf10992fa"
In short, we can see the following flow in terms of how browser behaves using etag
For the first request, the browser has no value for the If-None-Match Request Header, so the Server will send back the status code 200 (Ok), the content itself, and a Response header with ETag value.
For the subsequent requests, the browser will add the previously received ETag value in a form of the If-None-Match Request Header. This way, the server can compare this value with the current value from ETag and, if both match, the server can return 304 (Not Modified) telling the browser to use the latest version of the file, or just 200 followed by the new content and the related ETag value instead.
However, I couldn't find any information in regards to using the Cache-Control: max-age header and how will this affect the above behavior, like:
Will the browser request for new updates before max-age has met? Meaning that I can define a higher max-age value (pagespeed/gmetrix will be happy about it) and force this refresh using only etag fingerprint.
If not, then what are the advantages of using etag and adding extra bits to the network?
No, the browser will not send any requests until max-age has passed.
The advantage of using ETag is that, if the file hasn't changed, you don't need to resend the entire file to the client. The response will be a small 304.
Note that you can achieve the best of both worlds by using the stale-while-revalidate directive, which allows stale responses to be served while the cache silently revalidates the resource in the background.

Which Cache-Control header values are used by client/server?

I'm trying to understand which values for Cache-Control are to be used in the request and which are to be used in the response.
This is a good answer, but it doesn't mention if you should use those values in the response.
For example, should no-store be used in the request or the response?
Is no-cache only to be used in the request?
What would happen if it was used in the response instead of the request?
Is there any point in it being used in both the request and response?
It is indeed confusing that the same header name and directive can mean different things depending on whether it appears in a request or a response. This was acknowledged by one of the standard's editors, who wrote: "If we were designing Cache-Control from scratch today, we’d probably use a different name for the field in requests to help avoid this kind of confusion."
Fortunately, RFC 7234 describes Cache-Control request directives and response directives separately, so you can probably find the answers to your questions there. For example:
no-store can be used in either the request or the response. The meaning is basically the same, it's just a question of whether it's the client or the server who wants nothing to be stored.
no-cache can also be used in either the request or the response, but the meaning is different. If used in a request it means that the response to this request should not come from the cache (without validation). If used on a response, it means that future requests should not be satisfied with this response (without validation).

Appropriate HTTP status code for request specifying invalid Content-Encoding header?

What status code should be returned if a client sends an HTTP request and specifies a Content-Encoding header which cannot be decoded by the server?
Example
A client POSTs JSON data to a REST resource and encodes the entity body using the gzip coding. However, the server can only decode DEFLATE codings because it failed the gzip class in server school.
What HTTP response code should be returned? I would say 415 Unsupported Media Type but it's not the entity's Content-Type that is the problem -- it's the encoding of the otherwise supported entity body.
Which is more appropriate: 415? 400? Perhaps a custom response code?
Addendum: I have, of course, thoroughly checked rfc2616. If the answer is there I may need some new corrective eyewear, but I don't believe that it is.
Update:
This has nothing to do with sending a response that might be unacceptable to a client. The problem is that the client is sending the server what may or may not be a valid media type in an encoding the server cannot understand (as per the Content-Encoding header the client packaged with the request message).
It's an edge-case and wouldn't be encountered when dealing with browser user-agents, but it could crop up in REST APIs accepting entity bodies to create/modify resources.
As i'm reading it, 415 Unsupported Media Type sounds like the most appropriate.
From RFC 2616:
10.4.16 415 Unsupported Media Type
The server is refusing to service the request because the entity of the request is in a format not supported by the requested resource for the requested method.
Yeah, the text part says "media type" rather than "encoding", but the actual description doesn't include any mention of that distinction.
The new hotness, RFC 7231, is even explicit about it:
6.5.13. 415 Unsupported Media Type
The 415 (Unsupported Media Type) status code indicates that the
origin server is refusing to service the request because the payload
is in a format not supported by this method on the target resource.
The format problem might be due to the request's indicated
Content-Type or Content-Encoding, or as a result of inspecting the
data directly.
They should make that the final question on Who Wants To Be a Millionaire!
Well the browser made a request that the server cannot service because the information the client provided is in a format that cannot be handled by the server. However, this isn't the server's fault for not supporting the data the client provided, it's the client's fault for not listening to the server's Acccept-* headers and providing data in an inappropriate encoding. That would make it a Client Error (400 series error code).
My first instinct is 400 Bad Request is the appropriate response in this case.
405 Method Not Allowed isn't right because it refers to the HTTP verb being one that isn't allowed.
406 Not Acceptable looks like it might have promise, but it refers to the server being unable to provide data to the client that satisfies the Accept-* request headers that it sent. This doesn't seem like it would fit your case.
412 Precondition Failed is rather vaguely defined. It might be appropriate, but I wouldn't bet on it.
415 Unsupported Media Type isn't right because it's not the data type that's being rejected, it's the encoding format.
After that we get into the realm of non-standard response codes.
422 Unprocessable Entity describes a response that should be returned if the request was well-formed but if it was semantically incorrect in some way. This seems like a good fit, but it's a WebDAV extension to HTTP and not standard.
Given the above, I'd personally opt for 400 Bad Request. If any other HTTP experts have a better candidate though, I'd listen to them instead. ;)
UPDATE: I'd previously been referencing the HTTP statuses from their page on Wikipedia. Whilst the information there seems to be accurate, it's also less than thorough. Looking at the specs from W3C gives a lot more information on HTTP 406, and it's leading me to think that 406 might be the right code after all.
10.4.7 406 Not Acceptable
The resource identified by the request is only capable of generating
response entities which have content characteristics not acceptable
according to the accept headers sent in the request.
Unless it was a HEAD request, the response SHOULD include an entity
containing a list of available entity characteristics and location(s)
from which the user or user agent can choose the one most appropriate.
The entity format is specified by the media type given in the
Content-Type header field. Depending upon the format and the
capabilities of the user agent, selection of the most appropriate
choice MAY be performed automatically. However, this specification
does not define any standard for such automatic selection.
Note: HTTP/1.1 servers are allowed to return responses which are
not acceptable according to the accept headers sent in the
request. In some cases, this may even be preferable to sending a
406 response. User agents are encouraged to inspect the headers of
an incoming response to determine if it is acceptable.
If the response could be unacceptable, a user agent SHOULD temporarily
stop receipt of more data and query the user for a decision on further
actions.
While it does mention the Content-Type header explicitly, the wording mentions "entity characteristics", which you could read as covering stuff like GZIP versus DEFLATE compression.
One thing worth noting is that the spec says that it may be appropriate to just send the data as is, along with the headers to tell the client what format it's in and what encoding it uses, and just leave it for the client to sort out. So if the client sends a header indicating it accepts GZIP compression, but the server can only generate a response with DEFLATE, then sending that along with headers saying it's DEFLATE should be okay (depending on the context).
Client: Give me a GZIPPED page.
Server: Sorry, no can do. I can DEFLATE pack it for you. Here's the DEFLATE packed page. Is that okay for you?
Client: Welllll... I didn't really want DEFLATE, but I can decode it okay so I'll take it.
(or)
Client: I think I'll have to clear that with my user. Hold on.

Resources