HTTP Headers: Access-Control-Allow-Methods VS Allow - http

What is the main difference between these two headers?
Access-Control-Allow-Methods is located in headers collection of the request while Allow can be found inside of Content.Headers collection.
Which one I should care about while handling OPTIONS requests?

Allow is a basic HTTP header which is used to describe which HTTP methods may be used to request a resource. This is in general and not specifically for JS. The header predates the existence of JS.
Access-Control-Allow-Headers is a CORS extension to HTTP which describes which HTTP methods may be used by client-side code to make cross-origin requests to a resource.
You must include an Allow header if you are making a 405 response. You may always include it.
You need to include Access-Control-Allow-Headers if you are making a response to a preflight OPTIONS request (unless you don't want to use it to grant the follow-up request any permissions).

Related

Does the ETag header make the Cache-Control header obsolete? How to make sure Cache-Control is not harmful then?

Definition of ETag header (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag):
The ETag HTTP response header is an identifier for a specific version
of a resource. It allows caches to be more efficient, and saves
bandwidth, as a web server does not need to send a full response if
the content has not changed. On the other side, if the content has
changed, etags are useful to help prevent simultaneous updates of a
resource from overwriting each other ("mid-air collisions").
Definition of Cache-Control header (https://developer.mozilla.org/de/docs/Web/HTTP/Headers/Cache-Control):
The Cache-Control general-header field is used to specify directives
for caching mechanisms in both requests and responses.
So the ETag header tells the browser for a resource to send a single HTTP request to the server and ask if the file hash has changed. If yes, download a new one. Great. So if the ETag header is set why should I need Cache-Control any more (beside of the Expires header which may help to avoid this single request)?
So if I have to set the Cache-Control header anyway it can only be harmful right? I think the most appropriate value would be:
Cache-Control: must-revalidate
But I am not sure if this triggers unecessary additional actions.
After some research, I found a great tutorial on Medium by Alex Barashkov: "Best practices for cache control settings for your website".
Alex writes:
I recommend you apply Cache-Control: no-cache to html files. Applying
“no-cache” does not mean that there is no cache at all, it simply
tells the browser to validate resources on the server before use it
from the cache. That’s why we need to use it with Etag, so browsers
will send a simple request and load the extra 80 bytes to verify the
state of the file.
Presence of ETag header does not tell the browser to do anything. Browser decides what to do based on the Cache-Control header it receives in the request and cached response. If it decides that resource is stale or needs to be re-validated, then it can use the ETag value to create a conditional request to the server and either get a new resource (status code 200), or a notification that things have not changed (status code 304)
Both headers are necessary for your cache to work optimally.

HTTP response headers that cause side effects for other resources on the same origin

I have a server which hosts resources for several users on the same hostname. For example:
http://example.com/alice/blog.html
http://example.com/bob/cat.jpg
http://example.com/carol/todo.txt
I would like to allow users to specify their own response headers for resources within their directories, similar to what is done on AWS S3. For example, Carol may want her TODO list readable from scripts on another domain, so she might want Access-Control-Allow-Origin: * set for todo.txt.
While I want this feature to be as flexible as possible, I cannot allow just any response headers to be specified, as some response headers have side effects for the entire origin or hostname. For example, Set-Cookie could be used for one person's directory, but the user agent could then make a request to someone else's directory with the cookie value. As another example, a user could set Strict-Transport-Security, potentially locking out other users from using normal HTTP.
What other HTTP response headers have the potential for side effects for the entire origin, rather than just the resource that was requested? My list so far:
Alt-Svc
Public-Key-Pins
Server
Set-Cookie
Strict-Transport-Security
Rather than blocking response headers that could affect the entire domain I would recommend a slightly different approach and specify a white list of response headers that are definitely okay to use. There could be new, experimental or browser-specific headers that are non-standard but potentially affect the entire domain for a user with a specific browser.
I would suggest that the following headers are safe to use and should be everything your user needs to modify:
Access-Control-Allow-Origin
Access-Control-Allow-Credentials
Access-Control-Expose-Headers
Access-Control-Max-Age
Access-Control-Allow-Methods
Access-Control-Allow-Headers
Age
Allow
Cache-Control
Content-Disposition
Content-Encoding
Content-Language
Content-Length
Content-Location
Content-Range
Content-Type
Date
ETag
Expires
Last-Modified
Link
Location
Pragma
Retry-After
Transfer-Encoding
For static content such as files and html pages I would not set Content-Range or Content-Length manually. The server should set many of these headers automatically. Nevertheless overriding them might make sense for some users. Transfer-Encoding can be used to add gzip or deflate during transfer if your server supports it, but must not be used with HTTP/2.
Also Location, Allow and Retry-After only make sense for certain status codes. You might want to omit them

Are both "Cache request directives" and "Cache response directives" needed?

If I already have "Cache request directives," what is the point of "Cache response directives." Do they add anything? Will my application run the same without them?
I looking for proof whether "Cache response directives" are redundant. If they are redundant, I will not bother with them.
GC_
I assume you are asking as an application developer and if so, you should not bother with any Cache-Control header your application receives in a request.
Why?
Because that Cache-Control header is intended for caches before the request reaches your application.
It is not for your application.
This is explained in RFC7234 Section 5.2 (emphasis mine):
The "Cache-Control" header field is used to specify directives for caches along the request/response chain.
The purpose of the header is to tell caches what to do with the request.
Your application receives the header because it is attached to a request.
But just because you receive it, it doesn't mean it is for you.
Bottom line: ignore any Cache-Control header in a request.
Cache-Control in a response comes from your application and it is also intended for caches.
You use it to tell caches what to do with the response.
Basically, you use the header to specify whether the response is cacheable and if it is, for how long.
It is not merely a copy of the Cache-Control header received in a request.
Do they add anything?
Yes, they do.
Cache-Control in a response tells caches whether the response is cacheable and if it is,
it allows caches to serve an equivalent request immediately with a cached response.
This reduces your application's load and improves response times from a client's point of view.
RFC7234 Section 4.2 states:
When a response is "fresh" in the cache, it can be used to satisfy subsequent requests without contacting the origin server, thereby improving efficiency.
Your next question:
Will my application run the same without them?
It depends.
If your application doesn't add appropriate Cache-Control header for responses that must not be cached, future requests may receive stale responses.
So, I recommend that at the very least, add Cache-Control: no-cache to responses that must not be cached.
Additional explanation for your question in the comment section
The header should generally come from your backend, not your frontend.
This allows caches to accurately accelerates requests to your backend and keeps your frontend request code simple.
There is one exception: if the backend isn't yours and its response freshness policy doesn't match your requirement.
An example scenario may be in order:
Let's say, that in addition to sending requests to your own backend, your frontend also sends requests to someone else's backend.
This particular backend specifies that its responses are cacheable for at most 5 minutes, by either sending Cache-Control: max-age=300 or appropriate Expires header.
Let's also say, that you want the responses to be no more than 10 seconds stale, because 5 minutes is too stale for you.
Since the backend isn't yours, you can't change the 5-minutes directive, but you can send your requests with Cache-Control: max-age=10 thereby forcing the caches to fetch a fresh response if a cached response is older than 10 seconds, despite the 5-minutes directive from the backend.
That is the appropriate situation to send Cache-Control header from your frontend: the backend isn't yours and its response freshness policy doesn't match your requirement.
Are both "Cache request directives" and "Cache response directives" needed?
Yes. Cache-Control in request header and Cache-Control in response header are both needed. Even if you already have Cache-Control in request header, Cache-Control in response is not redundant. They are 2 different things. According to RFC7234:
cache directives are unidirectional in that the presence of a directive in a request does not imply that the same directive is to be given in the response.
Generally speaking, Cache-Control in response header controls the cache behaviour from resource provider's point of view. -- should the resource stored in cache? How long would it be valid? When requested, does it need to be revalidated? etc. As response headers can be configured for all HTTP requests, "Cache response directives" provides a way to define cache policy for all resources.
Cache-Control in request header, however, controls the cache behaviour from resource consumer's point of view. It's more like defining exceptional case where the cache policy of specific resource should be adjusted. If you check RFC7234, most of the "Request Cache-Control Directives" indicates that the client is willing to... or indicates that the client is unwilling to...
Also, as request headers can only be configured in some cases (e.g. Ajax), "Cache request directives" doesn't exist for many HTTP requests. For example, after HTML file is parsed, many HTTP requests will be created to fetch static resources (image files, css files etc.), there is no way to configure Cache-Control header for these requests manually in program.
If I already have "Cache request directives", what is the point of "Cache response directives"?
If you only have "Cache request directives" and never get Cache-Control response header, some problems will happen:
Without Cache-Control response header, the cache behaviour of all resources are decided by browser (e.g. calculate valid-time through LM-Factor algorithm). In the worst case, there would be no cache at all.
For static resources (e.g. image files, css files), as you can't configure Cache-Control in request, you lost cache control ability.

Default value for Access-Control-Allow-Methods

I just learned about the Access-Control-Allow-Methods header, e.g.
Access-Control-Allow-Methods: OPTIONS, HEAD, GET
I have never used this header (just Access-Control-Allow-Origin), but I have gotten CORS to work in the past.
Is the default to allow all methods, or have I gotten lucky with undefined behavior?
The Access-Control-Allow-Methods header indicates which HTTP methods are allowed on a particular endpoint for cross-origin requests. If you allow all HTTP methods, then its ok to set the value to something like Access-Control-Allow-Methods: GET, PUT, POST, DELETE, HEAD. However, if you want to limit the endpoint to only a few methods, you should only include those methods.
As to why you haven't been seeing this before, this header is only used on CORS preflight requests. Maybe your application didn't use CORS preflight, and then something changed to trigger a preflight. Does your application use any HTTP methods other than GET/POST, or any custom HTTP headers?
You can learn more about CORS preflight requests here: http://www.html5rocks.com/en/tutorials/cors/
The default of Access-Control-Allow-Methods is to allow through all simple methods, even on preflight requests. As the flow on https://www.w3.org/TR/cors/#preflight-request says (step 7 of successful preflight request):
If request method is not a case-sensitive match for any method in methods and is not a simple method, apply the cache and network error steps.
And the definition of simple method is:
A method is said to be a simple method if it is a case-sensitive match for one of the following: GET HEAD POST
So if you have a preflighted POST request (due to a custom HTTP header, say), and do not send a Access-Control-Allow-Methods response header, the request will still go ahead okay.

CORS - why is Access-control-allow-origin header necessary?

Premise:
When an site (http://example.com) tries to make a cross-origin request, the browser will send an HTTP request to the cross-origin server (http://other-server.com) with the header Origin: http://example.com. If the server at http://other-server.com approves http://example.com as a valid origin, then it will 1) Respond without error AND 2) set the response header to Access-control-allow-origin: http://example.com
My question is - why is it necessary to set the Access-control-allow-origin header in the response? Doesn't responding without error already acknowledge that the server (http://other-server.com) is allowing the cross-origin request?
This extra layer of acknowledgement gives servers a lot of flexibility over how they support CORS. For example:
1) A server has a lot of choices when setting the Access-Control-Allow-Origin header. It can use the * value to allow all clients, or it can limit the scope of clients by using the actual value of the origin (e.g. http://example.com). If a server does support CORS, but not for all origins, it could respond without error, but the Access-Control-Allow-Origin could be set to http://notyourorigin.com.
2) CORS allows even more flexibility via the Access-Control-Allow-Methods and Access-Control-Allow-Headers preflight response headers. These headers go beyond the simple binary success/error HTTP status, and provide more nuanced information about what is and is not supported in the server.
As the examples above point out, an error response without any context can be very confusing to the user. If you make a CORS request, and all you get back is an error response, you have no idea why that request failed. Are you doing the request wrong? Does the server support CORS at all? This can be very difficult to figure out without any accompanying information. The Access-Control-* gives more context to the user so they can effectively debug their CORS requests.

Resources