Does the browser cache evict existing entries after a successful PUT/POST/DELETE? - http

If I send a GET xhr from the browser and the response comes with the correct headers, then the resource is stored in the cache. But I was wondering what would happen if I then send a PUT to the same url. Is the browser smart enough to know that this method will change the resource and so evict the entry from the cache?

I found this in the rfc:
A cache MUST invalidate the effective Request URI (Section 5.5 of
[RFC7230]) as well as the URI(s) in the Location and Content-Location
response header fields (if present) when a non-error status code is
received in response to an unsafe request method.
The unsafe methods include PUT/POST/DELETE, so browsers should evict the entries after requests with these methods are successful (status code 2xx or 3xx).
Though one should be aware that a resource can change even if no requests that modify it are made.

Related

If the browser can cache PATCH requests

If you fetch an image to display a second or n+1 times, or fetch some JSON likewise, and nothing has changed, then the browser shouldn't actually download/fetch the content. This is how GET requests work with caching.
But I'm wondering, hypothetically, if instead of using GET you use PATCH to fetch the image or JSON. Wondering if the browser can still use its cached version if nothing has changed, or what needs to be done to make PATCH work like a GET so that it doesn't fetch cached content.
It's important to understand that PATCH is not for fetching anything. You're making a change on the server and the response may have information about how the change was applied.
HTTP requests other than GET sometimes can be cacheable. To find out if PATCH is, you can read the RFC's. The RFC has this to say:
A response to this method is only cacheable if it contains explicit freshness information (such as an Expires header or
"Cache-Control: max-age" directive) as well as the Content-Location
header matching the Request-URI, indicating that the PATCH response
body is a resource representation. A cached PATCH response can only
be used to respond to subsequent GET and HEAD requests; it MUST NOT
be used to respond to other methods (in particular, PATCH).
This already suggest 'no', doing a PATCH request twice will not result in the second to be skipped.
A second thing to look out for with HTTP methods is if they are idempotent or safe. PATCH is neither.
RFC7231 has this to say about cacheable methods:
In general, safe methods that
do not depend on a current or authoritative response are defined as
cacheable; this specification defines GET, HEAD, and POST as
cacheable, although the overwhelming majority of cache
implementations only support GET and HEAD.
Both of these suggest that 'no', PATCH is not cacheable, and there's no set of HTTP headers that will make it so.

How can I tell the "current age" of a cached page?

I'm wondering how the browser determines whether a cached resource has expired or not.
Assume that I have set the max-age header to 300. I made a request at 14:00, 3 minutes later I made another request to the same resource. So how can the browser tell the resource haven't expired (the current age which is 180 is less than the max-age)? Does the browser hold a "expiry date" or "current age" for every requested resource? If so how can I inspect the "current age" at the time I made the request?
Check what browsers store in their cache
To have a better understanding on how the browser cache works, check what the browsers actually store in their cache:
Firefox: Navigate to about:cache.
Chrome: Navigate to chrome://cache.
Note that there's a key for each cache entry (requested URL). Associated with the key, you will find the whole response details (status codes, headers and content). With those details, the browser is able to determine the age of a requested resource and whether it's expired or not.
The reference for HTTP caching
The RFC 7234, the current reference for caching in HTTP/1.1, tells you a good part of the story about how cache is supposed to work:
2. Overview of Cache Operation
Proper cache operation preserves the semantics of HTTP transfers
while eliminating the transfer of information already
held in the cache. Although caching is an entirely OPTIONAL feature
of HTTP, it can be assumed that reusing a cached response is
desirable and that such reuse is the default behavior when no
requirement or local configuration prevents it. [...]
Each cache entry consists of a cache key and one or more HTTP
responses corresponding to prior requests that used the same key.
The most common form of cache entry is a successful result of a
retrieval request: i.e., a 200 (OK) response to a GET request, which
contains a representation of the resource identified by the request
target. However, it is also possible to
cache permanent redirects, negative results (e.g., 404 (Not Found)),
incomplete results (e.g., 206 (Partial Content)), and responses to
methods other than GET if the method's definition allows such caching
and defines something suitable for use as a cache key.
The primary cache key consists of the request method and target URI.
However, since HTTP caches in common use today are typically limited
to caching responses to GET, many caches simply decline other methods
and use only the URI as the primary cache key. [...]
Some rules are defined regarding storing responses in caches:
3. Storing Responses in Caches
A cache MUST NOT store a response to any request, unless:
The request method is understood by the cache and defined as being
cacheable, and
the response status code is understood by the cache, and
the no-store cache directive does not appear
in request or response header fields, and
the private response directive does not
appear in the response, if the cache is shared, and
the Authorization header field does
not appear in the request, if the cache is shared, unless the
response explicitly allows it, and
the response either:
contains an Expires header field, or
contains a max-age response directive, or
contains a s-maxage response directive
and the cache is shared, or
contains a Cache Control Extension that
allows it to be cached, or
has a status code that is defined as cacheable by default, or
contains a public response directive.
Note that any of the requirements listed above can be overridden by a cache-control extension. [...]
Usually (but not always) the server providing the resource will provide a Date header, indicating the time at which that resource was requested. Caching entities can use that Date and the current time to find the resource's age. If the Date response header does not appear, that the caching entity will probably mark the resource's request time in other metadata, and use that metadata for computing the age. Another possibly helpful response header to look for is the Last-Modified response header.
So first, you should check if the cached resource has the Date header for your own age calculation. If not present, it will then depend on which specific browser you are using, and how that browser handles caching for Date-less resources. More information on HTTP caching and the various factors involved, can be found in this caching tutorial.
Hope this helps!

HTTP: Can I trust 'Content-Length' request header value

I'm developing a file uploading service. I want my users to be restricted for total uploaded files sizes, i.e. they have quotas for uploaded files.
So I want to check for available quota as a user starts upload a new file.
The easiest way is to take 'Content-Length' header value of the POST request and check it against remaining user's quota.
But I'm anxious about whether I can trust 'Content-Length' value.
What if a bad guy specifies a small value in 'Content-Length' header and starts uploading a huge file.
Should I check additionally during reading from input stream (and saving a file on disk) or it's redundant (and such a situation should be detected by Web servers)?
Short answer: it's safe
Long answer: Generally, servers are required to read (at most) as many bytes as are specified in the Content-Length request header. Any bytes that comes after that are expected to indicate a completely new request (reusing the same connection).
I'd assume that this requirement is validated on the server by checking that the next few bytes can be parsed as a request-line.
request-line = method SP request-target SP HTTP-version CRLF
If your bad guy is not smart enough to inject request headers in the right locations of the message body, the server should (must?) automatically treat the entire chain of requests as invalid and abort the file upload.
If your guy does inject new request headers in the message body (a.k.a. request smuggling), each resulting request is technically still valid and you'll still be able to trust that the Content-Length is valid for each message body. You just have to watch out for a different kind of attacks. For example: you might have a proxy installed that filters incoming requests, but only does so by checking the headers of the first request. The smuggled requests get a free pass, which is an obvious security breach.

304 Not Modified with 200 (from cache)

I'm trying to understand what exactly is the difference between "status 304 not modified" and "200 (from cache)"
I'm getting 304 on javascript files that I changed last. Why does this happen?
Thanks for the assistance.
https://sookocheff.com/post/api/effective-caching/ is an excellent source to form the required understanding around these 2 HTTP status codes.
After reading this thoroughly, I had this understanding
In typical usage, when a URL is retrieved, the web server will return the resource's current representation along with its corresponding ETag value, which is placed in an HTTP response header "ETag" field. The client may then decide to cache the representation, along with its ETag. Later, if the client wants to retrieve the same URL resource again, it will first determine whether the local cached version of the URL has expired (through the Cache-Control and the Expire headers). If the URL has not expired, it will retrieve the local cached resource. If it determined that the URL has expired (is stale), then the client will contact the server and send its previously saved copy of the ETag along with the request in a "If-None-Match" field. (Source: https://en.wikipedia.org/wiki/HTTP_ETag)
But even when expires time for an asset is set in future, browser can still reach the server for a conditional GET using ETag as per the 'Vary' header. Details on 'vary' header: https://www.fastly.com/blog/best-practices-using-vary-header/
"200 (from cache)" means the browser found a cached response for the request made. So instead of making a network call to fetch that resource from the remote server, it simply made use of the cached response.
Now these cached responses have a life time associated with them. With time, the responses can become stale. And if these are not allowed to be served stale (see section 4.2.4 - RFC7234), the browser needs to contact the remote server and verify whether these cached responses are still valid or not. The 304 response status code is the servers' way of letting the browser know that the response hasn't changed and is still valid.
If the browser receives a 304 response, it can "freshen" the relevant stale response(s). And subsequent requests to the resource can again be served from the browser cache without checking with the remote server (until the response becomes stale again).
304 Modified
A 304 Not Modified means that the files have not been modified since the specified version by the "If-Modified-Since", or "If-None-Match".
200 OK
This is the response you'll get if an HTTP request works. GET requests will have something relating to the files. A POST request will have something holding the result of the action.
Happy coding!Lyfe

Is an entity body allowed for an HTTP DELETE request?

When issuing an HTTP DELETE request, the request URI should completely identify the resource to delete. However, is it allowable to add extra meta-data as part of the entity body of the request?
The spec does not explicitly forbid or discourage it, so I would tend to say it is allowed.
Microsoft sees it the same way (I can hear murmuring in the audience), they state in the MSDN article about the DELETE Method of ADO.NET Data Services Framework:
If a DELETE request includes an entity body, the body is ignored [...]
Additionally here is what RFC2616 (HTTP 1.1) has to say in regard to requests:
an entity-body is only present when a message-body is present (section 7.2)
the presence of a message-body is signaled by the inclusion of a Content-Length or Transfer-Encoding header (section 4.3)
a message-body must not be included when the specification of the request method does not allow sending an entity-body (section 4.3)
an entity-body is explicitly forbidden in TRACE requests only, all other request types are unrestricted (section 9, and 9.8 specifically)
For responses, this has been defined:
whether a message-body is included depends on both request method and response status (section 4.3)
a message-body is explicitly forbidden in responses to HEAD requests (section 9, and 9.4 specifically)
a message-body is explicitly forbidden in 1xx (informational), 204 (no content), and 304 (not modified) responses (section 4.3)
all other responses include a message-body, though it may be of zero length (section 4.3)
Update
And in RFC 9110 (June 2022), The fact that request bodies on GET, HEAD, and DELETE are not interoperable has been clarified.
section 9.3.5 Delete
Although request message framing is independent of the method used,
content received in a DELETE request has no generally defined
semantics, cannot alter the meaning or target of the request, and
might lead some implementations to reject the request and close the
connection because of its potential as a request smuggling attack
(Section 11.2 of [HTTP/1.1]). A client SHOULD NOT generate content in
a DELETE request unless it is made directly to an origin server that
has previously indicated, in or out of band, that such a request has a
purpose and will be adequately supported. An origin server SHOULD NOT
rely on private agreements to receive content, since participants in
HTTP communication are often unaware of intermediaries along the
request chain.
The 2014 update to the HTTP 1.1 specification (RFC 7231) explicitly permits an entity-body in a DELETE request:
A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request.
Some versions of Tomcat and Jetty seem to ignore a entity body if it is present. Which can be a nuisance if you intended to receive it.
One reason to use the body in a delete request is for optimistic concurrency control.
You read version 1 of a record.
GET /some-resource/1
200 OK { id:1, status:"unimportant", version:1 }
Your colleague reads version 1 of the record.
GET /some-resource/1
200 OK { id:1, status:"unimportant", version:1 }
Your colleague changes the record and updates the database, which updates the version to 2:
PUT /some-resource/1 { id:1, status:"important", version:1 }
200 OK { id:1, status:"important", version:2 }
You try to delete the record:
DELETE /some-resource/1 { id:1, version:1 }
409 Conflict
You should get an optimistic lock exception. Re-read the record, see that it's important, and maybe not delete it.
Another reason to use it is to delete multiple records at a time (for example, a grid with row-selection check-boxes).
DELETE /messages
[{id:1, version:2},
{id:99, version:3}]
204 No Content
Notice that each message has its own version. Maybe you can specify multiple versions using multiple headers, but by George, this is simpler and much more convenient.
This works in Tomcat (7.0.52) and Spring MVC (4.05), possibly w earlier versions too:
#RestController
public class TestController {
#RequestMapping(value="/echo-delete", method = RequestMethod.DELETE)
SomeBean echoDelete(#RequestBody SomeBean someBean) {
return someBean;
}
}
Just a heads up, if you supply a body in your DELETE request and are using a google cloud HTTPS load balancer, it will reject your request with a 400 error. I was banging my head against a wall and came to found out that Google, for whatever reason, thinks a DELETE request with a body is a malformed request.
Roy Fielding on the HTTP mailing list clarifies that on the http mailing list https://lists.w3.org/Archives/Public/ietf-http-wg/2020JanMar/0123.html and says:
GET/DELETE body are absolutely forbidden to have any impact whatsoever
on the processing or interpretation of the request
This means that the body must not modify the behavior of the server.
Then he adds:
aside from
the necessity to read and discard the bytes received in order to maintain
the message framing.
And finally the reason for not forbidding the body:
The only reason we didn't forbid sending a body is
because that would lead to lazy implementations assuming no body would
be sent.
So while clients can send the payload body, servers should drop it
and APIs should not define a semantic for the payload body on those requests.
It appears to me that RFC 2616 does not specify this.
From section 4.3:
The presence of a message-body in a request is signaled by the
inclusion of a Content-Length or Transfer-Encoding header field in
the request's message-headers. A message-body MUST NOT be included in
a request if the specification of the request method (section 5.1.1)
does not allow sending an entity-body in requests. A server SHOULD
read and forward a message-body on any request; if the request method
does not include defined semantics for an entity-body, then the
message-body SHOULD be ignored when handling the request.
And section 9.7:
The DELETE method requests that the origin server delete the resource
identified by the Request-URI. This method MAY be overridden by human
intervention (or other means) on the origin server. The client cannot
be guaranteed that the operation has been carried out, even if the
status code returned from the origin server indicates that the action
has been completed successfully. However, the server SHOULD NOT
indicate success unless, at the time the response is given, it
intends to delete the resource or move it to an inaccessible
location.
A successful response SHOULD be 200 (OK) if the response includes an
entity describing the status, 202 (Accepted) if the action has not
yet been enacted, or 204 (No Content) if the action has been enacted
but the response does not include an entity.
If the request passes through a cache and the Request-URI identifies
one or more currently cached entities, those entries SHOULD be
treated as stale. Responses to this method are not cacheable.c
So it's not explicitly allowed or disallowed, and there's a chance that a proxy along the way might remove the message body (although it SHOULD read and forward it).
I don't think a good answer to this has been posted, although there's been lots of great comments on existing answers. I'll lift the gist of those comments into a new answer:
This paragraph from RFC7231 has been quoted a few times, which does sum it up.
A payload within a DELETE request message has no defined semantics;
sending a payload body on a DELETE request might cause some existing
implementations to reject the request.
What I missed from the other answers was the implication. Yes it is allowed to include a body on DELETE requests, but it's semantically meaningless. What this really means is that issuing a DELETE request with a request body is semantically equivalent to not including a request body.
Including a request body should not have any effect on the request, so there is never a point in including it.
tl;dr: Techically a DELETE request with a request body is allowed, but it's never useful to do so.
Using DELETE with a Body is risky... I prefer this approach for List Operations over REST:
Regular Operations
GET /objects/ Gets all Objects
GET /object/ID Gets an Object with specified ID
POST /objects Adds a new Object
PUT /object/ID Adds an Object with specified ID, Updates an Object
DELETE /object/ID Deletes the object with specified ID
All Custom actions are POST
POST /objects/addList Adds a List or Array of Objects included in body
POST /objects/deleteList Deletes a List of Objects included in body
POST /objects/customQuery Creates a List based on custom query in body
If a client doesn't support your extended operations they can work in the regular way.
It is worth noting that the OpenAPI specification for version 3.0 dropped support for DELETE methods with a body:
see here and here for references
This may affect your implementation, documentation, or use of these APIs in the future.
It seems ElasticSearch uses this:
https://www.elastic.co/guide/en/elasticsearch/reference/5.x/search-request-scroll.html#_clear_scroll_api
Which means Netty support this.
Like mentionned in comments it may not be the case anymore
Several other answers mention RFC 7231 which had effectively said that a DELETE request is allowed to have a body but it is not recommended.
In 2022, RFC 7231 was superseded by RFC 9110: HTTP Semantics, which now says:
[...] content received in a DELETE request has no generally defined semantics, cannot alter the meaning or target of the request, and might lead some implementations to reject the request and close the connection [...]. A client SHOULD NOT generate content in a DELETE request unless it is made directly to an origin server that has previously indicated, in or out of band, that such a request has a purpose and will be adequately supported. An origin server SHOULD NOT rely on private agreements to receive content, since participants in HTTP communication are often unaware of intermediaries along the request chain.
This language has been strengthened from the previous language, to say that even though it is allowed, you really need to be very careful when using it because (for example) some users might be behind a proxy that would strip the body from the request in order to combat "request smuggling".
In case anyone is running into this issue testing, No, it is not universally supported.
I am currently testing with Sahi Pro and it is very apparent a http DELETE call strips any provided body data (a large list of IDs to delete in bulk, as per endpoint design).
I have been in contact with them several times, also sent three separate packages of scripts, images and logs for them to review and they still have not confirmed this. A failed patch, and a missed conference calls by their support later and I still haven't gotten a solid answer.
I am certain Sahi does not support this, and I would imagine many other tools follow suite.
This is not defined.
A payload within a DELETE request message has no defined semantics;
sending a payload body on a DELETE request might cause some existing
implementations to reject the request.
https://www.rfc-editor.org/rfc/rfc7231#page-29
Practical answer: NO
Some clients and servers ignore or even delete the body in DELETE request. In some rare cases they fail and return an error.
Might be the below GitHUb url will help you, to get the answer.
Actually, Application Server like Tomcat, Weblogic denying the HTTP.DELETE call with request payload. So keeping these all things in mind, I have added example in github,please have a look into that
https://github.com/ashish720/spring-examples

Resources