What determines request equivalence for HTTP caching? - http

I feel like this has to be easy to Google, but I can't find it: from the perspective of an HTTP cache, what determines if two requests are equivalent?
I imagine one ingredient is that that their URLs need to be identical; for example, rearranging (but not changing) query string parameters seems to cause a cache miss. Presumably they need to have the same Accept header. What else determines if a request can be served from cache?

This is mostly described in this RFC: https://www.rfc-editor.org/rfc/rfc7234#section-4
Summary:
The method
The full uri
Caching-related headers in response influence whether something got stored.
Any request headers that appeared in the list of the Vary response header.
It also matters whether you are caching for a specific user (for example a browser), or many users (for example a proxy).

I also struggled with this. Changing my google search to use "http cache key" generated better results. Using the URL seems to be the most common. Query strings are also generally included.
https://support.cloudflare.com/hc/en-us/articles/115004290387-Using-Custom-Cache-Keys describes what the default is for cloudflare and a discussion on the impact of using different keys.
Another parameter that could be useful is to identifying the type of assets that you want to cache. Or leave it open (no filtering)
"Authorization" header is specifically mentioned in the HTTP spec (https://www.rfc-editor.org/rfc/rfc7234) and needs to be handled.
Upon further reading, I noticed the section on "Secondary keys" in the standard (https://www.rfc-editor.org/rfc/rfc7234#section-4.1) and the use of "Vary" header in a response. Headers presented in the "Vary" response header have to match in both the original and the new request for the cache to declare it as a match.
And as for the primary key, standard says "The primary cache key consists of the request method and target URI." in https://www.rfc-editor.org/rfc/rfc7234#section-2

There are all the conditional requests for cache control like If-match, If-unmodified-since, If-none-match and If-modified-since. For example If-modified-since works this way: suppose you have already requested a page and now you want to reload it. If the header is present then a new page will be sent back from the server ONLY if it was modified since the date indicated as a value for If-modified-since, otherwise 304(not-modified) status will be returned.
Accept and Accept-* instead are necessary for Content-Negotiation, like in which language the page should be returned.
More on conditional requests here: https://www.rfc-editor.org/rfc/rfc7232#page-13

Related

If the browser can cache PATCH requests

If you fetch an image to display a second or n+1 times, or fetch some JSON likewise, and nothing has changed, then the browser shouldn't actually download/fetch the content. This is how GET requests work with caching.
But I'm wondering, hypothetically, if instead of using GET you use PATCH to fetch the image or JSON. Wondering if the browser can still use its cached version if nothing has changed, or what needs to be done to make PATCH work like a GET so that it doesn't fetch cached content.
It's important to understand that PATCH is not for fetching anything. You're making a change on the server and the response may have information about how the change was applied.
HTTP requests other than GET sometimes can be cacheable. To find out if PATCH is, you can read the RFC's. The RFC has this to say:
A response to this method is only cacheable if it contains explicit freshness information (such as an Expires header or
"Cache-Control: max-age" directive) as well as the Content-Location
header matching the Request-URI, indicating that the PATCH response
body is a resource representation. A cached PATCH response can only
be used to respond to subsequent GET and HEAD requests; it MUST NOT
be used to respond to other methods (in particular, PATCH).
This already suggest 'no', doing a PATCH request twice will not result in the second to be skipped.
A second thing to look out for with HTTP methods is if they are idempotent or safe. PATCH is neither.
RFC7231 has this to say about cacheable methods:
In general, safe methods that
do not depend on a current or authoritative response are defined as
cacheable; this specification defines GET, HEAD, and POST as
cacheable, although the overwhelming majority of cache
implementations only support GET and HEAD.
Both of these suggest that 'no', PATCH is not cacheable, and there's no set of HTTP headers that will make it so.

How can I tell the "current age" of a cached page?

I'm wondering how the browser determines whether a cached resource has expired or not.
Assume that I have set the max-age header to 300. I made a request at 14:00, 3 minutes later I made another request to the same resource. So how can the browser tell the resource haven't expired (the current age which is 180 is less than the max-age)? Does the browser hold a "expiry date" or "current age" for every requested resource? If so how can I inspect the "current age" at the time I made the request?
Check what browsers store in their cache
To have a better understanding on how the browser cache works, check what the browsers actually store in their cache:
Firefox: Navigate to about:cache.
Chrome: Navigate to chrome://cache.
Note that there's a key for each cache entry (requested URL). Associated with the key, you will find the whole response details (status codes, headers and content). With those details, the browser is able to determine the age of a requested resource and whether it's expired or not.
The reference for HTTP caching
The RFC 7234, the current reference for caching in HTTP/1.1, tells you a good part of the story about how cache is supposed to work:
2. Overview of Cache Operation
Proper cache operation preserves the semantics of HTTP transfers
while eliminating the transfer of information already
held in the cache. Although caching is an entirely OPTIONAL feature
of HTTP, it can be assumed that reusing a cached response is
desirable and that such reuse is the default behavior when no
requirement or local configuration prevents it. [...]
Each cache entry consists of a cache key and one or more HTTP
responses corresponding to prior requests that used the same key.
The most common form of cache entry is a successful result of a
retrieval request: i.e., a 200 (OK) response to a GET request, which
contains a representation of the resource identified by the request
target. However, it is also possible to
cache permanent redirects, negative results (e.g., 404 (Not Found)),
incomplete results (e.g., 206 (Partial Content)), and responses to
methods other than GET if the method's definition allows such caching
and defines something suitable for use as a cache key.
The primary cache key consists of the request method and target URI.
However, since HTTP caches in common use today are typically limited
to caching responses to GET, many caches simply decline other methods
and use only the URI as the primary cache key. [...]
Some rules are defined regarding storing responses in caches:
3. Storing Responses in Caches
A cache MUST NOT store a response to any request, unless:
The request method is understood by the cache and defined as being
cacheable, and
the response status code is understood by the cache, and
the no-store cache directive does not appear
in request or response header fields, and
the private response directive does not
appear in the response, if the cache is shared, and
the Authorization header field does
not appear in the request, if the cache is shared, unless the
response explicitly allows it, and
the response either:
contains an Expires header field, or
contains a max-age response directive, or
contains a s-maxage response directive
and the cache is shared, or
contains a Cache Control Extension that
allows it to be cached, or
has a status code that is defined as cacheable by default, or
contains a public response directive.
Note that any of the requirements listed above can be overridden by a cache-control extension. [...]
Usually (but not always) the server providing the resource will provide a Date header, indicating the time at which that resource was requested. Caching entities can use that Date and the current time to find the resource's age. If the Date response header does not appear, that the caching entity will probably mark the resource's request time in other metadata, and use that metadata for computing the age. Another possibly helpful response header to look for is the Last-Modified response header.
So first, you should check if the cached resource has the Date header for your own age calculation. If not present, it will then depend on which specific browser you are using, and how that browser handles caching for Date-less resources. More information on HTTP caching and the various factors involved, can be found in this caching tutorial.
Hope this helps!

Why do we need HTTP GET? Is there anything that can't be achieved by HTTP POST?

As far as I know what GET can do, the same can be achieved by POST. So why was GET required in first place while defining HTTP protocol. If GET is only for fetching the resource, people can still update resources by sending the parameters values in URL. Why this loophole? Or the guy who did the coding on server side to update the resource on GET request has written a bad code?
HTTP specified different methods for different purposes. The GET method is intended to be used to “retrieve whatever information (in the form of an entity) is identified by the Request-URI”. Especially, it is intended to be a safe and idempotent method. That means a GET request should not have side effects (i.e. changing data):
In particular, the convention has been established that the GET and HEAD methods SHOULD NOT have the significance of taking an action other than retrieval.
And sending an identical request multiple times results in the same as sending it just once:
Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property.
Practically, no browser implements POSTing by clicking links (without intercepting the click event in JavaScript), nor bookmarking POST data. Furthermore, semantically POST and GET serve different purposes. One is for POSTing data to an application, the other is for GETting data from the application. These semantics have practical implications, but they also have theoretical design implications that speak to the quality of your application's design: an application that doesn't handle GET differently from POST probably has a great deal of security problems and workflow bugs.
From RFC 2616:
9.3 GET
The GET method means retrieve whatever
information (in the form of an entity)
is identified by the Request-URI. If
the Request-URI refers to a
data-producing process, it is the
produced data which shall be returned
as the entity in the response and not
the source text of the process, unless
that text happens to be the output of
the process.
The semantics of the GET method change
to a "conditional GET" if the request
message includes an If-Modified-Since,
If-Unmodified-Since, If-Match,
If-None-Match, or If-Range header
field. A conditional GET method
requests that the entity be
transferred only under the
circumstances described by the
conditional header field(s). The
conditional GET method is intended to
reduce unnecessary network usage by
allowing cached entities to be
refreshed without requiring multiple
requests or transferring data already
held by the client.
The semantics of the GET method change
to a "partial GET" if the request
message includes a Range header field.
A partial GET requests that only part
of the entity be transferred, as
described in section 14.35. The
partial GET method is intended to
reduce unnecessary network usage by
allowing partially-retrieved entities
to be completed without transferring
data already held by the client.
The response to a GET request is
cacheable if and only if it meets the
requirements for HTTP caching
described in section 13.
See section 15.1.3 for security
considerations when used for forms.
9.5 POST
The POST method is used to request
that the origin server accept the
entity enclosed in the request as a
new subordinate of the resource
identified by the Request-URI in the
Request-Line. POST is designed to
allow a uniform method to cover the
following functions:
- Annotation of existing resources;
- Posting a message to a bulletin board, newsgroup, mailing
list,
or similar group of articles;
- Providing a block of data, such as the result of submitting a
form, to a data-handling process;
- Extending a database through an append operation. The actual
function performed by the POST method
is determined by the server and is
usually dependent on the Request-URI.
The posted entity is subordinate to
that URI in the same way that a file
is subordinate to a directory
containing it, a news article is
subordinate to a newsgroup to which it
is posted, or a record is subordinate
to a database.
The action performed by the POST
method might not result in a resource
that can be identified by a URI. In
this case, either 200 (OK) or 204 (No
Content) is the appropriate response
status, depending on whether or not
the response includes an entity that
describes the result.
If a resource has been created on the
origin server, the response SHOULD be
201 (Created) and contain an entity
which describes the status of the
request and refers to the new
resource, and a Location header (see
section 14.30).
Responses to this method are not
cacheable, unless the response
includes appropriate Cache-Control or
Expires header fields. However, the
303 (See Other) response can be used
to direct the user agent to retrieve a
cacheable resource.
POST requests MUST obey the message
transmission requirements set out in
section 8.2.
See section 15.1.3 for security
considerations.
As stated, the response may change with GET if the request message has conditionals based on certain criteria. The POST requires that the server accept the request, no matter what.
Anytime you do a web search and you want to link someone to it, you can easily do it through:
http://www.google.com/search?q=lol
Can you imagine telling someone to do a POST request instead? A POST request isn't really bookmarkable like that, which is why GET is useful.
They simply have different purposes, as stated in other answers. GET is for GETing, POST is for POSTing.
Everything can also be achieved using raw TCP connections. Yet we often use HTTP rather than raw TCP connections because HTTP offers a layer of abstraction and, therefore, convenience and conforming implementations. Likewise, we use HTTP correctly (GETs, POSTs, PUTs, DELETEs, etc) rather than dumbly (POSTs only) because these verbs offer an additional layer of abstraction and, therefore, convenience and conforming implementations.
Lets say I want to send a variable to a page via a link, can I do that with POST? Nope, but with GET, I can send something over by doing ?variableName=someValue
You're right, everything can be tunnel through an HTTP POST. In fact, SOAP web services do exactly that. Everything is a POST using SOAP web services.
In that case, you are tunneling through HTTP, and not using HTTP to its fullest. If that's all you want to do, then that's fine.
However, if you wish to leverage HTTP for the features and benefits that it provides beyond simple message transport, then you should read the RFC and learn the rest of the HTTP protocol including GET, PUT, POST, DELETE, and all of the headers, cache management and result codes.

How do I deal with different requests that map to the same response?

I'm designing a Web service. The request is idempotent, so I chose the GET method. The response is relatively expensive to calculate and not small, so I want to get caching (on the protocol level) right. (Don't worry about memoisation at my part, I have that already covered; my question here is actually also paying attention to the Web as a whole.)
There's only one mandatory parameter and a number of optional parameter with default values if missing. For example, the following two map to the same representation of the response. (If this is a dumb way to go about it the interface, propose something better.)
GET /service?mandatory_parameter=some_data HTTP/1.1
GET /service?mandatory_parameter=some_data;optional_parameter=default1;another_optional_parameter=default2;yet_another_optional_parameter=default3 HTTP/1.1
However, I imagine clients do not know this and would treat them separate and therefore waste cache storage. What should I do to avoid violating the golden rule of caching?
Make up a canonical form, document it (e.g. all parameters are required after all and need to be sorted in a specific order) and return a client error unless the required form is met?
Instead of an error, redirect permanently to the canonical form of a request?
Or is it enough to not mind how the request looks like, and just respond with the same ETag for same responses?
First, don't use semicolons as a delimiter in a query string. You should be using ? to begin a query string and & to delimit variable/value pairs. RFC 3986 doesn't explicitly say you have to use &, but the vast majority of existing code uses this delimiter because of the application/x-www-form-urlencoded precedent.
Second, you're right, in that parameters in a query string result in a different URI, and thus, as far as caches are concerned, a different resource. Assuming you want optimal caching performance, if you know that an optional parameter has been specified, and its inclusion is unnecessary and does not affect the representation that will be transmitted, you should be making a redirect to a canonical representation that omits the parameter. (i.e., An optional parameter is given with a value that is set to the default value. For example, if you have http://example.com:80/, you can normalize to http://example.com/ because 80 is the default value for the port with HTTP. You can do the same for query parameters since you control the URI space.) If you have parameters included (optional or otherwise) that appear in an order other than the canonical order, you should redirect for that too. A 301 redirect would be preferred if you know that the relationship between URIs will be stable. Otherwise, do a 302/307 redirect as appropriate. I would recommend defining your canonical form the same way that OAuth does: Sort each parameter alphabetically, first by key, then by value. Other normalization operations will also help out here. RFC 3986 has an entire section on URI normalization that will be relevant to you. This technique will really only work for GET, and redirects on PUT/POST/DELETE are not generally recommended.
Third, ETags are great, and they provide a huge performance improvement if implemented well by both the client and server. However, it's unfortunately rare for both sides to do it right. Ditto for Last-Modified. You should pursue these, because the CPU and bandwidth savings are significant when it works, but they are not sufficient on their own. Other headers like Cache-Control are also frequently necessary. It's worth familiarizing yourself with Section 13 of RFC 2616 if you're planning on going into great detail on this stuff.
Finally, a word of warning — there is an issue with these redirects you need to be aware of: Clients trying to access your resources may frequently be redirected to other locations. This introduces overhead that only gives you an overall savings if the clients make subsequent requests against the same resource, maintaining state to avoid the subsequent redirect. Unless you've open-sourced a reference client implementation that takes advantage of your caching optimizations, you may never benefit from these tweaks.
I would pick option (2) in your list - I would make the request RESTful, rather than RPC like.
I.e. in this case, if you make all of the parameters parts of the request path:
/service/mandatory_parameter/some_data/optional_parameter/default1/another_optional_parameter/default2/yet_another_optional_parameter/default3
In the case where not all of the optional parameters are specified, return a 301 (Permanent redirect) to the full resource name with the defaults filled in. This will (or should) be cached by clients and web caches appropriately, and even if it gets to your backend then making the 301 should be very cheap for you.
At which point, you have one canonical form for the URI, and caching will work as normal/expected.
This does mean that every combination of parameters will be cached separately (as a 301), however that's fine really as the non-canonical requests will have an independent cache policy to the full request and clients which are worried about the extra round trip can fill in all the parameters themselves.
Your option (3) won't work as you expect - each form will be cached independently as they're different URIs.
It should also be noted that a lot of downstream caches / software won't cache your response at all due to the query parameters, which is why I suggest turning it into a 'proper' resource..
First it's a good thing you choice GET since other methods don't have as good caching support. As far as I know browsers do cache URIs with respect to the parameters so I don't think It's a good idea to use a canonical form.
One thing that you don't state here is how this service is going to be used. If those requests are made from a browser (and it looks to me that those are probably issued from a script) requests will probably look the same even if they are asked for more than once. So make sure that whatever generate the URI end up with the same URI for equal input data (remove default parameters or always include them).
When it comes to the ETag I recommend you to have this, though I would like to clarify how it works; You get the request, you process all your "expensive calculations" and then if there were a If-None-Match header with the same hash (ETag) as your processed response you may return 304 Not-Modified. So ETag is used to avoid transmitting the response if the client already have it. (Sure you may implement caching on server-side, but this is better to do based on input parameters).
To further improve cache hits on client side you may want to set proper caching headers in you response.
I asked almost the same question for me some month ago. My answer I describe on an example of my realization.
On the server side I have WFC service which receive requests in one of the following forms
GET /Service/RequestedData?param1=data1&param2=data2…
GET /Service/RequestedData/IdOfData?param1=data1&param2=data2…
PUT /Service/RequestedData/IdOfData // with param1=data1&param2=data2… in body
POST /Service/RequestedData/IdOfData // with param1=data1&param2=data2… in body
DELETE /Service/RequestedData/IdOfData
So requests are in REST for, but GET requests have some optional parameters. Especially this part is a port of your interest.
Because WFC support a URL templates, the prototype of functions which reply to a client request looks like
[WebGet (UriTemplate = "RequestedData?param1={myParam1}&param2={myParam2}",
ResponseFormat = WebMessageFormat.Json)]
[OperationContract]
MyResult GetData (string myParam1, int myParam2);
All requests like
GET /Service/RequestedData?param1=&param2=data2
GET /Service/RequestedData?param2=data2&param1=
GET /Service/RequestedData?param2=data2
will be mapped to the same call from the side of my WCF service. So I have one problem less.
Now at the beginning of implementation of every method which response to HTTP GET request I set in the HTTP header "Cache-Control: max-age=0". It means that client always try to verify client browser cache and no ajax requests will be not easy responded from the local cache like it can do Internet Explorer.
Next I calculate always an ETag based on my data. The exact algorithm is a subject of separate discussion, but important is, that in all responses to HTTP GET requests exist ETag in the HTTP header.
So clients every time verify his local cache and send GET request to server. They send the ETag, which come from its local cache, inside of "If-None-Match" HTTP header. Server computes the ETag which has data, which will be sending back to this GET request. It ETag of data is the same as in the client request server send back response with empty body and the code "304 Not Modified" back. In this case browser gives data from the local cache.
If the same client from a unknown reason create a new version of URL request, which will be interpret from the web browser as a new URL, then web browser will not find old server response in the local cache and send one more time the same request to the server. Is it a real problem? The server send the data one more time. If you have a server side caching you can makes a little more optimization. In the most cases, the URL of GET requests will be produced by a client side JavaScript so you will be no time have such situation.
Calculation of ETag and setting of "Cache-Control: max-age=0" and Etag header as well as setting "304 Not Modified" code should do WFC service, but it is very easy.
The most important is that my implementation of ETag calculation is not as expansive as getting the whole data from the database server and calculation MD5 cache from there. I use permanently rowversion data type in every row of data in the SQL Server database. This rowversion is nothing other as a counter of changes in the database. If one change a row of data rowversion value in the corresponding row will be incremented. So if one makes SELECT statement from maximum value of rowversion value, and this value is not changed comparing with the previous requests, one can be sure that the data were not changed in the time period. The algorithm of calculation of ETag should be only sensitive to deleting of data from the table. But it is also a solved problem. A little more about this you can read in Concurrency handling of Sql transactrion.
I don’t want suggest my ETag calculation as a best choice, I want only say, that calculation of ETag can be much cheaper as calculation MD5 from the whole data.
In case of errors Server throws an exception which will be mapped to a HTTP code, which I define in the throw statement. As a body WFC sends a standard JSON object {"description":"My error text"}. A custom error object is also possible (see Is WebProtocolException included in .net 4.0?). On the client side I use jQuery and in the corresponding jQuery.ajax inside of error event handler the error message will be decoded and displayed to the user.
So my recommendation: usage of ETag together with "Cache-Control: max-age=0" for all HTTP GET requests. For all other requests I’ll recommend you implement RESTfull service. For the error implementation you should look at the most native way which is supported by the software used for server and client implementation and use this.
UPDATED: To clear the URL structure I should add following. In my service the main part like GET /Service/RequestedData/IdOfData describes data objects requested. Parameters param1=data1&param2=data2 corresponds mostly the information about sorting, paging and filtering of data. I use active jqGrid plugin for jQuery and if the end-user scroll in the grid to the next page, click on the column header (sorting of data) or if he set a filter with respect of searching feature, all these follows to different optional parameters appended the main URL.

What's the difference between a POST and a PUT HTTP REQUEST?

They both seem to be sending data to the server inside the body, so what makes them different?
HTTP PUT:
PUT puts a file or resource at a specific URI, and exactly at that URI. If there's already a file or resource at that URI, PUT replaces that file or resource. If there is no file or resource there, PUT creates one. PUT is idempotent, but paradoxically PUT responses are not cacheable.
HTTP 1.1 RFC location for PUT
HTTP POST:
POST sends data to a specific URI and expects the resource at that URI to handle the request. The web server at this point can determine what to do with the data in the context of the specified resource. The POST method is not idempotent, however POST responses are cacheable so long as the server sets the appropriate Cache-Control and Expires headers.
The official HTTP RFC specifies POST to be:
Annotation of existing resources;
Posting a message to a bulletin board, newsgroup, mailing list,
or similar group of articles;
Providing a block of data, such as the result of submitting a
form, to a data-handling process;
Extending a database through an append operation.
HTTP 1.1 RFC location for POST
Difference between POST and PUT:
The RFC itself explains the core difference:
The fundamental difference between the
POST and PUT requests is reflected in
the different meaning of the
Request-URI. The URI in a POST request
identifies the resource that will
handle the enclosed entity. That
resource might be a data-accepting
process, a gateway to some other
protocol, or a separate entity that
accepts annotations. In contrast, the
URI in a PUT request identifies the
entity enclosed with the request --
the user agent knows what URI is
intended and the server MUST NOT
attempt to apply the request to some
other resource. If the server desires
that the request be applied to a
different URI, it MUST send a 301 (Moved Permanently) response; the user agent MAY then make
its own decision regarding whether or not to redirect the request.
Additionally, and a bit more concisely, RFC 7231 Section 4.3.4 PUT states (emphasis added),
4.3.4. PUT
The PUT method requests that the state of the target resource be
created or replaced with the state defined by the representation
enclosed in the request message payload.
Using the right method, unrelated aside:
One benefit of REST ROA vs SOAP is that when using HTTP REST ROA, it encourages the proper usage of the HTTP verbs/methods. So for example you would only use PUT when you want to create a resource at that exact location. And you would never use GET to create or modify a resource.
Only semantics.
An HTTP PUT is supposed to accept the body of the request, and then store that at the resource identified by the URI.
An HTTP POST is more general. It is supposed to initiate an action on the server. That action could be to store the request body at the resource identified by the URI, or it could be a different URI, or it could be a different action.
PUT is like a file upload. A put to a URI affects exactly that URI. A POST to a URI could have any effect at all.
To give examples of REST-style resources:
POST /books with a bunch of book information might create a new book, and respond with the new URL identifying that book: /books/5.
PUT /books/5 would have to either create a new book with the ID of 5, or replace the existing book with ID 5.
In non-resource style, POST can be used for just about anything that has a side effect. One other difference is that PUT should be idempotent: multiple PUTs of the same data to the same URL should be fine, whereas multiple POSTs might create multiple objects or whatever it is your POST action does.
GET: Retrieves data from the server. Should have no other effect.
PUT: Replaces target resource with the request payload. Can be used to update or create a new resource.
PATCH: Similar to PUT, but used to update only certain fields within an existing resource.
POST: Performs resource-specific processing on the payload. Can be used for different actions including creating a new resource, uploading a file, or submitting a web form.
DELETE: Removes data from the server.
TRACE: Provides a way to test what the server receives. It simply returns what was sent.
OPTIONS: Allows a client to get information about the request methods supported by a service. The relevant response header is Allow with supported methods. Also used in CORS as preflight request to inform the server about actual the request method and ask about custom headers.
HEAD: Returns only the response headers.
CONNECT: Used by the browser when it knows it talks to a proxy and the final URI begins with https://. The intent of CONNECT is to allow end-to-end encrypted TLS sessions, so the data is unreadable to a proxy.
PUT is meant as a a method for "uploading" stuff to a particular URI, or overwriting what is already in that URI.
POST, on the other hand, is a way of submitting data RELATED to a given URI.
Refer to the HTTP RFC
As far as i know, PUT is mostly used for update the records.
POST - To create document or any other resource
PUT - To update the created document or any other resource.
But to be clear on that PUT usually 'Replaces' the existing record if it is there and creates if it not there..
Define operations in terms of HTTP methods
The HTTP protocol defines a number of methods that assign semantic meaning to a request. The common HTTP methods used by most RESTful web APIs are:
GET retrieves a representation of the resource at the specified URI. The body of the response message contains the details of the requested resource.
POST creates a new resource at the specified URI. The body of the request message provides the details of the new resource. Note that POST can also be used to trigger operations that don't actually create resources.
PUT either creates or replaces the resource at the specified URI. The body of the request message specifies the resource to be created or updated.
PATCH performs a partial update of a resource. The request body specifies the set of changes to apply to the resource.
DELETE removes the resource at the specified URI.
The effect of a specific request should depend on whether the resource is a collection or an individual item. The following table summarizes the common conventions adopted by most RESTful implementations using the e-commerce example. Not all of these requests might be implemented—it depends on the specific scenario.
Resource
POST
GET
PUT
DELETE
/customers
Create a new customer
Retrieve all customers
Bulk update of customers
Remove all customers
/customers/1
Error
Retrieve the details for customer 1
Update the details of customer 1 if it exists
Remove customer 1
/customers/1/orders
Create a new order for customer 1
Retrieve all orders for customer 1
Bulk update of orders for customer 1
Remove all orders for customer 1
The differences between POST, PUT, and PATCH can be confusing.
A POST request creates a resource. The server assigns a URI for the new resource and returns that URI to the client. In the REST model, you frequently apply POST requests to collections. The new resource is added to the collection. A POST request can also be used to submit data for processing to an existing resource, without any new resource being created.
A PUT request creates a resource or updates an existing resource. The client specifies the URI for the resource. The request body contains a complete representation of the resource. If a resource with this URI already exists, it is replaced. Otherwise, a new resource is created, if the server supports doing so. PUT requests are most frequently applied to resources that are individual items, such as a specific customer, rather than collections. A server might support updates but not creation via PUT. Whether to support creation via PUT depends on whether the client can meaningfully assign a URI to a resource before it exists. If not, then use POST to create resources and PUT or PATCH to update.
A PATCH request performs a partial update to an existing resource. The client specifies the URI for the resource. The request body specifies a set of changes to apply to the resource. This can be more efficient than using PUT, because the client only sends the changes, not the entire representation of the resource. Technically PATCH can also create a new resource (by specifying a set of updates to a "null" resource), if the server supports this.
PUT requests must be idempotent. If a client submits the same PUT request multiple times, the results should always be the same (the same resource will be modified with the same values). POST and PATCH requests are not guaranteed to be idempotent.
Others have already posted excellent answers, I just wanted to add that with most languages, frameworks, and use cases you'll be dealing with POST much, much more often than PUT. To the point where PUT, DELETE, etc. are basically trivia questions.
Please see: http://zacharyvoase.com/2009/07/03/http-post-put-diff/
I’ve been getting pretty annoyed lately by a popular misconception by web developers that a POST is used to create a resource, and a PUT is used to update/change one.
If you take a look at page 55 of RFC 2616 (“Hypertext Transfer Protocol – HTTP/1.1”), Section 9.6 (“PUT”), you’ll see what PUT is actually for:
The PUT method requests that the enclosed entity be stored under the supplied Request-URI.
There’s also a handy paragraph to explain the difference between POST and PUT:
The fundamental difference between the POST and PUT requests is reflected in the different meaning of the Request-URI. The URI in a POST request identifies the resource that will handle the enclosed entity. That resource might be a data-accepting process, a gateway to some other protocol, or a separate entity that accepts annotations. In contrast, the URI in a PUT request identifies the entity enclosed with the request – the user agent knows what URI is intended and the server MUST NOT attempt to apply the request to some other resource.
It doesn’t mention anything about the difference between updating/creating, because that’s not what it’s about. It’s about the difference between this:
obj.set_attribute(value) # A POST request.
And this:
obj.attribute = value # A PUT request.
So please, stop the spread of this popular misconception. Read your RFCs.
A POST is considered something of a factory type method. You include data with it to create what you want and whatever is on the other end knows what to do with it. A PUT is used to update existing data at a given URL, or to create something new when you know what the URI is going to be and it doesn't already exist (as opposed to a POST which will create something and return a URL to it if necessary).
It should be pretty straightforward when to use one or the other, but complex wordings are a source of confusion for many of us.
When to use them:
Use PUT when you want to modify a singular resource that is already a part of resource collection. PUT replaces the resource in its entirety. Example: PUT /resources/:resourceId
Sidenote: Use PATCH if you want to update a part of the resource.
Use POST when you want to add a child resource under a collection of resources.
Example: POST => /resources
In general:
Generally, in practice, always use PUT for UPDATE operations.
Always use POST for CREATE operations.
Example:
GET /company/reports => Get all reports
GET /company/reports/{id} => Get the report information identified by "id"
POST /company/reports => Create a new report
PUT /company/reports/{id} => Update the report information identified by "id"
PATCH /company/reports/{id} => Update a part of the report information identified by "id"
DELETE /company/reports/{id} => Delete report by "id"
The difference between POST and PUT is that PUT is idempotent, that means, calling the same PUT request multiple times will always produce the same result(that is no side effect), while on the other hand, calling a POST request repeatedly may have (additional) side effects of creating the same resource multiple times.
GET : Requests using GET only retrieve data , that is it requests a representation of the specified resource
POST : It sends data to the server to create a resource. The type of the body of the request is indicated by the Content-Type header. It often causes a change in state or side effects on the server
PUT : Creates a new resource or replaces a representation of the target resource with the request payload
PATCH : It is used to apply partial modifications to a resource
DELETE : It deletes the specified resource
TRACE : It performs a message loop-back test along the path to the target resource, providing a useful debugging mechanism
OPTIONS : It is used to describe the communication options for the target resource, the client can specify a URL for the OPTIONS method, or an asterisk (*) to refer to the entire server.
HEAD : It asks for a response identical to that of a GET request, but without the response body
CONNECT : It establishes a tunnel to the server identified by the target resource , can be used to access websites that use SSL (HTTPS)
In simple words you can say:
1.HTTP Get:It is used to get one or more items
2.HTTP Post:It is used to create an item
3.HTTP Put:It is used to update an item
4.HTTP Patch:It is used to partially update an item
5.HTTP Delete:It is used to delete an item
It would be worth mentioning that POST is subject to some common Cross-Site Request Forgery (CSRF) attacks while PUT isn't.
The CSRF below are not possible with PUT when the victim visits attackersite.com.
The effect of the attack is that the victim unintentionally deletes a user just because it (the victim) was logged-in as admin on target.site.com, before visiting attackersite.com:
Malicious code on attackersite.com:
Case 1: Normal request. saved target.site.com cookies will automatically be sent by the browser: (note: supporting PUT only, at the endpoint, is safer because it is not a supported <form> attribute value)
<!--deletes user with id 5-->
<form id="myform" method="post" action="http://target.site.com/deleteUser" >
<input type="hidden" name="userId" value="5">
</form>
<script>document.createElement('form').submit.call(document.getElementById('myform'));</script>
Case 2: XHR request. saved target.site.com cookies will automatically be sent by the browser: (note: supporting PUT only, at the endpoint, is safer because an attempt to send PUT would trigger a preflight request, whose response would prevent the browser from requesting the deleteUser page)
//deletes user with id 5
var xhr = new XMLHttpRequest();
xhr.open("POST", "http://target.site.com/deleteUser");
xhr.withCredentials=true;
xhr.send(["userId=5"]);
MDN Ref : [..]Unlike “simple requests” (discussed above), --[[ Means: POST/GET/HEAD ]]--, for "preflighted" requests the browser first sends an HTTP request using the OPTIONS method[..]
cors in action : [..]Certain types of requests, such as DELETE or PUT, need to go a step further and ask for the server’s permission before making the actual request[..]what is called a preflight request[..]
REST-ful usage
POST is used to create a new resource and then returns the resource URI
EX
REQUEST : POST ..../books
{
"book":"booName",
"author":"authorName"
}
This call may create a new book and returns that book URI
Response ...THE-NEW-RESOURCE-URI/books/5
PUT is used to replace a resource, if that resource is exist then simply update it, but if that resource doesn't exist then create it,
REQUEST : PUT ..../books/5
{
"book":"booName",
"author":"authorName"
}
With PUT we know the resource identifier, but POST will return the new resource identifier
Non REST-ful usage
POST is used to initiate an action on the server side, this action may or may not create a resource, but this action will have side affects always it will change something on the server
PUT is used to place or replace literal content at a specific URL
Another difference in both REST-ful and non REST-ful styles
POST is Non-Idempotent Operation: It will cause some changes if executed multiple times with the same request.
PUT is Idempotent Operation: It will have no side-effects if executed multiple times with the same request.
Actually there's no difference other than their title. There's actually a basic difference between GET and the others. With a "GET"-Request method, you send the data in the url-address-line, which are separated first by a question-mark, and then with a & sign.
But with a "POST"-request method, you can't pass data through the url, but you have to pass the data as an object in the so called "body" of the request. On the server side, you have then to read out the body of the received content in order to get the sent data.
But there's on the other side no possibility to send content in the body, when you send a "GET"-Request.
The claim, that "GET" is only for getting data and "POST" is for posting data, is absolutely wrong. Noone can prevent you from creating new content, deleting existing content, editing existing content or do whatever in the backend, based on the data, that is sent by the "GET" request or by the "POST" request. And nobody can prevent you to code the backend in a way, that with a "POST"-Request, the client asks for some data.
With a request, no matter which method you use, you call a URL and send or don't send some data to specify, which information you want to pass to the server to deal with your request, and then the client gets an answer from the server. The data can contain whatever you want to send, the backend is allowed to do whatever it wants with the data and the response can contain any information, that you want to put in there.
There are only these two BASIC METHODS. GET and POST. But it's their structure, which makes them different and not what you code in the backend. In the backend you can code whatever you want to, with the received data. But with the "POST"-request you have to send/retrieve the data in the body and not in the url-addressline, and with a "GET" request, you have to send/retrieve data in the url-addressline and not in the body. That's all.
All the other methods, like "PUT", "DELETE" and so on, they have the same structure as "POST".
The POST Method is mainly used, if you want to hide the content somewhat, because whatever you write in the url-addressline, this will be saved in the cache and a GET-Method is the same as writing a url-addressline with data. So if you want to send sensitive data, which is not always necessarily username and password, but for example some ids or hashes, which you don't want to be shown in the url-address-line, then you should use the POST method.
Also the URL-Addressline's length is limited to 1024 symbols, whereas the "POST"-Method is not restricted. So if you have a bigger amount of data, you might not be able to send it with a GET-Request, but you'll need to use the POST-Request. So this is also another plus point for the POST-request.
But dealing with the GET-request is way easier, when you don't have complicated text to send.
Otherwise, and this is another plus point for the POST method, is, that with the GET-method you need to url-encode the text, in order to be able to send some symbols within the text or even spaces. But with a POST method you have no restrictions and your content doesn't need to be changed or manipulated in any way.
Summary
Use PUT to create or replace the state of the target resource with the state defined by the representation enclosed in the request. That standardized intended effect is idempotent so it informs intermediaries that they can repeat a request in case of communication failure.
Use POST otherwise (including to create or replace the state of a resource other than the target resource). Its intended effect is not standardized so intermediaries cannot rely on any universal property.
References
The latest authoritative description of the semantic difference between the POST and PUT request methods is given in RFC 7231 (Roy Fielding, Julian Reschke, 2014):
The fundamental difference between the POST and PUT methods is highlighted by the different intent for the enclosed representation. The target resource in a POST request is intended to handle the enclosed representation according to the resource's own semantics, whereas the enclosed representation in a PUT request is defined as replacing the state of the target resource. Hence, the intent of PUT is idempotent and visible to intermediaries, even though the exact effect is only known by the origin server.
In other words, the intended effect of PUT is standardized (create or replace the state of the target resource with the state defined by the representation enclosed in the request) and so is common to all target resources, while the intended effect of POST is not standardized and so is specific to each target resource. Thus POST can be used for anything, including for achieving the intended effects of PUT and other request methods (GET, HEAD, DELETE, CONNECT, OPTIONS, and TRACE).
But it is recommended to always use the more specialized request method rather than POST when applicable because it provides more information to intermediaries for automating information retrieval (since GET, HEAD, OPTIONS, and TRACE are defined as safe), handling communication failure (since GET, HEAD, PUT, DELETE, OPTIONS, and TRACE are defined as idempotent), and optimizing cache performance (since GET and HEAD are defined as cacheable), as explained in It Is Okay to Use POST (Roy Fielding, 2009):
POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT), or any of the other standardized methods that tell intermediaries something more valuable than “this may change something.” The other methods are more valuable to intermediaries because they say something about how failures can be automatically handled and how intermediate caches can optimize their behavior. POST does not have those characteristics, but that doesn’t mean we can live without it. POST serves many useful purposes in HTTP, including the general purpose of “this action isn’t worth standardizing.”
Both PUT and POST are Rest Methods .
PUT - If we make the same request twice using PUT using same parameters both times, the second request will not have any effect. This is why PUT is generally used for the Update scenario,calling Update more than once with the same parameters doesn't do anything more than the initial call hence PUT is idempotent.
POST is not idempotent , for instance Create will create two separate entries into the target hence it is not idempotent so CREATE is used widely in POST.
Making the same call using POST with same parameters each time will cause two different things to happen, hence why POST is commonly used for the Create scenario
Post and Put are mainly used for post the data and other update the data. But you can do the same with post request only.

Resources