I was listening a video on youtube on REST API, below is the link:
REST API
It is said that GET, PUT, DELETE, HEAD are idempotent operations i.e. you can invoke them multiple times and get the same state on the server.
Would anybody please elaborate this line?
Just what it says
No matter how many times that Resource is requested with the exact same URL, the state on the server will never change as a side effect because of the request.
idempotent:
Denoting an element of a set that is unchanged in value when
multiplied or otherwise operated on by itself
The point is multiple requests that are exactly the same:
So if you request an image from a server 1000 times with the same URL, nothing on the server is changed.
If you call DELETE multiple times on the same resource, they state
on the server doesn't change. This removes the resource, and nothing
else, no side effect. And if the resource is not there, good that is
what we wanted and nothing else should be affected on the server.
Those Verbs should never have side effects.
Doing a GET should not cause side effects to alter the state of the server no matter how many times this exact URL is requested.
It is about repeated subsequent calls
Example:
Calling GET on a resource should NOT modify a database record, or
cause any changes. If it does it isn't following the rules.
If you call HEAD on a resource 1000 times in a row, the state on the server should not change. It might return different data because some removed the resource separately, but repeated calls should never do anything different on the server.
What is NOT idempotent
Example:
Calling GET multiple times causes a counter on tracking that
resource to increment every time you make the request with the exact
same URL. This is not idempotent. There is a side effect and the
state of the server is changing because of the request.
Idempotent means no matter how many times you invoke that method (e.g. GET), you won't introduce side effects. For example when you issue a GET request to a URL (navigating to http://www.google.com in a browser), you theoretically won't change the state of the web server no matter how many GET requests you issued to the server.
As a real world example, you shouldn't allow some database DELETE/INSERT operations be accessible via an HTTP GET. There are numerous accidents for the Google Crawler to accidentally deleted entities from databases while crawling (i.e. GETting) websites.
Related
I have an end point like this
/profiles/1
I want to get the profile whose id is 1 but at the same time increment the visited property by 1. The property comes as part of the object. Which HTTP verb I should be using to fetch the profile object with visited property incremented by 1.
Everytime a profile with id: 1 is fetched, the visited property will be incremented by 1.
Which HTTP verb I should be using to fetch the profile object with visited property incremented by 1?
The GET method is meant to be used for data retrieval, so it's a candidate. Quoting the RFC 7231, the document that currently defines the semantics and contents of the HTTP/1.1 protocol:
4.3.1. GET
The GET method requests transfer of a current selected representation for the target resource. GET is the primary mechanism of information retrieval and the focus of almost all performance optimizations. [...]
The GET method is also defined as both safe (read-only) and idempotent (multiple identical requests with that method is the same as the effect for a single such request).
But it's still on the table. Again, quoting the RFC 7231 (highlights are mine):
4.2.1. Safe Methods
Request methods are considered "safe" if their defined semantics are essentially read-only; i.e., the client does not request, and does not expect, any state change on the origin server as a result of applying a safe method to a target resource. [...]
This definition of safe methods does not prevent an implementation from including behavior that is potentially harmful, that is not entirely read-only, or that causes side effects while invoking a safe method. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it. For example, most servers append request information to access log files at the completion of every response, regardless of the method, and that is considered safe even though the log storage might become full and crash the server. [...]
4.2.2. Idempotent Methods
A request method is considered "idempotent" if the intended effect on the server of multiple identical requests with that method is the same as the effect for a single such request. [...]
Like the definition of safe, the idempotent property only applies to what has been requested by the user; a server is free to log each request separately, retain a revision control history, or implement other non-idempotent side effects for each idempotent request.
Assuming you want to count the number of the times a profile is retrieved, then it makes sense to attach it to the GET operation. You may want, however, to avoid incrementing the counter when the endpoint is hit by a bot, for example.
So under certain conditions, GET requests are allowed to have side effects. What is important, however, is that the client did not request that additional behavior and cannot be held accountable for it: what a server does is the server's responsibility.
Ultimately, it's also important to say that HTTP doesn't care about the storage underneath, so you could either store the view count in the same table as the profile data or a different table or a completely different database. And then retrieve all together or use different endpoints for the view count.
It is said everywhere[after reading many posts] that PUT is idempotent, means multiple requests with same inputs will produce same result as the very first request.
But, if we put same request with same inputs with POST method, then again, it will behave as PUT.
So, what is the difference in terms of Idempotent between PUT and POST.
The idea is that there should be a difference between POST and PUT, not that there is any. To clarify, the POST request should ideally create a new resource, whereas PUT request should be used to update the existing one. So, a client sending two POST requests would create two resources, whereas two PUT requests wouldn't (or rather shouldn't) cause any undesirable change.
To go into more detail, idempotency means that in an isolated environment multiple requests from the same client does not have any effect on the state of the resource. If request from another client changes the state of the resource, than it does not break the idempotency principle. Although, if you really want to ensure that put request does not end up overriding the changes by another simultaneous request from different client, you should always use etags. To elaborate, put request should always supply an etag (it got from get request) of the last resource state, and only if the etag is latest the resource should be updated, otherwise 412 (Precondition Failed) status code should be raised. In case of 412, client is suppose to get the resource again, and then try the update. According to REST, this is vital to prevent race conditions.
According to
W3C(http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html),
'Methods can also have the property of "idempotence" in that (aside
from error or expiration issues) the side-effects of N > 0 identical
requests is the same as for a single request.'
I am wondering if my current approach makes sense or if there is a better way to do it.
I have multiple situations where I want to create new objects and let the server assign an ID to those objects. Sending a POST request appears to be the most appropriate way to do that.
However since POST is not idempotent the request may get lost and sending it again may create a second object. Also requests being lost might be quite common since the API is often accessed through mobile networks.
As a result I decided to split the whole thing into a two-step process:
First sending a POST request to create a new object which returns the URI of the new object in the Location header.
Secondly performing an idempotent PUT request to the supplied Location to populate the new object with data. If a new object is not populated within 24 hours the server may delete it through some kind of batch job.
Does that sound reasonable or is there a better approach?
The only advantage of POST-creation over PUT-creation is the server generation of IDs.
I don't think it worths the lack of idempotency (and then the need for removing duplicates or empty objets).
Instead, I would use a PUT with a UUID in the URL. Owing to UUID generators you are nearly sure that the ID you generate client-side will be unique server-side.
well it all depends, to start with you should talk more about URIs, resources and representations and not be concerned about objects.
The POST Method is designed for non-idempotent requests, or requests with side affects, but it can be used for idempotent requests.
on POST of form data to /some_collection/
normalize the natural key of your data (Eg. "lowercase" the Title field for a blog post)
calculate a suitable hash value (Eg. simplest case is your normalized field value)
lookup resource by hash value
if none then
generate a server identity, create resource
Respond => "201 Created", "Location": "/some_collection/<new_id>"
if found but no updates should be carried out due to app logic
Respond => 302 Found/Moved Temporarily or 303 See Other
(client will need to GET that resource which might include fields required for updates, like version_numbers)
if found but updates may occur
Respond => 307 Moved Temporarily, Location: /some_collection/<id>
(like a 302, but the client should use original http method and might do automatically)
A suitable hash function might be as simple as some concatenated fields, or for large fields or values a truncated md5 function could be used. See [hash function] for more details2.
I've assumed you:
need a different identity value than a hash value
data fields used
for identity can't be changed
Your method of generating ids at the server, in the application, in a dedicated request-response, is a very good one! Uniqueness is very important, but clients, like suitors, are going to keep repeating the request until they succeed, or until they get a failure they're willing to accept (unlikely). So you need to get uniqueness from somewhere, and you only have two options. Either the client, with a GUID as Aurélien suggests, or the server, as you suggest. I happen to like the server option. Seed columns in relational DBs are a readily available source of uniqueness with zero risk of collisions. Round 2000, I read an article advocating this solution called something like "Simple Reliable Messaging with HTTP", so this is an established approach to a real problem.
Reading REST stuff, you could be forgiven for thinking a bunch of teenagers had just inherited Elvis's mansion. They're excitedly discussing how to rearrange the furniture, and they're hysterical at the idea they might need to bring something from home. The use of POST is recommended because its there, without ever broaching the problems with non-idempotent requests.
In practice, you will likely want to make sure all unsafe requests to your api are idempotent, with the necessary exception of identity generation requests, which as you point out don't matter. Generating identities is cheap and unused ones are easily discarded. As a nod to REST, remember to get your new identity with a POST, so it's not cached and repeated all over the place.
Regarding the sterile debate about what idempotent means, I say it needs to be everything. Successive requests should generate no additional effects, and should receive the same response as the first processed request. To implement this, you will want to store all server responses so they can be replayed, and your ids will be identifying actions, not just resources. You'll be kicked out of Elvis's mansion, but you'll have a bombproof api.
But now you have two requests that can be lost? And the POST can still be repeated, creating another resource instance. Don't over-think stuff. Just have the batch process look for dupes. Possibly have some "access" count statistics on your resources to see which of the dupe candidates was the result of an abandoned post.
Another approach: screen incoming POST's against some log to see whether it is a repeat. Should be easy to find: if the body content of a request is the same as that of a request just x time ago, consider it a repeat. And you could check extra parameters like the originating IP, same authentication, ...
No matter what HTTP method you use, it is theoretically impossible to make an idempotent request without generating the unique identifier client-side, temporarily (as part of some request checking system) or as the permanent server id. An HTTP request being lost will not create a duplicate, though there is a concern that the request could succeed getting to the server but the response does not make it back to the client.
If the end client can easily delete duplicates and they don't cause inherent data conflicts it is probably not a big enough deal to develop an ad-hoc duplication prevention system. Use POST for the request and send the client back a 201 status in the HTTP header and the server-generated unique id in the body of the response. If you have data that shows duplications are a frequent occurrence or any duplicate causes significant problems, I would use PUT and create the unique id client-side. Use the client created id as the database id - there is no advantage to creating an additional unique id on the server.
I think you could also collapse creation and update request into only one request (upsert). In order to create a new resource, client POST a “factory” resource, located for example at /factory-url-name. And then the server returns the URI for the new resource.
Why don't you use a request Id on your originating point (your originating point should do two things, send a GET request on request_id=2 to see if it's request has been applied - like a response with person created and created as part of request_id=2
This will ensure your originating system knows what was the last request that was executed as the request id is stored in db.
Second thing, if your originating point finds that last request was still at 1 not yet 2, then it may try again with 3, to make sure if by any chance just the GET response has gotten lost but the request 2 was created in the db.
You can introduce number of tries for your GET request and time to wait before firing again a GET etc kinds of system.
OK, I know already all the reasons on paper why I should not use a HTTP GET when making a RESTful call to update the state of something on the server. Thus returning possibly different data each time. And I know this is wrong for the following 'on paper' reasons:
HTTP GET calls should be idempotent
N > 0 calls should always GET the same data back
Violates HTTP spec
HTTP GET call is typically read-only
And I am sure there are more reasons. But I need a concrete simple example for justification other than "Well, that violates the HTTP Spec!". ...or at least I am hoping for one. I have also already read the following which are more along the lines of the list above: Does it violate the RESTful when I write stuff to the server on a GET call? &
HTTP POST with URL query parameters -- good idea or not?
For example, can someone justify the above and why it is wrong/bad practice/incorrect to use a HTTP GET say with the following RESTful call
"MyRESTService/GetCurrentRecords?UpdateRecordID=5&AddToTotalAmount=10"
I know it's wrong, but hopefully it will help provide an example to answer my original question. So the above would update recordID = 5 with AddToTotalAmount = 10 and then return the updated records. I know a POST should be used, but let's say I did use a GET.
How exactly and to answer my question does or can this cause an actual problem? Other than all the violations from the above bullet list, how can using a HTTP GET to do the above cause some real issue? Too many times I come into a scenario where I can justify things with "Because the doc said so", but I need justification and a better understanding on this one.
Thanks!
The practical case where you will have a problem is that the HTTP GET is often retried in the event of a failure by the HTTP implementation. So you can in real life get situations where the same GET is received multiple times by the server. If your update is idempotent (which yours is), then there will be no problem, but if it's not idempotent (like adding some value to an amount for example), then you could get multiple (undesired) updates.
HTTP POST is never retried, so you would never have this problem.
If some form of search engine spiders your site it could change your data unintentionally.
This happened in the past with Google's Desktop Search that caused people to lose data because people had implemented delete operations as GETs.
Here is an important reason that GETs should be idempotent and not be used for updating state on the server in regards to Cross Site Request Forgery Attacks. From the book: Professional ASP.NET MVC 3
Idempotent GETs
Big word, for sure — but it’s a simple concept. If an
operation is idempotent, it can be executed multiple times without
changing the result. In general, a good rule of thumb is that you can
prevent a whole class of CSRF attacks by only changing things in your
DB or on your site by using POST. This means Registration, Logout,
Login, and so forth. At the very least, this limits the confused
deputy attacks somewhat.
One more problem is there. If GET method is used , data is sent in the URL itself . In web server's logs , this data gets saved somewhere in the server along with the request path. Now suppose that if someone has access to/reads those log files , your data (can be user id , passwords , key words , tokens etc. ) gets revealed . This is dangerous and has to be taken care of .
In server's log file, headers and body are not logged but request path is . So , in POST method where data is sent in body, not in request path, your data remains safe .
i think that reading this resource:
http://www.servicedesignpatterns.com/WebServiceAPIStyles could be helpful to you to make difference between message API and resource api ?
This might be a bit of an ethical question, but I'm having quite a discussion in the office about the following issue:
Is it okay to set a cookie with a HTTP GET request? Because whenever a HTTP request changes something in the application, you should use a POST request. HTTP GET should only be used to retrieve data identified by the Request-URI.
In this case, the application doesn't change, but because the cookie is altered, the user might get a different experience when the page loads again, meaning that the HTTP GET request changed the application behaviour (nothing changed server-side though).
Get request reference
The discussion started because we want to use a normal anchor element to set a cookie.
The problem with GETs, especially if they are on an a tag, is when they get spidered by the likes of Google.
In your case, you'd needlessly be creating cookies that will, more than likely, never get used.
I'd also argue that the GET rule it's not really about changing the application, more about changing data. I appreciate the subtle distinction with the cookie ( i.e. you are not changing data on YOUR system ), but generally, it's a good rule to have, and irrespective of where the data is stored, GET shouldn't really be used to change it.
The user can always have different experience when he issues another GET request - you do not expect to return always the same set of data for (imagined) time service: "GET /time/current".
Also, it is not said you are not allowed to change server-side state in response for GET requests - it's perfectly 'legal' to increase a page hit counter, for example, even if you store it in the database.
Consider the section 9.1.1 Safe Methods
Naturally, it is not possible to ensure that the server does not
generate side-effects as a result of performing a GET request; in
fact, some dynamic resources consider that a feature. The important
distinction here is that the user did not request the side-effects, so
therefore cannot be held accountable for them.
Also I would say it is perfectly acceptable to change or set a cookie in response for the GET request because you just return some data.