Why is using a HTTP GET to update state on the server in a RESTful call incorrect? - http

OK, I know already all the reasons on paper why I should not use a HTTP GET when making a RESTful call to update the state of something on the server. Thus returning possibly different data each time. And I know this is wrong for the following 'on paper' reasons:
HTTP GET calls should be idempotent
N > 0 calls should always GET the same data back
Violates HTTP spec
HTTP GET call is typically read-only
And I am sure there are more reasons. But I need a concrete simple example for justification other than "Well, that violates the HTTP Spec!". ...or at least I am hoping for one. I have also already read the following which are more along the lines of the list above: Does it violate the RESTful when I write stuff to the server on a GET call? &
HTTP POST with URL query parameters -- good idea or not?
For example, can someone justify the above and why it is wrong/bad practice/incorrect to use a HTTP GET say with the following RESTful call
"MyRESTService/GetCurrentRecords?UpdateRecordID=5&AddToTotalAmount=10"
I know it's wrong, but hopefully it will help provide an example to answer my original question. So the above would update recordID = 5 with AddToTotalAmount = 10 and then return the updated records. I know a POST should be used, but let's say I did use a GET.
How exactly and to answer my question does or can this cause an actual problem? Other than all the violations from the above bullet list, how can using a HTTP GET to do the above cause some real issue? Too many times I come into a scenario where I can justify things with "Because the doc said so", but I need justification and a better understanding on this one.
Thanks!

The practical case where you will have a problem is that the HTTP GET is often retried in the event of a failure by the HTTP implementation. So you can in real life get situations where the same GET is received multiple times by the server. If your update is idempotent (which yours is), then there will be no problem, but if it's not idempotent (like adding some value to an amount for example), then you could get multiple (undesired) updates.
HTTP POST is never retried, so you would never have this problem.

If some form of search engine spiders your site it could change your data unintentionally.
This happened in the past with Google's Desktop Search that caused people to lose data because people had implemented delete operations as GETs.

Here is an important reason that GETs should be idempotent and not be used for updating state on the server in regards to Cross Site Request Forgery Attacks. From the book: Professional ASP.NET MVC 3
Idempotent GETs
Big word, for sure — but it’s a simple concept. If an
operation is idempotent, it can be executed multiple times without
changing the result. In general, a good rule of thumb is that you can
prevent a whole class of CSRF attacks by only changing things in your
DB or on your site by using POST. This means Registration, Logout,
Login, and so forth. At the very least, this limits the confused
deputy attacks somewhat.

One more problem is there. If GET method is used , data is sent in the URL itself . In web server's logs , this data gets saved somewhere in the server along with the request path. Now suppose that if someone has access to/reads those log files , your data (can be user id , passwords , key words , tokens etc. ) gets revealed . This is dangerous and has to be taken care of .
In server's log file, headers and body are not logged but request path is . So , in POST method where data is sent in body, not in request path, your data remains safe .

i think that reading this resource:
http://www.servicedesignpatterns.com/WebServiceAPIStyles could be helpful to you to make difference between message API and resource api ?

Related

What is the best practice for a REST api to manage maximum characters for a field?

I have a Web Api RESTful service that has a few POST endpoints to insert a new object to the database. We want to limit the maximum characters accepted for the object name to 20. Should the DB, API, or UI handle this?
Obviously, we could prevent more than 20 characters on any of those layers. However, if it gets past the UI then the form has been submitted. At that point, we would want the Service layer or the DB layer to return an informative explanation as to why it was not accepted. What would be the best practice to handle this?
Should the DB, API, or UI handle this?
At the very least, your API must handle data validation. Everyone has their own opinions on how REST should work, but one good approach would be to return HTTP 400 (Bad Request), with some content that includes information about why the request was bad.
Client-side checking is a definite bonus on top of this. Ideally you'll never even see this code get hit at your API layer. The client should also be capable of handling the potential "Bad Request" response in a graceful way. If there's ever a mismatch between the rules applied by the API and the client, you'll want the client to recognize that its action didn't succeed, and display an appropriate error to help the user respond to the issue.
Database-level checks are also a good idea if you ever allow data to bypass your API layer, through bulk imports or something. If not, the database might just be one more place you'll have to remember to change if you ever change your validation requirements.

Http/REST method for starting a service

I want to design a REST API to start a database. I can't find a suitable http method (aka verb).
I currently consider:
START /databases/mysampledatabase
I've browsed through a few RFCs, but then I thought someone here might point me to a de-facto standard verb.
Methods I've discarded (before I got tired of looking):
RFC 2616
OPTIONS
GET
HEAD
POST
PUT
DELETE
TRACE
CONNECT
RFC 2518
PROPFIND
PROPPATCH
MKCOL
COPY
MOVE
LOCK
UNLOCK
RFC 3253
REPORT
CHECKOUT
CHECKIN
UNCHECKOUT
MKWORKSPACE
UPDATE
LABEL
MERGE
BASELINE-CONTROL
MKACTIVITY
There's a bunch of thinking flaws here.. first off, the additional HTTP verbs (aside from the CRUD ones) should be considered not-restful.
So there's two ways I can interpret this question, and I have an answer for both:
1. What's the most appropriate HTTP method for starting a service
There's nothing quite like what you need, and I would advise simply using POST.
2. What's a good RESTful way to start a service
First, you should not see 'starting the service' as the action. It's easier to think of the 'status' (being started or stopped) as the resource you are changing, and PUT to update the resource.
So in this case, each service should have a unique uri. A GET on that uri could return something like :
{ "status" : "stopped" }
You just change 'stopped' to 'started', PUT the new resource.. and then the service could automatically begin running.
I wonder how useful this is though.. I'm not a REST zealot, and I think a simple POST is the best way to go.
edit I can't delete accepted answers, but since 2013 my thoughts on what is and isn't RESTful has nuanced quite a bit. I still think my example to represent the changable state of each service as a property still holds.

Consequences of POST not being idempotent (RESTful API)

I am wondering if my current approach makes sense or if there is a better way to do it.
I have multiple situations where I want to create new objects and let the server assign an ID to those objects. Sending a POST request appears to be the most appropriate way to do that.
However since POST is not idempotent the request may get lost and sending it again may create a second object. Also requests being lost might be quite common since the API is often accessed through mobile networks.
As a result I decided to split the whole thing into a two-step process:
First sending a POST request to create a new object which returns the URI of the new object in the Location header.
Secondly performing an idempotent PUT request to the supplied Location to populate the new object with data. If a new object is not populated within 24 hours the server may delete it through some kind of batch job.
Does that sound reasonable or is there a better approach?
The only advantage of POST-creation over PUT-creation is the server generation of IDs.
I don't think it worths the lack of idempotency (and then the need for removing duplicates or empty objets).
Instead, I would use a PUT with a UUID in the URL. Owing to UUID generators you are nearly sure that the ID you generate client-side will be unique server-side.
well it all depends, to start with you should talk more about URIs, resources and representations and not be concerned about objects.
The POST Method is designed for non-idempotent requests, or requests with side affects, but it can be used for idempotent requests.
on POST of form data to /some_collection/
normalize the natural key of your data (Eg. "lowercase" the Title field for a blog post)
calculate a suitable hash value (Eg. simplest case is your normalized field value)
lookup resource by hash value
if none then
generate a server identity, create resource
Respond => "201 Created", "Location": "/some_collection/<new_id>"
if found but no updates should be carried out due to app logic
Respond => 302 Found/Moved Temporarily or 303 See Other
(client will need to GET that resource which might include fields required for updates, like version_numbers)
if found but updates may occur
Respond => 307 Moved Temporarily, Location: /some_collection/<id>
(like a 302, but the client should use original http method and might do automatically)
A suitable hash function might be as simple as some concatenated fields, or for large fields or values a truncated md5 function could be used. See [hash function] for more details2.
I've assumed you:
need a different identity value than a hash value
data fields used
for identity can't be changed
Your method of generating ids at the server, in the application, in a dedicated request-response, is a very good one! Uniqueness is very important, but clients, like suitors, are going to keep repeating the request until they succeed, or until they get a failure they're willing to accept (unlikely). So you need to get uniqueness from somewhere, and you only have two options. Either the client, with a GUID as Aurélien suggests, or the server, as you suggest. I happen to like the server option. Seed columns in relational DBs are a readily available source of uniqueness with zero risk of collisions. Round 2000, I read an article advocating this solution called something like "Simple Reliable Messaging with HTTP", so this is an established approach to a real problem.
Reading REST stuff, you could be forgiven for thinking a bunch of teenagers had just inherited Elvis's mansion. They're excitedly discussing how to rearrange the furniture, and they're hysterical at the idea they might need to bring something from home. The use of POST is recommended because its there, without ever broaching the problems with non-idempotent requests.
In practice, you will likely want to make sure all unsafe requests to your api are idempotent, with the necessary exception of identity generation requests, which as you point out don't matter. Generating identities is cheap and unused ones are easily discarded. As a nod to REST, remember to get your new identity with a POST, so it's not cached and repeated all over the place.
Regarding the sterile debate about what idempotent means, I say it needs to be everything. Successive requests should generate no additional effects, and should receive the same response as the first processed request. To implement this, you will want to store all server responses so they can be replayed, and your ids will be identifying actions, not just resources. You'll be kicked out of Elvis's mansion, but you'll have a bombproof api.
But now you have two requests that can be lost? And the POST can still be repeated, creating another resource instance. Don't over-think stuff. Just have the batch process look for dupes. Possibly have some "access" count statistics on your resources to see which of the dupe candidates was the result of an abandoned post.
Another approach: screen incoming POST's against some log to see whether it is a repeat. Should be easy to find: if the body content of a request is the same as that of a request just x time ago, consider it a repeat. And you could check extra parameters like the originating IP, same authentication, ...
No matter what HTTP method you use, it is theoretically impossible to make an idempotent request without generating the unique identifier client-side, temporarily (as part of some request checking system) or as the permanent server id. An HTTP request being lost will not create a duplicate, though there is a concern that the request could succeed getting to the server but the response does not make it back to the client.
If the end client can easily delete duplicates and they don't cause inherent data conflicts it is probably not a big enough deal to develop an ad-hoc duplication prevention system. Use POST for the request and send the client back a 201 status in the HTTP header and the server-generated unique id in the body of the response. If you have data that shows duplications are a frequent occurrence or any duplicate causes significant problems, I would use PUT and create the unique id client-side. Use the client created id as the database id - there is no advantage to creating an additional unique id on the server.
I think you could also collapse creation and update request into only one request (upsert). In order to create a new resource, client POST a “factory” resource, located for example at /factory-url-name. And then the server returns the URI for the new resource.
Why don't you use a request Id on your originating point (your originating point should do two things, send a GET request on request_id=2 to see if it's request has been applied - like a response with person created and created as part of request_id=2
This will ensure your originating system knows what was the last request that was executed as the request id is stored in db.
Second thing, if your originating point finds that last request was still at 1 not yet 2, then it may try again with 3, to make sure if by any chance just the GET response has gotten lost but the request 2 was created in the db.
You can introduce number of tries for your GET request and time to wait before firing again a GET etc kinds of system.

Is it dangerous if a web resource POSTs to itself?

While reading some articles about writing web servers using Twisted, I came across this page that includes the following statement:
While it's convenient for this example, it's often not a good idea to
make a resource that POSTs to itself; this isn't about Twisted Web,
but the nature of HTTP in general; if you do this, make sure you
understand the possible negative consequences.
In the example discussed in the article, the resource is a web resource retrieved using a GET request.
My question is, what are the possible negative consequences that can arrive from having a resource POST to itself? I am only concerned about the aspects related to the HTTP protocol, so please ignore the fact that I mentioned about Twisted.
The POST verb is used for making a new resource in a collection.
This means that POSTing to a resource has no direct meaning (POST endpoints should always be collections, not resources).
If you want to update your resource, you should PUT to it.
Sometimes, you do not know if you want to update or create the resource (maybe you've created it locally and want to create-or-update it). I think that in that case, the PUT verb is more appropriate because POST really means "I want to create something new".
There's nothing inherently wrong about a page POSTing back to itself - in fact, many of the widely-used frameworks (ASP.NET, etc.) use that method to handle various events that happen on the client - some data is posted back to the same page where the server processes it and sends a new reponse.

Is PUT/DELETE idempotent with REST automatic?

I am learning about REST and PUT/DELETE, I have read that both of those (along with GET) is idempotent meaning that multiple requests put the server into the same state.
Does a duplicate PUT/DELETE request ever leave the web browser (when using XMLHttpRequest)? In other words, will the server be updating the same database record for each PUT request, or will duplicate requests be ignored automatically?
If yes, how is using PUT or DELETE different from just using POST?
I read an article which suggested that RESTful web services were the way forward. Is there any particular reason why HTML5 forms do not support PUT/DELETE methods?
REST is just a design structure for data access and manipulation. There's no set-in-stone rules for how a server must react to data requests.
That being said, typically a REST request of PUT or DELETE would be as follows:
DELETE /item/10293
or
PUT /item/23848
foo=bar
fizz=buzz
herp=derp
The requests given are associated with a specific ID. Because of this, telling the server to delete the same ID 15 times will end up with pretty much the same result as calling it once, unless there's some sort of re-numbering going on.
With the PUT request, telling the server to update a specific item to specific values will also lead to the same result.
A case where a command would be non-idempotent would typically involve some sort of relative value:
DELETE /item/last
Calling that 15 times would likely remove 15 items, rather than the same last item. An alternative using HTTP properly might look like:
POST /item/last?action=delete
Again, REST isn't an official spec, it's just a structure with some common qualities. There are many ways to implement a RESTful structure.
As for HTML5 forms supporting PUT & DELETE, it's really up to the browsers to start supporting different methods rather than the spec itself. If all the browsers started implementing different methods for form submission, I'm sure they'd be added to the spec.
With the web going the way it is, a good RESTful implementation is liable to also incorporate some form of AJAX anyway, so to me it seems largely unnecessary.
Does a duplicate PUT/DELETE request ever leave the web browser (when using XMLHttpRequest)?
Yeah, sure. Idempotence is only a convention and it's not enforced. If you make a request, duplicate or not, it will run through.
In other words, will the server be updating the same database record for each PUT request, or will duplicate requests be ignored automatically?
If it conforms to REST it should update the same database record twice, for example running UPDATE user SET name = 'John' twice. There is not guarantee what it will or will not do though, it depends on how it's implemented.
If yes, how is using PUT or DELETE different from just using POST?
It's just a convention. PUT and DELETE requests may or may not be handled differently from POST in the site's code.
I read an article which suggested that RESTful web services were the way forward. Is there any particular reason why HTML5 forms do not support PUT/DELETE methods?
I'm not really sure, to be honest. You can work around this by using a hidden <input> field called _method or similar and set it to DELETE or PUT, and then handle that server side.
PUT operation are idempotent but not safe operation. On success if PUT operation is repeated it will not insert duplicate records. Repeat PUT operation in case of NetworkFailure errors after verifying conditional headers like If-unmodified-since and/or if-match. Don't repeat in case of 4XX or 5XX error codes.
REST aims to establish a syntax convention regarding the HTTP method to use; each back end is free to implement anything they want, devs could break the convention but will cause unnecessary confusion if used by others not involved in the development.
For DELETE, if you delete some item with an ID, the server should responded it's deleted; if delete again, it's no more there so server responded "already removed", also good because your purpose is fulfilled. Same for PUT, because you provide the new status of your resource, the status yet-to-be; if it's already updated, mission complete and it's the same for the client.

Resources