What is the expected logic of a batched POST request to a subresource

What is the expected logic of a batched POST request to a subresource - api-design

I like the idea of vectorized/batched requests similar to what the StackExchange API offers and would like to implement something for my own API, i.e. GET /users/1;2;3;4;5 would return the selected user resources with id 1 to 5.
I think this is fairly simple when reading data, but what would be the expected behavior for i.e. a POST request to a subresource?
POST /1;2;3;4;5/subresource
Would this mean:
Creation of five new subresources, assigned to each id (1:1)
Creation of a single new subresource, but assigned to each resource id (1:n)

I have a couple of concerns regarding this approach. First, resources should be uniquely addressable via certain resource locators (URIs). Using your approach however bypasses this requirement in some way IMO. This approach may also lead to other issues later on, i.e. plenty of frameworks do not allow URIs that exceed a certain character size.
Furthermore, instead of consecutive resource IDs the resource should use UUIDs instead. This will first and foremost prevent guessing attacks and also prevent logical issues on moving resources or inserting some in between.
The POST method requests that the target resource process the
representation enclosed in the request according to the resource's
own specific semantics.
In regards to HTTP POST operations, the specification clearly states that the semantics of any body received via POST is up to the service developer. So you are basically allowed to do anything within a POST request. As the semantics is totally up to you, you have to document the behavior explicitely. Not documenting the applied logic will leave a large grey-zone for service users.

Related

How should Transient Resources be retrieved in a RESTful API

For a while I was (wrongly) thinking that a RESTful API just exposed CRUD operation to persisted entities for a web application. When you code something up in "the real world" you soon find out that this is not enough. For example, a bank account transfer doesn't have to be a persisted entity. It could be a transient resource where you POST to /transfers/ and in the payload you specify the details:
{"accountToCredit":1234, "accountToDebit":5678, "amount":10}
Using POST here makes sense because it changes the state on the server ($10 moves from one account to another every time this POST occurs).
What should happen in the case where it doesn't affect the server? The simple first answer would be to use GET. For example, you want to get a list of savings and checking accounts that have less than $100. You would then call something like GET to /accounts/searchResults?minBalance=0&maxBalance=100. What happens though if your search parameter need to use complex objects that wouldn't fit in the maximum length of a GET request.
My first thought was to use POST, but after thinking about it some more it should probably be a PUT since it isn't changing the state of the server, but from my (limited) understanding I always though of PUT as updating a resource and POST as creating a resource (like creating this search results). So which should be used in this case?
I found the following links which provide some information but it wasn't clear to me what should be used in the different cases:
Transient REST Representations
How to design RESTful search/filtering?
RESTful URL design for search

I would agree with your approach, it seems reasonable to me to use GET when searching for resources, and as said in one of your provided links, the whole point of query strings is for doing things like search. I also agree that PUT fits better when you want to update some resource in an idempotent way (no matter how many times you hit the request, the result will be the same).
So generally, I would do it as you propose. Now, if you are limited by the maximum length of GET request, then you could use POST or PUT, passing your parameters in a JSON, in a URI like:
PUT /api/search
You could see this as a "search resource" where you send new parameters. I know it seems like a workaround and you may be worried that REST is about avoiding verbs in the URIs. Well, there are few cases that it's still acceptable and RESTful to use verbs, e.g. in cases where calculation or conversion is involved to generate the result (for more about this, check this reference).
PS. I think this workaround is still RESTful, but even if it wasn't, REST isn't an obsession and an ultimate goal. Being pragmatic and keeping a clean API design might be a better approach, even if in few cases you are not RESTful.

RESTful Alternatives to DELETE Request Body

While the HTTP 1.1 spec seems to allow message bodies on DELETE requests, it seems to indicate that servers should ignore it since there are no defined semantics for it.
4.3 Message Body
A server SHOULD read and forward a message-body on any request; if the
request method does not include defined semantics for an entity-body,
then the message-body SHOULD be ignored when handling the request.
I've already reviewed several related discussions on this topic on SO and beyond, such as:
Is an entity body allowed for an HTTP DELETE request?
Payloads of HTTP Request Methods
HTTP GET with request body
Most discussions seem to concur that providing a message body on a DELETE may be allowed, but is generally not recommended.
Further, I've noticed a trend in various HTTP client libraries where more and more enhancements seem to be getting logged for these libraries to support request bodies on DELETE. Most libraries seem to oblige, although occasionally with a little bit of initial resistance.
My use case calls for the addition of some required metadata on a DELETE (e.g. the "reason" for deletion, along with some other metadata required for deletion). I've considered the following options, none of which seem completely appropriate and inline with HTTP specs and/or REST best practices:
Message Body - The spec indicates that message bodies on DELETE have no semantic value; not fully supported by HTTP clients; not standard practice
Custom HTTP Headers - Requiring custom headers is generally against standard practices; using them is inconsistent with the rest of my API, none of which require custom headers; further, no good HTTP response available to indicate bad custom header values (probably a separate question altogether)
Standard HTTP Headers - No standard headers are appropriate
Query Parameters - Adding query params actually changes the Request-URI being deleted; against standard practices
POST Method - (e.g. POST /resourceToDelete { deletemetadata }) POST is not a semantic option for deleting; POST actually represents the opposite action desired (i.e. POST creates resource subordinates; but I need to delete the resource)
Multiple Methods - Splitting the DELETE request into two operations (e.g. PUT delete metadata, then DELETE) splits an atomic operation into two, potentially leaving an inconsistent state. The delete reason (and other related metadata) are not part of the resource representation itself.
My first preference would probably be to use the message body, second to custom HTTP headers; however, as indicated, there are some downsides to these approaches.
Are there any recommendations or best practices inline with REST/HTTP standards for including such required metadata on DELETE requests? Are there any other alternatives that I haven't considered?

Despite some recommendations not to use the message body for DELETE requests, this approach may be appropriate in certain use cases. This is the approach we ended up using after evaluating the other options mentioned in the question/answers, and after collaborating with consumers of the service.
While the use of the message body is not ideal, none of the other options were perfectly fitting either. The request body DELETE allowed us to easily and clearly add semantics around additional data/metadata that was needed to accompany the DELETE operation.
I'd still be open to other thoughts and discussions, but wanted to close the loop on this question. I appreciate everyone's thoughts and discussions on this topic!

Given the situation you have, I would take one of the following approaches:
Send a PUT or PATCH: I am deducing that the delete operation is virtual, by the nature of needing a delete reason. Therefore, I believe updating the record via a PUT/PATCH operation is a valid approach, even though it is not a DELETE operation per se.
Use the query parameters: The resource uri is not being changed. I actually think this is also a valid approach. The question you linked was talking about not allowing the delete if the query parameter was missing. In your case, I would just have a default reason if the reason is not specified in the query string. The resource will still be resource/:id. You can make it discoverable with Link headers on the resource for each reason (with a rel tag on each to identify the reason).
Use a separate endpoint per reason: Using a url like resource/:id/canceled. This does actually change the Request-URI and is definitely not RESTful. Again, link headers can make this discoverable.
Remember that REST is not law or dogma. Think of it more as guidance. So, when it makes sense to not follow the guidance for your problem domain, don't. Just make sure your API consumers are informed of the variance.

What you seem to want is one of two things, neither of which are a pure DELETE:
You have two operations, a PUT of the delete reason followed by a DELETE of the resource. Once deleted, the contents of the resource are no longer accessible to anyone. The 'reason' cannot contain a hyperlink to the deleted resource. Or,
You are trying to alter a resource from state=active to state=deleted by using the DELETE method. Resources with state=deleted are ignored by your main API but might still be readable to an admin or someone with database access. This is permitted - DELETE doesn't have to erase the backing data for a resource, only to remove the resource exposed at that URI.
Any operation which requires a message body on a DELETE request can be broken down into at it's most general, a POST to do all the necessary tasks with the message body, and a DELETE. I see no reason to break the semantics of HTTP.

I suggest you include the required metadata as part of the URI hierarchy itself. An example (Naive):
If you need to delete entries based on a date range, instead of passing the start date and end date in body or as query parameters, structure the URI such a way that you pass the required information as part of the URI.
e.g.
DELETE /entries/range/01012012/31122012 -- Delete all entries between 01 January 2012 to 31st December 2012
Hope this helps.

I would say that query parameters are part of the resource definition, thus you can use them to define the scope of your operation, then "apply" the operation.
My conclusion is that Query Parameters as you defined it is the best approach.

RESTful collections & controlling member details

I have come across this issue a few times now, and each time I make a fruitless search to come up with a satisfying answer.
We have a collection resource which returns a representation of the member URIs, as well as a Link header field with the same URIs (and a custom relation type). Often we find that we need specific data from each member in the collection.
At one extreme, we can have the collection return nothing but the member URIs; the client must then query each URI in turn to determine the required data from each member.
At the other extreme, we return all of the details we might want on the collection. Neither of these is perfect; the first can result in a large number of API calls, and the second may return a lot of potentially unneeded information.
Of the two extremes I favour the second in our case, since we rarely use this for more than one sutiation. However, for a more general approach, I wondered if anyone had a nice way of dynamically specifying which details should be included for each member of the collection? I guess a query string parameter would be most appropriate, but I don't want to break the self-descriptiveness of the resource.

I prefer your first option..
At one extreme, we can have the
collection return nothing but the
member URIs; the client must then
query each URI in turn to determine
the required data from each member.
If you are wanting to reduce the number of HTTP calls over the wire, for example calling a service from a handset app (iOS/Android). You can include an additional header to include the child resources:
X-Aggregate-Resources-Depth: 2
Your server side code will have to aggregate the resources to the desired depth.

Sounds like you're trying to reinvent PROPFIND (RFC 4918, Section 9.1).

I regularly contain a subset of elements in each item within a collection resource. How you define the different subsets is really up to you. Whether you do,
/mycollectionwithjustlinks
/mycollectionwithsubsetA
/mycollectionwithsubsetB
or you use query strings
/mycollection?itemfields=foo,bar,baz
either way they are all different resources. I'm not sure why you believe this is affecting the self-descriptive constraint.

What's the justification behind disallowing partial PUT?

Why does an HTTP PUT request have to contain a representation of a 'whole' state and can't just be a partial?
I understand that this is the existing definition of PUT - this question is about the reason(s) why it would be defined that way.
i.e:
What is gained by preventing partial PUTs?
Why was preventing idempotent partial updates considered an acceptable loss?

PUT means what the HTTP spec defines it to mean. Clients and servers cannot change that meaning. If clients or servers use PUT in a way that contradicts its definition, at least the following thing might happen:
Put is by definition idempotent. That means a client (or intermediary!) can repeat a PUT any number of times and be sure that the effect will be the same. Suppose an intermediary receives a PUT request from a client. When it forwards the request to the server, there is a network problem. The intermediary knows by definition that it can retry the PUT until it succeeds. If the server uses PUT in a non idempotent way these potential multiple calls will have an undesired effect.
If you want to do a partial update, use PATCH or use POST on a sub-resource and return 303 See Other to the 'main' resource, e.g.
POST /account/445/owner/address
Content-Type: application/x-www-form-urlencoded
street=MyWay&zip=22222&city=Manchaster
303 See Other
Location: /account/445
EDIT: On the general question why partial updates cannot be idempotent:
A partial update cannot be idempotent in general because the idempotency depends on the media type semantics. IOW, you might be able to specify a format that allows for idempotent patches, but PATCH cannot be guaranteed to be idempotent for every case. Since the semantics of a method cannot be a function of the media type (for orthogonality reasons) PATCH needs to be defined as non-idempotent. And PUT (being defined as idempotent) cannot be used for partial updates.

Because, I guess, this would have translated in inconsistent "views" when multiple concurrent clients access the state. There isn't a "partial document" semantics in REST as far as I can tell and probably the benefits of adding this in face of the complexity of dealing with that semantics in the context of concurrency wasn't worth the effort.
If the document is big, there is nothing preventing you from building multiple independent documents and have an overarching document that ties them together. Furthermore, once all the bits and pieces are collected, a new document can be collated on the server I guess.
So, considering one can "workaround" this "limitations", I can understand why this feature didn't make the cut.

Short answer: ACIDity of the PUT operation and the state of the updated entity.
Long answer:
RFC 2616 : Paragraph 2.5, "POST method requests the enclosed entity to be accepted as a new subordinate of the requested URL". Paragraph 2.6, "PUT method requests the enclosed entity to be stored at the specified URL".
Since every time you execute POST, the semantic is to create a new entity instance on the server, POST constitutes an ACID operation. But repeating the same POST twice with the same entity in the body still might result in different outcome, if for example the server has run out of storage to store the new instance that needs to be created - thus, POST is not idempotent.
PUT on the other hand has a semantic of updating an existing entity. There's no guarantee that even if a partial update is idempotent, it is also ACID and results in consistent and valid entity state. Thus, to ensure ACIDity, PUT semantic requires the full entity to be sent. Even if it was not a goal for the HTTP protocol authors, the idempotency of the PUT request would happen as a side effect of the attempt to enforce ACID.
Of course, if the HTTP server has close knowledge of the semantic of the entities, it can allow partial PUTs, since it can ensure through server-side logic the consistency of the entity. This however requires tight coupling between the data and the server.

With a full document update, it's obvious, without knowing any details of the particular API or what its limitations on the document structure are, what the resulting document will be after the update.
If a certain method was known to never be a partial content update, and an API someone provided only supported that method, then it would always be clear what someone using the API would have to do to change a document to have a given set of valid contents.

Incrementing resource counter in a RESTful way: PUT vs POST

I have a resource that has a counter. For the sake of example, let's call the resource profile, and the counter is the number of views for that profile.
Per the REST wiki, PUT requests should be used for resource creation or modification, and should be idempotent. That combination is fine if I'm updating, say, the profile's name, because I can issue a PUT request which sets the name to something 1000 times and the result does not change.
For these standard PUT requests, I have browsers do something like:
PUT /profiles/123?property=value&property2=value2
For incrementing a counter, one calls the url like so:
PUT /profiles/123/?counter=views
Each call will result in the counter being incremented. Technically it's an update operation but it violates idempotency.
I'm looking for guidance/best practice. Are you just doing this as a POST?

I think the right answer is to use PATCH. I didn't see anyone else recommending it should be used to atomically increment a counter, but I believe RFC 2068 says it all very well:
The PATCH method is similar to PUT except that the entity contains a
list of differences between the original version of the resource
identified by the Request-URI and the desired content of the resource
after the PATCH action has been applied. The list of differences is
in a format defined by the media type of the entity (e.g.,
"application/diff") and MUST include sufficient information to allow
the server to recreate the changes necessary to convert the original
version of the resource to the desired version.
So, to update profile 123's view count, I would:
PATCH /profiles/123 HTTP/1.1
Host: www.example.com
Content-Type: application/x-counters
views + 1
Where the x-counters media type (which I just made up) is made of multiple lines of field operator scalar tuples. views = 500 or views - 1 or views + 3 are all valid syntactically (but may be forbidden semantically).
I can understand some frowning-upon making up yet another media type, but I humbly suggest it's more correct than the POST / PUT alternative. Making up a resource for a field, complete with its own URI and especially its own details (which I don't really keep, all I have is an integer) sounds wrong and cumbersome to me. What if I have 23 different counters to maintain?

An alternative might be to add another resource to the system to track the viewings of a profile. You might call it "Viewing".
To see all Viewings of a profile:
GET /profiles/123/viewings
To add a viewing to a profile:
POST /profiles/123/viewings #here, you'd submit the details using a custom media type in the request body.
To update an existing Viewing:
PUT /viewings/815 # submit revised attributes of the Viewing in the request body using the custom media type you created.
To drill down into the details of a viewing:
GET /viewings/815
To delete a Viewing:
DELETE /viewings/815
Also, because you're asking for best-practice, be sure your RESTful system is hypertext-driven.
For the most part, there's nothing wrong with using query parameters in URIs - just don't give your clients the idea that they can manipulate them.
Instead, create a media type that embodies the concepts the parameters are trying to model. Give this media type a concise, unambiguous, and descriptive name. Then document this media type. The real problem of exposing query parameters in REST is that the practice often leads out-of-band communication, and therefore increased coupling between client and server.
Then give your system a uniform interface. For example, adding a new resource is always a POST. Updating a resource is always a PUT. Deleting is DELETE, and getiing is GET.
The hardest part about REST is understanding how media types figure into system design (it's also the part that Fielding left out of his dissertation because he ran out of time). If you want a specific example of a hypertext-driven system that uses and doucuments media types, see the Sun Cloud API.

After evaluating the previous answers I decided PATCH was inappropriate and, for my purposes, fiddling around with Content-Type for a trivial task was a violation of the KISS principle. I only needed to increment n+1 so I just did this:
PUT /profiles/123$views
++
Where ++ is the message body and is interpreted by the controller as an instruction to increment the resource by one.
I chose $ to deliminate the field/property of the resource as it is a legal sub-delimiter and, for my purposes, seemed more intuitive than / which, in my opinion, has the vibe of traversability.

I think both approaches of Yanic and Rich are interresting. A PATCH does not need to be safe or indempotent but can be in order to be more robust against concurrency. Rich's solution is certainly easier to use in a "standard" REST API.
See RFC5789:
PATCH is neither safe nor idempotent as defined by [RFC2616], Section
9.1.
A PATCH request can be issued in such a way as to be idempotent,
which also helps prevent bad outcomes from collisions between two
PATCH requests on the same resource in a similar time frame.
Collisions from multiple PATCH requests may be more dangerous than
PUT collisions because some patch formats need to operate from a
known base-point or else they will corrupt the resource.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex