RESTful collections & controlling member details - http

I have come across this issue a few times now, and each time I make a fruitless search to come up with a satisfying answer.
We have a collection resource which returns a representation of the member URIs, as well as a Link header field with the same URIs (and a custom relation type). Often we find that we need specific data from each member in the collection.
At one extreme, we can have the collection return nothing but the member URIs; the client must then query each URI in turn to determine the required data from each member.
At the other extreme, we return all of the details we might want on the collection. Neither of these is perfect; the first can result in a large number of API calls, and the second may return a lot of potentially unneeded information.
Of the two extremes I favour the second in our case, since we rarely use this for more than one sutiation. However, for a more general approach, I wondered if anyone had a nice way of dynamically specifying which details should be included for each member of the collection? I guess a query string parameter would be most appropriate, but I don't want to break the self-descriptiveness of the resource.

I prefer your first option..
At one extreme, we can have the
collection return nothing but the
member URIs; the client must then
query each URI in turn to determine
the required data from each member.
If you are wanting to reduce the number of HTTP calls over the wire, for example calling a service from a handset app (iOS/Android). You can include an additional header to include the child resources:
X-Aggregate-Resources-Depth: 2
Your server side code will have to aggregate the resources to the desired depth.

Sounds like you're trying to reinvent PROPFIND (RFC 4918, Section 9.1).

I regularly contain a subset of elements in each item within a collection resource. How you define the different subsets is really up to you. Whether you do,
/mycollectionwithjustlinks
/mycollectionwithsubsetA
/mycollectionwithsubsetB
or you use query strings
/mycollection?itemfields=foo,bar,baz
either way they are all different resources. I'm not sure why you believe this is affecting the self-descriptive constraint.

Related

What is the expected logic of a batched POST request to a subresource

I like the idea of vectorized/batched requests similar to what the StackExchange API offers and would like to implement something for my own API, i.e. GET /users/1;2;3;4;5 would return the selected user resources with id 1 to 5.
I think this is fairly simple when reading data, but what would be the expected behavior for i.e. a POST request to a subresource?
POST /1;2;3;4;5/subresource
Would this mean:
Creation of five new subresources, assigned to each id (1:1)
Creation of a single new subresource, but assigned to each resource id (1:n)
I have a couple of concerns regarding this approach. First, resources should be uniquely addressable via certain resource locators (URIs). Using your approach however bypasses this requirement in some way IMO. This approach may also lead to other issues later on, i.e. plenty of frameworks do not allow URIs that exceed a certain character size.
Furthermore, instead of consecutive resource IDs the resource should use UUIDs instead. This will first and foremost prevent guessing attacks and also prevent logical issues on moving resources or inserting some in between.
The POST method requests that the target resource process the
representation enclosed in the request according to the resource's
own specific semantics.
In regards to HTTP POST operations, the specification clearly states that the semantics of any body received via POST is up to the service developer. So you are basically allowed to do anything within a POST request. As the semantics is totally up to you, you have to document the behavior explicitely. Not documenting the applied logic will leave a large grey-zone for service users.

REST resources with a triple as a parameter

When needing to create a URL that takes a finite set of parameters, where all of said parameters are semantically the same "level", what is the current consensus around the use of delimiters within URLs? Here's an example:
/myresource/thing1,thing2,thing3
/myresource/thing2,thing1
/myresource/thing1;thing2;thing3
/myresource/thing1;thing3
That is to say, the parameter here could be a single, a pair or a triple. They can be specified in any order because they are not a logical tree, and thing2 is not a subordinate resource of thing1, so doing something like this seems "wrong":
/myresources/thing1/thing2/thing3
This bothers me because it implies a tree-like relationship between the elements of the triple, and that is not the case (despite many HTTP frameworks seemingly pushing this, wrongly in my view). In addition, using a query string doesn't feel right as this is not a search operation, it is a known triple in a very finite space - there's nothing to query or search, so to speak.
I suppose the other option would be to make it a POST request and supply a body that details the parts of the triple being supplied. This doesn't give me warm fuzzies though, for some reason.
How have others handled this? Delimiters seem clean to me, and communicate the intended semantics of the resource, but i know there are folks would would take a different view, and I was looking to understand the experiences of others who've had similar use cases.
Since any value can be missing and values can appear in any order, How would you know which value is for which parameter (if that matters).
I would have used query string for GET, or in the payload for POST.
Use query parameters
/path/to/the/resource?key1=value1&key2=value2&key3=value3
or matrix parameters
/path/to/the/resource;key1=value1;key2=value2;key3=value3
Without a proper example, I'm not sure exactly about your needs.
However, a little known fact is that any HTTP parameter can have multiple values. It is the way to go when you have a set of objects (see GoogleMaps static API for an example).
/path/to/the/resource?things=thing1&things=thing2&things=thing3
Then you can use the same API for single, pairs, triples (and more).

How should Transient Resources be retrieved in a RESTful API

For a while I was (wrongly) thinking that a RESTful API just exposed CRUD operation to persisted entities for a web application. When you code something up in "the real world" you soon find out that this is not enough. For example, a bank account transfer doesn't have to be a persisted entity. It could be a transient resource where you POST to /transfers/ and in the payload you specify the details:
{"accountToCredit":1234, "accountToDebit":5678, "amount":10}
Using POST here makes sense because it changes the state on the server ($10 moves from one account to another every time this POST occurs).
What should happen in the case where it doesn't affect the server? The simple first answer would be to use GET. For example, you want to get a list of savings and checking accounts that have less than $100. You would then call something like GET to /accounts/searchResults?minBalance=0&maxBalance=100. What happens though if your search parameter need to use complex objects that wouldn't fit in the maximum length of a GET request.
My first thought was to use POST, but after thinking about it some more it should probably be a PUT since it isn't changing the state of the server, but from my (limited) understanding I always though of PUT as updating a resource and POST as creating a resource (like creating this search results). So which should be used in this case?
I found the following links which provide some information but it wasn't clear to me what should be used in the different cases:
Transient REST Representations
How to design RESTful search/filtering?
RESTful URL design for search
I would agree with your approach, it seems reasonable to me to use GET when searching for resources, and as said in one of your provided links, the whole point of query strings is for doing things like search. I also agree that PUT fits better when you want to update some resource in an idempotent way (no matter how many times you hit the request, the result will be the same).
So generally, I would do it as you propose. Now, if you are limited by the maximum length of GET request, then you could use POST or PUT, passing your parameters in a JSON, in a URI like:
PUT /api/search
You could see this as a "search resource" where you send new parameters. I know it seems like a workaround and you may be worried that REST is about avoiding verbs in the URIs. Well, there are few cases that it's still acceptable and RESTful to use verbs, e.g. in cases where calculation or conversion is involved to generate the result (for more about this, check this reference).
PS. I think this workaround is still RESTful, but even if it wasn't, REST isn't an obsession and an ultimate goal. Being pragmatic and keeping a clean API design might be a better approach, even if in few cases you are not RESTful.

Different representations of one resource

When i have a resource, let's say customers/3 which returns the customer object and i want to return this object with different fields, or some other changes (for example let's say i need to have include in customer object also his latest purchase (for the sake of speed i dont want to do 2 different queries)).
As i see it my options are:
customers/3/with-latest-purchase
customers/3?display=with-latest-purchase
In the first option there is distinct URI for the new representation, but is this REALLY needed? Also how do i tell the client that this URI exist?
In the second option there is GET parameter telling the server what kind of representation to return. The URI parameters can be explained through OPTIONS method and it is easier to tell client where to look for the data as all the representations are all in one place.
So my question is which of these is better (more RESTful) and/or is there some better way to do this that i do not know about?
I think what is best is to define atomic, indivisible service objects, e.g. customer and customer-latest-purchase, nice, clean, simple. Then if the client wants a customer with his latest purchases, they invoke both service calls, instead of jamming it all in one with funky parameters.
Different representations of an object is OK in Java through interfaces but I think it is a bad idea for REST because it compromises its simplicity.
There is a misconception that making query parameters look like file paths is more RESTful. The query portion of the address is included when determining a distinct URI so the second option is fine.
Is there much of a performance hit in including the latest purchase data in all customer GET requests? If not, the simplest thing would be to do that so there would neither be weird URL params or double requests. If getting the latest order is a significant hardship (which it probably shouldn't be) there is nothing wrong with adding a flag in the query string to include it.

Bulk Collection Manipulation through a REST (RESTful) API

I'd like some advice on designing a REST API which will allow clients to add/remove large numbers of objects to a collection efficiently.
Via the API, clients need to be able to add items to the collection and remove items from it, as well as manipulating existing items. In many cases the client will want to make bulk updates to the collection, e.g. adding 1000 items and deleting 500 different items. It feels like the client should be able to do this in a single transaction with the server, rather than requiring 1000 separate POST requests and 500 DELETEs.
Does anyone have any info on the best practices or conventions for achieving this?
My current thinking is that one should be able to PUT an object representing the change to the collection URI, but this seems at odds with the HTTP 1.1 RFC, which seems to suggest that the data sent in a PUT request should be interpreted independently from the data already present at the URI. This implies that the client would have to send a complete description of the new state of the collection in one go, which may well be very much larger than the change, or even be more than the client would know when they make the request.
Obviously, I'd be happy to deviate from the RFC if necessary but would prefer to do this in a conventional way if such a convention exists.
You might want to think of the change task as a resource in itself. So you're really PUT-ing a single object, which is a Bulk Data Update object. Maybe it's got a name, owner, and big blob of CSV, XML, etc. that needs to be parsed and executed. In the case of CSV you might want to also identify what type of objects are represented in the CSV data.
List jobs, add a job, view the status of a job, update a job (probably in order to start/stop it), delete a job (stopping it if it's running) etc. Those operations map easily onto a REST API design.
Once you have this in place, you can easily add different data types that your bulk data updater can handle, maybe even mixed together in the same task. There's no need to have this same API duplicated all over your app for each type of thing you want to import, in other words.
This also lends itself very easily to a background-task implementation. In that case you probably want to add fields to the individual task objects that allow the API client to specify how they want to be notified (a URL they want you to GET when it's done, or send them an e-mail, etc.).
Yes, PUT creates/overwrites, but does not partially update.
If you need partial update semantics, use PATCH. See http://greenbytes.de/tech/webdav/draft-dusseault-http-patch-14.html.
You should use AtomPub. It is specifically designed for managing collections via HTTP. There might even be an implementation for your language of choice.
For the POSTs, at least, it seems like you should be able to POST to a list URL and have the body of the request contain a list of new resources instead of a single new resource.
As far as I understand it, REST means REpresentational State Transfer, so you should transfer the state from client to server.
If that means too much data going back and forth, perhaps you need to change your representation. A collectionChange structure would work, with a series of deletions (by id) and additions (with embedded full xml Representations), POSTed to a handling interface URL. The interface implementation can choose its own method for deletions and additions server-side.
The purest version would probably be to define the items by URL, and the collection contain a series of URLs. The new collection can be PUT after changes by the client, followed by a series of PUTs of the items being added, and perhaps a series of deletions if you want to actually remove the items from the server rather than just remove them from that list.
You could introduce meta-representation of existing collection elements that don't need their entire state transfered, so in some abstract code your update could look like this:
{existing elements 1-100}
{new element foo with values "bar", "baz"}
{existing element 105}
{new element foobar with values "bar", "foo"}
{existing elements 110-200}
Adding (and modifying) elements is done by defining their values, deleting elements is done by not mentioning it the new collection and reordering elements is done by specifying the new order (if order is stored at all).
This way you can easily represent the entire new collection without having to re-transmit the entire content. Using a If-Unmodified-Since header makes sure that your idea of the content indeed matches the servers idea (so that you don't accidentally remove elements that you simply didn't know about when the request was submitted).
Best way is :
Pass Only Id Array of Deletable Objects from Front End Application To Web API
2. Then You have Two Options:
2.1 Web API Way : Find All Collections/Entities using Id arrays and Delete in API , but you need to take care of Dependant entities like Foreign Key Relational Table Data too
2.2. Database Way : Pass Ids to your database side, find all records in Foreign Key Tables and Primary Key Tables and Delete in same order i.e. F-Key Table records then P-Key Table records

Resources