Bulk operation on GET - api-design

What would be the best way to design an API to accept a request for a bulk GET. I currently have a scenario where I have about 100 id's. I don't want to call the API 100 times to get each resource but sending 100 GUIDs in a Query String doesn't seem right either.
What is the proper way to handle this scenario?

If you have too much specific parameters, it is common way to use POST with JSON body, where you specify, what you want.
But having big query string is not bad either (just remember there is limitation for maximum length). You can have even array in query string, it is sended (and consumed) as this : http://something.com?ids[0]=7&ids[1]=33&ids[2]=5
Frameworks (like Spring) are able to automatically convert these parameters into arrays or lists.

Related

Is there a canonical/RESTful way to send query details to a server during a GET?

I'm designing a (more or less) RESTful internal web service running on ASP.NET and IIS. I want clients to be able to pass query details to the server when accessing large collections of entries, using JSON to describe the query in a known manner. The issue is that the queries sent to the server will be complex; they may include aggregation, filtering, mapping—essentially anything that is supported by the LINQ query operators. This will result in relatively large JSON objects representing the queries.
The conflict I'm facing is that, while a query is semantically a GET in the world of REST, there's no standardized way to pass a large block of data to a web server during a GET. I've come up with a few options to get around this issue.
Option 1: Send the query object in the body of the GET request.
GET /namespace/collection/ HTTP/1.1
Content-Length: 22
{ /* query object */ }
Obviously, this is non-standard, and some software may choke on a GET request that has a body. (Or worse, simply strip the body and handle the request without it, which would cause the server to return an incorrect result set.)
Option 2: Use a non-standard HTTP verb (perhaps QUERY) instead of GET.
QUERY /namespace/collection/ HTTP/1.1
Content-Length: 22
{ /* query object */ }
While this doesn't fit exactly with the REST pattern, it seems (to me) like a safe alternative because other software (such as anything that uses WebDAV) seems to use non-standard HTTP verbs with sufficient success.
Option 3: Put the query object in a non-standard HTTP header.
GET /namespace/collection/ HTTP/1.1
ProjectName-Query: { /* query object */ }
This option keeps the request as a GET, but requires stuffing what could potentially be a very large object in an HTTP header. I understand some software places arbitrary length limits on HTTP headers, so this may cause issues if the object gets too big.
Option 4: Use the POST verb and provide an alternate endpoint for querying.
POST /namespace/collection/query HTTP/1.1
Content-Length: 22
{ /* query object */ }
Because this uses a standard verb and no standard headers, this method is guaranteed to work in all scenarios. The only issue is that it strays from RESTful architecture, which I'm trying to stay aligned with as best I can.
None of these options are quite right. What I want to know is which way makes the most sense for the service I'm writing; it's an internal web service (it will never exposed to the public) but it may be accessed through a variety of network security applications (firewalls, content filters, etc..) and I want to stick to known development styles, standards, and architecture as best I can.
I would think about "RESTful querying" as having two resources: Query and QueryResult.
You POST your Query to one end-point (e.g. "POST /queries/") and receive a CREATED Status back with the URI of your specific query (/queries/123) and a nice and RESTful hypertext body telling you the URL of your query result (e.g. /result/123 ). Then you access your query result with a GET /result/123. (Bonus points if you use hypertext to link back to /queries/123 so that the consumer of the query result can check and modify the query.
To elaborate the point I'm trying to make:
If RESTful is basically reduced to "map business entities to URIs" than the obvious question arises: "how can I query a subset of my entities"? Often the solution is "adding a query string to the 'all entities of this type'-URL" - Why else would it be called "query string"?. But it starts to feel "wrong" - as stated in the OP - if you want to have a full fledged query interface.
The reason is that with this requirement the Query becomes a full business object itself and is no longer an addendum to an resource address. It's no longer secondary but primary. It becomes important enough to become a resource in its own right - with it's own address (e.g. URL) and representation.
I would use Option 4. It is difficult to put the query representation in json for a large search request into an url, especially against a search server. I agree, in that case it does not fit into a Restful style since the resources cannot be identified by the URI. REST is a guideline. If the scenario cannot be realized by REST then i guess do something that solves the problem. Here using POST is not restful but it seems to be the correct solution.
I'm not sure how much it would look "canonical" to you, but you could have a serious look at OData (open data protocol):
OData is a standardized protocol for creating and consuming data APIs.
OData builds on core protocols like HTTP and commonly accepted
methodologies like REST. The result is a uniform way to expose
full-featured data APIs.
Even if you don't implement it as is, there are ideas that could be reused.
Specifically, OData defines batch processing. It's used for executing multiple operations sent in a single HTTP request. So, with OData, you have two choices:
use the GET + query string operation for queries that are not too long
use a POST + multipart body operation for bigger things.
More on maximum uri length in an OData context: OData Url Length Limitations
Also, many security devices (routers, firewall, etc.) will simply not let your option 1, 2 and 3 go through. GET + Body is unusual, GET + a big form value may get killed, and a custom HTTP verb is also very unusual.
Overall, I think the POST + body seems the best choice (whether it's strictly multipart - like in OData - or not is up to you)
After thinking more about this, I am going to give another answer.
What do you mean, in estimated number of characters, when you state the JSON representations will be "relatively large"? IE can handle URLs over 2,000 characters. Will the queries ever get bigger than that? Because I think the querystring is the way to go. Right now I am working on a system that uses JSONP so we have no other option than to pass all data as a JSON package in the querystring and it works fine. Not only will using the GET verb be semantically correct, this will also include the feature of being able to bookmark URLs to the results. The users could easily share links to the data results through email or other electronic communication systems you use internally.
I'm not sure if this helps but even for all Quickbooks APIs, queries which return large resultsets like Read All, or a LINQ extender query which returns large JSON resultsets, we use GET with the relevant content type and encoding like ASCII. The request uses compressionFormat as None and response uses a GZIP compressionFormat.
https://developer.intuit.com/apiexplorer?apiName=V3QBO
The best way would be to serialize the search JSON object and pass it as a query parameter. Are you sure it will be too long for modern browsers and servers? Modern browsers and servers can handle pretty hefty GET query parameter lengths, thousands of characters.
Perhaps an extension header like X-Custom-Query-Parameters-JSON if objects are going to be more on the order of 8k characters.
How many characters would a serialized JSON object be in your particular case?
Some related questions about character limits:
What is the limit on QueryString / GET / URL parameters
Is there a practical HTTP Header length limit?
An interesting problem. I don't have the specifics on what you are trying to do, but I wonder if it is too much to gracefully handle with one resource. You may want to break it up into several different types depending on the main characteristics of the request. If you are just trying to expose what should be a SQL query through an HTTP request, then I don't think there is any way it can can be implemented without a mess. Just pass the SQL query in the query string and stop trying to find a proper way to do it - it doesn't exist.
Use POST, and pass the queries/parameters as key-value pairs in the body as json. It also becomes easier in your asp.net code to translate the payload into a dictionary object.
Dictionary<string,object>

Different representations of one resource

When i have a resource, let's say customers/3 which returns the customer object and i want to return this object with different fields, or some other changes (for example let's say i need to have include in customer object also his latest purchase (for the sake of speed i dont want to do 2 different queries)).
As i see it my options are:
customers/3/with-latest-purchase
customers/3?display=with-latest-purchase
In the first option there is distinct URI for the new representation, but is this REALLY needed? Also how do i tell the client that this URI exist?
In the second option there is GET parameter telling the server what kind of representation to return. The URI parameters can be explained through OPTIONS method and it is easier to tell client where to look for the data as all the representations are all in one place.
So my question is which of these is better (more RESTful) and/or is there some better way to do this that i do not know about?
I think what is best is to define atomic, indivisible service objects, e.g. customer and customer-latest-purchase, nice, clean, simple. Then if the client wants a customer with his latest purchases, they invoke both service calls, instead of jamming it all in one with funky parameters.
Different representations of an object is OK in Java through interfaces but I think it is a bad idea for REST because it compromises its simplicity.
There is a misconception that making query parameters look like file paths is more RESTful. The query portion of the address is included when determining a distinct URI so the second option is fine.
Is there much of a performance hit in including the latest purchase data in all customer GET requests? If not, the simplest thing would be to do that so there would neither be weird URL params or double requests. If getting the latest order is a significant hardship (which it probably shouldn't be) there is nothing wrong with adding a flag in the query string to include it.

RESTful collections & controlling member details

I have come across this issue a few times now, and each time I make a fruitless search to come up with a satisfying answer.
We have a collection resource which returns a representation of the member URIs, as well as a Link header field with the same URIs (and a custom relation type). Often we find that we need specific data from each member in the collection.
At one extreme, we can have the collection return nothing but the member URIs; the client must then query each URI in turn to determine the required data from each member.
At the other extreme, we return all of the details we might want on the collection. Neither of these is perfect; the first can result in a large number of API calls, and the second may return a lot of potentially unneeded information.
Of the two extremes I favour the second in our case, since we rarely use this for more than one sutiation. However, for a more general approach, I wondered if anyone had a nice way of dynamically specifying which details should be included for each member of the collection? I guess a query string parameter would be most appropriate, but I don't want to break the self-descriptiveness of the resource.
I prefer your first option..
At one extreme, we can have the
collection return nothing but the
member URIs; the client must then
query each URI in turn to determine
the required data from each member.
If you are wanting to reduce the number of HTTP calls over the wire, for example calling a service from a handset app (iOS/Android). You can include an additional header to include the child resources:
X-Aggregate-Resources-Depth: 2
Your server side code will have to aggregate the resources to the desired depth.
Sounds like you're trying to reinvent PROPFIND (RFC 4918, Section 9.1).
I regularly contain a subset of elements in each item within a collection resource. How you define the different subsets is really up to you. Whether you do,
/mycollectionwithjustlinks
/mycollectionwithsubsetA
/mycollectionwithsubsetB
or you use query strings
/mycollection?itemfields=foo,bar,baz
either way they are all different resources. I'm not sure why you believe this is affecting the self-descriptive constraint.

How can I deal with HTTP GET query string length limitations and still want to be RESTful?

As stated in http://www.boutell.com/newfaq/misc/urllength.html, HTTP query string have limited length. It can be limited by the client (Firefox, IE, ...), the server (Apache, IIS, ...) or the network equipment (applicative firewall, ...).
Today I face this problem with a search form. We developed a search form with a lot of fields, and this form is sent to the server as a GET request, so I can bookmark the resulting page.
We have so many fields that our query string is 1100 bytes long, and we have a firewall that drops HTTP GET requests with more than 1024 bytes. Our system administrator recommends us to use POST instead so there will be no limitation.
Sure, POST will work, but I really feel a search as a GET and not a POST. So I think I will review our field names to ensure the query string is not too long, and if I can't I will be pragmatic and use POST.
But is there a flaw in the design of RESTful services? If we have limited length in GET request, how can I do to send large objects to a RESTful webservice? For example, if I have a program that makes calculations based on a file, and I want to provide a RESTful webservice like this: http://compute.com?content=<base64 file>. This won't work because the query string has not unlimited length.
I'm a little puzzled...
HTTP specification actually advises to use POST when sending data to a resource for computation.
Your search looks like a computation, not a resource itself. What you could do if you still want your search results to be a resource is create a token to identify that specific search result and redirect the user agent to that resource.
You could then delete search results tokens after some amount of time.
Example
POST /search
query=something&category=c1&category=c2&...
201 Created
Location: /search/01543164876
then
GET /search/01543164876
200 Ok
... your results here...
This way, browsers and proxies can still cache search results but you are submitting your query parameters using POST.
EDIT
For clarification, 01543164876 here represents a unique ID for the resource representing your search. Those 2 requests basically mean: create a new search object with these criteria, then retrieve the results associated with the created search object.
This ID can be a unique ID generated for each new request. This would mean that your server will leak "search" objects and you will have to clean them regularly with a caching strategy.
Or it can be a hash of all the search criteria actually representing the search asked by the user. This allows you to reuse IDs since recreating a search will return an existing ID that may (or may not) be already cached.
Based on your description, IMHO you should use a POST. POST is for putting data on the server and, in some cases, obtain an answer. In your case, you do a search (send a query to the server) and get the result of that search (retrieve the query result).
The definition of GET says that it must be used to retrieve an already existing resource. By definition, POST is to create a new resource. This is exactly what you are doing: creating a resource on the server and retrieving it! Even if you don't store the search result, you created an object on the server and retrieved it. As PeterMmm previsouly said, you could do this with a POST (create and store the query result) and then use a GET to retrive the query, but it's more pratical do only a POST and retrieve the result.
Hope this helps! :)
REST is a manner to do things, not a protocol. Even if you dislike to POST when it is really a GET, it will work.
If you will/must stay with the "standard" definition of GET, POST, etc. than maybe consider to POST a query, that query will be stored on the server with a query id and request the query later with GET by id.
Regarding your example:http://compute.com?content={base64file}, I would use POST because you are uploading "something" to be computed. For me this "something" feels more like a resource as a simple parameter.
In contrast to this in usual search I would start to stick with GET and parameters. You make it so much easier for api-clients to test and play around with your api. Make the read-only access (which in most cases is the majority of traffic) as simple as possible!
But the dilemma of large query strings is a valid limitation of GET. Here I would go pragmatic, as long as you don't hit this limit go with GET and url-params. This will work in 98% of search-cases. Only act if you hit this limit and then also introduce POST with payload (with mime-type Content-Type: application/x-www-form-urlencoded).
Have you got more real-world examples?
The confusion around GET is a browser limitation. If you are creating a RESTful interface for an A2A or P2P application then there is no limitation to the length of your GET.
Now, if you happen to want to use a browser to view your RESTful interface (aka during development/debugging) then you will run into this limit, but there are tools out there to get around this.
This is an easy one. Use POST. HTTP doesn't impose a limit on the URL length for GET but servers do. Be pragmatic and work around that with a POST.
You could also use a GET body (that is allowed) but that's a double-whammy in that it is not correct usage and probably going to have server problems.
I think if u develop the biz system, encounter this issue, u must think whether the api design reasonable, if u GET api param design a biz_ids, and it too long.
u should think about with UI or Usecase, whether use other_biz_id to find biz_ids and build target response instead of biz_ids directly or not.
if u old api be depended on, u can add a new api for this usecase, if u module design well u add this api may fast.
I think should use protocols in a standard way as developer.
hope help u.

Best way of send multiple parameters via querystring Asp .Net

Which is the best way (in performance and security) to send multiple parameters to a web page (on a different server), considering that the length of the parameters may vary because I'm sending a list of products, and the customer may have selected more than one product, so we need to send each product on the querystring to the other page.
For example (I'm on C#); I want to call a web page like this:
Simple Querystring: thepage.asp?Product=1&Name=Coffee&Value=1.99
Json: thepage.asp?{"Product":"1","Name":"Coffee","Value":"1.99"}
XML: thepage.aps?<xml><Products><product>1</product><name>Coffee</name><Value>1.99</Value></Products>
(Obviouly considering we can't send special characters via querystring, but I put them here for better understanding)
Which will be the better way (performance, security)?
Thanks in advance.
Based on your comment, you're limited to what the third-party site will accept - if all it will handle is query-strings, that's how you'll have to send it. If it will handle form posts, then you could look at submitting the information in the headers of a post, but that is going to take more work (you also haven't specified if you're building a WebRequest on the server side, or doing this through JavaScript on the client side).
All things considered, here are some general points:
There are various limits on the length of a query string (IE limits them to about 2083 characters, some servers or proxies may ignore parts over 1024 characters etc), while POST requests can be much larger.
If you are doing this client side, the user can see the query string parameters (which has the benefit that they can book mark them), while they can't (easily) see POST requests.
For greater security, if the third party server supports it, submit the request over SSL.
Special characters can easily be sent via the query string if you UrlEncode them first.
As to performance, it depends on the amount of processing you have to do to create the query strings over creating XML or JSON strings.
I would use the simple querystring approach, which you could write a utility to convert the request.querystring collection into a format that works better for you (XML, JSON, Dictionary, etc.), IMHO.
HTH.
You need to keep in mind that there is a limit to how long your query string can be, depending on which browser your users use. IE6 has a limit of 2053 characters for example. I would suggest you come up with a method to keep your query string as short as possible to avoid hitting this limit.
As far as security goes, there really isn't any security if you are passing around information in a query string. Anyone can modify that information and then send it. If security is a major concern, you should look into encrypting the information before adding it to the query string, or find a different method for sending it altogether.
Come on what is the question asked ? which is the better way . no one answer proper here. all are telling about limitations. but not about the remedy to solve it . let say i want to pass 100 parameters generates dynamically all are in huge length , can i use here POST() then? I don't thinks so, just consider, what should the remedy then?? may be pass collection object as parameter.

Resources