URL matrix parameters vs. query parameters - http

I'm wondering whether to use matrix or query parameters in my URLs. I found an older discussion to that topic not satisfying.
Examples
URL with query params: http://some.where/thing?paramA=1&paramB=6542
URL with matrix params: http://some.where/thing;paramA=1;paramB=6542
At first sight matrix params seem to have only advantages:
more readable
no encoding and decoding of "&" in XML documents is required
URLs with "?" are not cached in many cases; URLs with matrix params are cached
matrix parameters can appear everywhere in the path and are not limited to its end
matrix parameters can have more than one value: paramA=val1,val2
But there are also disadvantages:
only a few frameworks like JAX-RS support matrix parameters
When a browser submits a form via GET, the params become query params. So it ends up in two kinds of parameters for the same task. To not confuse users of the REST services and limit the effort for the developers of the services, it would be easier to use always query params - in this area.
Since the developer of the service can choose a framework with matrix param support, the only remaining disadvantage would be that browsers create by default query parameters.
Are there any other disadvantages? What would you do?

The important difference is that matrix parameters apply to a particular path element while query parameters apply to the request as a whole. This comes into play when making a complex REST-style query to multiple levels of resources and sub-resources:
http://example.com/res/categories;name=foo/objects;name=green/?page=1
It really comes down to namespacing.
Note: The 'levels' of resources here are categories and objects.
If only query parameters were used for a multi-level URL, you would end up with
http://example.com/res?categories_name=foo&objects_name=green&page=1
This way you would also lose the clarity added by the locality of the parameters within the request. In addition, when using a framework like JAX-RS, all the query parameters would show up within each resource handler, leading to potential conflicts and confusion.
If your query has only one "level", then the difference is not really important and the two types of parameters are effectively interchangeable, however, query parameters are generally better supported and more widely recognized. In general, I would recommend that you stick with query parameters for things like HTML forms and simple, single-level HTTP APIs.

In addition to Tim Sylvester's answer I would like to provide an example of how matrix parameters can be handled with JAX-RS .
Matrix parameters at the last resource element
http://localhost:8080/res/categories/objects;name=green
You can access them using the #MatrixParam annotation
#GET
#Path("categories/objects")
public String objects(#MatrixParam("name") String objectName) {
return objectName;
}
Response
green
But like the Javadoc states
Note that the #MatrixParam annotation value refers to a name of a matrix parameter that resides in the last matched path segment of the Path-annotated Java structure that injects the value of the matrix parameter.
... what brings us to point 2
Matrix parameters in the middle of an URL
http://localhost:8080/res/categories;name=foo/objects;name=green
You can access matrix parameters anywhere using path variables and #PathParam PathSegment.
#GET
#Path("{categoryVar:categories}/objects")
public String objectsByCategory(#PathParam("categoryVar") PathSegment categorySegment,
#MatrixParam("name") String objectName) {
MultivaluedMap<String, String> matrixParameters = categorySegment.getMatrixParameters();
String categorySegmentPath = categorySegment.getPath();
String string = String.format("object %s, path:%s, matrixParams:%s%n", objectName,
categorySegmentPath, matrixParameters);
return string;
}
Response
object green, path:categories, matrixParams:[name=foo]
Since the matrix parameters are provided as a MultivaluedMap you can access each by
List<String> names = matrixParameters.get("name");
or if you only need the first one
String name = matrixParameters.getFirst("name");
Get all matrix parameters as one method parameter
http://localhost:8080/res/categories;name=foo/objects;name=green//attributes;name=size
Use a List<PathSegment> to get them all
#GET
#Path("all/{var:.+}")
public String allSegments(#PathParam("var") List<PathSegment> pathSegments) {
StringBuilder sb = new StringBuilder();
for (PathSegment pathSegment : pathSegments) {
sb.append("path: ");
sb.append(pathSegment.getPath());
sb.append(", matrix parameters ");
sb.append(pathSegment.getMatrixParameters());
sb.append("<br/>");
}
return sb.toString();
}
Response
path: categories, matrix parameters [name=foo]
path: objects, matrix parameters [name=green]
path: attributes, matrix parameters [name=size]

--Too important to be relegated to comment section.--
I'm not sure what the big deal is with matrix URLs. According to the w3c design article that TBL wrote, it was just a design idea and explicitly states that it's not a feature of the web. Things like relative URLs aren't implemented when using it. If you want to use it, that's fine; there's just no standard way to use it because it's not a standard.
— Steve Pomeroy.
So short answer is, if you need RS for business purpose, you are better off using request parameter.

Related

How to list all parameters available to query via API?

As a end-point user of an API, how can I list all parameters available to pass the query? In my case (stats about Age of Empires 2 matches), the website describing the API has a list with some of them but it seems there are more available.
To provide more context, I'm extracting the following information:
GET("https://aoe2.net/api/matches?game=aoe2de&count=1000&since=1632744000&map_type=12")
but for some reason the last condition, map_type=12 does nothing (output is the same as without it). I'm after the list of parameters available, so I can extract what I want.
PD: this post is closely related but does not focus on API. Perhaps this makes a difference, as the second answer there seems to suggest.
It is not possible to find out all available (undocumented) query parameters for a query, unless the API explicitly provides such a method or you can find out how the API server processes the query.
For instance, if the API server code is open source, you could find out from the code how the query is processed. Provided that you find the code also.
The answers in the post you linked are similarly valid for an API site as well as for one that provides content for a web browser (a web server can be both).
Under the hood, there is not necessarily any difference between an API server or a server that provides web content (html) in terms of how queries are handled.
As for the parameters seemingly without an effect, it seems that the API in question does not validate the query parameters, i.e., you can put arbitrary parameters in the query and the server will simply ignore parameters that it is not specifically programmed to use.
The documentation on their website is all any of us have to go by https://aoe2.net/#api
You can't just add your own parameters to the URL and expect it to return a value back as they have to have coded it to work that way.
Your best bet is to just extract as much data as you can by increasing the count parameter, then loop through the JSON response and extract the map_type from there.
JavaScript example:
<script>
json=[{"match_id":"1953364","lobby_id":null,"game_type":0},
{"match_id":"1961217","lobby_id":null,"game_type":0},
{"match_id":"1962068","lobby_id":null,"game_type":1},
{"match_id":"1962821","lobby_id":null,"game_type":0},
{"match_id":"1963814","lobby_id":null,"game_type":0},
{"match_id":"1963807","lobby_id":null,"game_type":0},
{"match_id":"1963908","lobby_id":null,"game_type":0},
{"match_id":"1963716","lobby_id":null,"game_type":0},
{"match_id":"1964491","lobby_id":null,"game_type":0},
{"match_id":"1964535","lobby_id":null,"game_type":12},];
for(var i = 0; i < json.length; i++) {
var obj = json[i];
if(obj.game_type==12){
//do something with game_type 12 json object
console.log(obj);
}
}
</script>

HTTPServletRequest getting requested URL

I am using servlets to allow clients to do CRUD Operations on a list. However I have one servlet, but it's possible to have multiple URL's get to this servlet because I have a wildcard character in the URL-Pattern.
http://localhost:8080/WebServiceDesignStyles3ProjectServer/SpyListCollection
This is the generic way to send a request to the servlet. However, for certain operations
http://localhost:8080/WebServiceDesignStyles3ProjectServer/SpyListCollection/{name}
Is a valid way to send a request to the servlet. I need to be able to get the last portion of that URL. It was told that I should be using getHeader("Accept") to be retrieving that. I've had success using getRequestURI(), but I was hoping someone could provide an example using getHeader(). Or at least an explanation describing the differences of the two.
Thank you for your time,
Kirie
You could split the request path by the separator(/) and check the last part.
String reqURI = req.getRequestURI();
String[] parts = reqURI.split("/");
if (parts[parts.length - 1].equals("SpyListCollection") {
//Generic operation
} else {
String operation = parts[parts.length - 1];
}

How to map several segments of URL to one PathVariable in spring-mvc?

I'm working on a webapp, one function of which was to list all the files under given path. I tried to map several segments of URL to one PathVariable like this :
#RequestMapping("/list/{path}")
public String listFilesUnderPath(#PathVariable String path, Model model) {
//.... add the file list to the model
return "list"; //the model name
}
It didn't work. When the request url was like /list/folder_a/folder_aa, RequestMappingHandlerMapping complained : "Did not find handler method for ..."
Since the given path could contains any number of segments, it's not practical to write a method for every possible situation.
In REST each URL is a separate resource, so I don't think you can have a generic solution. I can think of two options
One option is to change the mapping to #RequestMapping("/list/**") (path parameter no longer needed) and extract the whole path from request
Second option is to create several methods, with mappings like #RequestMapping("/list/{level1}"), #RequestMapping("/list/{level1}/{level2}"), #RequestMapping("/list/{level1}/{level2}/{level3}")... concatenate the path in method bodies and call one method that does the job. This, of course, has a downside that you can only support a limited folder depth (you can make a dozen methods with these mappings if it's not too ugly for you)
You can capture zero or more path segments by appending an asterisk to the path pattern.
From the Spring documentation on PathPattern:
{*spring} matches zero or more path segments until the end of the path and captures it as a variable named "spring"
Note that the leading slash is part of the captured path as mentioned in the example on the same page:
/resources/{*path} — matches all files underneath the /resources/, as well as /resources, and captures their relative path in a variable named "path"; /resources/image.png will match with "path" → "/image.png", and /resources/css/spring.css will match with "path" → "/css/spring.css"
For your particular problem the solution would be:
#RequestMapping("/list/{*path}") // Use *path instead of path
public String listFilesUnderPath(#PathVariable String path, Model model) {
//.... add the file list to the model
return "list"; //the model name
}

Different RESTful representations of the same resource

My application has a resource at /foo. Normally, it is represented by an HTTP response payload like this:
{"a": "some text", "b": "some text", "c": "some text", "d": "some text"}
The client doesn't always need all four members of this object. What is the RESTfully semantic way for the client to tell the server what it needs in the representation? e.g. if it wants:
{"a": "some text", "b": "some text", "d": "some text"}
How should it GET it? Some possibilities (I'm looking for correction if I misunderstand REST):
GET /foo?sections=a,b,d.
The query string (called a query string after all) seems to mean "find resources matching this condition and tell me about them", not "represent this resource to me according to this customization".
GET /foo/a+b+d My favorite if REST semantics doesn't cover this issue, because of its simplicity.
Breaks URI opacity, violating HATEOAS.
Seems to break the distinction between resource (the sole meaning of a URI is to identify one resource) and representation. But that's debatable because it's consistent with /widgets representing a presentable list of /widget/<id> resources, which I've never had a problem with.
Loosen my constraints, respond to GET /foo/a, etc, and have the client make a request per component of /foo it wants.
Multiplies overhead, which can become a nightmare if /foo has hundreds of components and the client needs 100 of those.
If I want to support an HTML representation of /foo, I have to use Ajax, which is problematic if I just want a single HTML page that can be crawled, rendered by minimalist browsers, etc.
To maintain HATEOAS, it also requires links to those "sub-resources" to exist within other representations, probably in /foo: {"a": {"url": "/foo/a", "content": "some text"}, ...}
GET /foo, Content-Type: application/json and {"sections": ["a","b","d"]} in the request body.
Unbookmarkable and uncacheable.
HTTP does not define body semantics for GET. It's legal HTTP but how can I guarantee some user's proxy doesn't strip the body from a GET request?
My REST client won't let me put a body on a GET request so I can't use that for testing.
A custom HTTP header: Sections-Needed: a,b,d
I'd rather avoid custom headers if possible.
Unbookmarkable and uncacheable.
POST /foo/requests, Content-Type: application/json and {"sections": ["a","b","d"]} in the request body. Receive a 201 with Location: /foo/requests/1. Then GET /foo/requests/1 to receive the desired representation of /foo
Clunky; requires back-and-forth and some weird-looking code.
Unbookmarkable and uncacheable since /foo/requests/1 is just an alias that would only be used once and only kept until it is requested.
I would suggest the querystring solution (your first). Your arguments against the other alternatives are good arguments (and ones that I've run into in practise when trying to solve the same problem). In particular, the "loosen the constraints/respond to foo/a" solution can work in limited cases, but introduces a lot of complexity into an API from both implementation and consumption and hasn't, in my experience, been worth the effort.
I'll weakly counter your "seems to mean" argument with a common example: consider the resource that is a large list of objects (GET /Customers). It's perfectly reasonable to page these objects, and it's commonplace to use the querystring to do that: GET /Customers?offset=100&take=50 as an example. In this case, the querystring isn't filtering on any property of the listed object, it's providing parameters for a sub-view of the object.
More concretely, I'd say that you can maintain consistency and HATEOAS through these criteria for use of the querystring:
the object returned should be the same entity as that returned from the Url without the querystring.
the Uri without the querystring should return the complete object - a superset of any view available with a querystring at the same Uri. So, if you cache the result of the undecorated Uri, you know you have the full entity.
the result returned for a given querystring should be deterministic, so that Uris with querystrings are easily cacheable
However, what to return for these Uris can sometimes pose more complex questions:
returning a different entity type for Uris differing only by querystring could be undesirable (/foo is an entity but foo/a is a string); the alternative is to return a partially-populated entity
if you do use different entity types for sub-queries then, if your /foo doesn't have an a, a 404 status is misleading (/foo does exist!), but an empty response may be equally confusing
returning a partially-populated entity may be undesirable, but returning part of an entity may not be possible, or may be more confusing
returning a partially populated entity may not be possible if you have a strong schema (if a is mandatory but the client requests only b, you are forced to return either a junk value for a, or an invalid object)
In the past, I have tried to resolve this by defining specific named "views" of required entities, and allowing a querystring like ?view=summary or ?view=totalsOnly - limiting the number of permutations. This also allows for definition of a subset of the entity that "makes sense" to the consumer of the service, and can be documented.
Ultimately, I think that this comes down to an issue of consistency more than anything: you can meet HATEOAS guidance using the querystring relatively easily, but the choices you make need to be consistent across your API and, I'd say, well documented.
I've decided on the following:
Supporting few member combinations: I'll come up with a name for each combination. e.g. if an article has members for author, date, and body, /article/some-slug will return all of it and /article/some-slug/meta will just return the author and date.
Supporting many combinations: I'll separate member names by hyphens: /foo/a-b-c.
Either way, I'll return a 404 if the combination is unsupported.
Architectural constraint
REST
Identifying resources
From the definition of REST:
a resource R is a temporally varying membership function MR(t), which for time t maps to a set of entities, or values, which are equivalent. The values in the set may be resource representations and/or resource identifiers.
A representation being an HTTP body and an identifier being a URL.
This is crucial. An identifier is just a value associated with other identifiers and representations. That's distinct from the identifier→representation mapping. The server can map whatever identifier it wants to any representation, as long as both are associated by the same resource.
It's up to the developer to come up with resource definitions that reasonably describe the business by thinking of categories of things like "users" and "posts".
HATEOAS
If I really care about perfect HATEOAS, I could put a hyperlink somewhere in the /foo representation to /foo/members, and that representation would just contain a hyperlink to every supported combination of members.
HTTP
From the definition of a URL:
The query component contains non-hierarchical data that, along with data in the path component, serves to identify a resource within the scope of the URI's scheme and naming authority (if any).
So /foo?sections=a,b,d and /foo?sections=b are distinct identifiers. But they can be associated within the same resource while being mapped to different representations.
HTTP's 404 code means that the server couldn't find anything to map the URL to, not that the URL is not associated with any resource.
Functionality
No browser or cache will ever have trouble with slashes or hyphens.
Actually it depends on the functionality of the resource.
If for example the resource represents an entity:
/customers/5
Here the '5' represents an id of the customer
Response:
{
"id": 5,
"name": "John",
"surename": "Doe",
"marital_status": "single",
"sex": "male",
...
}
So if we will examine it closely, each json property actually represents a field of the record on customer resource instance.
Let's assume consumer would like to get partial response, meaning, part of the fields. We can look at it as the consumer wants to have the ability to select the various fields via the request, which are interesting to him, but not more (in order to save traffic or performance, if part of the fields are hard to compute).
I think in this situation, the most readable and correct API would be (for example, get only name and surename)
/customers/5?fields=name,surename
Response:
{
"name": "John",
"surename": "Doe"
}
HTTP/1.1
if illegal field name is requested - 404 (Not Found) is returned
if different field names are requested - different responses will be generated, which also aligns with the caching.
Cons: if the same fields are requested, but the order is different between the fields (say: fields=id,name or fields=name,id), although the response is the same, those responses will be cached separately.
HATEOAS
In my opinion pure HATEOAS is not suitable for solving this particular problem. Because in order to achieve that, you need a separate resource for every permutation of field combinations, which is overkill, as it is bloating the API extensively (say you have 8 fields in a resource, you will need permutations!).
if you model resources only for the fields but not all the permutations, it has performance implications, e.g. you want to bring the number of round trips to minimum.
If a,b,c are property of a resource like admin for role property the right way is to use is the first way that you've suggested GET /foo?sections=a,b,d because in this case you would apply a filter to the foo collection. Otherwise if a,b and c are a singole resource of foo collection the the way that would follow is to do a series of GET requests /foo/a /foo/b /foo/c. This approach, as you said, has a high payload for request but it is the correct way to follow the approach Restfull. I would not use the second proposal made ​​by you because plus char in a url has a special meaning.
Another proposal is to abandon use GET and POST and create an action for the foo collection like so: /foo/filter or /foo/selection or any verb that represent an action on the collection. In this way, having a post request body, you can pass a json list of the resource you would.
you could use a second vendor media-type in the request header application/vnd.com.mycompany.resource.rep2, you can't bookmark this however, query-parameters are not cacheable (/foo?sections=a,b,c) you could take a look at matrix-parameters however regarding this question they should be cacheable URL matrix parameters vs. request parameters

Is it valid to combine a form POST with a query string?

I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence). However, is it a valid thing to do according to the HTTP specification? Say you were to POST to:
http://1.2.3.4/MyApplication/Books?bookCode=1234
... and submit some update like a change to the book name whose book code is 1234, you'd be wanting the processing code to take both the bookCode query string param into account, and the POSTed form params with the updated book information. Is this valid, and is it a good idea?
Is it valid according HTTP specifications ?
Yes.
Here is the general syntax of URL as defined in those specs
http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]]
There is no additional constraints on the form of the http_URL. In particular, the http method (i.e. POST,GET,PUT,HEAD,...) used don't add any restriction on the http URL format.
When using the GET method : the server can consider that the request body is empty.
When using the POST method : the server must handle the request body.
Is it a good idea ?
It depends what you need to do. I suggest you this link explaining the ideas behind GET and POST.
I can think that in some situation it can be handy to always have some parameters like the user language in the query part of the url.
I know that in most MVC frameworks, for example, both query string params and form params will be made available to the processing code, and usually merged into one set of params (often with POST taking precedence).
Any competent framework should support this.
Is this valid
Yes. The POST method in HTTP does not impose any restrictions on the URI used.
is it a good idea?
Obviously not, if the framework you are going to use is still clue-challenged. Otherwise, it depends on what you want to accomplish. The major use case (redirection of a data subset to a new POST target) has been irretrievably broken by browser implementations (all mechanically following the broken lead of Mosaic/Netscape), so the considerations here are mostly theoretical.

Resources