Using duplicate parameters in a URL - standards

We are building an API in-house and often are passing a parameter with multiple values.
They use: mysite.com?id=1&id=2&id=3
Instead of: mysite.com?id=1,2,3
I favor the second approach but I was curious if it was actually incorrect to do the first?

I'm not an HTTP guru, but from what I understand there's not a definitive standard on the query part of the URL regarding multiple values, it's typically up to the CGI that handles the request to parse the query string.
RFC 1738 section 3.3 mentions a searchpart and that it should go after the ? but doesn't seem to elaborate on its format.
http://<host>:<port>/<path>?<searchpart>

I did not (bother to) check which RFC standard defines it. (Anyone who knows about this please leave a reference in the comment.) But in practice, the mysite.com?id=1&id=2&id=3 way is already how a browser would produce when a form contains duplicated fields, typically the checkboxes. See it in action in this w3schools example page. So there is a good chance that the whatever programming language you are using, already provides some helper functions to parse an input like that and probably returns a list.
You could, of course, go with your own approach such as mysite.com?id=1,2,3, which is not bad at all in this particular case. But you will need to implement your own logic to produce and to consume such format. Now you may or may not need to think about handling some corner cases by yourself, such as: what if the input is not well-formed, like mysite.com?id=1,2,? And do you need to invent yet another separator, if the comma sign itself can also be a valid input, like mysite.com?name=Doe,John|Doe,Jane? Would you reach to a point that you will use a json string as the value, like mysite.com?name=["John Doe", "Jane Doe"]? etc. etc.. Your mileage may vary.

Worth adding that inconsistend handling of duplicate parameters in the URL on the server is may lead to vulnerabilities, specifically server-side HTTP parameter pollution, with a practical example - Client side Http Parameter Pollution - Yahoo! Classic Mail Video Poc.

in your first approach you will get an array of querystring values but in second approach you will get a string of querystring values.

I guess it depends on technology you use, how it becomes convenient. I am currently standing in front of the same question using currency=USD,CHF or currency=USD&currency=CHF
I am using Thymeleaf and using the second option makes it easy to work, I can then request something like: ${param.currency.contains(currency.value)}. When I try to use the first option it seems it takes the "array" like a string, so I need to split first and then do contain, what leads me to a more mess code.
Just my 50 cents :-)

Related

Pact matching non JSON body

Is there any way of matching non JSON bodies (either XML, byte or whatever). Looking for the Python solution, however will appreciate any ideas behind that (even monkeypatching).
It's possible, but not directly supported.
Currently there's only the ability to match JSON. You can fake non-JSON matching by expecting a string body, but then you won't be able to use pact's built in matchers- which might mean your tests will be data dependent unless you do a bit of leg work.
There is a stub for xml support, but it's not currently implemented.
If you're willing to get your hands dirty in Ruby (not that different to Python!) you can write your own matcher. I can show you how to configure the pact-provider-verifier to use the custom matching code. Currently, if you use a content type that is not JSON, as J_A_X says, it will do an exact string diff.

When would you use 'Real' translation messages in Symfony2?

The Symfony documentation says:
Using Real or Keyword Messages This example illustrates the two
different philosophies when creating messages to be translated:
$translated = $translator->trans('Symfony2 is great');
$translated = $translator->trans('symfony2.great');
< snip >
The choice of which method to use is entirely up to you, but the "keyword" format is often recommended.
http://symfony.com/doc/current/book/translation.html
So when would you use 'Real' messages?
You really have to decide for yourself. It's a bit a matter of taste and a bit a matter of your translation workflow.
Real messages are good when you don't want the overhead of maintaining an additional translation file (for the origin language). Furthermore, if you forget to translate some of the messages, you'd still see a valid message in the origin language. It's also somewhat easier to translate from an original message rather than a keyword.
Keywords are better when messages are changing often, especially with long texts. You abstract away the purpose of a message from the actual text.
EDIT: there's one more scenario when you could argue that real messages are better than keys - when your website only supports one language but with multiple variations - like en_GB, en_US. Most of the messages will be the same, only few will vary. So most of the messages could be left as they are, and only the ones which are actually different between GB and US put into a translation files. It would require much less work compared to an approach with using keys (of course, assuming your messages don't change very often).
One usecase for the real format I could come up with is when messages are created by users via the UI — it would be silly to force them to come up with keywords for each phrase they want to translate.
I haven't had such a need yet, so I always use the keyword format.
For the most part I agree with #Jakub Zalas' answer, however, the last line is a bit off.
Keywords are better when messages may ever change - not just when changing often. This is outlined as well in the docs themselves:
The second method is handy because the message key won't need to be changed in every translation file if you decide that the message should actually read "Symfony2 is really great" in the default locale.
If the message changes and you haven't used a key but the message as key you have to change any code using this message to reflect that change. More places to change are more potential bugs. We have the ability to build in leverage by using message keys.
Real messages has no big interest. IMO you can use them if you are sure your application will always be mono-language and you want to gain a few minutes in development.
Keyword trans has the interest that if you have to translate your website, you'll see immediately if a translation is missing.
To facilitate translations, I personnaly use JMSTranslationBundle

ASP.NET: Using Request["param"] versus using Request.QueryString["param"] or Request.Form["param"]

When accessing a form or query string value from code-behind in ASP.NET, what are the pros and cons of using, say:
// short way
string p = Request["param"];
instead of:
// long way
string p = Request.QueryString["param"]; // if it's in the query string or
string p = Request.Form["param"]; // for posted form values
I've thought about this many times, and come up with:
Short way:
Shorter (more readable, easier for newbies to remember, etc)
Long way:
No problems if there are a form value and query string value with same name (though that's not usually an issue)
Someone reading the code later knows whether to look in URLs or form elements to find the source of the data (probably the most important point)
.
So what other advantages/disadvantages are there to each approach?
the param collection includes all (4) collections:
Query-string parameters
Form fields
Cookies
Server variables
you can debate that searching in the combined collection is slower than looking into a specific one, but it is negligible to make a difference
The long way is better because:
It makes it easier (when reading the code later) to find where the value is coming from (improving readability)
It's marginally faster (though this usually isn't significant, and only applies to first access)
In ASP.NET (as well as the equivalent concept in PHP), I always use what you are calling the "long form." I do so out of the principle that I want to know exactly from where my input values are coming, so that I am ensuring that they get to my application the way I expect. So, it's for input validation and security that I prefer the longer way. Plus, as you suggest, I think the maintainability is worth a few extra keystrokes.

REST - Modify Part of Resource - PUT or POST

I'm seeing a good bit of hand-waving on the subject of how to update only part of a resource (eg. status indicator) using REST.
The options seem to be:
Complain that HTTP doesn't have a PATCH or MODIFY command. However, the accepted answer on HTTP MODIFY verb for REST? does a good job of showing why that's not as good an idea as it might seem.
Use POST with parameters and identify a method (eg. a parameter named "action"). Some suggestions are to specify an X-HTTP-Method-Override header with a self-defined method name. That seems to lead to the ugliness of switching within the implementation based on what you're trying to do, and to be open to the criticism of not being a particularly RESTful way to use POST. In fact, taking this approach starts to feel like an RPC-type interface.
Use PUT to over-write a sub-resource of the resource which represents the specific attribute(s) to update. In fact, this is effectively an over-write of the sub-resource, which seems in line with the spirit of PUT.
At this point, I see #3 as the most reasonable option.
Is this a best practice or an anti-pattern? Are there other options?
There are two ways to view a status update.
Update to a thing. That's a PUT. Option 3
Adding an additional log entry to the history of the thing. The list item in this sequence of log entries is the current status. That's a POST. Option 2.
If you're a data warehousing or functional programming type, you tend to be mistrustful of status changes, and like to POST a new piece of historical fact to a static, immutable thing. This does require distinguishing the thing from the history of the thing; leading to two tables.
Otherwise, you don't mind an "update" to alter the status of a thing and you're happy with a PUT. This does not distinguish between the thing and it's history, and keeps everything in one table.
Personally, I'm finding that I'm less and less trustful of mutable objects and PUT's (except for "error correction"). (And even then, I think the old thing can be left in place and the new thing added with a reference to the previous version of itself.)
If there's a status change, I think there should be a status log or history and there should be a POST to add a new entry to that history. There may be some optimization to reflect the "current" status in the object to which this applies, but that's just behind-the-scenes optimization.
Option 3 (PUT to some separated sub-resource) is your best bet right now, and it wouldn't necessarily be "wrong" to just use POST on the main resource itself - although you could disagree with that depending on how pedantic you want to be about it.
Stick with 3 and use more granular sub-resources, and if you really do have a need for PATCH-like behavior - use POST. Personally, I will still use this approach even if PATCH does actually end up as a viable option.
HTTP does have a PATCH command. It is defined in Section 19.6.1.1 of RFC 2068, and was updated in draft-dusseault-http-patch-16, currently awaiting publication as RFC.
It's ok to POST & emulating PATCH where not available
Before explaining this, it's probably worth mentioning that there's nothing wrong with using POST to do general updates (see here) In particular:
POST only becomes an issue when it is used in a situation for which some other method is ideally suited: e.g., retrieval of information that should be a representation of some resource (GET), complete replacement of a representation (PUT)
Really we should be using PATCH to make small updates to complex resources but it isn't as widely available as we'd like. We can emulated PATCH by using an additional attribute as part of a POST.
Our service needs to be open to third-party products such as SAP, Flex, Silverlight, Excel etc. That means that we have to use the lowest common denominator technology - for a while we weren't able to use PUT because only GET and POST were supported across all the client technologies.
The approach that I've gone with is to have a "_method=patch" as part of a POST request. The benefits are;
(a) It's easy to deal with on the server side - we're basically pretending that PATCH is available
(b) It indicates to third-parties that we are not violating REST but working around a limitation with the browser. It's also consistent with how PUT was handled a few years back by the Rails community so should be comprehensible by many
(c) It's easy to replace when PATCH becomes more widely available
(d) It's a pragmatic response to an awkward problem.
PATCH is fine for patch or diff formats. Until then it's not very useful at all.
As for your solution 2 with a custom method, be it in the request or in the headers, no no no no and no, it's awful :)
Only two ways that are valid are either to PUT the whole resource, with the sub data modified, or POST to that resource, or PUT to a sub-resource.
It all depends on the granularity of your resources and the intended consequences on caching.
A bit late with an answer but I would consider using JSON Patch for scenarios like this.
At the core of it, it requires two copies of the resource (the original and the modified), and performs a diff on it. The outcome of the diff is an array of patch operations describing the difference.
An example of this:
[
{ "op": "replace", "path": "/baz", "value": "boo" },
{ "op": "add", "path": "/hello", "value": ["world"] },
{ "op": "remove", "path": "/foo" }
]
There are many client libraries that can do the hard lifting in generat

Should I use Request.Params instead of explicitly doing Request.Form?

I have been using Request.Form for all my code. And if I need querystring I hit that explicitly too. It came up in a code review that I should probably use the Params collection instead.
I thought it was a best practice, to hit the appropriate collection directly. I am looking for some reinforcement to one side or the other of the argument.
It is more secure to use Request.Form. This will prevent users from "experimenting" with posted form parameters simply by changing the URL. Using Request.Form doesn't make this secure for "real hackers", but IMHO it's better to use the Form collection.
By using the properties under the request you are narrowing down the your retrieval to the proper collection (which is a good thing for readability and performance). I consider your approach to be a best practice and follow it myself.
I have always used
Request.Form("Param")
or
Request.QueryString("Param")
This is purely down to a syntax which is easier to read. I seriously doubt there is a performance impact.
The only time I use Request.Params instead of Form or Querystring is if I don't know whether the method by which the parameters will be passed in.
To put that in context, in 10 years I have used Request.Params in anger only once :)
Kindness,
D
I think it's better to use the Form and QueryString collections explicitly unless you're explicitly trying to define flexible behavior in your application like in a search form where you might want to have the search parameters definable in a URL or saved in cookies such as pagination preferences.
I would use Request.Form and Request.QueryString explicitly. The reason is that the two are not interchangable. The query string is used for HTTP Get requests, and FORM variables for HTTP post requests.
Get requests are typically applicable where you are requesting data, e.g. do a google search, the search words are in the query string. The post are when you are sending data to the web server for processing or storing. So when I say that the two are not interchangable I mean that you cannot change the page from using a GET to a POST without breaking functionality.
So IMHO, the implementation of the page can quite clearly reflect the fact that you intend it to be called by a GET or a POST request.
/Pete

Resources