I've created a Web API project in ASP.Net, and am having some trouble getting the authentication working.
The API is expecting a token to be submitted in the Authorization header in each request. The code that checks to see if the header is set checks if the
HttpRequestMessage.Headers.Authorization
property is null. The first few times I tested this, I discovered that this property was always null, but the strange part is that if you checked the HttpRequestMessage.Headers enumerable, the Authorization header WAS set correctly (also if you did HttpRequestMessage.Headers.ToString(), it would appear there too).
Stranger still, I found that if I removed some of the attributes that are sent in the token, I could get it to work as expected. So it was as though the Authorization property wasn't being set if the header value's character length was too long. Unfortunately, even when manually removing some of the text from the token, it would then proceed to fail on a digest check, as it should!
I can't find any documentation that mentions this, so I was wondering if anyone else has come across it? I don't think the header is too long for IIS, because the header value appears in HttpRequestMessage.Headers.ToString(), so it IS being received, but for some reason it's not being assigned to the Authorization property.
Unfortunately I can't re-write the code that checks this property (this seems the easy solution) because it's apart of the Thinktecture library (ie not written by ourselves).
If you are passing the parameters on a GET, you will be limited to 2100 characters. The RFC spec will be different between implementations. Most of the browsers limit you to 2083 characters. You can definitely get away with 1000 characters.
Microsoft
Pretty much everybody else
If you are passing the parameters on a POST, you should have virtually unlimited lengths.
For a while I was (wrongly) thinking that a RESTful API just exposed CRUD operation to persisted entities for a web application. When you code something up in "the real world" you soon find out that this is not enough. For example, a bank account transfer doesn't have to be a persisted entity. It could be a transient resource where you POST to /transfers/ and in the payload you specify the details:
{"accountToCredit":1234, "accountToDebit":5678, "amount":10}
Using POST here makes sense because it changes the state on the server ($10 moves from one account to another every time this POST occurs).
What should happen in the case where it doesn't affect the server? The simple first answer would be to use GET. For example, you want to get a list of savings and checking accounts that have less than $100. You would then call something like GET to /accounts/searchResults?minBalance=0&maxBalance=100. What happens though if your search parameter need to use complex objects that wouldn't fit in the maximum length of a GET request.
My first thought was to use POST, but after thinking about it some more it should probably be a PUT since it isn't changing the state of the server, but from my (limited) understanding I always though of PUT as updating a resource and POST as creating a resource (like creating this search results). So which should be used in this case?
I found the following links which provide some information but it wasn't clear to me what should be used in the different cases:
Transient REST Representations
How to design RESTful search/filtering?
RESTful URL design for search
I would agree with your approach, it seems reasonable to me to use GET when searching for resources, and as said in one of your provided links, the whole point of query strings is for doing things like search. I also agree that PUT fits better when you want to update some resource in an idempotent way (no matter how many times you hit the request, the result will be the same).
So generally, I would do it as you propose. Now, if you are limited by the maximum length of GET request, then you could use POST or PUT, passing your parameters in a JSON, in a URI like:
PUT /api/search
You could see this as a "search resource" where you send new parameters. I know it seems like a workaround and you may be worried that REST is about avoiding verbs in the URIs. Well, there are few cases that it's still acceptable and RESTful to use verbs, e.g. in cases where calculation or conversion is involved to generate the result (for more about this, check this reference).
PS. I think this workaround is still RESTful, but even if it wasn't, REST isn't an obsession and an ultimate goal. Being pragmatic and keeping a clean API design might be a better approach, even if in few cases you are not RESTful.
I've got a site that uses an order entry form and sends a rather decently sized POST request when the form is submitted.
However, when a particular value is passed in one of our form variables (OrderDetail), every time without fail, it gets an error page in the browser and a 504 error via Fiddler.
Here are a couple examples of tests I ran last night sending POST requests through Fiddler. When the "OrderDetail=" value is changed to the below it will either submit successfully or return a 504 error after a few seconds:
These ones FAIL:
&OrderDetail=Deliver+Writ+of+Execution%3B+and+Application+for+Earnings+Withholding+Order+to+Los+Angeles+County+Sheriff+DASH+Court+Services+Division+per+instructions
&OrderDetail=Deliver+Execution+Earnings+Withholding+Order+to+Los+Angeles+County+Sheriff+DASH+Court+Services+Division+per+instructions
&OrderDetail=Deliver+Writ+of+Execution%3B+and+Application+for+Earnings+Withholding+Order+to+Los+Angeles+County+Sheriff
&OrderDetail=Deliver+Writ+of+Execution%3B+Application+for+Earnings+Withholding+Order+to+Los+Angeles+County+Sheriff
&OrderDetail=Writ+of+Withholding+Execution+Order+Los+Angeles+County+Sheriff
&OrderDetail=writ+Execution+adsfsdfsdfsd+Order+County
&OrderDetail=wd+Execution+adsfsdfsdfsd+Order+Count
This got me thinking that perhaps it has to do with the words "Exec" ('Exec' and 'Execution' throw errors, 'Exe' does not) and "Count" ('County' and 'Count' throw errors, 'Cont' does not)
However, I haven't seen anything this specific mentioned in google searches regarding the 504 error.
Regarding the Coldfusion code around this, there is nothing fancy for this page. Just a standard form post. I added a cfmail test in the Application file and on these failures it is never ran, so this seems to be between the browser and IIS. We're on a shared server, so I can't see too much there, though.
Oddly enough, when the &OrderDetail= param is changed to one of these values (very similar to the above), the result is success:
&OrderDetail=wd+Execution+adsfsdfsdfsd+Order+Coun
&OrderDetail=wd+Execution+adsfsdfsdfsd+Order+Conty
&OrderDetail=Writ+of+Withholding+Order+Execution+Los+Angeles+County+Sheriff
&OrderDetail=Writ+of+Withholding+ExecutionOrder+Los+Angeles+County+Sheriff
In the 3rd one, I put 'Order' BEFORE 'Execution' and it works..
The total length of this POST request is about 4720 characters. I've increased the length of this one field to 5-6 times its length and they passed, so it almost seems tied to the value of the "&OrderDetail" param in the POST.
Any ideas on why this specific data could be an issue for a web server? I've never seen this before and it doesn't continue to be a problem for nearly any other request going through.
One interesting note as well: In the POST request, this variable is pretty close to the start of the param list. If I delete everything after it, it goes with no problem. Although I haven't been able to nail down what in the subsequent lines could be causing it. I can post the entire request if it will help.
More importantly though, I just want to know what could qualify as "reserved" or "illegal" for FORM data. Everything appears to be escaped properly so I'm not sure what else can be done here except for some pre-processing javascript to further escape any such words.
Thanks!
Given that EXEC and COUNT are causing the error, whilst putting ORDER before EXEC is preventing the error, this sounds like something is making a flawed attempt at protecting from SQL injection attacks.
If you have any software in place that claims to do that, I would see if (temporarily) disabling it stops the problem from occurring.
(This software might be at the firewall level, so you may need to talk to your sys admins.)
Importantly, I would also check your codebase for where OrderDetail is used, and make sure that it is using cfqueryparam whenever it is used inside a query - and the same goes for all other user-supplied data.
I am writing a score submission system for games where I need to ensure that reports back to the server are not falsified (aka, hacked).
I know that I can store a password or private passkey in the program to authenticate or encrypt the request but if the program is decompiled, a crafty hacker can extract the password/passkey and use it to falsify reports.
Does a perfect solution exist?
Thanks in advance.
No. All you can do is make it difficult for cheaters.
You don't say what environment you're running on, but it sounds like you're trying to solve a code authentication problem*: knowing that the code that is executing is actually what you think it is. This is a problem that has plagued online games forever and does not have a good solution.
Common ways in which such systems are commonly broken:
Capture, modification and replay of submissions to the server
Modifying the binary to allow cheating
Using a debugger to modify the submission in-memory before the program applies signatures/encryption/whatever
Punkbuster is an example of a system which attempts to solve some of these problems: http://en.wikipedia.org/wiki/PunkBuster
Also consider http://en.wikipedia.org/wiki/Cheating_in_online_games
Chances are, this is probably too hard for your game. Hiding a public key in your binary and signing everything that leaves it will probably put you well ahead of the pack, security-wise.
* Apologies, I don't actually remember what the formal name for this is. I keep thinking "running code authentication", but Google comes up with nothing for the term.
There is one thing you can do - record all of the user inputs and send those to the server as part of the submission. The server can then replay the inputs through a local copy of the game engine to determine the score. Obviously this isn't appropriate for every type of game, though. Depending on the game, you may need to include replay protection.
Another method that may be appropriate for some types of games is to include a video recording of the high-scoring play within the submission. Provide links to the videos from the high score table, along with a link to report suspicious entries. This will let you "crowd-source" cheat detection - if a cheater's score hits the table at number 1, then the players behind scores 2 through 10 have a pretty big incentive to validate the video for you. If a score is reported enough times, you can check the video yourself and decide if it should be removed (and the user banned).
Ok, I know the difference in purpose. GET is to get some data. Make a request and get data back. POST should be used for CRUD operations other than read I believe. But when it comes down to it, does the server really care if it's receiving a GET vs. POST in the end?
According to the HTTP RFC, GET should not have any side-effects, while POST may have side-effects.
The most basic example of this is that GET is not appropriate for anything like a purchase-transaction or posting an article to a blog, while POST is appropriate for actions-that-have-consequences.
By the RFC, you can hold a user responsible for actions done by POST (such as a purchase), but not for GET actions. 'Bots always use GET for this reason.
From the RFC 2616, 9.1.1:
9.1.1 Safe Methods
Implementors should be aware that the
software represents the user in
their interactions over the Internet,
and should be careful to allow the
user to be aware of any actions they
might take which may have an
unexpected significance to themselves
or others.
In particular, the convention has
been established that the GET and
HEAD methods SHOULD NOT have the
significance of taking an action
other than retrieval. These methods
ought to be considered "safe". This
allows user agents to represent other
methods, such as POST, PUT and
DELETE, in a special way, so that the
user is made aware of the fact that
a possibly unsafe action is being
requested.
Naturally, it is not possible to
ensure that the server does not
generate side-effects as a result of
performing a GET request; in fact,
some dynamic resources consider that a
feature. The important distinction
here is that the user did not request
the side-effects, so therefore
cannot be held accountable for them.
It does if a search engine is crawling the page, since they will be making GET requests but not POST. Say you have a link on your page:
http://www.example.com/items.aspx?id=5&mode=delete
Without some sort of authorization check performed before the delete, it's possible that Googlebot could come in and delete items from your page.
Since you're the one writing the server software (presumably), then it cares if you tell it to care. If you handle POST and GET data identically, then no, it doesn't.
However, the browser definitely cares. Refreshing or clicking back to a page you got as a response to a POST pops up the little "Are you sure you want to submit data again" prompt, for example.
GET has data limit restrictions based on the sending browser:
The spec for URL length does not dictate a minimum or maximum URL length, but implementation varies by browser. On Windows: Opera supports ~4050 characters, IE 4.0+ supports exactly 2083 characters, Netscape 3 -> 4.78 support up to 8192 characters before causing errors on shut-down, and Netscape 6 supports ~2000 before causing errors on start-up
If you use a GET request to alter back-end state, you run the risk of bad things happening if a webcrawler of some kind traverses your site. Back when wikis first became popular, there were horror stories of whole sites being deleted because the "delete page" function was implemented as a GET request, with disastrous results when the Googlebot came knocking...
"Use GET if: The interaction is more like a question (i.e., it is a safe operation such as a query, read operation, or lookup)."
"Use POST if: The interaction is more like an order, or the interaction changes the state of the resource in a way that the user would perceive (e.g., a subscription to a service), or the user be held accountable for the results of the interaction."
source
You be aware of a few subtle security differences. See my question
GET versus POST in terms of security?
Essentially the important thing to remember is that GET will go into the browser history and will be transmitted through proxies in plain text, so you don't want any sensitive information, like a password in a GET.
Obvious maybe, but worth mentioning.
By HTTP specifications, GET is safe and idempotent and POST is neither. What this means is that a GET request can be repeated multiple times without causing side effects.
Even if your server doesn't care (and this is unlikely), there may be intermediate agents between your client and the server, all of whom have this expectation. For example proxies to cache data at your ISP or other providers for improved performance. THe same expectation is true for accelerators, for example, a prefetching plugin for your browser.
Thus a GET request can be cached (based on certain parameters), and if it fails, it can be automatically repeated without any expecation of harmful effects. So, really your server should strive to fulfill this contract.
On the other hand, POST is not safe, not idempotent and every agent knows not to cache the results of a POST request, or retry a POST request automatically. So, for example, a credit card transaction would never, ever be a GET request (you don't want accounts being debited multiple times because of network errors, etc).
That's a very basic take on this. For more information, you might consider the "RESTful Web Services" book by Ruby and Richardson (O'Reilly press).
For a quick take on the topic of REST, consider this post:
http://www.25hoursaday.com/weblog/2008/08/17/ExplainingRESTToDamienKatz.aspx
The funny thing is that most people debate the merits of PUT v POST. The GET v POST issue is, and always has been, very well settled. Ignore it at your own peril.
GET has limitations on the browser side. For instance, some browsers limit the length of GET requests.
I think a more appropriate answer, is you can pretty much do the same things with both. It is not so much a matter of preference, however, but a matter of correct usage. I would recommend you use you GETs and POSTs how they were intended to be used.
Technically, no. All GET does is post the stuff in the first line of the HTTP request, and POST posts stuff in the body.
However, how the "web infrastructure" treats the differences makes a world of difference. We could write a whole book about it. However, I'll give you some "best practises":
Use "POST" for when your HTTP request would change something "concrete" inside the web server. Ie, you're editing a page, making a new record, and so on. POSTS are less likely to be cached, or treated as something that's "repeatable without side-effects"
Use "GET" for when you want to "look at an object". Now, such a look might change something "behind the scenes" in terms of caching or record keeping, but it shouldn't change anything "substantial". Ie, I could repeat my GET over and over and nothing bad would happen, except for inflated hit counts. GETs should be easily bookmarkable, so a user can go back to that same object later on.
The parameters to the GET (the stuff after the ?, traditionally) should be considered "attributes to the view" or "what to view" and so on. Again, it shouldn't actually change anything: use POST for that.
And, a final word, when you POST something (for example, you're creating a new comment), have the processing for the post issue a 302 to "redirect" the user to a new URL that views that object. Ie, a POST processes the information, then redirects the browser to a GET statement to view the new state. Displaying information as a result of a POST can also cause problems. Doing the redirection is often used, and makes things work better.
Should the user be able to bookmark the resulting page? Another thing to think about is some browsers/servers incorrectly limit the GET URI length.
Edit: corrected char length restriction note - thanks ars!
It depends on the software at the server end. Some libraries, like CGI.pm in perl handles both by default. But there are situations where you more or less have to use POST instead of GET, at least for pushing data to the server. Large amounts of data (where the corresponding GET url would become too long), binary data (to avoid lots of encoding/decoding trouble), multipart files, non-parsed headers (for continuous updates pre-AJAX style...) and similar.
The server technically couldn't care one way or the other about what kind of request it receives. It will blindly execute any request coming across the wire.
Which is the problem. If you have an action that destroys or modifies data in a GET action, Google will tear your site up as it crawls through indexing.
The server usually doesn't care. But it's mostly for following good practices, as you mentioned. The client side also matter - as mentioned you cannot bookmark a POST'd page usually, and some browsers have limits on the length of the URL for really long GET queries.
Since GET is intended for specifying resource you wanna get, depending on exact software on the server side, the web server (or the load balancer in front of it) may have a size limit on GET requests to prevent Denial Of Service attacks...
Be aware that browsers may cache GET requests but will generally not cache POST requests.
Yes, it does matter. GET and POST are quite different, really.
You are right in that normally, GET is for "getting" data from the server and displaying a page, while POST is for "posting" data back to the server. Internally, your scripts get the same data whether it's GET or POST, so no, the server doesn't really care.
The main difference is GET parameters are specified in URLs, while POST is not. This is why POST is used for signup and login forms - you don't want your password in a URL. Similarly, if you're viewing different pages or displaying a specific view of some data, you normally want a unique URL.
It really does matter. I have gathered like 11 things you should know abut them.
11 things you should know about GET vs POST
No, they shouldn't except for #jbruce2112 answer and uploading files require POST.