How should I handle unsupported verbs on a resource? - http

I am developing a RESTful framework and am deciding how to handle an unsupported verb being called against a resource. For example, someone trying to PUT to a read-only resource.
My initial thought was a 404 error, but the error is not that the resource cannot be found, it exists, just the user is trying to use the resource incorrectly. Is there a more appropriate error code? What is the most common way in which this situation is handled?

Is it that you simply don't support a certain verb ie DELETE? In that case I'd use the following HTTP response code if someone uses a verb you don't support.
405 Method Not Allowed
A request was made of a resource using a request method not supported by that resource;[2] for example, using GET on a form which requires data to be presented via POST, or using PUT on a read-only resource. [source]

I don't think you would receive a request to your app at all if the incorrect verb were used (but that probably depends on which specific technologies you're using on the server side).
To be more helpful to potentially confused client connection attempts I suppose you could create a stub endpoint/action for each commonly incorrect verb, method combinations and then send back a friendly "use {verbname} instead for this request" text response, but I'd personally just invest a bit of time in better developer documentation : )
You could also seamlessly redirect to the correct action in those cases...

Related

HTTP 405 -- web server compliance

The RFC states:
10.4.6 405 Method Not Allowed
The method specified in the Request-Line is not allowed for the
resource identified by the Request-URI. The response MUST include an
Allow header containing a list of valid methods for the requested
resource.
However, I've been unable to identify a single server which complies with that MUST.
I can see that that requirement would be very hard to fulfill with modern web servers, given the variety of proxying, dynamic applications, etc that exist.
Why, historically, did that requirement make sense?
Does anything depend on that behavior, or did it ever? What would a use case for it be?
Do any web servers "properly" implement this aspect of http? IIS (at least when using ASP.NET) and even some "RESTful" APIs return 404 rather than 405 when giving a bogus method, as far as I've been able to tell.
Additionally, why do servers return 405 for methods such as BOGUS that clearly are not implemented by the server, even when serving documents and not proxying out or calling some code (cgi/etc), when they should return 501?
Should these parts of HTTP be considered "vestigial", seeing as few if any servers conform to the spec?
Actually, it isn't that hard for most frameworks to properly return 'Allow'. All of the frameworks I know of require specification of which methods a specific controller is going to be called for (usually defaulting to GET), and code could easily register extension methods with the framework for it to return.
So far the evidence seems to point to either a) nobody reads the spec and nobody knows about this requirement, b) nobody cares about this feature.
Trying to directly answer the questions:
The requirement still makes sense, especially - as Meryn's comment says for HATEOAS API's.
Since a server is "An application program that accepts connections in order to service requests by sending back responses" it's easy to say yes - there are applications on the net that depend on it. ;) One such use case is to respond 405 to a POST /resource/1/ with Allow: GET, HEAD, PUT, DELETE to indicate the resource is not a "factory resource".
Since the methods allowed on a resource could vary by application logic, we should also consider application servers - as you point out in your question. In which case, yes - e.g., django returns a proper Allow header with 405 responses.

Should I deny unused request methods?

In my Zend Framework MVC application I am using only two request methods: GET and POST. I am wondering whether I should put a check in my base controller to throw an exception if the other request types are received (e.g. PUT or DELETE).
As far as I can see there are two areas for consideration:
Would it improve security at all? Am I giving potential hackers a head start if I allow the framework to respond to PUT, DELETE, et al?
Would it interfere with correct operation of the site? For example, do search engine bots rely on requests other than GET and POST?
Your ideas are much appreciated!
The correct response code would be 405 Method Not Allowed, including an Allow: GET, POST header.
10.4.6 405 Method Not Allowed
The method specified in the Request-Line is not allowed for the resource identified by the Request-URI. The response MUST include an Allow header containing a list of valid methods for the requested resource.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
People violate the api of your app/framework/site etc either due to errors or on purpose, to probe your site for weaknesses. (Only matters in frequency if your site is internal only or on the public net.)
If your site supports developers, then that'd be a possible reason to reply with a 405 code of method not allowed. Perhaps only if the session (assuming sessions) is marked as being in developer mode.
If you don't expect valid developers, then I recommend silently swallowing any bad input to make it harder for the bad guys.
Another reason not to give error messages in the normal case: the lack of an error message in a particular case can then be interpreted that the bad data made it further into your stack than other data--outlining a possible attack route.
And finally, error returns (type, delay before responding, and more) can be used to characterize a particular version of an app/framework etc. This can then be used to quickly locate other vulnerable installations once an attack vector is found.
Yes, the above is pessimistic, and I fondly remember the 80's when everybody responded to ping, echo and other diagnostic requests. But the bad guys are here and it is our responsibility to harden our systems. See this TED video for more.

Should a webserver ignore extra query params or return an error?

I'm implementing the logic for a RESTful web server which supports searching with a SolR like syntax. Here are some common valid requests:
"https://www.somewhere.com/fooResource/123"
"https://www.somewhere.com/fooResource/456"
"https://www.somewhere.com/fooResource?q=title:hi"
"https://www.somewhere.com/fooResource?q=title:hello&sort=foo"
My question is very generic; what should I do if I receive a request like this?
"https://www.somewhere.com/fooResource?q=title:hi&something=foo"
I received a query parameter "something" which has no meaning to me, and our search engine will ignore it. Should I
return a 4xx status code immediately
ignore it and return a 200 with results
either my be "right" depending on my use case
Many web pages just ignore stuff that they aren't expecting.
Usually the URL and parameters are a result of clicking something or running some code on a browser or web service client. These would seldom submit anything unexpected.
If there is some reason you expect someone to be fooling with your web site and submitting requests that are "hackish" in some fashion, you might want to lock them out by recognizing illegal parameters and returning some error. 4xx would be reasonable for REST service.
Read the HTTP status definitions. I would practice not returning anything with bad info. The definition of 400 is The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications. and seems appropriate here, but your use case may deem otherwise.
If you IGNORE you are not giving the client any information. They may never know something is wrong.

ASP.NET MVC Head Verb and Selenium RC

Selenium (RC) is being used to test an ASP.NET 1.1 site.
When we make a request via Selenium RC (which in turn automates the request via a configured browser - in this case Firefox) the http verb is "HEAD". We have several form action methods that have separate GET and POST methods decorated with AcceptVerbs(HttpVerbs.Get) or HttpVerbs.Post respectively. These methods are returning a 404 and logging a "a public action method could not be found" error message.
Questions:
When writing separate Get/Post action methods what is the best practice for handling the Head verb? Should we always decorate with an AcceptVerbs(HttpVerbs.Get | HttpVerbs.Head)?
Why is the HEAD verb being generated when Selenium RC is automating the browser in lieu of an If-Modified-Since header?
We've also seen log entries from (non-mainstream) crawlers that are using the HEAD verb. We created robots.txt entries to stop these crawlers from indexing the site, but now we're wondering what the best practice from an SEO perspective is as well. Is it important to respond to HEAD for crawlers? Are there mainstream crawlers that use it? Does it impact SEO rank?
Yes, I think that whenever you are restricting your requests to be GET only, you should always allow HEAD on them as well - in fact, I do think it should be built into the MVC framework (next thing on my todo list: raise the issue in MVC bug tracker that [HttpGet] attribute should somehow support HEAD verb)
I would like to know an answer to this too. In the meantime, there is a suggested workaround - pass 'true' as a second param to Selenium's open().
I don't think it impacts SERP ranking per se, however I can see how crawlers would not request the full page if HEAD gives a 404. According to the HTTP spec (RFC2616), "The HEAD method is identical to GET except that the server MUST NOT return a message-body in the response", so, if you are doing things right - it shouldn't be a problem to allow the method and avoid getting unlisted.

Is there any way to check if a POST url exists?

Is there any way to determine if a POST endpoint exists without actually sending a POST request?
For GET endpoints, it's not problem to check for 404s, but I'd like to check POST endpoints without triggering whatever action resides on the remote url.
Sending an OPTIONS request may work
It may not be implemented widely but the standard way to do this is via the OPTIONS verb.
WARNING: This should be idempotent but a non-compliant server may do very bad things
OPTIONS
Returns the HTTP methods that the server supports for specified URL. This can be used to check the functionality of a web server by requesting '*' instead of a specific resource.
More information here
This is not possible by definition.
The URL that you're posting to could be run by anything, and there is no requirement that the server behave consistently.
The best you could do is to send a GET and see what happens; however, this will result in both false positives and false negatives.
You could send a HEAD request, if the server you are calling support it - the response will typically be way smaller than a GET.
Does endpoint = script? It was a little confusing.
I would first point out, why would you be POSTing somewhere if it doesn't exist? It seems a little silly?
Anyway, if there is really some element of uncertainty with your POST URL, you can use cURL, then set the header option in the cURL response. I would suggest that if you do this that you save all validated POSTs if its likely that the POST url would be used again.
You can send your entire POST at the same time as doing the CURL then check to see if its errored out.
I think you probably answered this question yourself in your tags of your question with cURL.

Resources