HTTP 405 -- web server compliance - http

The RFC states:
10.4.6 405 Method Not Allowed
The method specified in the Request-Line is not allowed for the
resource identified by the Request-URI. The response MUST include an
Allow header containing a list of valid methods for the requested
resource.
However, I've been unable to identify a single server which complies with that MUST.
I can see that that requirement would be very hard to fulfill with modern web servers, given the variety of proxying, dynamic applications, etc that exist.
Why, historically, did that requirement make sense?
Does anything depend on that behavior, or did it ever? What would a use case for it be?
Do any web servers "properly" implement this aspect of http? IIS (at least when using ASP.NET) and even some "RESTful" APIs return 404 rather than 405 when giving a bogus method, as far as I've been able to tell.
Additionally, why do servers return 405 for methods such as BOGUS that clearly are not implemented by the server, even when serving documents and not proxying out or calling some code (cgi/etc), when they should return 501?
Should these parts of HTTP be considered "vestigial", seeing as few if any servers conform to the spec?
Actually, it isn't that hard for most frameworks to properly return 'Allow'. All of the frameworks I know of require specification of which methods a specific controller is going to be called for (usually defaulting to GET), and code could easily register extension methods with the framework for it to return.
So far the evidence seems to point to either a) nobody reads the spec and nobody knows about this requirement, b) nobody cares about this feature.

Trying to directly answer the questions:
The requirement still makes sense, especially - as Meryn's comment says for HATEOAS API's.
Since a server is "An application program that accepts connections in order to service requests by sending back responses" it's easy to say yes - there are applications on the net that depend on it. ;) One such use case is to respond 405 to a POST /resource/1/ with Allow: GET, HEAD, PUT, DELETE to indicate the resource is not a "factory resource".
Since the methods allowed on a resource could vary by application logic, we should also consider application servers - as you point out in your question. In which case, yes - e.g., django returns a proper Allow header with 405 responses.

Related

When serving a single-page application that uses the History API, should HTTP content negotiation be used?

When serving a single-page application (SPA) that uses the History API, it's common practice to serve the application's HTML instead of a 404 response for requests to any unknown resource. If the application experiences some kind of full-page reload, this allows it to continue presenting the same content to the user.
This can have some negative consequences. For example, an <img> element with a typo in its src attribute will probably get served HTML content instead of a 404. It's possible to avoid this particular case by utilizing HTTP content negotiation, and only serving the HTML when the request headers indicate that the client can accept it.
However, I'm concerned that this is actually a misuse of content negotiation. As I understand it, the main purpose of content negotiation is to provide different representations of the same resource, not to determine whether a resource exists at all.
If you did implement content negotiation for SPA serving it's probably more appropriate to use a 406 Not Acceptable response, rather than a 404 Not Found. But MDN at least seems to indicate that a 406 response is generally a bad idea:
In practice, this error is very rarely used. Instead of responding using this error code, which would be cryptic for the end user and difficult to fix, servers ignore the relevant header and serve an actual page to the user. It is assumed that even if the user won't be completely happy, they will prefer this to an error code.
If a server returns such an error status, the body of the message should contain the list of the available representations of the resources, allowing the user to choose among them.
The most popular implementation of this pattern for JavaScript servers seems to be connect-history-api-fallback. At the time of writing, it uses content negotiation to determine whether to serve the SPA HTML. It doesn't seem to use 406 responses though, instead opting for 404s.
So with all of the above in mind, my questions are:
What is the correct way to serve the HTML for a single-page app?
Should HTTP content negotiation be involved at all?
Additionally, if content negotiation is a desirable solution here, then in the case that an unknown resource is requested, but the client has indicated that HTML is not acceptable:
Should a 406 response be favoured over a 404 response?
What should the body of the response contain?
What additional header content is required to ensure that the system is well-behaved? (For example, I expect that I would probably at least need to set a Vary header to ensure that HTTP caching works correctly)
I feel the only right way to solve this is to not do a catch-all for every possible route, and instead correctly use 404's and serve HTML when there's actually a page to be served.

Is there any reason to use the HTTP 410 GONE status code?

When permanently removing a page from your website, are there any practical benefits to setting up a "410 GONE" HTTP response for the URL (vs. letting it 404)?
Yes, the 410 Gone HTTP status code conveys that the resource requested was once available in the past, but it has now been retired or made obsolete.
The 404 Not Found HTTP status code could imply that the website has been incorrectly updated so as to be missing a file that would normally be defined there. It could also mean that the requesting client referenced a resource that never did exist and probably never will.
The 410 Gone status can have more immediate SEO implications because it tells search engines that the missing resource was intentionally removed. That should hasten the reduction of future search references to that page more so than the 404 Not Found status.
I could imagine if you have a public API, and you finally disable your long deprecated v1 after publishing like v4 or something, you could use this statuscode to make it obvious to consumers of that API. But then again one could argue that a 301 is also valid for this type of situation. It also depends on how different it is, and whether there is an actual replacement, or is it just actually gone.
From RFC 9110:
The 410 response is primarily intended to assist the task of web maintenance by notifying the recipient that the resource is intentionally unavailable and that the server owners desire that remote links to that resource be removed. Such an event is common for limited-time, promotional services and for resources belonging to individuals no longer working at the server's site. It is not necessary to mark all permanently unavailable resources as "gone" or to keep the mark for any length of time -- that is left to the discretion of the server owner.

Should I deny unused request methods?

In my Zend Framework MVC application I am using only two request methods: GET and POST. I am wondering whether I should put a check in my base controller to throw an exception if the other request types are received (e.g. PUT or DELETE).
As far as I can see there are two areas for consideration:
Would it improve security at all? Am I giving potential hackers a head start if I allow the framework to respond to PUT, DELETE, et al?
Would it interfere with correct operation of the site? For example, do search engine bots rely on requests other than GET and POST?
Your ideas are much appreciated!
The correct response code would be 405 Method Not Allowed, including an Allow: GET, POST header.
10.4.6 405 Method Not Allowed
The method specified in the Request-Line is not allowed for the resource identified by the Request-URI. The response MUST include an Allow header containing a list of valid methods for the requested resource.
http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
People violate the api of your app/framework/site etc either due to errors or on purpose, to probe your site for weaknesses. (Only matters in frequency if your site is internal only or on the public net.)
If your site supports developers, then that'd be a possible reason to reply with a 405 code of method not allowed. Perhaps only if the session (assuming sessions) is marked as being in developer mode.
If you don't expect valid developers, then I recommend silently swallowing any bad input to make it harder for the bad guys.
Another reason not to give error messages in the normal case: the lack of an error message in a particular case can then be interpreted that the bad data made it further into your stack than other data--outlining a possible attack route.
And finally, error returns (type, delay before responding, and more) can be used to characterize a particular version of an app/framework etc. This can then be used to quickly locate other vulnerable installations once an attack vector is found.
Yes, the above is pessimistic, and I fondly remember the 80's when everybody responded to ping, echo and other diagnostic requests. But the bad guys are here and it is our responsibility to harden our systems. See this TED video for more.

HTTP Response before Request

My question might sound stupid, but I just wanted to be sure:
Is it possible to send an HTTP response before having the request for that resource?
Say for example you have an HTML page index.html that only shows a picture called img.jpg.
Now, if your server knows that a visitor will request the HTML file and then the jpg image every time:
Would it be possible for the server to send the image just after the HTML file to save time?
I know that HTTP is a synchronous protocol, so in theory it should not work, but I just wanted someone to confirm it (or not).
A recent post by Jacques Mattheij, referencing your very question, claims that although HTTP was designed as a synchronous protocol, the implementation was not. In practise the browser (he doesn't specify which exactly) accepts answers to requests have not been sent yet.
On the other hand, if you are looking to something less hacky, you could have a look at :
push techniques that allows the server to send content to the browser. The modern implementation that replace long-polling/Comet "hacks" are the websockets. You may want to have a look at socket.io also.
Alternatively you may want to have a look at client-side routing. Some implementations combine this with caching techniques (like in derby.js I believe).
If someone requests /index.html and you send two responses (one for /index.html and the other for /img.jpg), how do you know the recipient will get the two responses and know what to do with them before the second request goes in?
The problem is not really with the sending. The problem is with the receiver possibly getting unexpected data.
One other issue is that you're denying the client the ability to use HTTP caching tools like If-Modified-Since and If-None-Match (i.e. the client might not want /img.jpg to be sent because it already has a cached copy).
That said, you can approximate the server-push benefits by using Comet techniques. But that is much more involved than simply anticipating incoming HTTP requests.
You'll get a better result by caching resources effectively, i.e. setting proper cache headers and configuring your web server for caching. You can also inline images using base 64 encoding, if that's a specific concern.
You can also look at long polling javascript solutions.
You're looking for server push: it isn't available in HTTP. Protocols like SPDY have it, but you're out of luck if you're restricted to HTTP.
I don't think it is possible to mix .html and image in the same HTTP response. As for sending image data 'immediately', right after the first request - there is a concept of 'static resources' which could be of help (but it will require client to create a new reqest for a specific resource).
There are couple of interesting things mentioned in the the article.
No it is not possible.
The first line of the request holds the resource being requested so you wouldn't know what to respond with unless you examined the bytes (at least one line's worth) of the request first.
No. HTTP is defined as a request/response protocol. One request: one response. Anything else is not HTTP, it is something else, and you would have to specify it properly and implement it completely at both ends.

How should I handle unsupported verbs on a resource?

I am developing a RESTful framework and am deciding how to handle an unsupported verb being called against a resource. For example, someone trying to PUT to a read-only resource.
My initial thought was a 404 error, but the error is not that the resource cannot be found, it exists, just the user is trying to use the resource incorrectly. Is there a more appropriate error code? What is the most common way in which this situation is handled?
Is it that you simply don't support a certain verb ie DELETE? In that case I'd use the following HTTP response code if someone uses a verb you don't support.
405 Method Not Allowed
A request was made of a resource using a request method not supported by that resource;[2] for example, using GET on a form which requires data to be presented via POST, or using PUT on a read-only resource. [source]
I don't think you would receive a request to your app at all if the incorrect verb were used (but that probably depends on which specific technologies you're using on the server side).
To be more helpful to potentially confused client connection attempts I suppose you could create a stub endpoint/action for each commonly incorrect verb, method combinations and then send back a friendly "use {verbname} instead for this request" text response, but I'd personally just invest a bit of time in better developer documentation : )
You could also seamlessly redirect to the correct action in those cases...

Resources