Implementing an HTTP Server - do I have to respond to all requests? - http

If I am making an HTTP server, can I choose to ignore requests I don't want to respond to and let them time out?
I'm just wondering whether I am in any sense better off not responding to requests from potentially malicious sources than responding to them with data I'd rather not serve up, or responding with some 403 Forbidden or similar response that lets them know I exist.

A 403 should suffice. But I wouldn't let it just time out. If someone is trying to be cheeky, a time out will be more informative than a Service Unavailable 503.
I answered a relevant question a while back, read the question/answer, it's about a specific use case, but it does mention cases where you don't want to return an HTTP status code because it gives too much info.
RFC - 404 or 400 for relation of entity not found in PUT request
Also have a look at this list of HTTP Status codes, you can always use something like Too Many Requests 429, a Not Acceptable 406 or even something like I'm a teapot 418 ;)

Related

What is the most appropriate HTTP response from a backend service when attempting to remove an entry that no longer exists in the Database?

My team is developing a simple backend service that provides the operations ADD, GET and REMOVE a very simple item. All are triggered by an http request and they do not much besides adding, getting and removing the item from a database.
Regarding the specific scenario in which a REMOVE operation is triggered on a item that is not present in the DB (e.g. was removed before), our question is what should be the response of the service? We having been debating options like 200 + some specific message, 410 - resource gone, amongst other 2XX and 4XX possibilities, but we haven't reached a consensus.
I hope this is not Bikeshedding.
Thank you for your help.
What should be the response of the service?
It's important to highlight that status codes are meant to indicate the result of the server's attempt to understand and satisfy the client request. Having said that, 2xx status codes are unsuitable for this situation and should be avoided:
The 2xx (Successful) class of status code indicates that the client's request was successfully received, understood, and accepted.
The most suitable status code would be in the 4xx range:
The 4xx (Client Error) class of status code indicates that the client seems to have erred. Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
The 404 status code seems to be what you are looking for, as it indicates that the server can't find the requested resource:
6.5.4. 404 Not Found
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists. A 404 status code does not indicate whether this lack of representation is temporary or permanent; [...]
If you are concerned on how the client will understand the 404 reponse, you could provide them with a payload stating that such resource is no longer available.
And just bear in mind that ADD and REMOVE are not standard HTTP methods. Hopefully that was a typo and you are using POST (or PUT) and DELETE to express operations over your resources.

Http status code to send if server state invalid?

I am writing a REST service and some of the push requests will only work during a certain window of time. For example during active work hours. Outside of those times the server will send an error.
I have looked at the available HTTP status codes and I am not sure which one best to apply for an 'invalid server state' or equivalent situation. I am considering a 400 (Bad Request) or a 422 (Unprocessable Entity)?
For the 422, the definition I have is "The request was well-formed but was unable to be followed due to semantic errors." and wondering whether this is really the most applicable case?
400 Bad Request looks like the right response to me. It's definitely a client error, but there's nothing wrong with the request itself; just the timing of the request. If the response body contains some additional information to make that clear (along the lines of "Our offices are closed. Please make your request between the hours of 9 AM and 5 PM GMT, Monday to Friday.") then you've successfully used a simple and common response type in the appropriate manner. Which makes for a good API.
As an additional note; the reason I'd say that a 422 would be less correct is that the meaning of the request is clear. It's just a timing issue, there's no semantic error.

Which HTTP status code should I use for a health-check failure?

I'm implementing a /_status/ endpoint which does some sanity checks on data in our database.
For example, we are collecting measurements and the status should go "bad" if the latest measurement is over an hour old.
I would like to point Pingdom at this URL to leverage their alerting infrastructure and tell us when something's wrong.
On a "good" status I will serve an HTML page with an HTTP 200 OK status. But what would an appropriate HTTP status code be for "bad"? Or would it be more correct not to convey this information via status code, but via HTML content instead?
Thanks!
Well... this is an old question, but I ended up here, so I thought I'd give my two cents here:
It seems pretty clear that a 2xx should be returned if all is OK
If health is not OK, I think it should return a 5xx result (4xx talks about the client being at fault in the request; 2xx and 3xx are all successful to some degree).
I think that a 5xx is correct because this is a special request that is answering about the state of the whole service. Also, because most Load Balancers offer liveliness checks based on response codes and not all offer a way to parse a more complex payload (other than perhaps a RegExp Match which can make the check brittle).
I agree with #Julien that a 500 (specifically) doesn't seem appropriate, and we've decided on 503 Service Unavailable.
503 seems to fit for a couple of reasons:
It's a 5xx family result code which indicates that something is going on on the server side.
It has a temporary nature to it indicating that it may recover.
We just had a similar discussion in our group. We decided for our purposes that the HTTP response codes should be reporting on your server's success or failure to honor the request. For a GET, this would mean whether or not you can respond with the requested resource. In this case, the requested resource is a health report, so as long as you're returning that successfully, it should be a 200 response.
We're returning JSON for our health check, with a top-level "isHealthy" field set to true or false. Our load balancer and other monitors will parse the JSON and use this field to determine if the system is healthy or not.
If you don't want to parse JSON in your monitors, you could try putting a custom response header to indicate binary health of the system, e.g., System-Health: true or System-Health: false. You might have better luck getting monitors which can check that.
If you really want to use a response code, I would recommend an additional endpoint called something like "health" which returns a "204 No Content" when healthy, and a "404 Not Found" when not healthy. In this case, the resource defined by the URL is, symbolically, the health of your system, and so if it's healthy, you can return a successful response. If it's unhealthy, then it's health can't be found, hence the 404.
If your data is 'bad' because there is a service failure (even if that is a backend job failing) then a HTTP 500 seems like a valid response. It indicates that something, somewhere is broken.
It isn't very specific, you're shrugging your shoulders and saying:
The 500 (Internal Server Error) status code indicates that the server
encountered an unexpected condition that prevented it from fulfilling
the request.
ietf rfc7231
If you ask for health and the server state is not healthy, I'm partial to 409 Conflict which "Indicates that the request could not be processed because of conflict in the current state of the resource" .
Some people might object that if you can respond then the request can be processed, but I disagree. Every error message is a response. The server defines resource semantics. If you ask for the good news resource and the server responds "here is bad news", it didn't give you what it defines to have offered at that resource.
In practice, it's much easier to say 2**="up" 4**="down" and pipe request counts into an availability metric and have a load balancer remove the server from its pool based on the response code. Coming up with ways to argue that "hey, we told you something, so 200 OK" just seems like missing the forrest for the trees to me.

What http considers the same resource?

Specification's talking about requests for "the same resource". But I failed to find any explanations as to what it exactly is. Is it the URL? Or probably requests with the same URL and different headers are considered as different resources? I'm using custom headers as a way to influence what's returned by the server. And seem to experience some issues because of that.
A URL identifies a resource, and a resource is just some chunk of information. This article succinctly describes the two's relation:
EX:
If I were to make an Http GET request as such - GET path/to/res/file - I would either get a 200 response with the file resource in the message body, or if something went wrong, I might get something like a 404 or a 500, depending on the server implementation.
http://www.jmarshall.com/easy/http/#resources
http://en.wikipedia.org/wiki/Uniform_resource_locator
http://en.wikipedia.org/wiki/Web_resource
I hope that clears it up a little for you.

HTTP status code when single request asks for too large resource or too many of them

Does somebody know which HTTP status code is the right one for the following situation?
An anonymous client can request a range of items from a collection from RESTful API with GET /collection/?range_start=100&range_end=200. The example query returns a list with 100 items (in JSON). There is also a limit, lets say 300, for how many items the client can request. What should the response status code be if the client asks for example 1000 items in the range [100, 1100] what means 700 items over the limit?
Should it be 400 Bad Request, 403 Forbidden, 409 Conflict, 416 Requested Range Not Satisfiable(?) or 422 Unprocessable Entity? What would you recommend?
A related question and answer propose 409 but the situation is slightly different:
https://stackoverflow.com/a/13463815/638546
403 sounds like the most appropriate choice. It basically says "nu-uh. You don't get to see that.", which is pretty much the case here.
10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it.
Authorization will not help and the request SHOULD NOT be repeated. [...]
Of course, it'd be a good idea for the response body to include the reason you're refusing the request.
All the other codes seem to me to have specific meanings that disqualify their use here.
400 is not quite appropriate because the request is valid, and you understand it just fine; it's just asking for more than you're willing to send at once.
409 is not appropriate because it's specifically related to the "state" of the resource. (It is appropriate for the question you linked, because in that case the error was adding to a collection that was already "full". In your case, though, it's not the resource that has a problem; it's the request.) Also,
This code is only allowed in situations where it is expected that the user might be able to resolve the conflict and resubmit the request.
where by "resubmit" the standard means "repeat". In this case, no matter what the client does, that request will be invalid.
416 specifically refers to the "Range" header, so it's out altogether.
417 likewise refers to a header field (in this case "Expect"), so it's likewise out.
422 is not appropriate because it specifically means you sent an entity that is syntactically correct, but is still broken. Since GETs traditionally have no request body (no entity), there's nothing to be unprocessable. If the client were POSTing a request, you might almost have a case...but then, you'd also have to make a good case for why a RESTful API requires a POST that doesn't update anything.
(I'm about 47% sure that the code also doesn't make much sense outside of WebDAV...but it does seem there are conceivable use cases. Just not this one.)
This should always produce 400 series Client Error. Exactly which error is by choice of the API/CGI developer. I'd expect either a 405, 406, 416 or the 'catch-all' 417. The api developer has control over the text (body) of these error messages to include more useful information.

Resources