HTTP status code for resource that is not available yet - http

I have a DB table with a report_url column. As soon as a backend done with filling and storing a report it fills that column with S3 link. If the report was not yet stored, the column value is NULL by default. I also have Pyramid API where an endpoint is declared returning Response with body of report content. So, whenever the user makes request, according controller will be fired to get the report link and download the file and return it to user. However, if report is not done yet (report_url is NULL), I need to inform the user somehow. In this case front-end should receive HTTP status 400, but I have not figured out if this fits best. Or maybe 503 fits better here?

Have a look at available http status codes.
What you probably want is 404, specifically because of this line:
In an API, this can also mean that the endpoint is valid but the
resource itself does not exist.:
Full description:
404 Not Found
The server cannot find the requested resource. In the browser, this
means the URL is not recognized. In an API, this can also mean that
the endpoint is valid but the resource itself does not exist. Servers
may also send this response instead of 403 Forbidden to hide the
existence of a resource from an unauthorized client. This response
code is probably the most well known due to its frequent occurrence on
the web.
If the server is working on getting the report, 102 gets an honorable mention:
102 Processing (WebDAV)
This code indicates that the server has received and is processing the request, but no response is available yet.
it's not part of the standard, it's an extension, WebDAV.
400 status codes are used to let the user know something they did is not working. 500 status codes are used when something is going on with the server. That's how I understand it anyway.
In that way, if this is a "normal" execution of the API/program, perhaps a 200 status code would do just fine. E.g. just define the endpoint to return {"report_url": null} if it isn't ready, otherwise {"report_url": "an actual url"} and then give 200 in each case. And the receiving party handles it depending on if it is null or not. The pro of this method is, now the user can know that it is definitely a proper endpoint (and not an url typo, which would also give 404). However, you could make your own 404 page saying "report is not ready" or "report does not exist" for example. The con of this 200 method is some speed penalty since you have to send an unnecessary response body.
Disclaimer: I am not a web/http expert at all.

The correct HTTP status code is 202 - Accepted. The documentation says:
The 202 (Accepted) status code indicates that the request has been accepted for processing, but the processing has not been completed.
..
The representation sent with this response ought to describe the request's current status and point to (or embed) a status monitor that can provide the user with an estimate of when the request will be fulfilled.

Related

What is the most appropriate HTTP response from a backend service when attempting to remove an entry that no longer exists in the Database?

My team is developing a simple backend service that provides the operations ADD, GET and REMOVE a very simple item. All are triggered by an http request and they do not much besides adding, getting and removing the item from a database.
Regarding the specific scenario in which a REMOVE operation is triggered on a item that is not present in the DB (e.g. was removed before), our question is what should be the response of the service? We having been debating options like 200 + some specific message, 410 - resource gone, amongst other 2XX and 4XX possibilities, but we haven't reached a consensus.
I hope this is not Bikeshedding.
Thank you for your help.
What should be the response of the service?
It's important to highlight that status codes are meant to indicate the result of the server's attempt to understand and satisfy the client request. Having said that, 2xx status codes are unsuitable for this situation and should be avoided:
The 2xx (Successful) class of status code indicates that the client's request was successfully received, understood, and accepted.
The most suitable status code would be in the 4xx range:
The 4xx (Client Error) class of status code indicates that the client seems to have erred. Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
The 404 status code seems to be what you are looking for, as it indicates that the server can't find the requested resource:
6.5.4. 404 Not Found
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists. A 404 status code does not indicate whether this lack of representation is temporary or permanent; [...]
If you are concerned on how the client will understand the 404 reponse, you could provide them with a payload stating that such resource is no longer available.
And just bear in mind that ADD and REMOVE are not standard HTTP methods. Hopefully that was a typo and you are using POST (or PUT) and DELETE to express operations over your resources.

Which HTTP status code should I use for a health-check failure?

I'm implementing a /_status/ endpoint which does some sanity checks on data in our database.
For example, we are collecting measurements and the status should go "bad" if the latest measurement is over an hour old.
I would like to point Pingdom at this URL to leverage their alerting infrastructure and tell us when something's wrong.
On a "good" status I will serve an HTML page with an HTTP 200 OK status. But what would an appropriate HTTP status code be for "bad"? Or would it be more correct not to convey this information via status code, but via HTML content instead?
Thanks!
Well... this is an old question, but I ended up here, so I thought I'd give my two cents here:
It seems pretty clear that a 2xx should be returned if all is OK
If health is not OK, I think it should return a 5xx result (4xx talks about the client being at fault in the request; 2xx and 3xx are all successful to some degree).
I think that a 5xx is correct because this is a special request that is answering about the state of the whole service. Also, because most Load Balancers offer liveliness checks based on response codes and not all offer a way to parse a more complex payload (other than perhaps a RegExp Match which can make the check brittle).
I agree with #Julien that a 500 (specifically) doesn't seem appropriate, and we've decided on 503 Service Unavailable.
503 seems to fit for a couple of reasons:
It's a 5xx family result code which indicates that something is going on on the server side.
It has a temporary nature to it indicating that it may recover.
We just had a similar discussion in our group. We decided for our purposes that the HTTP response codes should be reporting on your server's success or failure to honor the request. For a GET, this would mean whether or not you can respond with the requested resource. In this case, the requested resource is a health report, so as long as you're returning that successfully, it should be a 200 response.
We're returning JSON for our health check, with a top-level "isHealthy" field set to true or false. Our load balancer and other monitors will parse the JSON and use this field to determine if the system is healthy or not.
If you don't want to parse JSON in your monitors, you could try putting a custom response header to indicate binary health of the system, e.g., System-Health: true or System-Health: false. You might have better luck getting monitors which can check that.
If you really want to use a response code, I would recommend an additional endpoint called something like "health" which returns a "204 No Content" when healthy, and a "404 Not Found" when not healthy. In this case, the resource defined by the URL is, symbolically, the health of your system, and so if it's healthy, you can return a successful response. If it's unhealthy, then it's health can't be found, hence the 404.
If your data is 'bad' because there is a service failure (even if that is a backend job failing) then a HTTP 500 seems like a valid response. It indicates that something, somewhere is broken.
It isn't very specific, you're shrugging your shoulders and saying:
The 500 (Internal Server Error) status code indicates that the server
encountered an unexpected condition that prevented it from fulfilling
the request.
ietf rfc7231
If you ask for health and the server state is not healthy, I'm partial to 409 Conflict which "Indicates that the request could not be processed because of conflict in the current state of the resource" .
Some people might object that if you can respond then the request can be processed, but I disagree. Every error message is a response. The server defines resource semantics. If you ask for the good news resource and the server responds "here is bad news", it didn't give you what it defines to have offered at that resource.
In practice, it's much easier to say 2**="up" 4**="down" and pipe request counts into an availability metric and have a load balancer remove the server from its pool based on the response code. Coming up with ways to argue that "hey, we told you something, so 200 OK" just seems like missing the forrest for the trees to me.

Should I use HTTP 4xx to indicate HTML form errors?

I just spent 20 minutes debugging some (django) unit tests. I was testing a view POST, and I was expecting a 302 return code, after which I asserted a bunch database entities were as expected. Turns out a recently merged commit had added a new form field, and my tests were failing because I wasn't including the correct form data.
The problem is that the tests were failing because the HTTP return code was 200, not 302, and I could only work out the problem by printing out the response HTTP and looking through it. Aside from the irritation of having to look through HTML to work out the problem, a 200 seems like the wrong code for a POST that doesn't get processed. A 4xx (client error) seems more appropriate. In addition, it would have made debugging the test a cinch, as the response code would have pointed me straight at the problem.
I've read about using 422 (Unprocessable Entity) as a possible return code within REST APIs, but can't find any evidence of using it within HTML views / handlers.
My question is - is anyone else doing this, and if not, why not?
[UPDATE 1]
Just to clarify, this question relates to HTML forms, and not an API.
It is also a question about HTTP response codes per se - not Django. That just happens to be what I'm using. I have removed the django tag.
[UPDATE 2]
Some further clarification, with W3C references (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html):
10.2 Successful 2xx
This class of status code indicates that the client's request was successfully received, understood, and accepted.
10.4 Client Error 4xx
The 4xx class of status code is intended for cases in which the client seems to have erred.
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax.
And from https://www.rfc-editor.org/rfc/rfc4918#page-78
11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
[UPDATE 3]
Digging in to it, 422 is a WebDAV extension[1], which may explain its obscurity. That said, since Twitter use 420 for their own purposes, I think I'll just whatever I want. But it will begin with a 4.
[UPDATE 4]
Notes on the use of custom response codes, and how they should be treated (if unrecognised), from HTTP 1.1 specification (https://www.rfc-editor.org/rfc/rfc2616#section-6.1.1):
HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached. For example, if an
unrecognized status code of 431 is received by the client, it can
safely assume that there was something wrong with its request and
treat the response as if it had received a 400 status code. In such
cases, user agents SHOULD present to the user the entity returned
with the response, since that entity is likely to include human-
readable information which will explain the unusual status.
[1] https://www.rfc-editor.org/rfc/rfc4918
You are right that 200 is wrong if the outcome is not success.
I'd also argue that a success-with-redirect-to-result-page should be 303, not 302.
4xx is correct for client error. 422 seems right to me. In any case, don't invent new 4xx codes without registering them through IANA.
It's obvious that some form POST requests should result in a 4xx HTTP error (e.g. wrong URL, lacking an expected field, failing to send an auth cookie), but mistyping passwords or accidentally omitting required fields are extremely common and expected occurrences in an application.
It doesn't seem clear from any spec that every form invalidation problem must constitute an HTTP error.
I guess my intuition is that, if a server sends a client a form, and the client promptly replies with a correctly-formed POST request to that form with all expected fields, a common business logic violation shouldn't be an HTTP error.
The situation seems even less defined if a client-side script is using HTTP as a transport mechanism. E.g. if a JSON-RPC requests sends form details, the server-side function is successfully called and the response returned to the caller, seems like a 200 success.
Anecdotally: Logging in with bad credentials yields a 200 from Facebook, Google, and Wikipedia, and a 204 from Amazon.
Ideally the IETF would clear this up with an RFC, maybe adding an HTTP error code for "the operation was not performed due to a form invalidation failure" or expanding the definition of 422 to cover this.
There doesn't appear to be an accepted answer, which to be honest, is a bit surprising. Form validation is such a cornerstone of web development that the fact that there is no response code to illustrate a validation failure seems like a missed opportunity. Particularly given the proliferation of automated testing. It doesn't seem practical to test the response by examining the HTML content for an error message rather than just testing the response code.
I stick by my assertion in the question that 200 is the wrong response code for a request that fails business rules - and that 302 is also inappropriate. (If a form fails validation, then it should not have updated any state on the server, is therefore idempotent, and there is no need to use the PRG pattern to prevent users from resubmitting the form. Let them.)
So, given that there isn't an 'approved' method, I'm currently testing (literally) with my own - 421. I will report back if we run into any issues with using non-standard HTTP status codes.
If there are no updates to this answer, then we're using it in production, it works, and you could do the same.
The POST returns 200 if you do not redirect.
The 302 is not sent automatically in headers after POST request, so you have to send the header (https://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpResponse) manually and the code does not relay on data of the form.
The reason of the redirection back to the form (or whatever) with code 302 is to disallow browser to send the data repeatedly on refresh or history browsing.

HTTP status code for "no data available" from an external datasource

Scenario:
A POST request is sent to process an order that will result in data retrieval from an external datasource.
There are three possible results:
The datasource returned data for the request
No data was available for the request (this is viewed as an error)
The datasource couldn't be accessed (may be down for maintenance)
An obvious response for 1 is 200: OK or 201: Created (an entity is created from this request).
What status codes would be appropriate for 2 and 3?
Status codes I have considered:
503: Service Unavailable when datasource is down
500: Internal Server Error when datasource is down
502: Bad Gateway when "no data available"
404: Not Found when "no data available"
403: Forbidden when "no data available"
412: Precondition Failed when "no data available"
2) Looking back at this, I agree it should probably be either a 204 No Content or maybe a 200 with a body indicating no records or resources could be found depending on the structure returned.
404's are generally used when the resource URI doesn't exist or a resource in the URI is not found in the case of a restful service.
3) 503 Service Unavailable
The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.
Note: The existence of the 503 status code does not imply that a
server must use it when becoming overloaded. Some servers may wish
to simply refuse the connection.
3) I agree with 503 for this
2) Frankly I think a good argument could be made for using 204 in case 2 You can include metainfo in the header to indicate specifically what 'went wrong'. It really depends on how much you consider this case to be 'an error' at the API level.
If the API itself is functioning as intended, and the request was to a valid endpoint, by an authenticated and authorized user and did not cause the server to malfunction, then very few of the 400 or 500 series errors would really seem to apply.
for example, 404 usually means the URI you called does not exist, if it does exist, then using that code is misleading at least IMHO
**10.2.5 204 No Content**
The server has fulfilled the request but does not need to return an
entity-body, and might want to return updated metainformation. The
response MAY include new or updated metainformation in the form of
entity-headers, which if present SHOULD be associated with the
requested variant.
If the client is a user agent, it SHOULD NOT change its document view
from that which caused the request to be sent. This response is
primarily intended to allow input for actions to take place without
causing a change to the user agent's active document view, although
any new or updated metainformation SHOULD be applied to the document
currently in the user agent's active view.
The 204 response MUST NOT include a message-body, and thus is always
terminated by the first empty line after the header fields.
HTTP 404 - With your own error message like "No data found".
Twitter uses 404.
Reference: https://developer.twitter.com/en/docs/basics/response-codes.html
The datasource returned data for the request
200: OK/201: CREATED
Because everything is working as expected
No data was available for the request (this is viewed as an error)
400: BAD REQUEST
The request was invalid or cannot be otherwise served. An accompanying error message will explain further inside the body.like:
HTTP 400
{
response: null,
code: "USER_101", //should be used customized error codes here
error: "User details not found"
}
The datasource couldn't be accessed (may be down for maintenance)
404: Resource/URI NOT FOUND
The URI requested or resource is invalid
Like: https://www.lipsum.com/list-page
**/list-page** is not defined/found
Find here most frequently used status codes:
200 – OK
Everything is working, The resource has been fetched and is transmitted in the message body.
201 – CREATED
A new resource has been created
204 – NO CONTENT
The resource was successfully deleted, no response body
304 – NOT MODIFIED
This is used for caching purposes. It tells the client that the response has not been modified, so the client can continue to use the same cached version of the response.
400 – BAD REQUEST
The request was invalid or cannot be served. The exact error should be explained in the error payload.
401 – UNAUTHORIZED
The request requires user authentication.
403 – FORBIDDEN
The server understood the request but is refusing it or the access is not allowed.
404 – NOT FOUND
There is no resource behind the URI.
500 – INTERNAL SERVER ERROR API
If an error occurs in the global catch blog, the stack trace should be logged and not returned as a response.
In my opinion the best way to handle this is with a 200 no result object.
Why?
You have a response that you can do something with without a lot of trouble. I searched, everything worked correctly but there wasn't anything in the database to give a result. Therefore, result = null and a message explaining as much. If something found this in the network calls it is not a security risk.
If you are concerned with a security risk then a 204 is probably the best approach.
res.status(200).send({
result: null,
message: 'No result'
});

Which HTTP status code to use for required parameters not provided?

I have several pages designed to be called with AJAX - I have them return an abnormal status code if they can't be displayed, and my javascript will show an error box accordingly.
For example, if the user is not authenticated or their session has timed out and they try to call one of the AJAX pages, it will return 401 Unathorized.
I also have some return 500 Internal Server Error if something really odd happens server-side.
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
I had a look at the wikipedia article on HTTP status codes, but the closest one I could find to the code I'm looking for was this:
422 Unprocessable Entity
The request was well-formed but was unable to be followed due to semantic errors.
Edit: The above code is WebDAV specific and therefore unlikely to be appropriate in this case
Can anyone think of an appropriate code to return?
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
You could pick 404 Not Found:
The server has not found anything matching the Request-URI [assuming your required parameters are part of the URI, i.e. $_GET]. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
(highlight by me)
404 Not Found is a subset of 400 Bad Request which could be taken as well because it's very clear about what this is:
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
This is normally more common with missing/wrong-named post fields, less with get requests.
As Luca Fagioli comments, strictly speaking 404, etc. are not a subset of the 400 code, and correctly speaking is that they fall into the 4xx class that denotes the server things this is a client error.
In that 4xx class, a server should signal whether the error situation is permanent or temporary, which includes to not signal any of it when this makes sense, e.g. it can't be said or would not be of benefit to share. 404 is useful in that case, 400 is useful to signal the client to not repeat the request unchanged. In the 400 case, it is important then for any request method but a HEAD request, to communicate back all the information so that a consumer can verify the request message was received complete by the server and the specifics of "bad" in the request are visible from the response message body (to reduce guesswork).
I can't actually suggest that you pick a WEBDAV response code that does not exist for HTTP clients using hypertext, but you could, it's totally valid, you're the server coder, you can actually take any HTTP response status code you see fit for your HTTP client of which you are the designer as well:
11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
IIRC request entity is the request body. So if you're operating with request bodies, it might be appropriate as Julian wrote.
You commented:
IMHO, the text for 400 speaks of malformed syntax. I would assume the syntax here relates to the syntax of HTTP string that the client sends across to the server.
That could be, but it can be anything syntactically expressed, the whole request, only some request headers, or a specific request header, the request URI etc.. 400 Is not specifically about "HTTP string syntax", it's infact the general answer to a client error:
The 4xx class of status code is intended for cases in which the client seems to have erred. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. These status codes are applicable to any request method. User agents SHOULD display any included entity to the user.
The important part is here that you must tell the client what went wrong. The status code is just telling that something went wrong (in the 4xx class), but HTTP has not been specifically designed to make a missing query-info part parameter noteable as error condition. By fact, URI only knows that there is a query-info part and not what it means.
If you think 400 is too broad I suggest you pick 404 if the problem is URI related, e.g. $_GET variables.
I don't know about the RFC writers' intentions, but the status code I have seen used in the wild for that case is 400 Bad Request.
422 is a regular HTTP status code; and it is used outside WebDAV. Contrary to what others say, there's no problem with that; HTTP has a status code registry for a reason.
See http://www.iana.org/assignments/http-status-codes
Read this carefully:
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
422 is a WebDAV-specific thing, and I haven't seen it used for anything else.
400, even though not intended for this particular purpose, seems to be a common choice.
404 is also a viable choice if your API is RESTful or similar (using the path part of the URI to indicate search parameters)
Description as quoted against 400
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
(Emphasis mine)
That speaks of malformed syntax, which is not the case when the browser sends a request to the server. Its just the case of missing parameters (while there's no malformed syntax).
I would suggest stick with 404 :)
(Experts correct me if I am wrong anywhere :) )
I need to answer this old question because the most upvoted and accepted answer is plain wrong.
From RFC 9110 - HTTP Semantics:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request due to something that is perceived to be
a client error (e.g., malformed request syntax, invalid request
message framing, or deceptive request routing).
So 400 is what you have to use.
Do not use 404, because you will completely mislead the API consumer. 404 means that the resource was not found on the server:
The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists.

Resources