HTTP health check - GET or HEAD and 200 or 204 response? - http

I’m wondering if there is a general convention for this: When implementing a HTTP health check for any given application where you are not interested in any response body but just the status code, what would the default/expected endpoint look like?
Using a HEAD request - and returning 200 or 204 status code (which one of those?)
Using a GET with 204
something else?

As of my experience, people use mostly GET and 200. A health check wouldn't respond too much content, so no use of making a HEAD request. But this is mostly the case with a dedicated health check URL.
Today's cloud systems often use Kubernetes or OpenShift. They appear to use a GET request. I think they'll probably want to get a 200ish response code, so 200-299:
https://docs.openshift.com/enterprise/3.0/dev_guide/application_health.html
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Another example, Drupal defines the HTTP response code to be 200:
https://www.drupal.org/project/health_check_url
In Oracle's Infrastructure-as-a-Service docs you can choose between GET and HEAD requests, but the default is HEAD:
https://docs.oracle.com/en-us/iaas/api/#/en/healthchecks/20180501/HttpMonitor/

Use a GET with 204 possibly supporting also HEAD with same status code
A HEAD should give the same response as GET but without response body, so you should first know/define what the GET response gives out in terms of headers (and status code), then, if you want, you can support also HEAD on the same endpoint, returning the same status, in this case 204.
Note that if GET employee/34 anwswers with 404 also HEAD must anwser with same code. That means one must do the same work as for GET: check if employee esists, set status etc. but must not write any response. Tomcat supports this automatically as it uses for HEAD request a response object that never writes to the "real" response, so one can use same code handling GET
For a check one may consider also TRACE but it produces a response body / output mirroring what you send to it, is different, I haven't seen implemented anywhere.
TRACE allows the client to see what is being received at the other
end of the request chain and use that data for testing or diagnostic
information.

Related

How to name my custom HTTP codes?

We're building a backend with a number of APIs. What should be the ideal range of HTTP codes that I should be using?
I've gone through https://en.wikipedia.org/wiki/List_of_HTTP_status_codes and they give a list such as:
1xx Informational
2xx Success
3xx Redirection
4xx Client Error
5xx Server Error
But since I want to implement my own status codes for various purposes such as missing email, I want to name the response accordingly.
So, according to me, a missing email response should trigger a 4xx response as it's a client error. What I'm trying to understand is that should I look for the first open slot such as #419 or should I begin to number the HTTP codes after #451?
You shouldn't use any custom codes at all - they might conflict with future standardization.
If you think you have a use case for a new code that is of general use, propose it in the right place (the HTTP Working Group).
If you just need something specific to your application, use a 400 (in this case), and provide additional information in the response body.

REST HTTP status code for "success, and thus you no longer have access"

My RESTful service includes a resource representing an item ACL. To update this ACL, a client does a PUT request with the new ACL as its entity. On success, the PUT response entity contains the sanitized, canonical version of the new ACL.
In most cases, the HTTP response status code is fairly obvious. 200 on success, 403 if the user isn't permitted to edit the ACL, 400 if the new ACL is malformed, 404 if they try to set an ACL on a nonexistent item, 412 if the If-Match header doesn't match, and the like.
There is one case, however, where the correct HTTP status code isn't obvious. What if the authenticated user uses PUT to remove themselves from the ACL? We need to indicate that the request has succeeded but that they no longer have access to the resource.
I've considered returning 200 with the new ACL in the PUT entity, but this lacks any indication that they no longer have the ability to GET the resource. I've considered directly returning 403, but this doesn't indicate that the PUT was successful. I've considered returning 303 with the Location pointing back to the same resource (where a subsequent GET will give a 403), but this seems like a misuse of 303 given that the resource hasn't moved.
So what's the right REST HTTP status code for "success, and thus you no longer have access"?
200 is the appropriate response, because it indicates success (as any 2xx code implies). You may distinguish the user's lack of permission in the response (or, if you don't wish to, 204 is fine). Status codes make no contract that future requests will return the same code: a 200 response to the PUT does not mean a subsequent GET can't return 403. In general, servers should never try to tell clients what will happen if they issue a particular request. HTTP clients should almost always leap before they look and be prepared to handle almost any response code.
You should read the updated description of the PUT method in httpbis; it discusses not only the use of 200/204 but indicates on a careful reading that returning a transformed representation in immediate response to the PUT is not appropriate; instead, use an ETag or Last-Modified header to indicate whether the entity the client sent was transformed or not. If it was, the client should issue a subsequent GET rather than expecting the new representation to be sent in response to the PUT, if for no other reason than to update any caches along the way (because the response to a PUT is not cacheable). Section 6.3.1 agrees: the response to a PUT should represent the status of the action, not the resource itself. Note also that, for a new ACL, you MUST return 201, not 200.
You're confusing two semantic ideas, and trying to combine them into a single response code.
The first: That you successfully created an ACL at the location that you were attempting to. The correct semantic response (in either a RESTful or non-RESTful scenario) is a 201 Created. From the RFC: "The request has been fulfilled and resulted in a new resource being created."
The second: That the user who executed the PUT does not have access to this resource any more. This is a transient idea - what if the ACL is updated, or something changes before the next request? The idea that a user does not have access to a resource of any kind (and this includes an ACL resource) only matters for the scope of that request. Before the next request is executed, something could change. On a single request where a user does not have access to something you should return a 403 Forbidden.
Your PUT method should return a 201. If the client is worried about whether it has access any more, it should make a subsequent request to determine it's status.
You might want to take a look at HTTP response code "204 No Content" (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html), indicating that the "server has fulfilled the request [to be removed from the ACL] but does not need to return an entity-body, and might want to return updated metainformation" (here, as a result of the successful removal). Although you're not allowed to return a message body with 204, you can return entity headers indicating changes to the user's access to the resource. I got the idea from Amazon S3 - they return a 204 on a successful DELETE request (http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectDELETE.html), which seems to resemble your situation since by removing yourself from an ACL, you've blocked access to that resource in the future.
Very interesting question :-) This is why I love REST, sometimes it might get you crazy. Reading w3 http status code definitions I would choose (this of course is just my humble opinion) one of those:
202 Accepted - since this mean "well yes I got your request, I will process it but come back later and see what happens" - and when the user comes back later she'll get a 403(which should be expected behavior)
205 Reset Content - "Yep, I understood you want to remove yourself please make a new request, when you come back you'll get 403"
On the other hand (just popped-up in my mind), why should you introduce a separate logic and differentiate that case and not using 200 ? Is this rest going to be used from some client application that has an UI? And the user of the rest should show a pop-up to the end-user "Are you sure you want to remove yourself from the ACL?" well here the case can be handled if your rest returns 200 and just show a pop-up "Are you sure you want to remove user with name from the ACL?", no need to differentiate the two cases. If this rest will be used for some service-to-service communication(i.e. invoked only from another program) again why should you differentiate the cases here the program wouldn't care which user will be removed from the ACL.
Hope that helps.

Should I use HTTP 4xx to indicate HTML form errors?

I just spent 20 minutes debugging some (django) unit tests. I was testing a view POST, and I was expecting a 302 return code, after which I asserted a bunch database entities were as expected. Turns out a recently merged commit had added a new form field, and my tests were failing because I wasn't including the correct form data.
The problem is that the tests were failing because the HTTP return code was 200, not 302, and I could only work out the problem by printing out the response HTTP and looking through it. Aside from the irritation of having to look through HTML to work out the problem, a 200 seems like the wrong code for a POST that doesn't get processed. A 4xx (client error) seems more appropriate. In addition, it would have made debugging the test a cinch, as the response code would have pointed me straight at the problem.
I've read about using 422 (Unprocessable Entity) as a possible return code within REST APIs, but can't find any evidence of using it within HTML views / handlers.
My question is - is anyone else doing this, and if not, why not?
[UPDATE 1]
Just to clarify, this question relates to HTML forms, and not an API.
It is also a question about HTTP response codes per se - not Django. That just happens to be what I'm using. I have removed the django tag.
[UPDATE 2]
Some further clarification, with W3C references (http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html):
10.2 Successful 2xx
This class of status code indicates that the client's request was successfully received, understood, and accepted.
10.4 Client Error 4xx
The 4xx class of status code is intended for cases in which the client seems to have erred.
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax.
And from https://www.rfc-editor.org/rfc/rfc4918#page-78
11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
[UPDATE 3]
Digging in to it, 422 is a WebDAV extension[1], which may explain its obscurity. That said, since Twitter use 420 for their own purposes, I think I'll just whatever I want. But it will begin with a 4.
[UPDATE 4]
Notes on the use of custom response codes, and how they should be treated (if unrecognised), from HTTP 1.1 specification (https://www.rfc-editor.org/rfc/rfc2616#section-6.1.1):
HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached. For example, if an
unrecognized status code of 431 is received by the client, it can
safely assume that there was something wrong with its request and
treat the response as if it had received a 400 status code. In such
cases, user agents SHOULD present to the user the entity returned
with the response, since that entity is likely to include human-
readable information which will explain the unusual status.
[1] https://www.rfc-editor.org/rfc/rfc4918
You are right that 200 is wrong if the outcome is not success.
I'd also argue that a success-with-redirect-to-result-page should be 303, not 302.
4xx is correct for client error. 422 seems right to me. In any case, don't invent new 4xx codes without registering them through IANA.
It's obvious that some form POST requests should result in a 4xx HTTP error (e.g. wrong URL, lacking an expected field, failing to send an auth cookie), but mistyping passwords or accidentally omitting required fields are extremely common and expected occurrences in an application.
It doesn't seem clear from any spec that every form invalidation problem must constitute an HTTP error.
I guess my intuition is that, if a server sends a client a form, and the client promptly replies with a correctly-formed POST request to that form with all expected fields, a common business logic violation shouldn't be an HTTP error.
The situation seems even less defined if a client-side script is using HTTP as a transport mechanism. E.g. if a JSON-RPC requests sends form details, the server-side function is successfully called and the response returned to the caller, seems like a 200 success.
Anecdotally: Logging in with bad credentials yields a 200 from Facebook, Google, and Wikipedia, and a 204 from Amazon.
Ideally the IETF would clear this up with an RFC, maybe adding an HTTP error code for "the operation was not performed due to a form invalidation failure" or expanding the definition of 422 to cover this.
There doesn't appear to be an accepted answer, which to be honest, is a bit surprising. Form validation is such a cornerstone of web development that the fact that there is no response code to illustrate a validation failure seems like a missed opportunity. Particularly given the proliferation of automated testing. It doesn't seem practical to test the response by examining the HTML content for an error message rather than just testing the response code.
I stick by my assertion in the question that 200 is the wrong response code for a request that fails business rules - and that 302 is also inappropriate. (If a form fails validation, then it should not have updated any state on the server, is therefore idempotent, and there is no need to use the PRG pattern to prevent users from resubmitting the form. Let them.)
So, given that there isn't an 'approved' method, I'm currently testing (literally) with my own - 421. I will report back if we run into any issues with using non-standard HTTP status codes.
If there are no updates to this answer, then we're using it in production, it works, and you could do the same.
The POST returns 200 if you do not redirect.
The 302 is not sent automatically in headers after POST request, so you have to send the header (https://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpResponse) manually and the code does not relay on data of the form.
The reason of the redirection back to the form (or whatever) with code 302 is to disallow browser to send the data repeatedly on refresh or history browsing.

Which HTTP status code to use for required parameters not provided?

I have several pages designed to be called with AJAX - I have them return an abnormal status code if they can't be displayed, and my javascript will show an error box accordingly.
For example, if the user is not authenticated or their session has timed out and they try to call one of the AJAX pages, it will return 401 Unathorized.
I also have some return 500 Internal Server Error if something really odd happens server-side.
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
I had a look at the wikipedia article on HTTP status codes, but the closest one I could find to the code I'm looking for was this:
422 Unprocessable Entity
The request was well-formed but was unable to be followed due to semantic errors.
Edit: The above code is WebDAV specific and therefore unlikely to be appropriate in this case
Can anyone think of an appropriate code to return?
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
You could pick 404 Not Found:
The server has not found anything matching the Request-URI [assuming your required parameters are part of the URI, i.e. $_GET]. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
(highlight by me)
404 Not Found is a subset of 400 Bad Request which could be taken as well because it's very clear about what this is:
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
This is normally more common with missing/wrong-named post fields, less with get requests.
As Luca Fagioli comments, strictly speaking 404, etc. are not a subset of the 400 code, and correctly speaking is that they fall into the 4xx class that denotes the server things this is a client error.
In that 4xx class, a server should signal whether the error situation is permanent or temporary, which includes to not signal any of it when this makes sense, e.g. it can't be said or would not be of benefit to share. 404 is useful in that case, 400 is useful to signal the client to not repeat the request unchanged. In the 400 case, it is important then for any request method but a HEAD request, to communicate back all the information so that a consumer can verify the request message was received complete by the server and the specifics of "bad" in the request are visible from the response message body (to reduce guesswork).
I can't actually suggest that you pick a WEBDAV response code that does not exist for HTTP clients using hypertext, but you could, it's totally valid, you're the server coder, you can actually take any HTTP response status code you see fit for your HTTP client of which you are the designer as well:
11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
IIRC request entity is the request body. So if you're operating with request bodies, it might be appropriate as Julian wrote.
You commented:
IMHO, the text for 400 speaks of malformed syntax. I would assume the syntax here relates to the syntax of HTTP string that the client sends across to the server.
That could be, but it can be anything syntactically expressed, the whole request, only some request headers, or a specific request header, the request URI etc.. 400 Is not specifically about "HTTP string syntax", it's infact the general answer to a client error:
The 4xx class of status code is intended for cases in which the client seems to have erred. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. These status codes are applicable to any request method. User agents SHOULD display any included entity to the user.
The important part is here that you must tell the client what went wrong. The status code is just telling that something went wrong (in the 4xx class), but HTTP has not been specifically designed to make a missing query-info part parameter noteable as error condition. By fact, URI only knows that there is a query-info part and not what it means.
If you think 400 is too broad I suggest you pick 404 if the problem is URI related, e.g. $_GET variables.
I don't know about the RFC writers' intentions, but the status code I have seen used in the wild for that case is 400 Bad Request.
422 is a regular HTTP status code; and it is used outside WebDAV. Contrary to what others say, there's no problem with that; HTTP has a status code registry for a reason.
See http://www.iana.org/assignments/http-status-codes
Read this carefully:
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
422 is a WebDAV-specific thing, and I haven't seen it used for anything else.
400, even though not intended for this particular purpose, seems to be a common choice.
404 is also a viable choice if your API is RESTful or similar (using the path part of the URI to indicate search parameters)
Description as quoted against 400
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
(Emphasis mine)
That speaks of malformed syntax, which is not the case when the browser sends a request to the server. Its just the case of missing parameters (while there's no malformed syntax).
I would suggest stick with 404 :)
(Experts correct me if I am wrong anywhere :) )
I need to answer this old question because the most upvoted and accepted answer is plain wrong.
From RFC 9110 - HTTP Semantics:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request due to something that is perceived to be
a client error (e.g., malformed request syntax, invalid request
message framing, or deceptive request routing).
So 400 is what you have to use.
Do not use 404, because you will completely mislead the API consumer. 404 means that the resource was not found on the server:
The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists.

HTTP 400 (bad request) for logical error, not malformed request syntax

The HTTP/1.1 specification (RFC 2616) has the following to say on the meaning of status code 400, Bad Request (§10.4.1):
The request could not be understood by
the server due to malformed syntax.
The client SHOULD NOT repeat the
request without modifications.
There seems to be a general practice among a few HTTP-based APIs these days to use 400 to mean a logical rather than a syntax error with a request. My guess is that APIs are doing this to distinguish between 400 (client-induced) and 500 (server-induced). Is it acceptable or incorrect to use 400 to indicate non-syntactic errors? If it is acceptable, is there an annotated reference on RFC 2616 that provides more insight into the intended use of 400?
Examples:
Google Data Protocol, Protocol Reference, HTTP Status Codes
Status 422 (RFC 4918, Section 11.2) comes to mind:
The 422 (Unprocessable Entity) status code means the server understands the content type of the request entity (hence a 415(Unsupported Media Type) status code is inappropriate), and the syntax of the request entity is correct (thus a 400 (Bad Request) status code is inappropriate) but was unable to process the contained instructions. For example, this error condition may occur if an XML request body contains well-formed (i.e., syntactically correct), but semantically erroneous, XML instructions.
As of this time, the latest draft of the HTTPbis specification, which is intended to replace and make RFC 2616 obsolete, states:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request because the received syntax is invalid,
nonsensical, or exceeds some limitation on what the server is willing
to process.
This definition, while of course still subject to change, ratifies the widely used practice of responding to logical errors with a 400.
HTTPbis will address the phrasing of 400 Bad Request so that it covers logical errors as well. So 400 will incorporate 422.
From https://datatracker.ietf.org/doc/html/draft-ietf-httpbis-p2-semantics-18#section-7.4.1
"The server cannot or will not process the request, due to a client error (e.g., malformed syntax)"
Even though, I have been using 400 to represent logical errors also, I have to say that returning 400 is wrong in this case because of the way the spec reads. Here is why i think so, the logical error could be that a relationship with another entity was failing or not satisfied and making changes to the other entity could cause the same exact to pass later. Like trying to (completely hypothetical) add an employee as a member of a department when that employee does not exist (logical error). Adding employee as member request could fail because employee does not exist. But the same exact request could pass after the employee has been added to the system.
Just my 2 cents ... We need lawyers & judges to interpret the language in the RFC these days :)
Thank You,
Vish
It could be argued that having incorrect data in your request is a syntax error, even if your actual request at the HTTP level (request line, headers etc) is syntactically valid.
For example, if a Restful web service is documented as accepting POSTs with a custom XML Content Type of application/vnd.example.com.widget+xml, and you instead send some gibberish plain text or a binary file, it seems resasonable to treat that as a syntax error - your request body is not in the expected form.
I don't know of any official references to back this up though, as usual it seems to be down to interpreting RFC 2616.
Update: Note the revised wording in RFC 7231 §6.5.1:
The 400 (Bad Request) status code indicates that the server cannot or will not process the request due to something that is perceived to be a client error e.g., malformed request syntax, invalid request message framing, or deceptive request routing).
seems to support this argument more than the now obsoleted RFC 2616 §10.4.1 which said just:
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
On Java EE servers a 400 is returned if your URL refers to a non-existent "web -application". Is that a "syntax error"? Depends on what you mean by syntax error. I would say yes.
In English syntax rules prescribe certain relationships between parts of speech. For instance "Bob marries Mary" is syntactically correct, because it follows the pattern {Noun + Verb + Noun}. Whereas
"Bob marriage Mary" would be syntactically incorrect, {Noun + Noun + Noun}.
The syntax of a simple URLis { protocol + : + // + server + : + port }. According to this "http://www.google.com:80" is syntactically correct.
But what about "abc://www.google.com:80"? It seems to follow the exact same pattern. But really
it is a syntax error. Why? Because 'abc' is not a DEFINED protocol.
The point is that determining whether or not we have a 400 situation requires more than parsing the characters and spaces and delimiters. It must also recognize what are the valid "parts of speech".
This is difficult.
I think we should;
Return 4xx errors only when the client has the power to make a change to the request, headers or body, that will result in the request succeeding with the same intent.
Return error range codes when the expected mutation has not occured, i.e. a DELETE didn't happen or a PUT didn't change anything. However, a POST is more interesting because the spec says it should be used to either create resources at a new location, or just process a payload.
Using the example in Vish's answer, if the request intends to add employee Priya to a department Marketing but Priya wasn't found or her account is archived, then this is an application error.
The request worked fine, it got to your application rules, the client did everything properly, the ETags matched etc. etc.
Because we're using HTTP we must respond based on the effect of the request on the state of the resource. And that depends on your API design.
Perhaps you designed this.
PUT { updated members list } /marketing/members
Returning a success code would indicate that the "replacement" of the resource worked; a GET on the resource would reflect your changes, but it wouldn't.
So now you have to choose a suitable negative HTTP code, and that's the tricky part, since the codes are strongly intended for the HTTP protocol, not your application.
When I read the official HTTP codes, these two look suitable.
The 409 (Conflict) status code indicates that the request could not be completed due to a conflict with the current state of the target resource. This code is used in situations where the user might be able to resolve the conflict and resubmit the request. The server SHOULD generate a payload that includes enough information for a user to recognize the source of the conflict.
And
The 500 (Internal Server Error) status code indicates that the server encountered an unexpected condition that prevented it from fulfilling the request.
Though we've traditionally considered the 500 to be like an unhandled exception :-/
I don't think its unreasonable to invent your own status code so long as its consistently applied and designed.
This design is easier to deal with.
PUT { membership add command } /accounts/groups/memberships/instructions/1739119
Then you could design your API to always succeed in creating the instruction, it returns 201 Created and a Location header and any problems with the instruction are held within that new resource.
A POST is more like that last PUT to a new location. A POST allows for any kind of server processing of a message, which opens up designs that say something like "The action successfully failed."
Probably you already wrote an API that does this, a website. You POST the payment form and it was successfully rejected because the credit card number was wrong.
With a POST, whether you return 200 or 201 along with your rejection message depends on whether a new resource was created and is available to GET at another location, or not.
With that all said, I'd be inclined to design APIs that need fewer PUTs, perhaps just updating data fields, and actions and stuff that invokes rules and processing or just have a higher chance of expected failures, can be designed to POST an instruction form.
In my case:
I am getting 400 bad request because I set content-type wrongly. I changed content type then able to get response successfully.
Before (Issue):
ClientResponse response = Client.create().resource(requestUrl).queryParam("noOfDates", String.valueOf(limit))
.header(SecurityConstants.AUTHORIZATION, formatedToken).
header("Content-Type", "\"application/json\"").get(ClientResponse.class);
After (Fixed):
ClientResponse response = Client.create().resource(requestUrl).queryParam("noOfDates", String.valueOf(limit))
.header(SecurityConstants.AUTHORIZATION, formatedToken).
header("Content-Type", "\"application/x-www-form-urlencoded\"").get(ClientResponse.class);

Resources