What is the correct status code if request URL does not match regex - http

I'm trying to develop error handling in my REST API and I'm currently working on the response codes. I haven't found a proper answer to what would be the appropriate response code for my problem.
I have routes which are managed like this in the back-end code:
$site->route([
'route' => '/{id:^[a-z]+$}'
]);
Now lets say the user inputs www.example.com/page1 which does not match the regex pattern. What would be the correct response? I am thinking either a 404, page not found, but I also think that a 400 response would be correct, because it describes that there was an error with the request.
I already use 404 if static URLs are invalid, so this is a question of matching dynamic parts of the URLs.
What are the field of use of both of these response codes in the case of REST APIs?

I think 404 is more appropriate.
According to the specification, 400 means:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request due to something that is perceived to be
a client error.
While 404 means:
The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists.
From client point of view, request to /page1 will result in error, because the resource page1 does not exist (no route is matched in server side).
Normally, 400 means the server has already targeted the resource, but cannot return that resource due to client error. In this scenario, server cannot target the resource if no route is matched. Anyway, if the request is sent to /123456 and the status code is 400, what should be the error message? "Your requested URL is incorrect" sounds like 404, and "Your requested URL should follow regular expression: ...^[a-z]+$..." sounds very weird.

Related

HTTP status code for resource that is not available yet

I have a DB table with a report_url column. As soon as a backend done with filling and storing a report it fills that column with S3 link. If the report was not yet stored, the column value is NULL by default. I also have Pyramid API where an endpoint is declared returning Response with body of report content. So, whenever the user makes request, according controller will be fired to get the report link and download the file and return it to user. However, if report is not done yet (report_url is NULL), I need to inform the user somehow. In this case front-end should receive HTTP status 400, but I have not figured out if this fits best. Or maybe 503 fits better here?
Have a look at available http status codes.
What you probably want is 404, specifically because of this line:
In an API, this can also mean that the endpoint is valid but the
resource itself does not exist.:
Full description:
404 Not Found
The server cannot find the requested resource. In the browser, this
means the URL is not recognized. In an API, this can also mean that
the endpoint is valid but the resource itself does not exist. Servers
may also send this response instead of 403 Forbidden to hide the
existence of a resource from an unauthorized client. This response
code is probably the most well known due to its frequent occurrence on
the web.
If the server is working on getting the report, 102 gets an honorable mention:
102 Processing (WebDAV)
This code indicates that the server has received and is processing the request, but no response is available yet.
it's not part of the standard, it's an extension, WebDAV.
400 status codes are used to let the user know something they did is not working. 500 status codes are used when something is going on with the server. That's how I understand it anyway.
In that way, if this is a "normal" execution of the API/program, perhaps a 200 status code would do just fine. E.g. just define the endpoint to return {"report_url": null} if it isn't ready, otherwise {"report_url": "an actual url"} and then give 200 in each case. And the receiving party handles it depending on if it is null or not. The pro of this method is, now the user can know that it is definitely a proper endpoint (and not an url typo, which would also give 404). However, you could make your own 404 page saying "report is not ready" or "report does not exist" for example. The con of this 200 method is some speed penalty since you have to send an unnecessary response body.
Disclaimer: I am not a web/http expert at all.
The correct HTTP status code is 202 - Accepted. The documentation says:
The 202 (Accepted) status code indicates that the request has been accepted for processing, but the processing has not been completed.
..
The representation sent with this response ought to describe the request's current status and point to (or embed) a status monitor that can provide the user with an estimate of when the request will be fulfilled.

What is the most appropriate HTTP response from a backend service when attempting to remove an entry that no longer exists in the Database?

My team is developing a simple backend service that provides the operations ADD, GET and REMOVE a very simple item. All are triggered by an http request and they do not much besides adding, getting and removing the item from a database.
Regarding the specific scenario in which a REMOVE operation is triggered on a item that is not present in the DB (e.g. was removed before), our question is what should be the response of the service? We having been debating options like 200 + some specific message, 410 - resource gone, amongst other 2XX and 4XX possibilities, but we haven't reached a consensus.
I hope this is not Bikeshedding.
Thank you for your help.
What should be the response of the service?
It's important to highlight that status codes are meant to indicate the result of the server's attempt to understand and satisfy the client request. Having said that, 2xx status codes are unsuitable for this situation and should be avoided:
The 2xx (Successful) class of status code indicates that the client's request was successfully received, understood, and accepted.
The most suitable status code would be in the 4xx range:
The 4xx (Client Error) class of status code indicates that the client seems to have erred. Except when responding to a HEAD request, the server SHOULD send a representation containing an explanation of the error situation, and whether it is a temporary or permanent condition.
The 404 status code seems to be what you are looking for, as it indicates that the server can't find the requested resource:
6.5.4. 404 Not Found
The 404 (Not Found) status code indicates that the origin server did not find a current representation for the target resource or is not willing to disclose that one exists. A 404 status code does not indicate whether this lack of representation is temporary or permanent; [...]
If you are concerned on how the client will understand the 404 reponse, you could provide them with a payload stating that such resource is no longer available.
And just bear in mind that ADD and REMOVE are not standard HTTP methods. Hopefully that was a typo and you are using POST (or PUT) and DELETE to express operations over your resources.

Is it suitable to use 200 HTTP status code for a forbidden web page?

What is the difference when we use 200 response status code for a forbidden page with an error message saying 'Access Denied' instead of using 403 response status code?
Are there any security implications?
The HTTP Response codes convey information about how the server has processed your request. So, if the server responds with 200, it means: "OK, I have received your request and processed it successfully". If it returns 403, it would mean: "I received your request successfully, but you don't have access to this resource".
However, technically they are both returned in the same format, in the same way in the response HTTP header like this:
HTTP/1.1 200 OK
HTTP/1.1 403 Forbidden
The difference is in the meaning. And the meanings are defined in the standard.
So, when you are responding with code 200, you are telling the client that it is all good and dandy. If you are responding to client with 403, you are saying that the client doesn't have permission to this resource. Remember, there can be different clients: web browsers, crawlers, ajax requests from javascript, etc.
So, if you are sending a login form with 200 code:
Users who are using a web browser would understand that they need to login.
Google crawler will index your members/quality-content URL with the login form and will not understand that actually, the original content is different and it should not index this page with the login form.
Javascript with ajax callback will run success callback, when it should be running error callback function.
So, basically, make us all a favour and follow the standards! :)
Answering your second question, no it does not make your application any less secure.
The reason for this decision might be that error message was not visiable using Internet explorer like described here: How do I suppress "friendly error messages" in Internet Explorer?
Actually the correct way is to use the right HTTP error code and make the error message longer than 512 bytes as described here:
https://support.microsoft.com/en-us/kb/294807
Response status codes are intended to help the client to understand the result of the request. Whenever possible, you should use the proper status codes.
The semantics of the status codes are defined in the RFC 7231, the current reference for HTTP/1.1.
While the 200 status code indicates that the request has succeeded, the 403 status code indicates that the server understood the request but refuses to authorize it:
6.3.1. 200 OK
The 200 (OK) status code indicates that the request has succeeded. The payload sent in a 200 response depends on the request method. [...]
6.5.3. 403 Forbidden
The 403 (Forbidden) status code indicates that the server understood the request but refuses to authorize it. A server that wishes to make public why the request has been forbidden can describe that reason in the response payload (if any). [...]
Returning 200 will work, for sure. But why would you return 200 if you can return a much more meaningful status code? If is there any good reason, this should be added to your question.

HTTP status code for "no data available" from an external datasource

Scenario:
A POST request is sent to process an order that will result in data retrieval from an external datasource.
There are three possible results:
The datasource returned data for the request
No data was available for the request (this is viewed as an error)
The datasource couldn't be accessed (may be down for maintenance)
An obvious response for 1 is 200: OK or 201: Created (an entity is created from this request).
What status codes would be appropriate for 2 and 3?
Status codes I have considered:
503: Service Unavailable when datasource is down
500: Internal Server Error when datasource is down
502: Bad Gateway when "no data available"
404: Not Found when "no data available"
403: Forbidden when "no data available"
412: Precondition Failed when "no data available"
2) Looking back at this, I agree it should probably be either a 204 No Content or maybe a 200 with a body indicating no records or resources could be found depending on the structure returned.
404's are generally used when the resource URI doesn't exist or a resource in the URI is not found in the case of a restful service.
3) 503 Service Unavailable
The server is currently unable to handle the request due to a temporary overloading or maintenance of the server. The implication is that this is a temporary condition which will be alleviated after some delay. If known, the length of the delay MAY be indicated in a Retry-After header. If no Retry-After is given, the client SHOULD handle the response as it would for a 500 response.
Note: The existence of the 503 status code does not imply that a
server must use it when becoming overloaded. Some servers may wish
to simply refuse the connection.
3) I agree with 503 for this
2) Frankly I think a good argument could be made for using 204 in case 2 You can include metainfo in the header to indicate specifically what 'went wrong'. It really depends on how much you consider this case to be 'an error' at the API level.
If the API itself is functioning as intended, and the request was to a valid endpoint, by an authenticated and authorized user and did not cause the server to malfunction, then very few of the 400 or 500 series errors would really seem to apply.
for example, 404 usually means the URI you called does not exist, if it does exist, then using that code is misleading at least IMHO
**10.2.5 204 No Content**
The server has fulfilled the request but does not need to return an
entity-body, and might want to return updated metainformation. The
response MAY include new or updated metainformation in the form of
entity-headers, which if present SHOULD be associated with the
requested variant.
If the client is a user agent, it SHOULD NOT change its document view
from that which caused the request to be sent. This response is
primarily intended to allow input for actions to take place without
causing a change to the user agent's active document view, although
any new or updated metainformation SHOULD be applied to the document
currently in the user agent's active view.
The 204 response MUST NOT include a message-body, and thus is always
terminated by the first empty line after the header fields.
HTTP 404 - With your own error message like "No data found".
Twitter uses 404.
Reference: https://developer.twitter.com/en/docs/basics/response-codes.html
The datasource returned data for the request
200: OK/201: CREATED
Because everything is working as expected
No data was available for the request (this is viewed as an error)
400: BAD REQUEST
The request was invalid or cannot be otherwise served. An accompanying error message will explain further inside the body.like:
HTTP 400
{
response: null,
code: "USER_101", //should be used customized error codes here
error: "User details not found"
}
The datasource couldn't be accessed (may be down for maintenance)
404: Resource/URI NOT FOUND
The URI requested or resource is invalid
Like: https://www.lipsum.com/list-page
**/list-page** is not defined/found
Find here most frequently used status codes:
200 – OK
Everything is working, The resource has been fetched and is transmitted in the message body.
201 – CREATED
A new resource has been created
204 – NO CONTENT
The resource was successfully deleted, no response body
304 – NOT MODIFIED
This is used for caching purposes. It tells the client that the response has not been modified, so the client can continue to use the same cached version of the response.
400 – BAD REQUEST
The request was invalid or cannot be served. The exact error should be explained in the error payload.
401 – UNAUTHORIZED
The request requires user authentication.
403 – FORBIDDEN
The server understood the request but is refusing it or the access is not allowed.
404 – NOT FOUND
There is no resource behind the URI.
500 – INTERNAL SERVER ERROR API
If an error occurs in the global catch blog, the stack trace should be logged and not returned as a response.
In my opinion the best way to handle this is with a 200 no result object.
Why?
You have a response that you can do something with without a lot of trouble. I searched, everything worked correctly but there wasn't anything in the database to give a result. Therefore, result = null and a message explaining as much. If something found this in the network calls it is not a security risk.
If you are concerned with a security risk then a 204 is probably the best approach.
res.status(200).send({
result: null,
message: 'No result'
});

Which HTTP status code to use for required parameters not provided?

I have several pages designed to be called with AJAX - I have them return an abnormal status code if they can't be displayed, and my javascript will show an error box accordingly.
For example, if the user is not authenticated or their session has timed out and they try to call one of the AJAX pages, it will return 401 Unathorized.
I also have some return 500 Internal Server Error if something really odd happens server-side.
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
I had a look at the wikipedia article on HTTP status codes, but the closest one I could find to the code I'm looking for was this:
422 Unprocessable Entity
The request was well-formed but was unable to be followed due to semantic errors.
Edit: The above code is WebDAV specific and therefore unlikely to be appropriate in this case
Can anyone think of an appropriate code to return?
What status code should I return if one of these pages was called without required parameters? (and therefore can't return any content).
You could pick 404 Not Found:
The server has not found anything matching the Request-URI [assuming your required parameters are part of the URI, i.e. $_GET]. No indication is given of whether the condition is temporary or permanent. The 410 (Gone) status code SHOULD be used if the server knows, through some internally configurable mechanism, that an old resource is permanently unavailable and has no forwarding address. This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable.
(highlight by me)
404 Not Found is a subset of 400 Bad Request which could be taken as well because it's very clear about what this is:
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
This is normally more common with missing/wrong-named post fields, less with get requests.
As Luca Fagioli comments, strictly speaking 404, etc. are not a subset of the 400 code, and correctly speaking is that they fall into the 4xx class that denotes the server things this is a client error.
In that 4xx class, a server should signal whether the error situation is permanent or temporary, which includes to not signal any of it when this makes sense, e.g. it can't be said or would not be of benefit to share. 404 is useful in that case, 400 is useful to signal the client to not repeat the request unchanged. In the 400 case, it is important then for any request method but a HEAD request, to communicate back all the information so that a consumer can verify the request message was received complete by the server and the specifics of "bad" in the request are visible from the response message body (to reduce guesswork).
I can't actually suggest that you pick a WEBDAV response code that does not exist for HTTP clients using hypertext, but you could, it's totally valid, you're the server coder, you can actually take any HTTP response status code you see fit for your HTTP client of which you are the designer as well:
11.2. 422 Unprocessable Entity
The 422 (Unprocessable Entity) status code means the server
understands the content type of the request entity (hence a
415(Unsupported Media Type) status code is inappropriate), and the
syntax of the request entity is correct (thus a 400 (Bad Request)
status code is inappropriate) but was unable to process the contained
instructions. For example, this error condition may occur if an XML
request body contains well-formed (i.e., syntactically correct), but
semantically erroneous, XML instructions.
IIRC request entity is the request body. So if you're operating with request bodies, it might be appropriate as Julian wrote.
You commented:
IMHO, the text for 400 speaks of malformed syntax. I would assume the syntax here relates to the syntax of HTTP string that the client sends across to the server.
That could be, but it can be anything syntactically expressed, the whole request, only some request headers, or a specific request header, the request URI etc.. 400 Is not specifically about "HTTP string syntax", it's infact the general answer to a client error:
The 4xx class of status code is intended for cases in which the client seems to have erred. Except when responding to a HEAD request, the server SHOULD include an entity containing an explanation of the error situation, and whether it is a temporary or permanent condition. These status codes are applicable to any request method. User agents SHOULD display any included entity to the user.
The important part is here that you must tell the client what went wrong. The status code is just telling that something went wrong (in the 4xx class), but HTTP has not been specifically designed to make a missing query-info part parameter noteable as error condition. By fact, URI only knows that there is a query-info part and not what it means.
If you think 400 is too broad I suggest you pick 404 if the problem is URI related, e.g. $_GET variables.
I don't know about the RFC writers' intentions, but the status code I have seen used in the wild for that case is 400 Bad Request.
422 is a regular HTTP status code; and it is used outside WebDAV. Contrary to what others say, there's no problem with that; HTTP has a status code registry for a reason.
See http://www.iana.org/assignments/http-status-codes
Read this carefully:
https://en.wikipedia.org/wiki/List_of_HTTP_status_codes
422 is a WebDAV-specific thing, and I haven't seen it used for anything else.
400, even though not intended for this particular purpose, seems to be a common choice.
404 is also a viable choice if your API is RESTful or similar (using the path part of the URI to indicate search parameters)
Description as quoted against 400
The request could not be understood by the server due to malformed syntax. The client SHOULD NOT repeat the request without modifications.
(Emphasis mine)
That speaks of malformed syntax, which is not the case when the browser sends a request to the server. Its just the case of missing parameters (while there's no malformed syntax).
I would suggest stick with 404 :)
(Experts correct me if I am wrong anywhere :) )
I need to answer this old question because the most upvoted and accepted answer is plain wrong.
From RFC 9110 - HTTP Semantics:
The 400 (Bad Request) status code indicates that the server cannot or
will not process the request due to something that is perceived to be
a client error (e.g., malformed request syntax, invalid request
message framing, or deceptive request routing).
So 400 is what you have to use.
Do not use 404, because you will completely mislead the API consumer. 404 means that the resource was not found on the server:
The 404 (Not Found) status code indicates that the origin server did
not find a current representation for the target resource or is not
willing to disclose that one exists.

Resources