*Concrete* risks of using HTTP GET for requests that have side-effects? (I know the theory.) - http

I know that, in principle, HTTP GET requests shouldn't have any side-effects (i.e. idempotent).
But I'm experimenting and, so far, my test HTTP GET requests that change data on a test database work just fine: it doesn't look like the HTTP GET requests ever get re-requested.
Now my tests are not representative of the real world. In the real world, any middle-man (e.g. Cloudflare) could take in consideration the fact that the HTTP request is GET and re-request the HTTP request, for example upon flaky networking. But I wonder if this actually ever happens?
Note that the HTTP requests in question are all browser-side fetch() JSON requests using the Fetch API. (I think that's relevant because while Cloudflare does re-request HTML resources, I don't think Cloudflare would ever re-request a fetch() JSON request.)
My gut feeling says that, while in theory such HTTP GET requests are allowed to be re-requested, it doesn't happen in practice. Am I right or is there a situation showing that I'm wrong?
Context
I'm the author of Telefunc which is a JavaScript/TypeScript RPC implementation and I'd like to make all Telefunc's HTTP request be GET, even when the user makes database changes. I'd like to do this because this would enable Telefunc to support ETag caching for all requests and without the user having to configure anything.

Related

Is there an real advantage on using the correct HTTP methods?

Most people use GET and POST for all of their requisitions, are there any major problems that should be considered a real reason to respect the correct semanthics?
What are the disadvantages on the lack of using "HEAD","PUT","DELETE","TRACE",etc.?
The semantics of every HTTP verb is known by clients (like browsers) and intermediate equipments as well. They treat requests accordingly (in terms of logging, caching, replaying, etc...)
Some examples:
A GET/HEAD request is considered "safe" and can be replayed a number of times without breaking anything. On the other hand, a POST request is not considered "safe" because it implies a remote state modification. That's why when you navigate by posting a form, and you click the "back" button of your browser, your browser asks if you want to POST the form again
A GET request should be used to retrieve a remote resource. That's why the resource can be cached in intermediate HTTP-aware equipments in order to reduce the number of requests on the backend systems. The response of a GET request is considered cacheable (it's implemented in Service Workers, web servers, HTTP proxies, etc...).
If you use a GET request in place of a POST, that means you're sending the payload in the URL and that means potential sensitive data logged by the backend, and all of the intermediate equipments (an URL is considered as a loggable non sensitive information)
About PUT, PATCH, DELETE, there is no major concern about using a POST instead of one of these, they're here to help you build self-documented RESTful APIs, define fine-grained authorizations on the endpoints and are pretty similar to POST when we forget the semantics (non cacheable, non replayable, hold the data in the body, if concerned)
GET requests are generally for requesting data and POST is for sending data. The biggest difference is that GET parameters are visible in the URL after a "?" while POST is sent in the header of the request. POST requests are more secure in this way and should be used to send sensitive data such as passwords. GET requests can be used to send parameters through just the URL. For example if your website dynamically loads pages based on parameters, it is useful to use get. Youtube uses get to pass the video id.
https://www.youtube.com/watch?v=dQw4w9WgXcQ

What is the actual difference between the different HTTP request methods besides semantics?

I have read many discussions on this, such as the fact the PUT is idempotent and POST is not, etc. However, doesn't this ultimately depend on how the server is implemented? A developer can always build the backend server such that the PUT request is not idempotent and creates multiple records for multiple requests. A developer can also build an endpoint for a PUT request such that it acts like a DELETE request and deletes a record in the database.
So my question is, considering that we don't take into account any server side code, is there any real difference between the HTTP methods? For example, GET and POST have real differences in that you can't send a body using a GET request, but you can send a body using a POST request. Also, from my understanding, GET requests are usually cached by default in most browsers.
Are HTTP request methods anything more than just a logical structure (semantics) so that as developers we can "expect" a certain behavior based on the type of HTTP request we send?
You are right that most of the differences are on the semantic level, and if your components decide to assign other semantics, this will work as well. Unless there are components involved that you do not control (libraries, proxies, load balancers, etc).
For instance, some component might take advantage of the fact that PUT it idempotent and thus can re retried, while POST is not.
The Hypertext Transfer Protocol (HTTP) is designed to enable communications between clients and servers.
HTTP works as a request-response protocol between a client and server.
A web browser may be the client, and an application on a computer that hosts a web site may be the server.
Example: A client (browser) submits an HTTP request to the server; then the server returns a response to the client. The response contains status information about the request and may also contain the requested content.
HTTP Methods
GET
POST
PUT
HEAD
DELETE
PATCH
OPTIONS
The GET Method
GET is used to request data from a specified resource.
GET is one of the most common HTTP methods.
Note that the query string (name/value pairs) is sent in the URL of a GET request.
The POST Method
POST is used to send data to a server to create/update a resource.
The data sent to the server with POST is stored in the request body of the HTTP request.
POST is one of the most common HTTP methods.
The PUT Method
PUT is used to send data to a server to create/update a resource.
The difference between POST and PUT is that PUT requests are idempotent. That is, calling the same PUT request multiple times will always produce the same result. In contrast, calling a POST request repeatedly have side effects of creating the same resource multiple times.
The HEAD Method
HEAD is almost identical to GET, but without the response body.
In other words, if GET /users returns a list of users, then HEAD /users will make the same request but will not return the list of users.
HEAD requests are useful for checking what a GET request will return before actually making a GET request - like before downloading a large file or response body.
The DELETE Method
The DELETE method deletes the specified resource.
The OPTIONS Method
The OPTIONS method describes the communication options for the target resource.
src. w3schools

HTTP Status 404 or 400 if no such API endpoint exists?

Status code 400 Bad Request is used when the client has sent a request that cannot be processed due to being malformed.
Status code 404 Not Found is used when the requested resource does not exist / cannot be found.
My question is, when a client sends a request to an endpoint my API does not serve, which of these status codes is more appropriate?
Should an endpoint be considered a "resource", and thus a 404 be returned? My issue with this is that if the client only checks the status code, they cannot tell the difference between a 404 indicating that they got to the correct endpoint, but there was no result matching their query, versus a 404 indicating that they queried a non-existing endpoint.
Alternatively, should we expect that a client has prior knowledge of all available API endpoints, and thus treat their request as malformed and return a 400 if the endpoint they are trying to reach does not exist?
Maybe this depends on whether the endpoints are REST or not. If they are REST endpoints, the client should not need prior API knowledge, but be able to learn about all relevant API endpoints by navigating the API from a single root endpoint. In such a case, I guess 404 would be more appropriate.
In my specific case right now, this is an internal (non-REST) HTTP API, where I expect the client to have prior knowledge of all API endpoints, so I am leaning towards 400, to avoid issues where 404 from accessing the wrong endpoint could be misconstrued as a 404 indicating that what they sought from the correct endpoint could not be found.
Thoughts?
As a convenience, many modern APIs provide human-readable endpoints for developer convenience. The intent of REST, however, is that URLs are treated as opaque - they may happen to contain semantic content, but can't be relied upon to do so. There's no such thing as a "malformed" URL. There's only a URL that points to something and a URL that doesn't.
Now, that's the REST dogma (and arguably also the HTTP 1.1 spec). That doesn't mean it's what you should do. If you have a single internal client for your API, and that's not going to change, you have a lot of flexibility in designing your own standards. Just make sure to document them, especially those that might confuse the guy straight out of college that they hire to replace you when you move on.

Do any modern browsers ever issue an HTTP HEAD request?

I understand what a HEAD request is, and what it could be used for. Will any standard, modern browser ever send a HEAD request? If so, in what context?
A browser will send a HEAD request if that is explicitly requested in an XMLHttpRequest, but I'm fairly certain that the browser will never send a HEAD request of its own accord. My evidence is that the Tornado web server defaults to returning an error for HEAD requests and I've never heard of anyone running into problems related to this (or even being aware of it).
HEAD is mostly obsolete IMHO: on a dynamic web site it is unlikely to be significantly more efficient than a GET, and it can usually be replaced by one of the following:
Conditional GET with If-Modified-Since or If-None-Match (used for caching)
Partial GET with Range header (used for e.g. streaming video)
OPTIONS (used for CORS preflight requests)

Post in REST API design

I've been under the impression that Post in Rest means "Create".
But after reading up on the spec http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.5
It seems like it can be more than just Create?
That was also stated by Stormpath in their screencasts on rest api design.
According to Stormpath, Post means "Process" , which can be pretty much anything.
Is that the correct way to see it?
I can trigger custom actions for my resources using Post?
In theory, a POST request should attempt to create or modify some resource on the server. As #Tichodroma pointed out, an idempotent request will affect this change only the first time it is sent, but otherwise what's important is that some state on the server will be changed by the request.
More practically. POST requests are often used when the request payload is too large to fit into a GET URI (e.g. a large file upload). This is usually an intentional breach of HTTP standards to avoid a 414 Request-URI Too Long response.
In terms of verbiage, I don't know if I like "process", because even a GET request will usually be "processed" to determine the resource to return. The main difference in my mind is the change of some state on the server.

Resources