Why is it useful to convert HEAD requests to GET requests? - http

In a new Phoenix app the Plug.Head plug is present by default and I was intrigued about its significance.
I know that "the HEAD method is identical to GET except that the server MUST NOT send a message body in the response".
I think the official Phoenix guides are top-notch but this threw me off in the Routing guide:
Plug.Head - converts HEAD requests to GET requests and strips the
response body
If HEAD requests are without a body then why is this needed? I thought maybe to rein in malformed requests but looking at the Plug.Head implementation, it just switches the HEAD method to GET.
def call(%Conn{method: "HEAD"} = conn, []), do: %{conn | method: "GET"}
def call(conn, []), do: conn
end
The closest thing I was able to find on this topic is a question on ServerFault but it was related to NGINX and a flawed application logic where HEAD requests needed to be converted to GET and the respective GET responses back to HEAD.

Since Phoenix is largely inspired by Rails, you can safely bet Plug.Head is inspired by Rack::Head.
A HEAD request returns the same response as GET but with headers only. So to produce the correct headers, they're routed to GET actions in your Phoenix app.
However to produce the correct (empty) body, the response's body must be stripped. Because Rack::Head is middleware, it gets to do so after it gets the response from controllers.
In contrast, Plug's architecture works more like a pipeline, Plug.Head modifies the method and passes conn along, but never sees it again.
If you see cdegroot's answer, the responsibility to strip the response's body is passed to the Plug.Conn.Adapter to implement (i.e. the webserver).

AFAICT, the idea is that Plug.Head just ensures that the request is processed as a GET; the second part of implementing HEAD, which is not sending a body, is done by the Plug connection adapters. The documentation for most callbacks, like send_resp, specifies that "If the request has method "HEAD", the adapter should not send the response to the client."

Related

How can I handle arbitrary incoming `application/json` HTTP requests in Odoo?

I'd like to accept and respond to JSON requests in Odoo from sources that may be out of my control. The reason this is not straightforward is because Odoo is forcing me to use JSON-RPC, which is not suitable for the source I'm interacting with.
For example, if I set the route type to http in the #http.route decorator, Odoo rejects the request if the mimetype is application/json but the body has no content. This isn't going to work in my case because I may not be able to choose what the other source sends to me. Additionally, I am unable to send back a custom JSON response unless the incoming request doesn't have the application/json mimetype, which again is not in my control.
I have done a lot of searching on the internet and read much of Odoo's HTTP source code. The "solution" I keep seeing everywhere is to monkey patch the JsonRequest class one way or another. This allows me to indeed respond with whatever I want, however it doesn't allow me to accept whatever the service may send me.
One specific case I need to be able to handle is incoming application/json GET requests with no body. How can I achieve this despite Odoo's heavy handed JSON-RPC handing?
There is no correct way to accomplish this, I'd call the described method acceptable. It applies to versions of Odoo 10 through 15.
In my opinion, it would be better to leave JsonRequest class alone and let it do its JSON-RPC related job. There is odoo.http.Root.get_request method which constructs the json-rpc or http request object, depending on the content type:
class Root(object):
"""Root WSGI application for the OpenERP Web Client.
"""
# ...
def get_request(self, httprequest):
# deduce type of request
if httprequest.mimetype in ("application/json", "application/json-rpc"):
return JsonRequest(httprequest)
else:
return HttpRequest(httprequest)
This point seems to be the most relevant one to be patched, with returning the custom request class object from this method. There is an issue, though - this method is called prior to any route detection. You have to invent a suitable method to tell, which request class object to return.
To have an idea about a possible implementation, please, see OCA base_rest module.

HTTP GET for a large string payload

I have a requirement where I need to make a HTTP request to a Flask server where the payload is a question(string) and a paragraph(string). The server uses machine learning to find the answer to the question within the paragraph and return it.
Now, the paragraph can be huge, as in thousands of words. So will a GET request with a JSON payload be appropriate? or should I be using POST?
will a GET request with a JSON payload be appropriate?
No - the problem here is that the payload of a GET request has no defined semantics; you have no guarantees that intermediate components will do the right thing with your request.
For example: caches are going to assume that the payload of the request is irrelevant, so your GET request might get a response for a completely different document.
should I be using POST?
Today, you should be using POST.
Eventually, you'll probably end up using the safe-method-with-body, once the HTTP-WG figures out the semantics of the new method and adoption has taken hold.

HTTP health check - GET or HEAD and 200 or 204 response?

I’m wondering if there is a general convention for this: When implementing a HTTP health check for any given application where you are not interested in any response body but just the status code, what would the default/expected endpoint look like?
Using a HEAD request - and returning 200 or 204 status code (which one of those?)
Using a GET with 204
something else?
As of my experience, people use mostly GET and 200. A health check wouldn't respond too much content, so no use of making a HEAD request. But this is mostly the case with a dedicated health check URL.
Today's cloud systems often use Kubernetes or OpenShift. They appear to use a GET request. I think they'll probably want to get a 200ish response code, so 200-299:
https://docs.openshift.com/enterprise/3.0/dev_guide/application_health.html
https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/
Another example, Drupal defines the HTTP response code to be 200:
https://www.drupal.org/project/health_check_url
In Oracle's Infrastructure-as-a-Service docs you can choose between GET and HEAD requests, but the default is HEAD:
https://docs.oracle.com/en-us/iaas/api/#/en/healthchecks/20180501/HttpMonitor/
Use a GET with 204 possibly supporting also HEAD with same status code
A HEAD should give the same response as GET but without response body, so you should first know/define what the GET response gives out in terms of headers (and status code), then, if you want, you can support also HEAD on the same endpoint, returning the same status, in this case 204.
Note that if GET employee/34 anwswers with 404 also HEAD must anwser with same code. That means one must do the same work as for GET: check if employee esists, set status etc. but must not write any response. Tomcat supports this automatically as it uses for HEAD request a response object that never writes to the "real" response, so one can use same code handling GET
For a check one may consider also TRACE but it produces a response body / output mirroring what you send to it, is different, I haven't seen implemented anywhere.
TRACE allows the client to see what is being received at the other
end of the request chain and use that data for testing or diagnostic
information.

Payloads of HTTP Request Methods

The Wikipedia entry on HTTP lists the following HTTP request methods:
HEAD: Asks for the response identical to the one that would correspond to a GET request, but without the response body.
GET: Requests a representation of the specified resource.
POST: Submits data to be processed (e.g., from an HTML form) to the identified resource. The data is included in the body of the request.
PUT: Uploads a representation of the specified resource.
DELETE: Deletes the specified resource.
TRACE: Echoes back the received request, so that a client can see what (if any) changes or additions have been made by intermediate servers.
OPTIONS: Returns the HTTP methods that the server supports for specified URL. This can be used to check the functionality of a web server by requesting '*' instead of a specific resource.
CONNECT: Converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.
PATCH: Is used to apply partial modifications to a resource.
I'm interested in knowing (specifically regarding the first five methods):
which of these methods are able (supposed to?) receive payloads
of the methods that can receive payloads, how do they receive it?
via query string in URL?
via URL-encoded body?
via raw / chunked body?
via a combination of ([all / some] of) the above?
I appreciate all input, if you could share some (preferably light) reading that would be great too!
Here is the summary from RFC 7231, an updated version of the link #Darrel posted:
HEAD - No defined body semantics.
GET - No defined body semantics.
PUT - Body supported.
POST - Body supported.
DELETE - No defined body semantics.
TRACE - Body not supported.
OPTIONS - Body supported but no semantics on usage (maybe in the future).
CONNECT - No defined body semantics
As #John also mentioned, all request methods support query strings in the URL (one notable exception might be OPTIONS which only seems to be useful [in my tests] if the URL is HOST/*).
I haven't tested the CONNECT and PATCH methods since I have no interest in them ATM.
RFC 7231, HTTP 1.1 Semantics and Content, is the most up-to-date and authoritative source on the semantics of the HTTP methods. This spec says that there are no defined meaning for a payload that may be included in a GET, HEAD, OPTIONS, or CONNECT message. Section 4.3.8 says that the client must not send a body for a TRACE request. So, only TRACE cannot have a payload, but GET, HEAD, OPTIONS, and CONNECT probably won't and the server isn't expected to know how to handle it if the client sends one (meaning it can ignore it).
If you believe anything is ambiguous, then there is a mailing list where you can voice your concerns.
I'm pretty sure it's not clear whether or not GET requests can have payloads. GET requests generally post form data through the query string, same for HEAD requests. HEAD is essentially GET - except it doesn't want a response body.
(Side note: I say it's not clear because a GET request could technically upgrade to another protocol; in fact, a version of websockets did just this, and while some proxy software worked fine with it, others chocked upon the handshake.)
POST generally has a body. Nothing is stopping you from using a query string, but the POST body will generally contain form data in a POST.
For more (and more detailed) information, I'd hit the actual HTTP/1.1 specs.

Passing params in the URL when using HTTP POST

Is it allowable to pass parameters to a web page through the URL (after the question mark) when using the POST method? I know that it works (most of the time, anyways) because my company's webapp does it often, but I don't know if it's actually supported in the standard or if I can rely on this behavior. I'm considering implementing a SOAP request handler that uses a parameter after the question mark to indicate that it is a SOAP request and not a normal HTTP request. The reason for this that the webapp is an IIS extension, so everything is accessed via the same URL (ex: example.com/myisapi.dll?command), so to get the SOAP request to be processed, I need to specify that "command" parameter. There would be one generic command for SOAP, not a specific command for each SOAP action -- those would be specified in the SOAP request itself.
Basically, I'm trying to integrate the Apache Axis2/C library into my webapp by letting the webapp handle the HTTP request and then pass off the incoming SOAP XML to Axis2 for handling if it's a SOAP request. Intuitively, I can't see any reason why this wouldn't work, since the URL you're posting to is just an arbitrary URL, as far as all the various components are concerned... it's the server that gives special meaning to the parts after the question mark.
Thanks for any help/insight you can provide.
Lets start with the simple stuff. HTTP GET request variables come from the URI. The URI is a requested resource, and so any webserver should (and apache does) have the entire URI stored in some variable available to the modules or appserver components running within the webserver.
An http POST which is different from an http GET is a separate logical call to the webserver, but it still defines a URI that should process the post. A good webserver (apache being one) will again make the URI available to whatever module or appserver is running within it, then will additionally make available the variables which were sent in the POST headers.
At the point where your application takes control from apache during a POST you should have access to both the GET and POST variables and be able to do whatever control logic you wish, including replying with a SOAP protocol instead of HTML.
If you are asking whether it is possible to send parameters via both GET and POST in a single HTTP request, then the answer is "YES". This is standard functionality that can be used reliably AFAIK.
One such example is sending authentication credentials in two pieces, one over GET and the other through POST so that any attempt to hijack a session would require hijacking both the GET and POST variables.
So in your case, you can use POST to contain the actual SOAP request but test for whether it is a SOAP request based on the parameter passed in GET (or in other words through the URL).
I believe that no standard actually defines the concept of "HTTP parameters" or "request variables". RFC 1738 defines that an URL may have a "search part", which is the substring after the question mark. HTML specifies in the form submission protocol how a browser processing a FORM element should submit it. In either case, how the server-side processes both the search part and the HTTP body is entirely up to the server - discarding both would be conforming to these two specs (but fairly useless).
In order to determine whether you can post a search part to a specific service, you need to study this service's protocol specification. If the service is practically defined by means of a HTML form, then you cannot use a mix - you can't even use POST if the FORM specifies GET (and vice versa). If you post to a web service, you need to look at the web service's WSDL - which will typically mandate POST; with all data in a SOAP message. Etc.
Specific web frameworks may have the notion of "request variables" - whether they will draw these variables both from a search part and a request body, you need to find out in the product documentation.
I deployed a web application with 3 (a mobile network operator) in the UK. It originally used POST parameters, but the 3 gateway stripped them (and X-headers as well!). So beware...
allowable? sure, it's doable, but i'm leaning towards the spec suggesting dual methods isn't necessarily supposed to happen, or be supported. RFC2616 defines HTTP/1.1, and i would argue suggests only one method per request. if you think about your typical HTTP transaction from the client side, you can see the limitation as well:
$ telnet localhost 80
POST /page.html?id=5 HTTP/1.1
host: localhost
as you can see, you can only use one method (POST/GET, etc...), however due to the nature of how various languages operate, they may pick up the query string, and assign it to the GET variable. ultimately though, this is a POST request, and not a GET.
so basically, yes this functionality exists, is it intended? i would say no.

Resources