What does HTTP look like?

What does HTTP look like? - http

I've been working a lot with HTTP related things - HTTP requests, HTTP responses, HTTP methods etc., but I'm not really sure I understand what the protocol itself looks like. Is it a document like a specification?

Hyper Text Transfer Protocol (HTTP) provides a pattern to interact with Resources (e.g. webpages on a webserver). Essentially it boils down to a Request (typically from a browser) and a Response (typically from a webserver).
The request highlighted red above identifies an action verb such as GET, POST, DELETE, or PUT (there are others verbs too) and a resource (URI/URL) to preform the action on. The request above depicts a browser request to view the wikipedia main page.
The server then responds to the request with the blue and green sections above; they represent the response header and the response body. The response header contains a lot of optional information about the server but the important fields are the status code (200 OK), the content length (54218) and the content type (text/html).
Since the content type is html the browser will try to render the html inside the response body. If the content type were something else such as a word doc then the browser would probably open a save dialog box. There are a plethora of content types that the body could represent, but not all browsers support each of the content types.

Is it a document like a specification?
Yes, HTTP is a protocol over TCP/IP defined in the following specification: http://www.w3.org/Protocols/rfc2616/rfc2616.html
This protocol is for example implemented by web servers and client browsers.

Related

HTTP Requests, body vs param vs headers vs data

I am new to HTTP requests (GET, POST, PUT, ETC.) and I am having some issues understanding the "anatomy" of these procedures.
What exactly is the difference between the body and the data? Are they the same thing? Or are headers the same thing as the param? When authentication takes place, are the username and password params or headers or does it vary from API to API? Any help is greatly appreciated. Are there any tutorials or reads you recommend to better understand how to deal with HTTP requests?
Thank you!

Based on This article and some points of others, you could find out about differences between HTTP header & HTTP parameter ,and and also Body:
Header:
meta data about the request
HTTP Headers are NOT part of the URL
if it's information about the request or about the client, then the header is appropriate
headers are hidden to end-users
globally data
restrict Dos-attack by detecting authorisation on it's header, because a header can be accessed before the body is downloaded
Param:
the query params are within the URL
like this "tag=networking&order=newest"
if it's the content of the request itself, then it's a parameter
The product id and requested image size are examples of "some detail" (or parameter) being supplied as part of the content of a request
parameters can be seen by end-users (query parameters) on URL
Body:
data of business logic
important information
unlike body, proxy servers are allowed to modify headers
data in specefic kinds of requests
you can pass token by body as encoding & decoding in servers

For a full and correct understanding of these questions, RFC2616 recommend by Remy Lebeau is worth reading.
What exactly is the difference between the body and the data?
If you are reading some blog, the body (HTTP body) is be used to transfer data (probably in JSON format). The body carries data, in another way, you get data from body.
Are they the same thing?
So they are not same at all.
Or are headers the same thing as the param?
Header (HTTP header) is related to body, they are part of the HTTP message.
As param, it's usually refer to http request param, which usually looks like the following part of the question mark
url?paramName=paramValue&paramTwo=Value2
When authentication takes place, are the username and password params
or headers or does it vary from API to API?
They vary for different API's, normally not in param, probably in body of a post request.
Again, start from the RFC2616 would be a good choice.

data is not a HTTP specific term. data can be anything.
a 'parameter' is also not a HTTP specific term. Many web frameworks might consider parameters everything behind the ? in a url, but this is not an absolute truth.
usernames and passwords sometimes appear in the request body, sometimes in headers. In web applications they typically are in the request body, but certain types of authentication systems place them in the Authorization header.

REST: HTTP headers or request parameters

I've been putting in some research around REST. I noticed that the Amazon S3 API uses mainly http headers for their REST interface. This was a surprise to me, since I assumed that the interface would work mainly off request parameters.
My question is this: Should I develop my REST interface using mainly http headers, or should I be using request parameters?

The question mainly is whether the parameters defined are part of the resource identifier (URI) or not. if so, then you would use the request parameters otherwise HTTP custom headers. For example, passing the id of the album in a music gallery must be part of the URI.
Remember, for example /employee/id/45 (Or /employee?id=45, REST does not have a prejudice against query string parameters or for clean slash separated URIs) identifies one resource. Now you could use content-negotiation by sending request header content-type: text/plain or content-type: image/jpg to get the info or the image. In this respect, resource is deemed to be the same and header only used to define format of the resource.
Generally, I am not a big fan of HTTP custom headers. This usually assumes the client to have a prior knowledge of the server implementation (not discoverable through natural HTTP means, i.e. hypermedia) which always is considered a REST anti-pattern
HTTP headers usually define aspects of HTTP orthogonal to what is to be achieved in the process of request/response. Authorization header (really a misnomer, should have been authentication) is a classic example.

Payloads of HTTP Request Methods

The Wikipedia entry on HTTP lists the following HTTP request methods:
HEAD: Asks for the response identical to the one that would correspond to a GET request, but without the response body.
GET: Requests a representation of the specified resource.
POST: Submits data to be processed (e.g., from an HTML form) to the identified resource. The data is included in the body of the request.
PUT: Uploads a representation of the specified resource.
DELETE: Deletes the specified resource.
TRACE: Echoes back the received request, so that a client can see what (if any) changes or additions have been made by intermediate servers.
OPTIONS: Returns the HTTP methods that the server supports for specified URL. This can be used to check the functionality of a web server by requesting '*' instead of a specific resource.
CONNECT: Converts the request connection to a transparent TCP/IP tunnel, usually to facilitate SSL-encrypted communication (HTTPS) through an unencrypted HTTP proxy.
PATCH: Is used to apply partial modifications to a resource.
I'm interested in knowing (specifically regarding the first five methods):
which of these methods are able (supposed to?) receive payloads
of the methods that can receive payloads, how do they receive it?
via query string in URL?
via URL-encoded body?
via raw / chunked body?
via a combination of ([all / some] of) the above?
I appreciate all input, if you could share some (preferably light) reading that would be great too!

Here is the summary from RFC 7231, an updated version of the link #Darrel posted:
HEAD - No defined body semantics.
GET - No defined body semantics.
PUT - Body supported.
POST - Body supported.
DELETE - No defined body semantics.
TRACE - Body not supported.
OPTIONS - Body supported but no semantics on usage (maybe in the future).
CONNECT - No defined body semantics
As #John also mentioned, all request methods support query strings in the URL (one notable exception might be OPTIONS which only seems to be useful [in my tests] if the URL is HOST/*).
I haven't tested the CONNECT and PATCH methods since I have no interest in them ATM.

RFC 7231, HTTP 1.1 Semantics and Content, is the most up-to-date and authoritative source on the semantics of the HTTP methods. This spec says that there are no defined meaning for a payload that may be included in a GET, HEAD, OPTIONS, or CONNECT message. Section 4.3.8 says that the client must not send a body for a TRACE request. So, only TRACE cannot have a payload, but GET, HEAD, OPTIONS, and CONNECT probably won't and the server isn't expected to know how to handle it if the client sends one (meaning it can ignore it).
If you believe anything is ambiguous, then there is a mailing list where you can voice your concerns.

I'm pretty sure it's not clear whether or not GET requests can have payloads. GET requests generally post form data through the query string, same for HEAD requests. HEAD is essentially GET - except it doesn't want a response body.
(Side note: I say it's not clear because a GET request could technically upgrade to another protocol; in fact, a version of websockets did just this, and while some proxy software worked fine with it, others chocked upon the handshake.)
POST generally has a body. Nothing is stopping you from using a query string, but the POST body will generally contain form data in a POST.
For more (and more detailed) information, I'd hit the actual HTTP/1.1 specs.

Specify supported media types when sending "415 unsupported media type"

If a clients sends data in an unsupported media type to a HTTP server, the server answers with status "415 unsupported media type". But how to tell the client what media types are supported? Is there a standard or at least a recommended way to do so? Or would it just be written to the response body as text?

There is no specification at all for what to do in this case, so expect implementations to be all over the place. (What would be sensible would be if the server's response included something like an Accept: header since that has pretty much the right semantics, if currently in the wrong direction.)

I believe you can do this with the OPTIONS Http verb.
Also the status code of 300 Multiple Choices could be used if your scenario fits a certain use case. If they send a request with an Accept header of application/xml and you only support text/plain and that representation lives at a distinct URL then you can respond with a 300 and in the Location header the URL of that representation. I realize this might not exactly fit your question, but it's another possible option.
And from the HTTP Spec:
10.4.7 406 Not Acceptable
The resource identified by the request is only capable of generating response entities which have content characteristics not acceptable according to the accept headers sent in the request.
Unless it was a HEAD request, the response SHOULD include an entity containing a list of available entity characteristics and location(s) from which the user or user agent can choose the one most appropriate. The entity format is specified by the media type given in the Content-Type header field. Depending upon the format and the capabilities of the user agent, selection of the most appropriate choice MAY be performed automatically. However, this specification does not define any standard for such automatic selection.
Note: HTTP/1.1 servers are allowed to return responses which are
not acceptable according to the accept headers sent in the
request. In some cases, this may even be preferable to sending a
406 response. User agents are encouraged to inspect the headers of
an incoming response to determine if it is acceptable.

tl;dr;
Edited the generated proxy class to inherit from Microsoft.Web.Services3.WebServicesClientProtocol**.
I came across this question when troubleshooting this error, so I thought I would help the next person who might come through here, although not sure if it answers the question as stated. I ran into this error when at some point I had to take over an existing solution which was utilizing WSE and MTOM encoding. It was a windows client calling a web service.
To the point, the client was calling the web service where it would throw that error.
Something that contributed to resolving that error for me was to check the web service proxy class that apparently is generated by default to inherit from System.Web.Services.Protocols.SoapHttpClientProtocol.
Essentially that meant that it didn't actually use WSE3.
Anyhow I manually edited the proxy and changed it to inherit from Microsoft.Web.Services3.WebServicesClientProtocol.
BTW, to see the generated proxy class in VS click on the web reference and then click the 'Show All Files' toolbar button. The reference.cs is da place of joy!
Hope it helps.

In his book "HTTP Developer's Handbook" on page 81 Chris Shiflett explains what a 415 means, and then he says, "The media type used in the content of the HTTP response should be indicated in the Content-Type entity header."
1) So is Content-Type a possible answer? It would presumably be a comma-separated list of accepted content types. The obvious problem with this possibility is that Content-Type is an entity header not a response header.
2) Or is this a typo in the book? Did he really mean to say "the HTTP request"?

Alternative bodies for HTTP PUT

I'm developing a REST-ful webservice, and I have a question about the HTTP PUT method.
I want to allow people to submit content using a application/form-data request body. However, the default response will be in application/xml.
Is this acceptable?
Evert

Content types are only important within the scope of a single request. All they do is describe the format of the content that is being sent.
Your web service should provide the response most acceptable to the client request that it is capable of providing. The client request should include an Accept header that describes the acceptable content types. If your service can't provide any of the content types in this header then return 406 Not Acceptable
In your situation, if your client GET requests include application/xml in the Accept header then it is fine to respond with application/xml, regardless of any PUT request made on the requested resources.
EDIT:
The status code definition for 406 Not Acceptable includes a note with the following:
Note: HTTP/1.1 servers are allowed to return responses which are
not acceptable according to the accept headers sent in the
request. In some cases, this may even be preferable to sending a
406 response. User agents are encouraged to inspect the headers of
an incoming response to determine if it is acceptable.
So you can return application/xml whenever you want.

RESTful services should use the correct HTTP method (GET,HEAD,PUT,DELETE or POST) for the action, ensure that any scoping information is contained in the URI and ensure that the HTTP message envelope does not contain another envelope i.e. SOAP.
Roy Fieldings 2000 Ph.D. dissertation: Architectural Styles and the Design of Network-Based Software Architectures forms the foundation of REST.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex