I am trying to create an HTTP parser, and I am interested to know how can I retrieve the URL from HTTP response message text. Does every response contain the URL of the page? Do I need to look at some fields in the header for that or just look at the packet's body? Do I need to save information from previous packets (such as location message)?
Requests have URLs, responses are data packets sent back to the client. But if you make an HTTP request, the immediate response will come from the same URL.
Related
I am looking for an appropriate HTTP status code that tells the receiver that just the meta-data is being sent, not the complete data.
For example, say you do an HTTP GET:
GET /foo?meta_data_only=yes
the server won't look up the complete data, just send some metadata back about the endpoint, for example. Is there an HTTP status code for the response that can represent this? I would guess it's in the 200s or 300s somewhere?
Since your metadata is being returned in the headers, I would send a status code of 204 No Content.
https://httpstatuses.com/204
The server has successfully fulfilled the request and that there is no
additional content to send in the response payload body.
Metadata in
the response header fields refer to the target resource and its
selected representation after the requested action was applied.
This sounds exactly like what you’re looking for: a successful response that contains no body, and metadata in the headers that provide additional about the resource.
Another thing worth noting is that it’s common practice to use the HTTP verb HEAD when you only want metadata. HEAD is very similar to GET, except that it specifies that you do not want a body back. For example if you do a HEAD to an image url, you will get a 204 No Content response and some metadata about the file such as Content-Type, Content-Size, maybe ETag, but you won’t be sent all of the file data. A lot of web servers (such as Nginx) support this behavior out of the box for static files. I would recommend that you stop using your querystring parameter, and instead implement HEAD versions of your endpoints. That would make the intention even more clear and intuitive.
Does every HTTP request need to be paired with a response? When you do some POST or DELETE actions, my understanding is that sometimes you don't need to send back data. I've always been told to send back an empty object, but is that necessary? Also, is sending a status code considered a response?
Q1: Does every HTTP request need to be paired with a response?
Yes, unless client cancel the request. Actually, one HTTP request needs to be paired with one or more HTTP responses. According to RFC7231:
A server listens on a connection for a request, parses each message received, interprets the message semantics in relation to the identified request target, and responds to that request with one or more response messages.
Q2: When you do some POST or DELETE actions, my understanding is that sometimes you don't need to send back data. I've always been told to send back an empty object, but is that necessary?
It's NOT necessary to send back an empty object (payload). According to RFC7230, the response payload is not required:
A server responds to a client's request by sending one or more HTTP response messages, each beginning with... and finally a message body containing the payload body (if any).
However, although you don't have to "send back data", you still need to send back message, such as HTTP response statuc code and some necessary response headers.
Q3: is sending a status code considered a response?
Yes. Theoretically, a minimal HTTP response can only include HTTP protocol version, status code and status code textual phrase.
I am having difficulties understanding the HTTP request and response flow. I am working with a system where I can "hijack" incoming HTTP request and give my own response. The problem I am having is that some type of GET request seem to assume that all data is sent back in first request.
For instance, JPEG image requests, no matter the size (my tests include 0-20 MB JPEG files) seems to assume that the entire data is sent in the first response. Even if I don't send any data and explicitly set range header to 0 I never get a response back from the client asking for the data.
Other data request types, such as mp4 video, the client seems perfectly fine with getting a response with only header information with no data and then sends a new request explicitly asking for byte range 0-.
Is there some kind of agreement between the the client and server that some types should be sent back in one request while others can be split up in a number of requests?
I wish to get the complete url of file being set by server through its http-header(response). I am doing some sort of packet filtering. Is it possible to generate complete url from http response header.
No, it's not possible. You need the actual HTTP request.
Do SOAP and REST put their respective payloads as a URL? For example:
http://localhost/action/?var=datadatadata
I know SOAP uses XML and sometimes runs on a different port on the server, but do you still submit data like the example above or do you send it as one big XML encapsulated packet to that port?
It depends on your HTTP method. GET method will put everything into URL while POST method only put path information in URL and the rest of them are streamed into the HTTP request body.
SOAP should also rely on HTTP protocol and hence should follow the same rule. Check out http://www.w3.org/TR/soap12-part0/#L10309