HTTP GET with Transfer-Encoding Chunked - http

I recently saw a GET call which had the Transfer-Encoding header set to Chunked. What we noticed from this call is a delay then a 500 socket timeout.
Digging deeper, we showed this behavior of the 500 coming from ELB's and Apache web servers.
Of note, if we make the GET call with Transfer-Encoding as Chunked and include an empty payload, the ELB in our case allows the request to go through as normal.
Given this ELB's and Apache web servers exihibit this behaviour, my question is sending Transfer-Encoding as Chunked on a HTTP GET call is valid or not?

The Transfer-Encoding header is now under authority of RFC 7230, section 3.3.1 (see also the IANA Message Header Registry. You will find that this header is part of the HTTP message framing, so it is valid for both, requests and responses.
Keeping in mind that GET requests may carry bodies, the reaction of the server(s) is absolutely correct. What is causing the delay followed by the 500 is probably the following: The server would expect a body but fails to find one, since the chunk-encoded representation of an empty string is not an empty string but a zero-sized chunk. In consequence, the server is running into a timeout.

Related

Network Tracing for http requests

There is a issue raised by one of our client who is using our Rest based API that whenever he is sending a post request to our server without AcceptEncoding http header but he is getting Compressed content in return. I checked the IIS logs on our API server which addressed his request and the request received on the server has come with a Accept-Encoding(http header) as set to gzip. In between the client machine and our server sits intermediaries(proxies) and load balancer. which network tracing tool should I use for investigating as to where this http header is getting added.
One solution to avoid an HTTP message to be compressed is to add Cache-Control: no-transform to the request headers to avoid payload alteration by proxies as stated in RFC 7234 section 5.2.1.6.
Also, Via header may contain useful comments that can help when looking for what did each proxy add to the request.

Returning 400 in virtual host environments where Host header has no match

Consider a web server from which three virtual hosts are served:
mysite.com
myothersite.com
imnotcreative.com
Now assume that the server receives the following raw request message (code formatting removes the terminating \r\n sequences):
GET / HTTP/1.1
Host: nothostedhere.com
I haven't see any guidance in RFC 2616 (perhaps I missed it?) on how to respond to a request for a host name that does not exist at the current server. Apache, for example, will simply use the first virtual host defined in its configuration as the "primary host" and pretend the client requested that host. Obviously this is more robust than returning a 400 Bad Request response and guarantees the client always sees some representation.
So my question is ...
Can anyone provide reasons aside from the "robustness vs. correctness" argument to dissuade me from responding with a 400 (or other error code) should the client request a non-existent host when employing the HTTP/1.1 protocol?
Note that all HTTP/1.1 requests MUST specify a Host: header as per RFC 2616. For HTTP/1.0 requests the only real option is to serve the "primary" host result. This question specifically addresses HTTP/1.1 protocol requests.
400 is not really the semantically correct response code in this scenario.
10.4.1 400 Bad Request
The request could not be understood by the server due to malformed syntax.
This is not what has happened. The request is syntactically valid, and by the time you server has reached the routing phase (when you are inspecting the value of the header) this will already have been determined.
I would say the correct response code here is 403:
10.4.4 403 Forbidden
The server understood the request, but is refusing to fulfill it.
This describes what has happened more accurately. The server is refusing to fulfill the request because it is unable to, and a more verbose error message can be provided in the message entity.
There is also an argument that 404 would be acceptable/correct, since a suitable document with which to satisfy the request could not be found, but personally I think that this is not the correct option, because 404 states:
10.4.5 404 Not Found
The server has not found anything matching the Request-URI
This explicitly mentions a problem with the Request-URI, and at this early stage of the routing phase you are probably not interested in the URI, since you first need to allocate the request to a host before it can determine whether it has a suitable document to handle the URI path.
In HTTP/1.1 Host: headers are mandatory. If a client states that it is using version 1.1 and does not supply a Host: header then 400 is definitely the correct response code. If the client states that it is using version 1.0 then it is not required to supply a host header and this should be handled gracefully - and this scenario amounts to the same situation as an unrecognised domain.
Really you have two options in this event: route the request to a default virtual host container, or respond with an error. As outlined above, if you are going to respond with an error, I believe the error should be 403.
I'd say this largely depends on what type(s) of clients you expect to consume your service and the type of service you offer.
For a general website:
Pretty safe to assume that requests are triggered from a user's browser, in which case I'd be more forgiving regarding the lack or incorrectness of a Host: header. I'd even go so far and say that the way Apache handles the case (i.e. fallback to the first appropriate VHost) is perfectly fine. After all, you don't want to scare your customers away.
For an API/RPC type of service:
That's a totally different case. You SHOULD expect whatever/whoever consumes your service to adhere to your specifications. So, if these require a consumer to pass a valid Host: header and the consumer fails to do so, you SHOULD return with a reasonable response (400 Bad Request seems fine to me).
It seems the previously accepted answer is no longer correct.
Per RFC 7230:
"A server MUST respond with a 400 (Bad Request) status code to any HTTP/1.1 request message that lacks a Host header field and to any request message that contains more than one Host header field or a Host header field with an invalid field-value."

Is it legitimate http/rest to have requests that are compressed?

I asked this question a few days ago, and I didn't get a lot of activity on it. And it got me thinking that perhaps this was because my question was nonsensical.
My understanding of http is that a client (typical a browser) sends a request (get) to a server, in my case IIS. Part of this request is the accept-encoding header, which indicates to the server what type of encoding the client would like the resource returned in. Typically this could include gZip. And if the server is set up correctly it will return the resource requested in the requested encoding.
The response will include a Content-Encoding header indicating what compression has been applied to the resource. Also included in the response is the Content-Type header which indicates the mime type of the resource. So if the response includes both Content-Type : application/json and Content-Encoding: gzip, the client knows that resource is json that is has been compressed using gzip.
Now the scenario I am facing is that I am developing a web service for clients that are not browsers but mobile devices, and that instead of requesting resources, these devices will be posting data to the service to handle.
So i have implemented a Restfull service that accepts post request with json in the body. And my clients send their post requests with Content-Type:Application/json. But some of my clients have requested that they want to compress their request to speed up transmission. But my understanding is the there is no way to indicate in a request that the body of the request has been encoded using gZip.
That is to say there is no content-Encoding header for requests, only responses.
Is this the case?
Is it incorrect usage of http to attempt to compress requests?
According to another answer here on SO, it is within the HTTP standard to have a Content-Encoding header on the request and send the entity deflated.
It seems that no server automatically inflates the data for you, though, so you'll have to write the server-side code yourself (check the request header and act accordingly).

What are the consequences of not setting "cache-control" in http response header?

Say, my web application responds to a http request with a response that has no "cache-control" in its header. If the client-end submits the same request within a relatively short time, what would happen? Does a cached copy of the response get used and thus the request does not need to reach the server? Or does the request get sent to the server just like the first time?
If the answer is "it depends", please indicate what the dependencies are. Thanks.
There is no caching behavior defined in HTTP/1.1 protocol for a resource served with no cache-related headers, so it's really up to the HTTP client's implementation.
Here is the link to RFC.

How do HTTP proxy caches decide between serving identity- vs. gzip-encoded resources?

An HTTP server uses content-negotiation to serve a single URL identity- or gzip-encoded based on the client's Accept-Encoding header.
Now say we have a proxy cache like squid between clients and the httpd.
If the proxy has cached both encodings of a URL, how does it determine which to serve?
The non-gzip instance (not originally served with Vary) can be served to any client, but the encoded instances (having Vary: Accept-Encoding) can only be sent to a clients with the identical Accept-Encoding header value as was used in the original request.
E.g. Opera sends "deflate, gzip, x-gzip, identity, *;q=0" but IE8 sends "gzip, deflate". According to the spec, then, caches shouldn't share content-encoded caches between the two browsers. Is this true?
First of all, it's IMHO incorrect not to send "Vary: Accept-Encoding" when the entity indeed varies by that header (or its absence).
That being said, the spec currently indeed disallows serving the cached response to Opera, because the Vary header does not match per the definitions in HTTPbis, Part 6, Section 2.6. Maybe this is an area where we should relax the requirements for caches (you may want to follow up on the IETF HTTP mailing list...
UPDATE: turns out that this was already marked as an open question; I just added an issue in our issue tracker for it, see Issue 147.
Julian is right, of course. Lesson: Always send Vary: Accept-Encoding when sniffing Accept-Encoding, no matter what the response encoding.
To answer my question, if you mistakenly leave Vary out, if a proxy receives a non-encoded response (without Vary), it can simply cache and return this for every subsequent request (ignoring Accept-Encoding). Squid does this.
The big problem with leaving out Vary is that If the cache receives an encoded variant without Vary then it MAY send this in response to other requests even if their Accept-Encoding indicates the client can not understand the content.

Resources