The source of confusion is this answer.
To be honest, I know what is Http Request, Http Session, but I have never heard this - Http Connection. So it boils down to this only. What exactly is the difference b/w Http Request & Http Connection?
Related
Say my client is posting a http request, of which the body is 1 TB.
The first time when my server's http handler get invoked, how much of the request boday has been in my server's tcp socket receive-q?
Edited:
Actaully what I want to know is related to the details of http.
Say the size of the whole http request is very large (big size of headers, and big size of body):
Basically, I can use one TCP request to send the whole http request(the size of which is very big). I can also split the big http request into several smaller slices and send them with several TCP requests. Which way does http protocol implement?
When can we think that the http server can start to work from the point of view of http protocol?
For example:
All the path in the URL (the part after the host:port in the URL) has been available in the server's memory? Or,
All the http headers have been available in the server's memory? Or,
All the request body has been available in the server's memory?
I'm building a distributed system in which i do some http requests to comunicate. I want the requests to be fault tolerant. The requests has no timeout, should i retry the request after some period if i have no response or the http request is automatically retrying tcp connections? I used the library async http client in java. Thanks
... the http request is automatically retrying tcp connections?
The HTTP request is not a thing which can retry something by itself. A HTTP request is just data. It is up to the application to retry the request if something goes wrong. Some libraries used in applications might offer this, others not. Most don't since it is often not clear if the request should be retried in the first place since it might have unintended side effects if the web application receives the request twice (it might have received the first even though it gave no response).
Can the HTTP client send a request while receiving the HTTP response?
For example, a client sends HTTP request A to server. Then, the server starts to send HTTP response. Before the client finish to receive HTTP response A, the client sends additional request B. Can it be possible? or Does it follow the HTTP RFC?
I think that above scenario is different from the pipelining. What I know about the pipelining is the scenario that client send multiple request A,B,C then the server response A,B,C consecutively. However, in the above scenario, request B is issued while the processing the response A.
Thank you
With the same connection object you must read the whole response before you can send a new request to the server, because response provides access to the request headers, return type and the entity body, If you send new request before fully reading response, client may get confused with mismatched responses.
Again it totally depends upon client library you using. Library could allow asynchronous requests.
There are concepts like
AsyncTask in android, promis in Angularjs etc.
allow asynchronous request.
I'm currently trying to automatically reconstruct an HTTP browsing only with a pcap ( basically it means matching an HTTP reply to the next HTTP requests). Most of the times, it works fine but sometimes a certain url, u, is present in the data of multiple HTTP replies.
For example, if u1 and u2 contains u in their reply data and if the request to u happens after the request to u2, how can I decide if the request to u was caused by u1 or by u2 ? Note that no request to u was made between u1 and u2.
Are there some fields in any network layer that I can use to make this match ?
Thanks!
HTTP runs on top of TCP, which is connection-oriented. You have access to the IP header of the connection used for the HTTP request (client IP/port -> server IP/Port).
HTTP is a command/response protocol, there is 1 response for each request.
So, simply look for an HTTP response immediately following the HTTP request on the same TCP connection (server IP/Port -> client IP/Port).
HTTP is state-less, the connection may be closed between requests without affecting the overall browsing model (closing connections is the required behavior in HTTP 0.9, is the default behavior in HTTP 1.0, and is not the default behavior in HTTP 1.1+), so it is possible for an HTTP response to trigger subsequent requests on new connections, so you need to be ready to handle that. The Connection header in the HTTP request will tell you whether the client is asking for the connection to remain open or not. The Connection header in the HTTP response will tell you whether the server is actually closing the connection or not after sending the response. But even if the server leaves the connection open, that is no guarantee that the client will actually reuse the same connection for later requests to the same server (though it likely will, unless a timeout elapses between requests).
I am writing a HTTP proxy server and I noticed that many clients use the "Connection: Keep-Alive" header to keep a persistent connection. Is it possible that the client sends another HTTP request before the server processes the first?
For example, the client sends "GET / HTTP/1.1" but before the server has a chance to respond, the client sends "GET /favicon.ico HTTP/1.1". Is that possible? Or will the client pause for the response before sending the second request?
Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?
"Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?"
I don't think so, see HTTPbis P1, Section 2.2:
Recipients MUST consider every message in a connection in isolation; because HTTP is a stateless protocol, it cannot be assumed that two requests on the same connection are from the same client or share any other common attributes. In particular, intermediaries might mix requests from different clients into a single server connection. Note that some existing HTTP extensions (e.g., [RFC4559]) violate this requirement, thereby potentially causing interoperability and security problems.
Yes, it is possible for the client to pipeline requests. (See http://en.wikipedia.org/wiki/HTTP_pipelining).
Turning your last question around... it would not be safe for a client to assume that requests to multiple hosts would be served by a single pipeline. There may be no specs that directly address your question on the Host: header, but it's a safe bet they'll be the same.
Regarding the first question:
Is it possible that the client sends another HTTP request before the server processes the first?
I believe that yes, it can be possible (perhaps I am wrong, I remembered having read that a couple of years ago; the definitive answer is in the HTTP protocol specifications). But I don't understand why you are asking. Also, the client can open several TCP connections at once to the same HTTP server. And of course you have many simultaneous clients.
About the second question
Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?
I believe it is usually the case, but I won't assume that to be certain. I could imagine that some clever HTTP clients, recognizing that two URL with different Host: headers share the same IP, could re-use the same connection.
But I don't understand why you are asking. Persistent HTTP connections have been invented to minimize the TCP connections which are costly, and the two questions you are asking are an extreme point on that. Perhaps few HTTP clients are doing what you describe today.
And you should be strict on what you send (w.r.t. standard conformance), but flexible on what you accept receiving.