I understand the rule that, if client and server both support persistent connection, they can use it through a Connection:keep-alive header in the first request. After that, both client and server will still keep the underlying TCP connection open when they are done with the first request/response, and then use the same connection in the following requests/responses.
What I'm not clear about is the programming model. Consider the following client code in go:
resp, _ := client.Get("http://www.stackoverflow.com")
// do some other things
resp, _ = client.Get("http://www.stackoverflow.com/questions")
As far as I know, keep-alive is the default strategy in HTTP/1.1.
Q1: Do these two requests use the same TCP connection?
On the server side:
When one request comes, the Go HTTP framework dispatches it to a handler, then due to keep-alive, the framework should prepare for the next request on the same TCP connection. However I don't see any block-mode read code in a handler. So,
Q2: Does the Go HTTP framework use some kind of non-block mode to deal with keep-alive?
I mean, a handler will not block for reading, but just return when done with a request, then the framework will poll each non-block TCP connection, if one of them has data arrive, it dispatches it to an associated handler, and so on. Only when the request has a header of Connection:Close, the framework will close the TCP connection when the handler returns.
Q1: Do these two requests use the same TCP connection?
Yes, if the connection wasn't closed by the server, the http.Transport can re-use the connection in the default configuration.
Q2: Does the go http framework use some kind of non-block mode to deal with keep-alive?
The go HTTP Server handles keepalive connections be default. There's no such thing as a HTTP "non-block mode". It handles requests and responses on a single connection in a serial manner as described by the HTTP specification.
Related
I need to operate with a SOAP service from Erlang. SOAP implementation is not a subject, I have a problem with HTTP requests at a client side.
I use IBrowse as a HTTP client. This SOAP service uses a specific authorization mechanism, which relates an opened session to a client connection (socket). So, the client should use only one persistent connection to server (socket), and if it try to send a request via another socket (e.g., connection from pool) - authorization will fail.
I use IBrowse in this way:
Spawn connection process to server (ibrowse:spawn_worker_process/1)
Send request to server via spawned process with {max_sessions, 1} and {max_pipeline_size, 0}.
If I understand the docs right, this should use one socket for server connection with disabled pipelining, also, I use Connection: Keep-Alive header and HTTP version explicitly set to 1.0. But my connection is always closed after the response is received.
How can I use IBrowse (or another http-client) the way I described above?
I think you could that with hackney by reusing a connection.
Also gun is quite nice http client, easy to use, keeping connection, but with little less connection control.
I've read some articles comparing the differences between WebSocket and the other push methods like Long polling. All the conclusions tend to be WebSocket is better then HTTP with low latency in the server and client bidirectional communication process.
But if server push is not a must, for example, a client game program just make a few queries to the server for some information, does it still better to use WebSocket then HTTP? More specially, I have two doubts here:
1. In a single Request-Response procedure, which is more efficency ? (I establish a WebSocket connection each time querying in the above case.)
2. Will the server capacity (The total number of clients that the server can serve) be affected by the unnecessary long-lived connection if I keep an WebSocket connection during the life cycle of the client?
Added Question:
3. Suppose there is only one TCP connection between the server and the client, will the stability of the connection go down and down as time flows?
The basic thing behind both the WebSocket and HTTP is the socket. In HTTP, it opens a connection on request and closes on response. For WebSocket, concept is a 2 way communication (full duplex) rather than request-response cycle.
Answers to your question:
Either you can use HTTP server or can create request-response design
using WebSocket
That's obvious. Each connection is a socket object. Server capacity
will be affected if we are not managing connections.
In WebSocket, it's using ping-pong mechanism to make sure that the client or
the server is alive. For every ping requests from one end, other end is
subjected to reply a pong response. This mechanism helps to detect failures and hence to maintain stability.
I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.
I am writing a HTTP proxy server and I noticed that many clients use the "Connection: Keep-Alive" header to keep a persistent connection. Is it possible that the client sends another HTTP request before the server processes the first?
For example, the client sends "GET / HTTP/1.1" but before the server has a chance to respond, the client sends "GET /favicon.ico HTTP/1.1". Is that possible? Or will the client pause for the response before sending the second request?
Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?
"Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?"
I don't think so, see HTTPbis P1, Section 2.2:
Recipients MUST consider every message in a connection in isolation; because HTTP is a stateless protocol, it cannot be assumed that two requests on the same connection are from the same client or share any other common attributes. In particular, intermediaries might mix requests from different clients into a single server connection. Note that some existing HTTP extensions (e.g., [RFC4559]) violate this requirement, thereby potentially causing interoperability and security problems.
Yes, it is possible for the client to pipeline requests. (See http://en.wikipedia.org/wiki/HTTP_pipelining).
Turning your last question around... it would not be safe for a client to assume that requests to multiple hosts would be served by a single pipeline. There may be no specs that directly address your question on the Host: header, but it's a safe bet they'll be the same.
Regarding the first question:
Is it possible that the client sends another HTTP request before the server processes the first?
I believe that yes, it can be possible (perhaps I am wrong, I remembered having read that a couple of years ago; the definitive answer is in the HTTP protocol specifications). But I don't understand why you are asking. Also, the client can open several TCP connections at once to the same HTTP server. And of course you have many simultaneous clients.
About the second question
Also, when using a persistent connection, is it safe to assume all requests through that connection will have the same "Host: " header?
I believe it is usually the case, but I won't assume that to be certain. I could imagine that some clever HTTP clients, recognizing that two URL with different Host: headers share the same IP, could re-use the same connection.
But I don't understand why you are asking. Persistent HTTP connections have been invented to minimize the TCP connections which are costly, and the two questions you are asking are an extreme point on that. Perhaps few HTTP clients are doing what you describe today.
And you should be strict on what you send (w.r.t. standard conformance), but flexible on what you accept receiving.
I've observed a HTTP 1.1 Server implementation, which terminates a client connection as soon as it detects a client-side connection shutdown of its outgoing channel (or rather, either before or after sending a proper http response). Is this a conforming HTTP 1.1 implementation?
RFC 2616 Section 8.1.4 seems to suggest this is to be the proper behaviour:
When a client or server wishes to time-out it SHOULD issue a graceful
close on the transport connection. Clients and servers SHOULD both
constantly watch for the other side of the transport close, and
respond to it as appropriate.
...
Servers SHOULD NOT close a connection in the middle of transmitting a response, unless a network or client failure is suspected.
Am I interpreting it right? Is there a more explicit reference about half-closed connection handling in the context of HTTP 1.1?
As far as i know, thats is all we need to know about Half-closed connections.
The server will only close the connection if it detects that the client closed it (it can ben when the server is about to write to the socket) or at the end of the request, if it does not support connection: keep-alive.
The client can disconnect any time, but it should tell the server why is it disconnecting (time_out, request cancel). But it is not very used by those who write sockets components. They just close the socket when they need to force a time_out.
But the client implementation is not the problem. You should worry about server implementation since suffer a lot with those unexpected disconnects.
EDIT
Maybe those links can help you.
Transmission Control Protocol - Functional Specification
TRANSMISSION CONTROL PROTOCOL