I need to operate with a SOAP service from Erlang. SOAP implementation is not a subject, I have a problem with HTTP requests at a client side.
I use IBrowse as a HTTP client. This SOAP service uses a specific authorization mechanism, which relates an opened session to a client connection (socket). So, the client should use only one persistent connection to server (socket), and if it try to send a request via another socket (e.g., connection from pool) - authorization will fail.
I use IBrowse in this way:
Spawn connection process to server (ibrowse:spawn_worker_process/1)
Send request to server via spawned process with {max_sessions, 1} and {max_pipeline_size, 0}.
If I understand the docs right, this should use one socket for server connection with disabled pipelining, also, I use Connection: Keep-Alive header and HTTP version explicitly set to 1.0. But my connection is always closed after the response is received.
How can I use IBrowse (or another http-client) the way I described above?
I think you could that with hackney by reusing a connection.
Also gun is quite nice http client, easy to use, keeping connection, but with little less connection control.
Related
I'm implementing HTTP over TLS proxy server (sni-proxy) that make two socket connection:
Client to ProxyServer
ProxyServer to TargetServer
and transfer data between Client and TargetServer(TargetServer detected using server_name extension in ClientHello)
The problem is that the client doesn't close the connection after the response has been received and the proxy server waits for data to transfer and uses resources when the request has been done.
What is the best practice for implementing this project?
The client behavior is perfectly normal - HTTP keep alive inside the TLS connection or maybe even a Websocket connection. Given that the proxy does transparent forwarding of the encrypted traffic it is not possible to look at the HTTP traffic in order to determine exactly when the connection can be closed. A good approach is therefore to keep the connection open as long as the resources allow this and on resource shortage close the connections which were idle (no traffic) the longest time.
I understand the rule that, if client and server both support persistent connection, they can use it through a Connection:keep-alive header in the first request. After that, both client and server will still keep the underlying TCP connection open when they are done with the first request/response, and then use the same connection in the following requests/responses.
What I'm not clear about is the programming model. Consider the following client code in go:
resp, _ := client.Get("http://www.stackoverflow.com")
// do some other things
resp, _ = client.Get("http://www.stackoverflow.com/questions")
As far as I know, keep-alive is the default strategy in HTTP/1.1.
Q1: Do these two requests use the same TCP connection?
On the server side:
When one request comes, the Go HTTP framework dispatches it to a handler, then due to keep-alive, the framework should prepare for the next request on the same TCP connection. However I don't see any block-mode read code in a handler. So,
Q2: Does the Go HTTP framework use some kind of non-block mode to deal with keep-alive?
I mean, a handler will not block for reading, but just return when done with a request, then the framework will poll each non-block TCP connection, if one of them has data arrive, it dispatches it to an associated handler, and so on. Only when the request has a header of Connection:Close, the framework will close the TCP connection when the handler returns.
Q1: Do these two requests use the same TCP connection?
Yes, if the connection wasn't closed by the server, the http.Transport can re-use the connection in the default configuration.
Q2: Does the go http framework use some kind of non-block mode to deal with keep-alive?
The go HTTP Server handles keepalive connections be default. There's no such thing as a HTTP "non-block mode". It handles requests and responses on a single connection in a serial manner as described by the HTTP specification.
I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.
I've observed a HTTP 1.1 Server implementation, which terminates a client connection as soon as it detects a client-side connection shutdown of its outgoing channel (or rather, either before or after sending a proper http response). Is this a conforming HTTP 1.1 implementation?
RFC 2616 Section 8.1.4 seems to suggest this is to be the proper behaviour:
When a client or server wishes to time-out it SHOULD issue a graceful
close on the transport connection. Clients and servers SHOULD both
constantly watch for the other side of the transport close, and
respond to it as appropriate.
...
Servers SHOULD NOT close a connection in the middle of transmitting a response, unless a network or client failure is suspected.
Am I interpreting it right? Is there a more explicit reference about half-closed connection handling in the context of HTTP 1.1?
As far as i know, thats is all we need to know about Half-closed connections.
The server will only close the connection if it detects that the client closed it (it can ben when the server is about to write to the socket) or at the end of the request, if it does not support connection: keep-alive.
The client can disconnect any time, but it should tell the server why is it disconnecting (time_out, request cancel). But it is not very used by those who write sockets components. They just close the socket when they need to force a time_out.
But the client implementation is not the problem. You should worry about server implementation since suffer a lot with those unexpected disconnects.
EDIT
Maybe those links can help you.
Transmission Control Protocol - Functional Specification
TRANSMISSION CONTROL PROTOCOL
From this article on Wikipedia:
Keepalive messages were not officially
supported in HTTP 1.0. In HTTP 1.1 all
connections are considered persistent,
unless declared otherwise.
Does this mean that using this
mechanism I can actually simulate a
TCP socket connection?
Using this can I make a Server
"push" data to a client?
Are all HTTP connections, even the
one I am using to connect to Stack
Overflow "HTTP persistent"?
Does the COMET technology of
server push use this mechanism of
HTTP persistent connection to push
data to clients?
Does this mean that using this mechanism I can actually simulate a
TCP socket connection?
Not really, sockets have MANY more features and flexibility.
Using this can I make a Server "push" data to a client?
Not directly, it's still a request/response protocol; the persistent connection just means the client can use the same underlying socket to send multiple requests and receive the respective responses.
Are all HTTP connections, even the one I am using to connect to Stack
Overflow "HTTP persistent"?
Unless your browser (or a peculiar server) says otherwise, yes.
Does the COMET technology of server push use this mechanism of HTTP
persistent connection to push data to
clients?
Kinda (for streaming, at least), but with a lot of whipped cream on top. There are other Comet implementation approaches, such as hidden iframes and AJAX long polling, that may not require persistent connections (which give some firewalls &c the fits anyway;-).
Actually, the HTTP server can "push" data to a connected http client without the client requesting it. See "HTTP server push" at http://en.wikipedia.org/wiki/Push_technology. However it does seem to be commonly implemented.