How are pools of HTTPS connections managed? - networking

Consider an application that access a remote HTTPS server, sending POST of JSON-formatted requests at an URL on the server, and receiving JSON-formatted answers. The server does not support HTTP/2 multiplexing.
There are many requests, with widely varying workload (from idle to hundreds TPS). JSON messages are in the order of 1 kbyte. Client and server are authenticated by certificates+private keys. The requests can be considered independent (in particular, the server treats requests alike for all HTTPS channels opened with the same client certificate).
HTTP/1.1 does not allow* multiple concurrent POST requests over the same connection. Therefore the throughput can't exceed N/(Tr+Ts) TPS, where N is the number of opened HTTPS/TLS channels in use, Tr is the network round trip delay, and Ts is processing time on the server side (in the order of 30 ms under low load, due to database access and other factors). Opening an HTTPS connection costs at least 4 Tr, and sizable CPU time on both sides. It looks like something is needed to manage a pool of HTTPS connections on the client side.
How is this issue usually handled?
What are common libraries or background daemon/services, automatically opening new HTTPS connections as needed, reusing them when possible?
It would be nice if that detected when the server becomes unresponsive, and handled fallback to a backup server at a different URL, with return to the main server when it is up again.
Note: Next step would be load balancing, but then my load-balancing layer must somewhat handle an affinity between the requests, since they are not fully independent (sending a dependent request to the wrong server is reliably detected by the server, though).
[*] Due to how RFC 2616 is interpreted, I'm told.

Related

Is there any advantage to using socket.io over a simple HTTP HEAD request from a browser to server with Javascript to measure internet ping speed?

I'm trying to calculate the ping speed from a client to a server.
I'm creating a complete application, so I'm writing the server logic as well. I'm wondering if I should use socket.io to ping the server from the client, or simply send a random HTTP HEAD request to the server, and if any of them is more accurate than the other.
It depends on the frequency of the ping I guess. For a simple ping every so often, I would probably just fire off an http request as needed.
Every connection requires resources so x many clients connected require x connected sockets being managed which has some overhead. This may not be an issue given your anticipated quantity of concurrent users, but you would get away using fewer server resources having clients connect via http.
If you making the rest of your calls via http, then it makes sense to have a more accurate measurement using the same protocol.

What is the relationship between HTTP connection and a request?

While I am configuring my nginx, I found two modules: ngx_http_limit_conn_module and ngx_http_limit_req_module
one is for limiting connection per defined key, and one for limiting request.
My question is what is the relationship (and difference) between
a HTTP connection and a request.
It seems that multiple HTTP requests can use one common HTTP connection, what is the principle under this?
Basically connections are established to make requests using it. So for instance endpoint for given key may accept 5 connections per hour from given IP address. But it doesn't mean only 5 requests can be made but much more - if the connection is not closed after a request (from HTTP 1.1 it's by default kept alive).
E.g. an endpoint accepts 5 connections and 10 requests from given IP address. If connection is established for every request only 5 requests overall can be made. If connection is kept alive single client may make all the requests. If there are 5 clients, every establishes a connection and keeps it alive there are 2 request approx. that can be made by each client - however one can make all the request if it's fast enough.
HTTP connections - client and server introduce themselves.
HTTP requests - client ask something from server.
Making a connection with server involves TCP handshaking and it is basically creating a socket connection with the server. To make a HTTP request you should be already established a connection with the server. If you established connection with a server you can make multiple request using the same connection(HTTP/1.0 by default one request per connection, HTTP/1.1 by default it is keep alive). As most of the web pages need multiple resources from the server(ex: 100 photos to load in the screen). It is a low burden to the server if we keep the connection and request those 100 images using the same connection(No need to go through the connection establishment process 100 times). That is why HTTP/1.0 came up with keep alive as default.
A request is a functional execution: "Do something for me, and return the result back to me" - which is made by the client over a channel that the server is listening on, the "connection". Think of it as making a phone call to a restaurant. When the restaurant picks up the phone, you have an established "connection" - and now can place multiple requests over the same connection. The restaurant can handle multiple, simultaneous customer calls, if it has multiple phone lines open to receive the calls. This is your "connection pool" - at any point in time, you can only have as many simultaneous open connections (max) as the size of your connection pool. The number of requests however will vary. Some client may make 3 requests, and hang up, while other client may make 10 requests before hanging up.
The size of your connection pool determines concurrency - how many simultaneous clients can you talk to at any point in time? The length of those conversations will be use case specific.

Number of connections from a client to servers with proxy

I want to make simple HTTP proxy server.
Here, I have some problem of designing the program because of the number of connections.
When a client attempts to make connection to the 2 servers, there would be 2 connections; one from client to the server A and the other from client to the server B. It is natural; at least I think.
However, I'm confused when there is a proxy between client and server. I thought the client might make only 1 connection to the proxy, and send all of HTTP message (to server A and server B) via the connection. The first method is very natural (making 2 connections for 2 servers), but I want to double-check this before starting implementation!
Clients might make only one connection to your proxy server (using HTTP keepalive and/or pipelining to sequentially make more than one request through the same connection), or they might make multiple connections to your proxy server (especially if they want to make more than one HTTP request in parallel). You should be prepared for both eventualities because it's up to the client what it does.
The case of two HTTP requests coming over the same connection is semantically identical to the case of the same two HTTP requests coming over separate connections.

HTTP and Sessions

I just went through the specification of http 1.1 at http://www.w3.org/Protocols/rfc2616/rfc2616.html and came across a section about connections http://www.w3.org/Protocols/rfc2616/rfc2616-sec8.html#sec8 that says
" A significant difference between HTTP/1.1 and earlier versions of HTTP is that persistent connections are the default behavior of any HTTP connection. That is, unless otherwise indicated, the client SHOULD assume that the server will maintain a persistent connection, even after error responses from the server.
Persistent connections provide a mechanism by which a client and a server can signal the close of a TCP connection. This signaling takes place using the Connection header field (section 14.10). Once a close has been signaled, the client MUST NOT send any more requests on that connection. "
Then I also went through a section on http state management at https://www.rfc-editor.org/rfc/rfc2965 that says in its section 2 that
"Currently, HTTP servers respond to each client request without relating that request to previous or subsequent requests;"
A section about the need to have persistent connections in the RFC 2616 also said that prior to persistent connections every time a client wished to fetch a url it had to establish a new TCP connection for each and every new request.
Now my question is, if we have persistent connections in http/1.1 then as mentioned above a client does not need to make a new connection for every new request. It can send multiple requests over the same connection. So if the server knows that every subsequent request is coming over the same connection, would it not be obvious that the request is from the same client? And hence would this just not suffice to maintain the state and would this just nit be enough for the server to understand that the request was from the same client ? In this case then why is a separate state management mechanism required at all ?
Basically, yes, it would make sense, but HTTP persistent connections are used to eliminate administrative TCP/IP overhead of connection handling (e.g. connect/disconnect/reconnect, etc.). It is not meant to say anything about the state of the data moving across the connection, which is what you're talking about.
No. For instance, there might an intermediate (such as a proxy or a reverse proxy) in the request path that aggregates requests from multiple TCP connections.
See http://greenbytes.de/tech/webdav/draft-ietf-httpbis-p1-messaging-21.html#intermediaries.

How do you load balance TCP traffic?

I'm trying to determine how to load balance TCP traffic. I understand how HTTP load balancing works because it is a simple Request / Response architecture. However, I'm unsure of how you load balance TCP traffic when your servers are trying to write data to other clients. I've attached an image of the work flow for a simple TCP chat server where we want to balance traffic across N application servers. Are there any load balancers out there that can do what I'm trying to do, or do I need to research a different topic? Thanks.
Firstly, your diagram assumes that the load balancer is acting as a (TCP) proxy, which is not always the case. Often Direct Routing (or Direct Server Return) is used, or Destination NAT is performed. In both cases the connection between backend server and the client is direct. So in this case it is essentially the TCP handshake that is distributed amongst backend servers. See the following for more info:
http://www.linuxvirtualserver.org/VS-DRouting.html
http://www.linuxvirtualserver.org/VS-NAT.html
Obviously TCP proxies do exist (HAProxy being one), in which case the proxy manages both sides of the connecton, so your app would need to be able to identify the client by the incoming IP/Port (which would happen to be from the proxy rather than the client). The proxy will handle getting the messages back to the client.
Either way, it comes down to application design as I would imagine the tricky bit is having a common session store (a database of some kind, or key=>value store such as Redis), so that when your app server says "I need to send a message to Frank" it can determine which backend server Frank is connected to (from DB), and signal that server to send it the message. You reduce the problem of connections (from the same client) moving around different backend servers by having persistent connections (all load balancers can do this), or by using something intrinsically persistent like a websocket.
This is probably a vast oversimplification as I have no experience with chat software. Obviously DB servers themselves can be distributed amongst several machines, for fault-tolerance and load balancing.

Resources