I'm trying to determine how to load balance TCP traffic. I understand how HTTP load balancing works because it is a simple Request / Response architecture. However, I'm unsure of how you load balance TCP traffic when your servers are trying to write data to other clients. I've attached an image of the work flow for a simple TCP chat server where we want to balance traffic across N application servers. Are there any load balancers out there that can do what I'm trying to do, or do I need to research a different topic? Thanks.
Firstly, your diagram assumes that the load balancer is acting as a (TCP) proxy, which is not always the case. Often Direct Routing (or Direct Server Return) is used, or Destination NAT is performed. In both cases the connection between backend server and the client is direct. So in this case it is essentially the TCP handshake that is distributed amongst backend servers. See the following for more info:
http://www.linuxvirtualserver.org/VS-DRouting.html
http://www.linuxvirtualserver.org/VS-NAT.html
Obviously TCP proxies do exist (HAProxy being one), in which case the proxy manages both sides of the connecton, so your app would need to be able to identify the client by the incoming IP/Port (which would happen to be from the proxy rather than the client). The proxy will handle getting the messages back to the client.
Either way, it comes down to application design as I would imagine the tricky bit is having a common session store (a database of some kind, or key=>value store such as Redis), so that when your app server says "I need to send a message to Frank" it can determine which backend server Frank is connected to (from DB), and signal that server to send it the message. You reduce the problem of connections (from the same client) moving around different backend servers by having persistent connections (all load balancers can do this), or by using something intrinsically persistent like a websocket.
This is probably a vast oversimplification as I have no experience with chat software. Obviously DB servers themselves can be distributed amongst several machines, for fault-tolerance and load balancing.
Related
I'm trying to calculate the ping speed from a client to a server.
I'm creating a complete application, so I'm writing the server logic as well. I'm wondering if I should use socket.io to ping the server from the client, or simply send a random HTTP HEAD request to the server, and if any of them is more accurate than the other.
It depends on the frequency of the ping I guess. For a simple ping every so often, I would probably just fire off an http request as needed.
Every connection requires resources so x many clients connected require x connected sockets being managed which has some overhead. This may not be an issue given your anticipated quantity of concurrent users, but you would get away using fewer server resources having clients connect via http.
If you making the rest of your calls via http, then it makes sense to have a more accurate measurement using the same protocol.
Consider an application that access a remote HTTPS server, sending POST of JSON-formatted requests at an URL on the server, and receiving JSON-formatted answers. The server does not support HTTP/2 multiplexing.
There are many requests, with widely varying workload (from idle to hundreds TPS). JSON messages are in the order of 1 kbyte. Client and server are authenticated by certificates+private keys. The requests can be considered independent (in particular, the server treats requests alike for all HTTPS channels opened with the same client certificate).
HTTP/1.1 does not allow* multiple concurrent POST requests over the same connection. Therefore the throughput can't exceed N/(Tr+Ts) TPS, where N is the number of opened HTTPS/TLS channels in use, Tr is the network round trip delay, and Ts is processing time on the server side (in the order of 30 ms under low load, due to database access and other factors). Opening an HTTPS connection costs at least 4 Tr, and sizable CPU time on both sides. It looks like something is needed to manage a pool of HTTPS connections on the client side.
How is this issue usually handled?
What are common libraries or background daemon/services, automatically opening new HTTPS connections as needed, reusing them when possible?
It would be nice if that detected when the server becomes unresponsive, and handled fallback to a backup server at a different URL, with return to the main server when it is up again.
Note: Next step would be load balancing, but then my load-balancing layer must somewhat handle an affinity between the requests, since they are not fully independent (sending a dependent request to the wrong server is reliably detected by the server, though).
[*] Due to how RFC 2616 is interpreted, I'm told.
I know how load balancers work for http requests. A client opens a connection with the LB, LB forwards the request to the backend servers, LB gets a response and from the same connection it sends the response to the client and closes the connection. I want to know the internal details of load balancer for websockets. How the connections are maintained and how the response is sent to the client. I read many questions on stackoverflow but none of them gave a clear picture of the internal implementation of LB
the LB just route the connection to a server behind it.
so as long you keep the connection open you will keep being connected to the same server and do not communicate with the LB again.
depending on the client, on reconnection you could be routed to another server.
I'm not sure how it works when some libraries fallback to JSON-P tho
Implementations of load balancers have great variety. There are load balancers that support websockets, like F5's BIG-IP (https://support.f5.com/kb/en-us/solutions/public/14000/700/sol14754.html), and LB's that I don't think support websocekts, like AWS ELB (there is a thread where somebody says they could make it with ELB but I suppose they added some other component behind ELB: How do you get Amazon's ELB with HTTPS/SSL to work with Web Sockets?.
Load Balancer's not only act as terminators of HTTP connections, they can terminate also HTTPS, SSL, and TCP connections. They can implement stickiness based on different parameters, like cookies, origin IP, etc. (like F5) In the case of ELB's they use only cookies, and it could be application generated cookies or LB generated cookies (both only with HTTP or HTTPS). Also stickiness can be kept for certain defined time, sometimes configurable.
Now, in order to forward data corresponding to websockets, they need to terminate, and forward, connections at level of SSL or TCP (not HTTP or HTTPS). Unless they understand websocket protocol (I don't know if any does it). Additionally, they need to keep stickiness to the server with which the connetion was opened. This is not possible with ELB but yes with more complex LB's like BIG-IP.
These are two phrases I hear about very often, mainly associated with Nginx. Can someone give me a laymans defintion?
Definitions are often difficult to understand. I guess you just need some explanation for their use case.
A short explanation is: load balancing is one of the functionalities of reverse proxy, and reverse proxy is one of the softwares that can do load balancing.
And a long explanation is given below.
For example a service of your company has customers in UK and German. Because the policy is different for these two countries, your company has two web servers, uk.myservice.com for UK and de.myservice.com for German, each with different business logic. In addition, your company wants there to be only one unified endpoint, myservice.com for the service. In this case, you need to set up a reverse proxy as the unified endpoint. The proxy takes the url myservice.com, and rewrites the url of incoming requests so that requests from UK(determined by source ip) go to uk.myservice.com and requests from German go to de.myservice.com. From the view of a client from UK, it never knows the response is actually generated from uk.myservice.com.
In this case, the load of request traffic to the service is actually balanced to servers on uk.myservice.com and de.myservice.com as a side effect. So we normally don't call it used as a load balancer, just say it as a reverse proxy.
But lets say if your company uses the same policy for all countries, and has 2 servers, a.myservice.com and b.myservice.com, only for the reason that the work load is to heavy for one server machine. In this case, we normally call the reverse proxy as load balancer to emphasize the reason why it is being used.
Here is the basic definition:
Reverse Proxy is a proxy host, that receives requests from a client, and sends it to one of the servers behind itself. Nginx and apache httpd are commonly used as reverse proxies. These are in the administrative network of the web server that a servers a request.
This is in contrast with a (forward) Proxy, which sits in front of a client, and sends requests on behalf of a client to a web server. As an example, your corporate network address translator is a forward proxy. These are in the administrative network of the client from where the request originates.
Load balancing is a function performed by reverse proxies. The client requests are received by a load balancer, and the load balancer tries to send that request to one of the nodes (hosts) in the server pool, in an attempt to balance the load across various nodes.
I see both of them as a functionality of a HTTP/Web Server.
Load balancer’s job is to distribute the workload between servers node in a way that makes the best use of it.
Reverse proxy is a interface for external world ,forwarding request to a server node (even when we have a single node)
Its other use cases are caching of static content ,compression etc
I have read in many blogs/books that maintaining a persistent connection between the proxy cache and server can provide a lot of benefits. So you have:
C ---- Proxy ------ S
Now, if the connection between the proxy and the server is persistent it can provide many benefits.
1) We save up on TCP 3-way handshake
2) We don't have to perform slow start. The congestion window will be large since many clients will be sending data through this TCP flow.
The only problem is how do I de-multiplex the replies I receive from the server in order to send them to the client? How do I find out which reply corresponds to which client?
The latest nginx provides this feature. You may refer to the code. (upstream, proxy, upstream_keeplive..)