I need some suggestions to setup auto-scaling, in nginx, of websocket connections. Let's say I have nginx configured to proxy websocket connections to 4 upstream backend servers. When I try to add a 5th server to the upstream block and reload nginx, I see that nginx keeps the existing worker processes running and additionally create new ones to serve new websocket connections. I guess the old workers remain until the earlier connections close.
Ideally, as we auto-scale, we want the number of nginx worker processes to remain the same. Is there a way to transfer the socket connections from the older worker processes to the newer worker processes?
thanks.
Related
Hi,I am using nginx stream mode to proxy tcp connection. Could this be possible if I restart my app on the upstream, nginx could auto reconnect to upstream without lost the tcp connection on the downstream?
I found some clue from this HiveMQ blog post comment, hope this help. I copied them as below:
Hi Sourav,
the load balancer doesn’t have any knowledge of MQTT; at least I don’t
know any MQTT-aware load balancer.
HiveMQ replicates its state automatically in the cluster. If a cluster
node goes down and the client reconnects (and is assigned to another
broker instance by the LB), it can resume its complete session. The
client does not need to resubscribe.
Hope this helps, Dominik from the HiveMQ Team
I have an ejabberd cluster in AWS that I want to load balance. I initially tried putting an ELB in front of the nodes, but that makes the sessions to be non-sticky. I then enabled proxy protocol on the ELB and introduced an HAProxy node between the ELB and the ejabberd cluster. My assumption / understanding here was that the HAProxy instance would use the TCP proxy and ensure the sessions are sticky on the ejabberd servers.
However, that still does not seem to be happening! Is this even possible in the first place? Introducing the cookie config in the HAProxy.cfg file gives an error that cookies are enabled only for HTTP, so how can I have TCP sessions stay sticky on the server...
Please do help as seem to be lost on ideas here!
ejabberd does not require sticky load balancing. You do not need to implement this. Just use ejabberd cluster with ELB or HAProxy on front, without stickyness.
Thanks #Michael-sqlbot and #Mickael - seems it had to do with the idle timeout in the ELB. That was set to 60 seconds, so the TCP connection was getting refreshed if I didnt push any data from the client to the ejabberd server. On playing with that plus the health check interval, I can see the ELB giving me a long-running connection... Thanks.
I still have to figure out how to get the client IP's captured in ejabberd (believe enabling proxy protocol on the ELB would help) but that is a separate investigation...
I plan to use nginx for proxying websockets. When performing nginx reload / HUP , I understand that nginx waits for the old worker processes to stop processing all requests. In websocket connection however, this may not happen for long time as the connection is persistent. Is there an option / roadmap to forceibly kill old worker process after timeout on reload?
References:
http://nginx.org/en/docs/control.html
http://forum.nginx.org/read.php?21,247573,247651#msg-247651
Thanks
Unless you have either solution: proxy_read_timeout 1d or a ping message to keep connection alive, Nginx closes connections in 60sec otherwise. This default value was chosen by a reason.
See what Nginx core developer says:
There is proxy_read_timeout (http://nginx.org/r/proxy_read_timeout)
which as well applies to WebSocket connections. You have to bump it
if your backend do not send anything for a long time. Alternatively,
you may configure your backend to send websocket ping frames
periodically to reset the timeout (and check if the connection is
still alive).
Having said that nothing should stop you from using USR2+QUIT signals combination that usually used when you gracefully restart Nginx while binary upgrade. Nginx master/worker processes rare consume more than 50MB of memory, so to keep multiple masters isn't that expensive. USR2 helps to fork new master and spawn its workers followed by gracefully shutdown old workers and master.
Since haproxy v1.5.0 it was possible to temporarily stop reverse-proxying traffic to frontends using
set maxconn frontend <frontend_name> 0
command.
I've noticed that if haproxy is configured to maintain keepalive connections between haproxy and a client then said connections will continue be served whereas the new ones will continue awaiting for "un-pausing" a frontend.
The question is: is it possible to terminate current keepalive connections gracefully so that a client was required to establish new connections?
I've only found shutdown session and shutdown sessions commands but they are obviously not graceful at all.
The purpose of all of this is to make some changes on server seamlessly, otherwise in current configuration it would require a scheduled maintenance window.
I'm using nginx as reverse proxy, and find more than 30k TIME_WAIT state ports in upstream server(windows 2003). I know my servers are "out of ports" which discussed here(http://nginx.org/pipermail/nginx/2009-April/011255.html), and set both nginx and upstream server to reuse TIME_WAIT and to recycle more quickly.
[sysctl -p]
……
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
But nginx hangs and "connection timed out while connection to upstream server" error still can be found on nginx error log, when RPS of upstream is higher than 1000 within 1 minutes. When upstream is Windows, server will be "out of ports" in seconds.
Any ideas? A connection pool with a waiting queue? Maxim Dounin wrote a useful module to keep connection with memcached, but why can't it support Web Server?
I am new to nginx but from what I know so far, you need to reduce your net.ipv4.tcp_fin_timeout value which defaults to 60 seconds. Out of the box nginx doesn't supports http connection pooling with backend. Because of this every request to backend creates a new connection. With 64K ports and 60 seconds of wait before that port can be reused, the average RPS will not be more than 1K per second. You can either reduce
your net.ipv4.tcp_fin_timeout value both at nginx server and the backend server or you can assign multiple ip addresses to the backend box and configure nginx to treat these "same servers" as different servers.