nginx - ungraceful worker termination after timeout

nginx - ungraceful worker termination after timeout - nginx

I plan to use nginx for proxying websockets. When performing nginx reload / HUP , I understand that nginx waits for the old worker processes to stop processing all requests. In websocket connection however, this may not happen for long time as the connection is persistent. Is there an option / roadmap to forceibly kill old worker process after timeout on reload?
References:
http://nginx.org/en/docs/control.html
http://forum.nginx.org/read.php?21,247573,247651#msg-247651
Thanks

Unless you have either solution: proxy_read_timeout 1d or a ping message to keep connection alive, Nginx closes connections in 60sec otherwise. This default value was chosen by a reason.
See what Nginx core developer says:
There is proxy_read_timeout (http://nginx.org/r/proxy_read_timeout)
which as well applies to WebSocket connections. You have to bump it
if your backend do not send anything for a long time. Alternatively,
you may configure your backend to send websocket ping frames
periodically to reset the timeout (and check if the connection is
still alive).
Having said that nothing should stop you from using USR2+QUIT signals combination that usually used when you gracefully restart Nginx while binary upgrade. Nginx master/worker processes rare consume more than 50MB of memory, so to keep multiple masters isn't that expensive. USR2 helps to fork new master and spawn its workers followed by gracefully shutdown old workers and master.

Related

nginx reload ignores worker_shutdown_timeout

Clients connect to my Nginx instance with the keep-alive of 15s. I set worker_shutdown_timeout to 30s, and server keep-alive to 90s.
When I send -HUP signal or using Nginx -s reload to my instance. It creates new workers and immediately shuts down old workers. This causes my clients to get 499 EOFs.
Any idea what am I doing wrong?

Keepalive connections are closed immediately regardless of the worker_shutdown_timeout value, as clients are expected to re-open them as needed. The worker_shutdown_timeout applies to connections with actual requests being processed - these requests will be terminated when shutdown timeout expires.
If your clients cannot handle keepalive connection being closed by the server, probably there is room for improvement in the client code.

nginx stream mode reconnect to upstream without close downstream connection

Hi，I am using nginx stream mode to proxy tcp connection. Could this be possible if I restart my app on the upstream, nginx could auto reconnect to upstream without lost the tcp connection on the downstream?

I found some clue from this HiveMQ blog post comment, hope this help. I copied them as below:
Hi Sourav,
the load balancer doesn’t have any knowledge of MQTT; at least I don’t
know any MQTT-aware load balancer.
HiveMQ replicates its state automatically in the cluster. If a cluster
node goes down and the client reconnects (and is assigned to another
broker instance by the LB), it can resume its complete session. The
client does not need to resubscribe.
Hope this helps, Dominik from the HiveMQ Team

Autoscaling WebSocket connections in nginx

I need some suggestions to setup auto-scaling, in nginx, of websocket connections. Let's say I have nginx configured to proxy websocket connections to 4 upstream backend servers. When I try to add a 5th server to the upstream block and reload nginx, I see that nginx keeps the existing worker processes running and additionally create new ones to serve new websocket connections. I guess the old workers remain until the earlier connections close.
Ideally, as we auto-scale, we want the number of nginx worker processes to remain the same. Is there a way to transfer the socket connections from the older worker processes to the newer worker processes?
thanks.

Freezing haproxy traffic with maxconn 0 and keepalive connections

Since haproxy v1.5.0 it was possible to temporarily stop reverse-proxying traffic to frontends using
set maxconn frontend <frontend_name> 0
command.
I've noticed that if haproxy is configured to maintain keepalive connections between haproxy and a client then said connections will continue be served whereas the new ones will continue awaiting for "un-pausing" a frontend.
The question is: is it possible to terminate current keepalive connections gracefully so that a client was required to establish new connections?
I've only found shutdown session and shutdown sessions commands but they are obviously not graceful at all.
The purpose of all of this is to make some changes on server seamlessly, otherwise in current configuration it would require a scheduled maintenance window.

HTTP Proxy/FastCGI/SCGI not closing connection when client disconnected - bug or feature?

I'm working on Comet support for CppCMS framework via long XMLHttpRequest polls. In many cases, such request is closed by client before any response from server was given -- for example the page is closed, user moves to other page or it is just refeshed.
At the server side I expect that I would recieve the notification that connection is dropped. I tested the application via 3 connectors: FastCGI, SCGI and simple HTTP Proxy.
From 3 major UNIX web servers, Apache2, lighttpd and Nginx, only the last one had closed
connection as expected allowing my application to remove the request from wait queue -- this worked for both FastCGI and HTTP Proxy connectors. (Nginx does not have scgi module by default).
Others, Apache and Lighttpd do not close connection or inform the backend about disconnected
clients, the proceed as if the client is still on line. This happens for all 3 supported APIs: FastCGI, SCGI and HTTP Proxy.
I had opened an issue for Lighttpd, but what
more conserns me is the fact that Apache -- mature and well supported web server as lighttpd
and does not discloses the server backend that client had gone.
Questions:
Is this a bug or this is a feature? Is there any reason not to close the connection between web server and application backend?
Are there real life Comet application working behind these servers via FastCGI/SCGI/HTTP-Proxy backends?
If the above true, how do they deal with this issue? I understand that I can timeout all connections every 10 seconds, but I would like to keep them idle as far as client listens -- because this allows easier scale up -- each connection is very cheep -- the cost is only the opended socket.
Thanks!

(1) Feature. Or, more specifically, fallout from an implementation detail.
A TCP/IP connection does not involve a constant flow of traffic back and forth. Thus, there is no way to know that a client is gone without (a) the client telling you it is closing the connection or (b) a timeout.
(2) I'm not specifically familiar with Comet or CppCMS. But, yes, there are all kinds of CMS servers running behind the mentioned web servers and they all have to deal with this issue (and, yes, it is a pain).
(3) Timeouts are the only way, but you can mitigate the pain, so to speak. Have the client ping the server across the connection every N seconds when there is otherwise no activity. Doesn't have to do anything and you can tack stuff on the reply; notifications of concurrent edits or whatever you need.
You are correct in that it is surprising that mod_fastcgi doesn't support telling the backend that Apache has detected the disconnect or the connection timed out. And you aren't the first to be dismayed.
The second patch on this page should fix that particular issue:
http://osdir.com/ml/web.fastcgi.devel/2006-02/msg00015.html

http://ncannasse.fr/blog/tora_comet
I don't have any concrete information for you, but this article does mention that they can detect when the client has disconnected from Apache. See tora.Queue. And it sounds like the source is available in the neko CVS, so you might be able to find some clues there. Good luck.