How can Nginx be upgraded without dropping any requests? - unix

According to the Nginx documentation:
If you need to replace nginx binary
with a new one (when upgrading to a
new version or adding/removing server
modules), you can do it without any
service downtime - no incoming
requests will be lost.
My coworker and I were trying to figure out: how does that work?. We know (we think) that:
Only one process can be listening on port 80 at a time
Nginx creates a socket and connects it to port 80
A parent process and any of its children can all bind to the same socket, which is how Nginx can have multiple worker children responding to requests
We also did some experiments with Nginx, like this:
Send a kill -USR2 to the current master process
Repeatedly run ps -ef | grep unicorn to see any unicorn processes, with their own pids and their parent pids
Observe that the new master process is, at first, a child of the old master process, but when the old master process is gone, the new master process has a ppid of 1.
So apparently the new master process can listen to the same socket as the old one while they're both running, because at that time, the new master is a child of the old master. But somehow the new master process can then become... um... nobody's child?
I assume this is standard Unix stuff, but my understanding of processes and ports and sockets is pretty darn fuzzy. Can anybody explain this in better detail? Are any of our assumptions wrong? And is there a book I can read to really grok this stuff?

For specifics: http://www.csc.villanova.edu/~mdamian/Sockets/TcpSockets.htm describes the C library for TCP sockets.
I think the key is that after a process forks while holding a socket file descriptor, the parent and child are both able to call accept() on it.
So here's the flow. Nginx, started normally:
Calls socket() and bind() and listen() to set up a socket, referenced by a file descriptor (integer).
Starts a thread that calls accept() on the file descriptor in a loop to handle incoming connections.
Then Nginx forks. The parent keeps running as usual, but the child immediately execs the new binary. exec() wipes out the old program, memory, and running threads, but inherits open file descriptors: see http://linux.die.net/man/2/execve. I suspect the exec() call passes the number of the open file descriptor as a command line parameter.
The child, started as part of an upgrade:
Reads the open file descriptor's number from the command line.
Starts a thread that calls accept() on the file descriptor in a loop to handle incoming connections.
Tells the parent to drain (stop accept()ing, and finish existing connections), and to die.

I have no idea how nginx does it, but basically, it could just exec the new binary, carrying the listening socket with it the new process (actually, it remains the same process, it just replaces the program executing in it). The listening socket has a backlog of incoming connections, and as long as it's fast enough to boot up, it should be able to start processing them before it overflows. If not, it could probably fork first, exec, and wait for it to boot up to the point where it's ready to process incoming requests, then hand over the command of the listening socket (file descriptors are inherited when forking, both have access to it) via some internal mechanism, before exiting. Noting your observations, this looks like what it's doing (if your parent process dies, your ppid is reassigned to init, i.e. pid 1)
If it has multiple processes competing to accept on the same listening socket (again, I have no idea how nginx does it, perhaps it has a dispatching process?), then you could replace them one by one, by ordering them to exec the new program, as above, but one at a time, as to never drop the ball. Note that during such a process there would never be any new pids or parent/child relationship changes.
At least, I think that's probably how I would do it, off the top of my head.

Related

handle server shutdown while serving http request

Scenario : The server is in middle of processing a http request and the server shuts down. There are multiple points till where the code has executed. How are such cases typically handled ?. A typical example could be that some downstream http calls had to be made as a part of the incoming http request. How to find whether such calls were made or not made when the shutdown occurred. I assume that its not possible to persist every action in the code flow. Suggestions and views are welcome.
There are two kinds of shutdowns to consider here.
There are graceful shutdowns: when the execution environment politely asks your process to stop (e.g. systemd sends a SIGTERM) and expects it to exit on its own. If your process doesn’t exit within a few seconds, the environment proceeds to kill the process in a more forceful way.
A typical way to handle a graceful shutdown is:
listen for the signal from the environment
when you receive the signal, stop accepting new requests...
...and then wait for all current requests to finish
Exactly how you do this depends on your platform/framework. For instance, Go’s standard net/http library provides a Server.Shutdown method.
In a typical system, most shutdowns will be graceful. For example, when you need to restart your process to deploy a new version of code, you do a graceful shutdown.
There can also be unexpected shutdowns: e.g. when you suddenly lose power or network connectivity (a disconnected server is usually as good as a dead one). Such faults are harder to deal with. There’s an entire body of research dedicated to making distributed systems robust to arbitrary faults. In the simple case, when your server only writes to a single database, you can open a transaction at the beginning of a request and commit it before returning the response. This will guarantee that either all the changes are saved to the database or none of them are. But if you call multiple downstream services as part of one upstream HTTP request, you need to coordinate them, for example, with a saga.
For some applications, it may be OK to ignore unexpected shutdowns and simply deal with any inconsistencies manually if/when they arise. This depends on your application.

How nginx reload work ? why it is zero-downtime

refer to nginx official docs . the reload command of nginx is for reload of configuration files ,and during the progress , there's no downtime of the service .
i've learned that it wait requests that already connected until it finished ,and stop accept any new request . the idea is cool , but how does it deal with the keep-live connections ? because those long-live connections won't close and there continuous request comes along .
Here's the summary:
http://nginx.org/en/docs/control.html
The master process first checks the syntax validity, then tries to
apply new configuration. If this succeeds, it starts new worker
processes, and sends messages to old worker processes requesting them
to shut down gracefully.
That means it would keep older processes handling unclosed connections while having new processes working according to the updated configuration.
From this perspective connections with keep-alive are no different from other unclosed connections.
In versions prior to 1.11.11 such "old" processes could hang indefinitely long (according to #Alexey, haven't checked it though), from 1.11.11 there’s a configuration setting controlling this
http://nginx.org/en/docs/ngx_core_module.html#worker_shutdown_timeout

Signal and waitpid coexistence

I have the following question: can I use a signal handler for SIGCHLD and at specific places use waitpid(3) instead?
Here is my scenario: I start a daemon process that listens on a socket (at this point it's irrelevant if it's a TCP or a UNIX socket). Each time a client connects, the daemon forks a child to handle the request and the parent process keeps on accepting incoming connections. The child handling the request needs at some point to execute a command on the server; let's assume in our example that it needs to perform a copy like this:
cp -a /src/folder /dst/folder
In order to do so, the clild forks a new process that uses execl(3) (or execve(3), etc.) to execute the copy command.
In order to control my code better, I would ideally wish to catch the exit status of the child executing the copy with waitpid(3). Moreover, since my daemon process is forking children to handle requests, I need to have a signal handler for SIGCHLD so as to prevent zombie processes from being created.
In my code, I setup a signal handler for SIGCHLD using signal(3), I daemonize my program by forking twice, then I listen on my socket for incoming connections, I fork a process to handle each coming request and my child-process forks a grand-child-process to perform the copy, trying to catch its exit status via waitpid(3).
What happens is that SIGCHLD is caught by my handler when a grand-child-process dies, before waitpid(3) takes action and waitpid(3) returns -1 even though the grand-child-process exits with success.
My first thought was to add:
signal(SIGCHLD, SIG_DFL);
just before forking the child process to handle my connecting clients, without any success. Using SIG_IGN didn't work either.
Is there a suggestion on how to make my scenario work?
Thank you all for your help in advance!
PS. If you need code, I'll post it, but due to its size I decided to do so only if necessary.
PS2. My intention is to use my code in FreeBSD, but my checks are performed in Linux.
EDIT [SOLVED]:
The problem I was facing is solved. The "unexpected" behaviour was caused by my waitpid(3) handling code which was buggy at some point.
Hence, the above method can indeed be used to allow for signal(3) and waitpid(3) coexistence in daemon-like programs.
Thanx for your help and I hope that this method helps someone wishing to accomplish such a thing!

How should worker processes listen on same port

I am trying to write a simple Web Server in QT which will handle HTTP requests.
I know QT has QThreads and I can use it for handling simultaneous connections.
What I want is to have atleast 2 worker processes so that I can pass most of the load of the requests from the master to the workers.
The problem is the master is listening on port 80 and it does the "accept" when there is an incoming connection.
How should I write the code for the worker to accept an incoming connection ?
The worker process cannot listen on the same port as the master binds to the port 80.
I know NGinx and LigHTTPD have worker processes and it is only the master that listens on the port 80.
Please let me know how should I do this or how does NGINX or Lighttpd implement this.
You have two choices:
The master process does not accept connections on the socket but the workers do. When a connection comes in, it will be accept()ed by one of the workers at random. You can't control which of the workers which are blocked on accept() will receive it.
Only the master process accepts connections. Then the master process dispatches the accepted connections to the workers over another socket using sendmsg() with SCM_RIGHTS. You could have a bunch of socketpair()s, one between the master and each worker, to accomplish this.
If you are using worker threads rather than worker processes, then you don't have to use the SCM_RIGHTS on a socket trick to send the accepted connections to workers because the file descriptors are already valid in the worker thread (there is no separate file descriptor namespace for separate threads of a single process). In this case you can just use simple signaling like a semaphore and a queue to let the worker know it should pick up a new connection.
The way I would do it would be similar to the Threaded Fortune Cookie example except I wouldn't have the FortuneThread classes derived from QThread itself. I would rather have them derived from QRunnable and then pass them on to QThreadPool.
This way, you can put a limit on the number of active threads running in the system and you don't have to constantly keep creating and destroying threads, you just reuse them until your thread pool has been idle for a while.

Erlang accept incoming tcp connections dynamically

What I am trying to solve: have an Erlang TCP server that listens on a specific port (the code should reside in some kind of external facing interface/API) and each incoming connection should be handled by a gen_server (that is even the gen_tcp:accept should be coded inside the gen_server), but I don't actually want to initially spawn a predefined number of processes that accepts an incoming connection). Is that somehow possible ?
Basic Procedure
You should have one static process (implemented as a gen_server or a custom process) that performs the following procedure:
Listens for incoming connections using gen_tcp:accept/1
Every time it returns a connection, tell a supervisor to spawn of a worker process (e.g. another gen_server process)
Get the pid for this process
Call gen_tcp:controlling_process/2 with the newly returned socket and that pid
Send the socket to that process
Note: You must do it in that order, otherwise the new process might use the socket before ownership has been handed over. If this is not done, the old process might get messages related to the socket when the new process has already taken over, resulting in dropped or mishandled packets.
The listening process should only have one responsibility, and that is spawning of workers for new connections. This process will block when calling gen_tcp:accept/1, which is fine because the started workers will handle ongoing connections concurrently. Blocking on accept ensure the quickest response time when new connections are initiated. If the process needs to do other things in-between, gen_tcp:accept/2 could be used with other actions interleaved between timeouts.
Scaling
You can have multiple processes waiting with gen_tcp:accept/1 on a single listening socket, further increasing concurrency and minimizing accept latency.
Another optimization would be to pre-start some socket workers to further minimize latency after accepting the new socket.
Third and final, would be to make your processes more lightweight by implementing the OTP design principles in your own custom processes using proc_lib (more info). However, this you should only do if you benchmark and come to the conclusion that it is the gen_server behavior that slows you down.
The issue with gen_tcp:accept is that it blocks, so if you call it within a gen_server, you block the server from receiving other messages. You can try to avoid this by passing a timeout but that ultimately amounts to a form of polling which is best avoided. Instead, you might try Kevin Smith's gen_nb_server instead; it uses an internal undocumented function prim_inet:async_accept and other prim_inet functions to avoid blocking.
You might want to check out http://github.com/oscarh/gen_tcpd and use the handle_connection function to convert the process you get to a gen_server.
You should use "prim_inet:async_accept(Listen_socket, -1)" as said by Steve.
Now the incoming connection would be accepted by your handle_info callback
(assuming you interface is also a gen_server) as you have used an asynchronous
accept call.
On accepting the connection you can spawn another ger_server(I would recommend
gen_fsm) and make that as the "controlling process" by calling
"gen_tcp:controlling_process(CliSocket, Pid of spwned process)".
After this all the data from socket would be received by that process
rather than by your interface code. Like that a new controlling process
would be spawned for another connection.

Resources