Could you please explain me the logic of UNIX signal system: firstly it sends SIGHUP signal to process group and then it send SIGCONT signal in spite of the main idea of SIGHUP is "kill yourself, there is no terminal anymore"?
In case the process was stopped with SIGSTOP (which, for example, happens when you press CTRL+Z) and can't respond to SIGHUP because of that.
Related
I have a java process this is getting a signal shutdown. It is one of these SIGTERM, SIGINT, SIGHUP since the shutdown hook is running..
I can't figure out why we are getting the signal. The process runs on ubuntu and I can't find anything in dmesg to indicate the OS sent the signal.
Is there anywhere else that these messages would go? Are there any tools I can attach to the PID to get information about the signal?
Thanks in advance
So I found the OOM message in the syslog. I was expecting it to be in dmesg. My mistake
I have the following question: can I use a signal handler for SIGCHLD and at specific places use waitpid(3) instead?
Here is my scenario: I start a daemon process that listens on a socket (at this point it's irrelevant if it's a TCP or a UNIX socket). Each time a client connects, the daemon forks a child to handle the request and the parent process keeps on accepting incoming connections. The child handling the request needs at some point to execute a command on the server; let's assume in our example that it needs to perform a copy like this:
cp -a /src/folder /dst/folder
In order to do so, the clild forks a new process that uses execl(3) (or execve(3), etc.) to execute the copy command.
In order to control my code better, I would ideally wish to catch the exit status of the child executing the copy with waitpid(3). Moreover, since my daemon process is forking children to handle requests, I need to have a signal handler for SIGCHLD so as to prevent zombie processes from being created.
In my code, I setup a signal handler for SIGCHLD using signal(3), I daemonize my program by forking twice, then I listen on my socket for incoming connections, I fork a process to handle each coming request and my child-process forks a grand-child-process to perform the copy, trying to catch its exit status via waitpid(3).
What happens is that SIGCHLD is caught by my handler when a grand-child-process dies, before waitpid(3) takes action and waitpid(3) returns -1 even though the grand-child-process exits with success.
My first thought was to add:
signal(SIGCHLD, SIG_DFL);
just before forking the child process to handle my connecting clients, without any success. Using SIG_IGN didn't work either.
Is there a suggestion on how to make my scenario work?
Thank you all for your help in advance!
PS. If you need code, I'll post it, but due to its size I decided to do so only if necessary.
PS2. My intention is to use my code in FreeBSD, but my checks are performed in Linux.
EDIT [SOLVED]:
The problem I was facing is solved. The "unexpected" behaviour was caused by my waitpid(3) handling code which was buggy at some point.
Hence, the above method can indeed be used to allow for signal(3) and waitpid(3) coexistence in daemon-like programs.
Thanx for your help and I hope that this method helps someone wishing to accomplish such a thing!
What's the difference between the SIGINT signal and the SIGTERM signal? I know that SIGINT is equivalent to pressing ctrl+c on the keyboard, but what is SIGTERM for? If I wanted to stop some background process gracefully, which of these should I use?
The only difference in the response is up to the developer. If the developer wants the application to respond to SIGTERM differently than to SIGINT, then different handlers will be registered. If you want to stop a background process gracefully, you would typically send SIGTERM. If you are developing an application, you should respond to SIGTERM by exiting gracefully. SIGINT is often handled the same way, but not always. For example, it is often convenient to respond to SIGINT by reporting status or partial computation. This makes it easy for the user running the application on a terminal to get partial results, but slightly more difficult to terminate the program since it generally requires the user to open another shell and send a SIGTERM via kill. In other words, it depends on the application but the convention is to respond to SIGTERM by shutting down gracefully, the default action for both signals is termination, and most applications respond to SIGINT by stopping gracefully.
If I wanted to stop some background process gracefully, which of these should I use?
The unix list of signals date back to the time when computers had serial terminals and modems, which is where the concept of a controlling terminal originates. When a modem drops the carrier, the line is hung up.
SIGHUP(1) therefore would indicate a loss of connection, forcing programs to exit or restart. For daemons like syslogd and sshd, processes without a terminal connection that are supposed to keep running, SIGHUP is typically the signal used to restart or reset.
SIGINT(2) and SIGQUIT(3) are literally "interrupt" or "quit" - "from keyboard" - giving the user immediate control if a program would go haywire. With a physical character based terminal this would be the
only way to stop a program!
SIGTERM(15) is not related to any terminal handling, and can only be sent from another process. This would be the conventional signal to send to a background process.
SIGINT is a program interrupt signal,
which will sent when an user presses Ctrl+C.
SIGTERM is a termination signal, this will sent to an process to request that process termination, but it can be caught or ignored by that specific process.
I have a bash script where i kill a running process by sending the SIGTERM signal to it's process ID. However, i want to know the return code of the process i just sent the signal.
Is that possible?
i cannot use 'wait' because the process to kill was not started from my script and i'm receiving
"pid ##### is not a child of this shell"
I did some tests in a command line, in a console where the process was running, after i send the SIGTERM signal (from another console), i checked the exit code and it was 143.
I want to kill the process from a different script and catch that number.
As shellter said, you cannot get the exit code of a process except using wait (or waitpid(), etc...) and you can only do that if you are its parent.
But even if you could, think about this:
When you send a process a SIGTERM, only one of three things can happen:
The process has not installed any signal handler for SIGTERM. In this case it dies immediately as a result of the signal. But in this case the exit code is uninteresting – you already know what it is. On most platforms it is 143 (128 + integer value of SIGTERM), indicating, unsurprisingly, that the process has died as a result of SIGTERM.
The process has configured SIGTERM to be ignored. In this case, nothing happens, the process does not die, and so there is no exit code to obtain anyway.
The process has installed a signal handler for SIGTERM. In this case, the handler is invoked. The handler might do anything at all: possibly nothing, possibly exit immediately, possibly carry out some cleanup operation and exit later, possibly something completely different. Even if the process does exit, that's only an indirect result of the signal, and it happens at a later time, so there is no exit code to obtain that comes directly from the delivery of the signal.
Apparently, mpirun uses a SIGINT handler which "forwards" the SIGINT signal to each of the processes it spawned.
This means you can write an interrupt handler for your mpi-enabled code, execute mpirun -np 3 my-mpi-enabled-executable and then SIGINT will be raised for each of the three processes. Shortly after that, mpirun exits. This works fine when you have a small custom handler which only prints an error message and then exits. However, when your custom interrupt handler is doing a non-trivial job (e.g. doing serious computations or persisting data), the handler does not run to completion. I'm assuming this is because mpirun decided to exit too soon.
Here's the stderr upon pressing ctrl-c (i.e. causing SIGINT) after executing my-mpi-enabled-executable. This is the desirable expected behavior:
interrupted by signal 2.
running viterbi... done.
persisting parameters... done.
the master process will now exit.
Here's the stderr upon pressing ctrl-c after executing mpirun -np 1 my-mpi-enabled-executable. This is the problematic behavior:
interrupted by signal 2.
running viterbi... mpirun: killing job...
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 8970 on node pharaoh exited on signal 0 (Unknown signal 0).
--------------------------------------------------------------------------
mpirun: clean termination accomplished
Answering any of the following questions will solve my problem:
How to override the mpirun SIGINT handler (if at all possible)?
How to avoid the termination of the processes mpirun spawned right after mpirun terminates?
Is there another signal which mpirun may be sending to the children processes before mpirun terminates?
Is there a way to "capture" the so-called "signal 0 (Unknown signal 0)" (see the second stderr above)?
I'm running openmpi-1.6.3 on linux.
As per the OpenMPI manpage you can send a SIGUSR1 or SIGUSR2 to mpirun which will forward it and not shut down itsself.
When having the same issue, I came across this question and the answer by #Zulan.
In particular I wanted to catch a SIGINT (Ctrl+C) from the user, do some stuff and then exit in an orderly fashion. Thus, using SIGUSR1 was not an option. Reading the man page that #Zulan linked however, shows that mpirun (at least the OpenMPI version) catches a SIGINT and then sends a SIGTERM signal to the child processes. Thus, catching SIGTERM in my code allowed me to call the proper exit routines.
Note that signal handling is not save with MPI as noted here.