What is a signal in Unix? - unix

This comment confuses me: "kill -l generally lists all signals". I thought that a signal means a quantized amount of energy.
[Added] Please, clarify the (computational) signal in Unix and the physical signal. Are they totally different concepts?
[Added] Are there major differences between paradigms? Is the meaning the same in languages such as C, Python and Haskell? The signal seems to be a general term.

I cannot believe that people are not comparing things such as hardware and software or stressing OS at some points.
Comparison between a signal and an interrupt:
The difference is that while
interrupts are sent to the operating
system by the hardware, signals are
sent to the process by the operating
system, or by other processes. Note
that signals have nothing to do with
software interrupts, which are still
sent by the hardware (the CPU itself,
in this case). (source)
Definitions
process = a program in execution, according to the book below
Further reading
compare the signal to Interrupts and Exceptions
Tanenbaum's book Modern Operating Systems

The manual refers to a very basic mechanism that allow processes or the operation system to notify other processes by sending a signal. The operation system can use it to notify programs about abortions of them (signal SIGABRT) or about a segmentation fault (often caused by accessing a null-pointer, SIGSEGV), to name two of them.
Some unix servers use signals so the administrator can use kill to send them a signal, causing them to re-read their configuration file, without requiring them to restart.
There are default actions taken for some signals and other signals are just ignored. For example on receive of a SIGSEGV, the program terminates, while receiving a SIGCHLD, meaning a child-process died, will by default result in nothing special.
There is a ANSI C standard function that installs a signal handler, which is a function that can execute some code when receiving a signal, called signal (read in man signal). In different unix's, that function behave different, so its usage is discouraged. Its manpage refers to the sigaction function (read man sigaction), which behaves consistent, and is also more powerful.

A physical signal and a Unix signal are indeed different concepts. When a Unix signal is sent from one process to another, there is no specific corresponding physical signal. Unix signals are merely an abstraction so programmers can talk about processes communicating with one another.
Unix signals could have been called messages, events, notifications, or even a made-up term like "frobs". The designers just chose the name "signal", and it stuck.

A signal is a message, either to the target process, or to the OS about the target process. It is part of the unix API (and is defined in various POSIX standards).
Read man kill, man signal, and man sigaction.
Other SO questions that might be helpful:
What is the difference between sigaction and signal?

Some from my notes :
Allows asynchronous communication
Between processes belonging to the
same user
From the system to any process
From the system manager to any process
All associated information is in the signal itself
Many different signals
SIGINT
From the system to all processes
associated to a terminal
Trigger: ^C pressed
Usual way to stop a running process
SIGFPE
From the kernel to a single process
Trigger: error in floating point operation
SIGKILL
To a single process
Stops the execution of the destination process
SIGALRM
From the kernel to a single process
Trigger: timer expiration
SIGTERM
To a single process
Recommends the process to terminate gracefully
SIGUSR1, SIGUSR2
From any process to any other
Without a predefined semantic
Freely usable by programmers
Sending a signal to another process
int kill(pid, signal_ID)
The programmer can decide what to do when a signal
is received
Use the default behavior
Ignore it
Execute a user function
Detecting an interrupted write
if (write(fd, buff, SIZE)<0) {
switch (errno) {
case EINTR:
warning(“Interrupted write\n”);
break;
}
}…

A signal is a message which can be sent to a running process.
For example, to tell the Internet Daemon (inetd) to re-read its configuration file, it should be sent a SIGHUP signal.
For example, if the current process ID (PID) of inetd is 1234, you would type:
kill -SIGHUP 1234

A signal is "an event, message, or data structure transmitted between computational processes" (from Wikipedia).

In this case signal means 'message'. So it's sending a message to a process which can tell the process to do various things.

A unix signal is a kind of message that can be sent to and from unix processes. They can do things like tell a process to quit (SIGKILL) or that a process had an invalid memory reference (SIGSEGV) or that the process was killed by the user hitting control-c (SIGINT).
from a *nix command line type in:
man signal
that will should you all the signals available.

Signal is basically an interrupt that tells the process that a particular event has happened.
Signal generally send by the kernel, meanwhile a process can also send the signal to other process (depends on permission ans all ) by using kill and killall command and a process can send signal to itself by using raise.
Major use of signal:
To handle the interrupt.
Process synchronization.

Signal is an interrupt that used to intimate a process that a particular event has happened.
Signal can be send by kernel to running process or one process to another process.
In bash kill and killall command used to send the signal.

Related

Signals and Interupts When Executing a Program and Killing it

I want to understand better the signals and interupts mechanism in UNIX OS. As far as I understand it, interrupts are used to communicate between the CPU and the OS kernel. Signals are used to communicate between the OS kernel and OS processes.
I'm having some hard time understanding what happened on certain scenarios, and finding which signals and interrupts are being called and when.
For example, when executing a program and killing it using kill pid. Which interrupts are being triggered when typing the name of the program in the shell (e.g. pluma and then kill pluma_id)?
I've tried to use strace when calling the kill command. The first command that is executed is: execve ("/bin/kill", ["kill", "10057"], [/* 47 cars */]) = 0
As far as I see, this is a standard syscall, but I cannot understand which interrupts were triggered and which signals were sent when the keyboard key-down-event has happened. I also cannot understand which signals were sent to the process when it was killed using the kill syscall (maybe it wasn't sent at all?).
What is the full sequence of events (signals, sisals and interrupts) that happens in the following scenario:
Typing plume in shell
Hitting the enter key and executing pluma
Executing kill pluma_id
(Concise description is more than enough, just to understand the general flow)
Typing plume in shell
Keyboard interrupts occur. The CPU receives the keyboard interrupts, executes the handler, reads the keycode and scan code etc. An event in generated in /dev/input/event* which will be read either by a terminal emulator program or will get forwarded to the program by your input system. Your desktop environment, Xserver etc are involved.
Hitting the enter key and executing pluma
Same as above. Upon receiving the enter key, the shell would fork() and exec() pluma.
Executing kill pluma_id
Shell process makes the kill() system call. My manual for kill says "The default signal for kill is TERM. Use -l or -L to list available signals". There will be a context switch when the system call is made. After verifying the permissions, the kernel would find the process table entry for the specified process ID. It will update the signal mask for the process in the PTE with the signal number pluma has received.
Thus the signal is delivered to the process. Now the process needs to handle the signal. If it has installed a signal handler for the particular signal, the handler gets called. Else a default handeler/action will be taken by the kernel. In unix systems, signal handling for a process usually happens during a context switch, ie, when the process switches back to user context or when the process gets scheduled again.
The Design of the UNIX Operating System by Maurice J. Bach has a very simple and detailed explanation of the whole process. You might want to have a look at it.
Underneath kill (the program) used is a kill() system call, and this system call always gets a signal number as an argument.
The command kill just assumes that certain signals are sent by default, e.g.: TERM signal. What you look at strace output is program invocation. You should look deeper into the trace, and find where the system call is called. And then you'll see a numerical value of the signal.
You should take a look at the kill program documentation I think. It mentions which signal is sent to the process by default, if you don't specify the signal explicitly. It also shows you how to send a specific signal, if you want to.

Communication between two programs signals or shared mem?

I need to implement (in Qt) some solution to communicate between two programs running on Linux machine. One program is Worker, and the second is Watchdog. Basically I need Watchdog to periodically check on Worker and in case something wrong (no process,hangup - no answer from Worker) kill Worker (if present) and start it again.
Worker runs as a daemon, so I think starting it from unix /etc/init.d/worker would be appropriate.
I can see two solutions
Unix signals - both of them can send and receive Unix SIGUSR1
Shared memory
Which one to choose?
With signals both of programs will have to know others pid, probably reading from filesystem /var/run so it looks like a drawback.
With shared memory, all I need is "key" that programs will have hardcoded, so no need to read pids from filesystem. Since Watchdog should start first it can create shared mem segment, and Worker will only attach to it and maybe update its timestamp value??? However, to stop Worker by Watchdog (in case of hungup) Watchdog will still need Worker pid to send him SIGKILL, maybe it can read it from shared mem? Both concepts are new to me.
So what is the proper way to build reliable Watchdog, or am I missing something?
best regards
Marek
I think this is the best solution available through Qt:
http://qt-project.org/doc/qt-4.8/qlocalsocket.html
http://qt-project.org/doc/qt-4.8/qlocalserver.html
The QLocalSocket class provides a local socket. On Windows this is a
named pipe and on Unix this is a local domain socket.
http://qt-project.org/doc/qt-4.8/ipc-localfortuneserver.html
http://qt-project.org/doc/qt-4.8/ipc-localfortuneclient.html
Hope that helps.

Kill an mpi process

I would like to know if there is a way that an MPI process send a kill signal to another MPI process?
Or differently, is there a way to exit from an MPI environment graciously, when one of the process is still active? (i.e. mpi_abort() prints an error message).
Thanks
No, this is not possible within an MPI application using the MPI library.
Individual processes would not be aware of the location of the other processes, nor of the process IDs of the other processes - and there is nothing in the MPI spec to make the kill you are wanting.
If you were to do this manually, then you'd need to MPI_Alltoall to exchange process IDs and hostnames across the system, and then you would need to spawn ssh/rsh to visit the required node when you wanted to kill something. All in all, it's not portable, not clean.
MPI_Abort is the right way to do what you are trying to achieve. From the Open MPI manual:
"This routine makes a "best attempt" to abort all tasks in the group of comm." (ie. MPI_Abort(MPI_COMM_WORLD, -1) is what you need.
Any output during MPI_Abort would be machine specific - so you may, or may not, receive the error message you mention.

if interrupt happens how does unix kernel determine which process its for

Lets say Unix is executing process A and an interrupt at higher level occurs. Then OS get a interrupt number and from IVT it looks up the routine to call.
Now how does the OS know that this interrupt was for process A and not for process B. It might have been that process B might have issued a disk read and it came back while OS was executing process A.
Thanks
Start with this: http://en.wikipedia.org/wiki/MINIX
Go buy the book and read it; it will really help a lot.
Interrupts aren't "for" processes. They're for devices and handled by device drivers.
The device driver handles the interrupt and updates the state of the device.
If the device driver concludes that an I/O operation is complete, it can then update the its queue of I/O requests to determine which operation completed. The operation is removed from the queue of pending operations.
The process which is waiting for that operation is now ready-to-run and can resume execution.
You are talking about a hardware interrupt and these are not targeted at processes.
If a process A requests a file, the filesystem layer, which already resides in the kernel, will fetch the file from the block device. The block device itself is handled by a driver.
When the interrupt occurs, triggered by the block device, the OS has this interrupt associated with the driver. So the driver is told to handle the interrupt. It will then query which blocks were read and see for what it requested them.
After the filesystem is told that the requested data is ready, it may further process it. Then, the process leaves blocked state.
In the next round of the scheduler, the scheduler may select to wake up this process. It may also select to wake up another process first.
As you can see, the interrupt occurance is fully disconnected from the process operation.

Does the Unix kill command ensure that dynamically allocated memory will return properly?

I found a bunch of scripts in the project I have been newly assigned to that are the "shutdown" scripts. They just do some basic searches and run the Unix kill command. Is there any reason they shouldn't shutdown the process this way? Does this ensure that dynamically allocated memory will return properly? Are there any other negative effects? I've operated under an intuition that this is a last resort way of terminating a process.
The kill command sends a signal to a Unix process. That signal defaults to SIGTERM, which is a polite request for the program to exit.
When a process exits for any reason, the Unix OS does clean up its memory allocations, file handles and other resources. The only resources that do not get cleaned up are those that are supposed to be shared, like the contents of files and of shared memory (like System V IPC).
Many programs do not need to do any special cleanup on exit and use the default SIGTERM behavior, which is to let the OS stop the process.
If a program does need special behavior, it can install a signal handler, and it can then run a function to handle the signal.
Now the SIGKILL signal, which is number 9, is evil, but also necessary. This signal never gets to the process itself, the OS simple stops the process. This should only be used when really, really necessary. It often becomes necessary in multithreaded programs that get into deadlocks or programs that have installed a TERM signal handler, but screwed up during their exit process.
kill is a polite request for the program to end. It cleans up its memory, closes its handles and other such niceities. It sends a SIGTERM
kill -9 tells the operating system to grab the process by the balls and throw it the hell out of the bar. Obivously it is not concerned with niceities - although it does reclaim all the memory, as it's the Operating System's responsability to keep track of that. But because it's a forceful shutdown you may have problems when trying to run the program again (not cleaning up .pid files for example)
See also [wikipedia](http://en.wikipedia.org/wiki/Kill_(Unix)
Each process runs in its own protected address space, and when the process ends (whether it exits voluntarily or is killed by an external signal) that address space is fully reclaimed. So yes, all if its memory is released properly.
Depending on the process, it may or may not cause other problems next time you try to run it. For example, it may have some files open and leave them in an inconsistent state if it's killed unexpectedly. (The files will be closed automatically, but it could be in the middle of writing some application data, for example, and the files may contain incomplete/inconsistent data if interrupted.)
Typically when the system is shutting down, all processes will be sent signal 15 (SIGTERM), at which they can perform whatever cleanup/shutdown actions they need to do. Then a short time later, they'll get signal 9 (SIGKILL), which immediately kills them, without giving them any chance to react in any way. This gives all processes a chance to clean up for themselves, and then forcefully kills any processes that aren't responding promptly.
kill -9
is the last resort, not kill.
Yes memory is reclaimed (this is the OS's responsibility)
The programs can respond to the signal however they want, it's up to the particular program to do "the right thing"
kill by default will send a terminate signal which will allow the process to exit gracefully. If the process does not seem to exit in a timely fashion, some scripts will then fall back on kill -9 which forces an exit, 'ready or not'.
In all cases OS managed things such as dynamic memory will be returned, files closed etc. But application level things may not be tidied up on a -9 kill.
kill merely sends a signal to the process. The process can trap signals (except for signal 9) and run code to perform shutdown. An app's shutdown is supposed to be brief, but it may not be instantaneous.
In any case, once the process exits, the operating system will reclaim dynamically allocated memory, close open file descriptors, and other resources.
There could be some resources that survive, for example if the app held shared memory or sockets that are also held by other (still living) processes.

Resources