I executed a perl script in background using the following command
nohup perl myPerlSCript.pl >debug_log &
After few minutes I got the status as
[1]+ Stopped
I wasn't expecting it to stop, nor do I know what stopped it. How can I debug this and find out why it stopped? I am actually interested in knowing the unix commands to debug.
There are several ways a process running in the background can be stopped. All of them involve one of these signals:
SIGSTOP
SIGTSTP
SIGTTOU
SIGTTIN
SIGSTOP is severe. It's unblockable, unignorable, unhandlable. It stops the process as surely as SIGKILL would kill it. The others can be handled by the background process to prevent stopping.
A signal was sent by another process using kill(2), or by the process to itself using raise(3) or kill(2)
The process attempted to write to the terminal, and the terminal option tostop is enabled (see output of stty -a). This generates SIGTTOU.
The process attempted to change the terminal modes with tcsetattr(3) or an equivalent ioctl. (These are the same modes shown by stty.) This generates SIGTTOU regardless of the current state of the tostop flag.
The process attempted to read from the terminal. This generates SIGTTIN.
This list is probably very incomplete.
Are you using tcsh by any chance? Tcsh actually comes with a built-in nohup command that I've had lots of problems with before, seeing the exact behavior you're seeing.
Try using /usr/bin/nohup directly if that is the case.
Related
We have a process (written in c++ /managed), which receives network data via tcpip.
After running the process for a while while tracking network load, it seems that network get into freeze state and the process does not getting data, there are other processes in the system that using networking (same nic) which operates normally.
the process gets out of this frozen situation by itself after several minutes.
Any idea what is happening?
Any counter i can track to see if my process reach some limitations ?
It is going to be very difficult to answer specifically,
-- without knowing what exactly is your process/application about,
-- whether it is a network chat application, or a file server/client, or ......
-- without other details about your process how it is implemented, what libraries it uses, if relevant to problem.
Also you haven't mentioned what OS and environment you are running this process under,
there is very little anyone can help . It could be anything, a busy wait loopl in your code, locking problems if its a multi-threaded code,....
Nonetheless , here are some options to check:
If its linux try below commands to debug and monitor the behaviour of the process and see what could be problem-
top
Check top to see ow much resources(CPU, memory) your process is using and if there is anything abnormally high values in CPU usage for it.
pstack
This should stack frames of the process executing at time of the problem.
netstat
Run this with necessary options (tcp/udp) to check what is the stae of the network sockets opened by your process
gcore -s -c
This forces your process to core when the mentioned problem happens, and then analyze that core file using gdb
gdb
and then use command where at gdb prompt to get full back trace of the process (which functions it was executing last and previous function calls.
I have a little program written in Python (version 2.7.3 on Linux) that runs an external command on a number of files in a loop. It does this using subprocess.check_output and it then writes data from the output into a Sqlite3 database. I have 2 problems:
When I hit Ctrl+C to stop the program, all that happens is the executing subprocess is killed and the main program just continues with the next iteration and launches the next subprocess.
If I force-kill it with a kill -9 from another window it does exit but the databse file does not contain any changes.
I have been reading for several hours and trying various things including signal handlers, try/finally and so-on. I have so far been unable to do what ought to be very simple.
So, how can I have a Python program accept a Ctrl+C and cleanly terminate so that its subprocess ends and its sqlite3 database is correctly saved?
(just adding that, if the program is left to run to completion, it does exit cleanly and the database file is updated as expected)
There is a very good discussion/resources on understanding killing child processes here: How does Ctrl-C terminate a child process?
The sql issue(without more info) sounds like you're interrupting the process from committing the data when you kill the process immaturely
As I recall, you can check for Ctrl-C using try/except:
try:
#code here...
except KeyboardInterrupt:
#cleanup code...
I do this all the time during development so that I can cleanly stop if I find something wrong.
When I start a process in background in a terminal and some how if terminal gets closed then we can not interact that process any more. I am not sure but I think process also get killed. Can any one please tell me how can I detach that process from my terminal. So even if I close terminal then I can interact with same process in new terminal ?
I am new to unix so your extra information will help me.
The command you're looking for is disown.
disown <processid>
This is as close as you can get to a nohup. It detaches the process from the current login and allows it to continue running. Thanks David Korn!
http://www2.research.att.com/~gsf/man/man1/disown.html
and I just found reptyr which lets you reparent a disowned process.
https://github.com/nelhage/reptyr
It's already in the packages for ubuntu.
BUT if you haven't started the process yet and you're planning on doing this in the future then the way to go is screen and tmux. I prefer screen.
You might also consider the screen command. It has the "restore my session" functionality. Admittedly I have never used it, and forgot about it.
Starting the process as a daemon, or with nohup might not do everything you want, in terms of re-capturing stdout/stdin.
There's a bunch of examples on the web. On google try, "unix screen command" and "unix screen tutorial":
http://www.thegeekstuff.com/2010/07/screen-command-examples/
GNU Screen: an introduction and beginner's tutorial
First google result for "UNIX demonizing a process":
See the daemon(3) manpage for a short overview. The main thing of daemonizing
is going into the background without quiting or holding anything up. A list of
things a process can do to achieve this:
fork()
setsid()
close/redirect stdin/stdout/stderr to /dev/null, and/or ignore SIGHUP/SIGPIPE.
chdir() to /.
If started as a root process, you also want to do the things you need to be root
for first, and then drop privileges. That is, change effective user to the "daemon"
user or "nobody" with setuid()/setgid(). If you can't drop all privileges and need
root access sometimes, use seteuid() to temporary drop it when not needed.
If you're forking a daemon then also setup child handlers and, if calling exec,
set the close on exec flags on all file descriptors your children won't need.
And here's a HOWTO on creating Unix daemons: http://www.netzmafia.de/skripten/unix/linux-daemon-howto.html
'Interact with' can mean a couple of things.
The reason why a program, started at the command-line, exits when the terminal ends, is because the shell, when it exits, sends that process a HUP signal (see documentation for kill(1) for some introduction; HUP, by the way, is short for 'hang up', and originally indicated that the user had hung up the modem/telephone). The default response to a HUP signal is that a process is terminated – that is, the invoked program exits.
The details are slightly more fiddly, but this is the general intuition.
The nohup command tells the shell to start the program, and to do so in a way that this HUP signal is ignored. That is, the program keeps going after the invoking terminal exits.
You can still interact with this program by sending it signals (see kill(1) again), but this is a very limited sort of interaction, and depends on your program being written to do sensible things when it receives those signals (signals USR1 and USR2 are useful things to trap, if you're into that sort of thing). Alternatively, you can interact via named pipes, or semaphores, or other bits of inter-process communication (IPC). That gets fiddly pretty quickly.
I suspect what you're after, though, is being able to reattach a terminal to the process. That's a rather more complicated process, and applications like screen do suitably complicated things behind the scenes to make that happen.
The nohup thing is a sort of quick-and-dirty daemonisation. The daemon(3) function does the daemonisation 'properly', doing various bits of tidy-up as described in YePhIcK's answer, to comprehensively break the link with the process/terminal that invoked it. You can interact with that daemonised process with the same IPC tools as above, but not straightforwardly with a terminal.
I'm using mac/linux and I know that ctrl-z stops the currently running command in terminal, but I frequently see the process is still running when i check the system monitor. What is the right way to stop a command in terminal?
Typically I run into this issue when running python or ruby apps, i'm not sure if that has something to do with it, just thought I would add that.
Using control-z suspends the process (see the output from stty -a which lists the key stroke under susp). That leaves it running, but in suspended animation (so it is not using any CPU resources). It can be resumed later.
If you want to stop a program permanently, then any of interrupt (often control-c) or quit (often control-\) will stop the process, the latter producing a core dump (unless you've disabled them). You might also use a HUP or TERM signal (or, if really necessary, the KILL signal, but try the other signals first) sent to the process from another terminal; or you could use control-z to suspend the process and then send the death threat from the current terminal, and then bring the (about to die) process back into the foreground (fg).
Note that all key combinations are subject to change via the stty command or equivalents; the defaults may vary from system to system.
if you do ctrl-z and then type exit it will close background applications.
Ctrl+Q is another good way to kill the application.
Take a look at Job Control on UNIX systems
If you don't have control of your shell, simply hitting ctrl + C should stop the process. If that doesn't work, you can try ctrl + Z and using the jobs and kill -9 %<job #> to kill it. The '-9' is a type of signal. You can man kill to see a list of signals.
In a UNIX-y way, I'm trying to start a process, background it, and tie the lifetime of that process to my shell.
What I'm talking about isn't simply backgrounding the process, I want the process to be sent SIGTERM, or for it to have an open file descriptor that is closed, or something when the shell exits, so that the user of the shell doesn't have to explicitly kill the process or get a "you have running jobs" warning.
Ultimately I want a program that can run, uniquely, for each shell and carry state along with that shell, and close when the shell closes.
IBM's DB2 console commands work this way. When you connect to the database, it spawns a "db2bp" process, that carries the database state and connection and ties it to your shell. You can connect in multiple different terminals or ssh connections, each with its own db2bp process, and when those are closed the appropriate db2bp process dies and that connection is closed.
DB2 queries are then started with the db2 command, which simply hands it off to the appropriate db2bp process. I don't know how it communicates with the correct db2bp process, but maybe it uses the tty device connected to stdin as a unique key? I guess I need to figure that out too.
I've never written anything that does tty manipulation, so I have no clue where to even start. I think I can figure the rest out if I can just spawn a process that is automatically killed on shell exit. Anyone know how DB2 does it?
If your shell isn't a subshell, you can do the following; Put the following into a script called "ttywatch":
#!/usr/bin/perl
my $p=open(PI, "-|") || exec #ARGV; sleep 5 while(-t); kill 15,$p;
Then run your program as:
$ ttywatch commandline... & disown
Disowning the process will prevent the shell from complaining that there are running processes, and when the terminal closes, it will cause SIGTERM (15) to be delivered to the subprocess (your app) within 5 seconds.
If the shell isn't a subshell, you can use a program like ttywrap to at least give it its own tty, and then the above trick will work.
Okay, I think I figured it out. I was making it too complicated :)
I think all db2 is daemon-izing db2bp, then db2bp is calling waitpid on the parent PID (the shell's PID) and exiting after waitpid returns.
The communication between the db2 command and db2bp seems to be done via fifo with a filename based on the parent shell PID.
Waaaay simpler than I was thinking :)
For anyone who is curious, this whole endeavor was to be able to tie a python or groovy interactive session to a shell, so I could test code while easily jumping in and out of a session that would retain database connections and temporary classes / variables.
Thank you all for your help!
Your shell should be sending a SIGHUP signal to any running child processes when it shuts down. Have you tried adding a SIGHUP handler to your application to shut it down cleanly
when the shell exits?
Is it possible that your real problem here is the shell and not your process. My understanding agrees with Jim Lewis' that when the shell dies its children should get SIGHUP. But what you're complaining about is the shell (or perhaps the terminal) trying to prevent you from accidentally killing a running shell with active children.
Consider reading the manual for the shell or the terminal to see if this behavior is configurable.
From the bash manual on my MacBook:
The shell exits by default upon receipt of a SIGHUP. Before exiting, an interactive shell resends the SIGHUP
to all jobs, running or stopped. Stopped jobs are sent SIGCONT to ensure that they receive the SIGHUP. To
prevent the shell from sending the signal to a particular job, it should be removed from the jobs table with
the disown builtin (see SHELL BUILTIN COMMANDS below) or marked to not receive SIGHUP using disown -h.
If the huponexit shell option has been set with shopt, bash sends a SIGHUP to all jobs when an interactive
login shell exits.
which might point you in the right direction.