Running Airflow 1.9.0 with python 2.7. How do I gracefully stop a DAG?
In this case, I have a DAG that's running a file upload with bad code that causes everything to take 4 times as long, and I'd really prefer not to have to wait a day for it to finally time out (timeout is set to 10 hours).
The DAG looks for a tar file. When it finds one, it goes through every file in the tar, looking for files to process, and processes them.
I could not find any way to stop the DAG. I tried clicking on the "Running" circle in the "DAG Runs" column (the one to the right). It let me select the process and mark it as "failed". But it didn't stop running.
I tried clicking on the "Running" circle in the "Recent Tasks" column (the one to the left). It let me select processes, but trying to set them to filed (or to success), generated an exception in Airflow
Browse -> DAG Runs -> Checkbox -> With Selected "Delete"
If you constructed your process the way it sounds like you did, you won't be able to stop it from Airflow. You'll need to find the process identifier that is executing and forcibly terminate it to get it to actually stop.
You can add a Sensor Operator which runs in parallel with your Task which you want killed.
The sensor should monitor the value of some Variable and should complete when it detects a certain value in the Variable, like 'STOP' let's say.
The Sensor should be followed by a BashOperator Task which shoul d kill your long running Task using a command like below:
kill -9 $(ps -ef| grep | grep | grep -v grep | awk '{print $2;}')
Related
I am running on a python script on Putty using & at the end of the command. When I checked ps -p processid, it showed some name under TTY. After some time internet got disconnected. After connecting back, I check the process status using ps -p processid but this time I found '?' under TTY. Does this mean my script broke?
No, it only means that the TTY, which the script was started from, was closed in the meantime.
If you want to know more about the status, look into the STAT column:
It shows the status of the process. S stands for sleeping: the process
is waiting for something to happen. Z stands for a zombied process. A
zombied processes is one whose parent has died, leaving the child
processes behind. This is not a good thing. D stands for a process
that has entered an uninterruptible sleep. Often, these processes
refuse to die even when passed a SIGKILL. You can read more about
SIGKILL later in the next section on kill . W stands for paging. A
dead process is marked with an X. A process marked T is traced, or
stopped. R means that the process is runable.
source: http://www.slackbook.org/html/process-control-ps.html
The command ls -lu script_name.sh only gives last access time of script.
Is there any way to determine.
Processes in Linux do not normally leave traces after they terminate, unless they create or modify files, write syslog messages, or audit subsystem is on and it keeps track of exec* calls.
I have a little program written in Python (version 2.7.3 on Linux) that runs an external command on a number of files in a loop. It does this using subprocess.check_output and it then writes data from the output into a Sqlite3 database. I have 2 problems:
When I hit Ctrl+C to stop the program, all that happens is the executing subprocess is killed and the main program just continues with the next iteration and launches the next subprocess.
If I force-kill it with a kill -9 from another window it does exit but the databse file does not contain any changes.
I have been reading for several hours and trying various things including signal handlers, try/finally and so-on. I have so far been unable to do what ought to be very simple.
So, how can I have a Python program accept a Ctrl+C and cleanly terminate so that its subprocess ends and its sqlite3 database is correctly saved?
(just adding that, if the program is left to run to completion, it does exit cleanly and the database file is updated as expected)
There is a very good discussion/resources on understanding killing child processes here: How does Ctrl-C terminate a child process?
The sql issue(without more info) sounds like you're interrupting the process from committing the data when you kill the process immaturely
As I recall, you can check for Ctrl-C using try/except:
try:
#code here...
except KeyboardInterrupt:
#cleanup code...
I do this all the time during development so that I can cleanly stop if I find something wrong.
Is there a way, from the command line utility in unix (more specifically linux) to pipe input to a process knowing it's PID. For example, I start a Python process in the background, and keep track of the PID. Then, using the PID, and the command line, decide to execute "print 'Hello World'", and wish to receive the output to my terminal. Is it possible to do this?
On Linux, you can use the 'jobs' command to get the job number of the program you put into the background.
Then you can use the 'fg' command to bring that program to the foreground.
Say your python program is job 3. Calling 'fg 3' will bring the program to the foreground.
Not sure if this is what you're looking for. If not, it might help to elaborate on your example.
In a UNIX-y way, I'm trying to start a process, background it, and tie the lifetime of that process to my shell.
What I'm talking about isn't simply backgrounding the process, I want the process to be sent SIGTERM, or for it to have an open file descriptor that is closed, or something when the shell exits, so that the user of the shell doesn't have to explicitly kill the process or get a "you have running jobs" warning.
Ultimately I want a program that can run, uniquely, for each shell and carry state along with that shell, and close when the shell closes.
IBM's DB2 console commands work this way. When you connect to the database, it spawns a "db2bp" process, that carries the database state and connection and ties it to your shell. You can connect in multiple different terminals or ssh connections, each with its own db2bp process, and when those are closed the appropriate db2bp process dies and that connection is closed.
DB2 queries are then started with the db2 command, which simply hands it off to the appropriate db2bp process. I don't know how it communicates with the correct db2bp process, but maybe it uses the tty device connected to stdin as a unique key? I guess I need to figure that out too.
I've never written anything that does tty manipulation, so I have no clue where to even start. I think I can figure the rest out if I can just spawn a process that is automatically killed on shell exit. Anyone know how DB2 does it?
If your shell isn't a subshell, you can do the following; Put the following into a script called "ttywatch":
#!/usr/bin/perl
my $p=open(PI, "-|") || exec #ARGV; sleep 5 while(-t); kill 15,$p;
Then run your program as:
$ ttywatch commandline... & disown
Disowning the process will prevent the shell from complaining that there are running processes, and when the terminal closes, it will cause SIGTERM (15) to be delivered to the subprocess (your app) within 5 seconds.
If the shell isn't a subshell, you can use a program like ttywrap to at least give it its own tty, and then the above trick will work.
Okay, I think I figured it out. I was making it too complicated :)
I think all db2 is daemon-izing db2bp, then db2bp is calling waitpid on the parent PID (the shell's PID) and exiting after waitpid returns.
The communication between the db2 command and db2bp seems to be done via fifo with a filename based on the parent shell PID.
Waaaay simpler than I was thinking :)
For anyone who is curious, this whole endeavor was to be able to tie a python or groovy interactive session to a shell, so I could test code while easily jumping in and out of a session that would retain database connections and temporary classes / variables.
Thank you all for your help!
Your shell should be sending a SIGHUP signal to any running child processes when it shuts down. Have you tried adding a SIGHUP handler to your application to shut it down cleanly
when the shell exits?
Is it possible that your real problem here is the shell and not your process. My understanding agrees with Jim Lewis' that when the shell dies its children should get SIGHUP. But what you're complaining about is the shell (or perhaps the terminal) trying to prevent you from accidentally killing a running shell with active children.
Consider reading the manual for the shell or the terminal to see if this behavior is configurable.
From the bash manual on my MacBook:
The shell exits by default upon receipt of a SIGHUP. Before exiting, an interactive shell resends the SIGHUP
to all jobs, running or stopped. Stopped jobs are sent SIGCONT to ensure that they receive the SIGHUP. To
prevent the shell from sending the signal to a particular job, it should be removed from the jobs table with
the disown builtin (see SHELL BUILTIN COMMANDS below) or marked to not receive SIGHUP using disown -h.
If the huponexit shell option has been set with shopt, bash sends a SIGHUP to all jobs when an interactive
login shell exits.
which might point you in the right direction.