On my hosting account I run chat in Node.js. All works fine however my hosting timeout processes every 12 hours. Apparently when the process is deamonized it will not timeout and so I tried to demonize with:
using Forever.js - running forever start chat.js . Running forever list confirms it runs and ps -ef command shows ? in TTY column
tried nohup node chat.js - running ps -ef TTY column shows pts/0 and PPID is 1
I tried to disconnect stdin, stdout, and stderr, and make it ignore the hangup signal (SIGHUP) so nohup ./myscript 0<&- &> my.admin.log.file & with no luck. ps -ef TTY column is pts/0 and PPID is anything but 1
I tried (nohup ./myscript 0<&- &>my.admin.log.file &) with no luck again. ps -ef TTY column is pts/0 and PPID is 1
After all this process always timouts in about 12hrs.
Now I tried (nohup ./myscript 0<&- &>my.admin.log.file &) & and am waiting, but do not keep my hopes up and need someones help.
Hosting guys claim that daemon processes do not timeout but how can I make sure my process is a daemon? Noting I tried seems to work even though with my limited understanding ps -ef seems to suggest process is deamonized.
What shall I do to demonize the process without moving to much more expensive hosting plans? Can I argue with hosting that after all this porcess is a daemon and they just got it wrong somewhere?
Upstart is a really easy way to daemonize processes
http://upstart.ubuntu.com/
There's some info on using it with node and monit, which will restart Node for you if it crashes
http://howtonode.org/deploying-node-upstart-monit
Related
OK so I have paramiko v2.2.1 and I am trying to login to a machine and restart a service. Inside the service scripts it basically starts a process via nohup. However if I allow paramiko to disconnect as soon as it is done the process started terminates with a PIPE signal when it writes to stdout.
If I start the service by ssh'ing into the box and manually starting it there is no issue and it runs in the background fine. Also if I add long sleep 10 before disconnecting (close) paramiko it also seems to work just fine.
The service is started via a init.d script via a line like this:
env LD_LIBRARY_PATH=$bin_path nohup $bin_path/ServerLoop.sh \
"$bin_path/Service service args" "$#" &
Where ServerLoop.sh simply calls the service forever in a loop like this so it will never die:
SERVER=$1
shift
ARGS=$#
logger $ARGS
while [ 1 ]; do
$SERVER $ARGS
logger "$SERVER terminated with exit code: $STATUS. Server has been restarted"
sleep 1
done
I have noticed when I start the service by ssh'ing into the box I get a nohup.out file written to the root. However when I run through paramiko I get no nohup.out written anywhere on the system ... ie this after I manually ssh into the box and start the service:
root#ts4700:/mnt/mc.fw/bin# find / -name "nohup*"
/usr/bin/nohup
/usr/share/man/man1/nohup.1.gz
/nohup.out
And this is after I run through paramiko:
root#ts4700:/mnt/mc.fw/bin# find / -name "nohup*"
/usr/bin/nohup
/usr/share/man/man1/nohup.1.gz
As I understand it nohup will only redirect the output to nohup.out if "If standard output is a terminal" (from the manual), otherwise it thinks it is saving the output to a file so it does not redirect. Hence I tried the following:
In [43]: import paramiko
In [44]: paramiko.__version__
Out[44]: '2.2.1'
In [45]: ssh = paramiko.SSHClient()
In [46]: ssh.set_missing_host_key_policy(AutoAddPolicy())
In [47]: ssh.connect(ip, username='root', password=not_for_so_sorry, look_for_keys=False, allow_agent=False)
In [48]: stdin, stdout, stderr = ssh.exec_command("tty")
In [49]: stdout.read()
Out[49]: 'not a tty\n'
So I am thinking that nohup is not redirecting to nohup.out when I run it through paramiko because tty is not returning a terminal. I don't know why adding a sleep(10) would fix this though as the service if run on the command line is quite verbose.
I have also noticed that if the service is started from a manual ssh its tty in the ps ax output is still set to the ssh tty ... however if the process is started by paramiko its tty in the ps ax output is set to "?" .. since both processes are run through nohup I would have expected this to be the same.
If the problem is that nohup is indeed not redirecting the output to nohup.out because of the tty is there a way to force this to happen or a better way to run this sort of command via paramiko?
Thanks all, any help with this would be great :)
I use crontask to regularly run Rscript. Unfortunately, I need to do this on a small instance of aws and the process may hang, building more and more processes on top of each other until the whole system is lagging.
I would like to write a crontask to kill all R processes lasting longer than one minute. I found another answer on Stack Overflow that I've adapted that I think would solve the problem. I came up with;
if [[ "$(uname)" = "Linux" ]];then killall --older-than 1m "/usr/lib/R/bin/exec/R --slave --no-restore --file=/home/ubuntu/script.R";fi
I copied the task directly from htop, but it does not work as I expect. I get the No such file or directory error but I've checked it a few times.
I need to kill all R processes that have lasted longer than a minute. How can I do this?
You may want to avoid killing processes from another user and try SIGKILL (kill -9) after SIGTERM (kill -15). Here is a script you could execute every minute with a CRON job:
#!/bin/bash
PROCESS="R"
MAXTIME=`date -d '00:01:00' +'%s'`
function killpids()
{
PIDS=`pgrep -u "${USER}" -x "${PROCESS}"`
# Loop over all matching PIDs
for pid in ${PIDS}; do
# Retrieve duration of the process
TIME=`ps -o time:1= -p "${pid}" |
egrep -o "[0-9]{0,2}:?[0-9]{0,2}:[0-9]{2}$"`
# Convert TIME to timestamp
TTIME=`date -d "${TIME}" +'%s'`
# Check if the process should be killed
if [ "${TTIME}" -gt "${MAXTIME}" ]; then
kill ${1} "${pid}"
fi
done
}
# Leave a chance to kill processes properly (SIGTERM)
killpids "-15"
sleep 5
# Now kill remaining processes (SIGKILL)
killpids "-9"
Why imply an additional process every minute with cron?
Would it not be easier to start R with timeout from coreutils, the processes will then be killed automatically after the time you chose.
timeout [option] duration command [arg]…
I think the best option is to do this with R itself. I am no expert, but it seems the future package will allow executing a function in a separate thread. You could run the actual task in a separate thread, and in the main thread sleep for 60 seconds and then stop().
Previous Update
user1747036's answer which recommends timeout is a better alternative.
My original answer
This question is more appropriate for superuser, but here are a few things wrong with
if [[ "$(uname)" = "Linux" ]];then
killall --older-than 1m \
"/usr/lib/R/bin/exec/R --slave --no-restore --file=/home/ubuntu/script.R";
fi
The name argument is either the name of image or path to it. You have included parameters to it as well
If -s signal is not specified killall sends SIGTERM which your process may ignore. Are you able to kill a long running script with this on the command line? You may need SIGKILL / -9
More at http://linux.die.net/man/1/killall
Below is by code for spawing a fcgi script for nginx.
spawn-fcgi -d /home/ubuntu/workspace -f /home/ubuntu/workspace/index.py -a 127.0.0.1 -p 9001
Now, lets I want to make changes to the index.py script and reload with out bring down the system. How do reload the spawned program so the next connections are using the updated program while the others finish? For now I am killing the spawned process and running command again. I am hoping for something more graceful.
I tried this by the way.
sudo kill -1 `sudo lsof -t -i:9001
I have recently made something similar for node.js.
The idea is to have index.py as a very simple bootstrap script (which doesn‘t actually change much over time). It should catch SIGHUP, and reload/reread the application files (which are expected to change frequently).
I have a shell script that starts an ssh session to a remote host and pipes the output to another, local script, like so:
#!/bin/sh
ssh user#host 'while true ; do get-info ; sleep 1 ; done' | awk -f parse-info.awk
It works fine. I run it under the 'supervise' program from djb's daemontools. The only problem is shutting down the daemon. If I terminate the process for this shell script, the ssh and awk processes continue running as orphans. Normally I would solve this problem with exec to replace the supervising shell process, but the two processes run in their own subshells and can't replace the shell process.
What I would like to do is have the supervising shell script 'forward' any signals it receives to at least one of the child processes, so that I can break the pipe and shut down cleanly. Is there an easy way to do this?
Inter process communications.
You should be looking at pipes, etc.
In UNIX, I have a utility, say 'Test_Ex', a binary file. How can I write a job or a shell script(as a cron job) running always in the background which keeps checking if 'Test_Ex' is still running every 5 seconds(and probably hide this job). If it is running, do nothing. If not, delete a directory at the specified path.
Try this script:
pgrep Test_Ex > /dev/null || rm -r dir
If you don't have pgrep, use
ps -e -ocomm | grep Test_Ex || ...
instead.
Utilities like upstart, originally part of the Ubuntu linux distribution I believe, are good for monitoring running tasks.
The best way to do this is to not do it. If you want to know if Test_Ex is still running, then start it from a script that looks something like:
#!/bin/sh
Test_Ex
logger "Test_Ex died"
rm /p/a/t/h
or
#!/bin/sh
while ! Test_ex
do
logger "Test_Ex terminated unsuccesfully, restarting in 5 seconds"
sleep 5
done
Querying ps regularly is a bad idea, and trying to monitor it from cron is a horrible, horrible idea. There seems to be some comfort in the idea that crond will always be running, but you can no more rely on that than you can rely on the wrapper script staying alive; either one can be killed at any time. Waking up every 10 seconds to query ps is just a waste of resources.