How to stop R from leaving zombie processes behind - r

Here is a little reproducible example:
library(doMC)
library(doParallel)
registerDoMC(4)
timing <- system.time( fitall <- foreach(i=1:1000, .combine = "c") %dopar% {
print(i)
})
I start up R and look at the process table:
> system("ps -efl")
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 5 80 0 - 21399 wait 10:58 ? 00:00:00 /usr/local/lib/R/bin/exec/R --no-save --no-restore
0 S chbr 9 1 0 80 0 - 1113 wait 10:58 ? 00:00:00 sh -c ps -efl
0 R chbr 10 9 0 80 0 - 4294 - 10:58 ? 00:00:00 ps -efl
If I use the aformentioned simple for loop doMC or doParallel leave a zombie process behind. Output of ps -efl after running the loop:
> system("ps -efl")
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 4 80 0 - 25256 wait 11:00 ? 00:00:00 /usr/local/lib/R/b
1 Z chbr 10 1 0 80 0 - 0 exit 11:00 ? 00:00:00 [R] <defunct>
0 S chbr 12 1 0 80 0 - 1113 wait 11:00 ? 00:00:00 sh -c ps -efl
0 R chbr 13 12 0 80 0 - 4294 - 11:00 ? 00:00:00 ps -efl
If I repeat the loop without issuing registerDoMC(4) again no additional zombie process gets created. However, if I issue registerDoMC(4) an additional zombie process gets created:
> system("ps -efl")
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
4 S chbr 1 0 0 80 0 - 25554 wait 11:00 ? 00:00:01 /usr/local/lib/R/b
1 Z chbr 21 1 0 80 0 - 0 exit 11:02 ? 00:00:00 [R] <defunct>
1 Z chbr 22 1 0 80 0 - 0 exit 11:02 ? 00:00:00 [R] <defunct>
0 S chbr 26 1 0 80 0 - 1113 wait 11:03 ? 00:00:00 sh -c ps -efl
0 R chbr 27 26 0 80 0 - 4294 - 11:03 ? 00:00:00 ps -efl
That's how I figured it could be doMC which is doing something that should not be done. If doMC is causing this is there a way to stop doMC from leaving zombie processes behind? (stopCluster() does not work as no cluster gets created in the first place.)
> sessionInfo()
R Under development (unstable) (2014-08-16 r66404)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_IE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_IE.UTF-8 LC_COLLATE=en_IE.UTF-8
[5] LC_MONETARY=en_IE.UTF-8 LC_MESSAGES=en_IE.UTF-8
[7] LC_PAPER=en_IE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] doParallel_1.0.8 doMC_1.3.3 iterators_1.0.7 foreach_1.4.2
loaded via a namespace (and not attached):
[1] codetools_0.2-8 compiler_3.2.0

This really has nothing to do with foreach or doMC; as Steve Weston has pointed out in answer to other StackOverflow queries, doMC is essentially just a wrapper for mclapply, and you can see zombie processes created with a simple call to mclapply:
library(parallel)
mclapply(rep(5,4), rnorm)
On my system, this leaves two zombie processes:
[richcalaway#richcalaway-pc ~]$ ps -efl | grep defunct
1 Z 1660945517 28701 28624 0 77 0 - 0 exit 12:00 pts/1 00:00:00 [R] <defunct>
1 Z 1660945517 28702 28624 0 78 0 - 0 exit 12:00 pts/1 00:00:00 [R] <defunct>
0 S 1660945517 28704 28308 0 78 0 - 15306 pipe_w 12:00 pts/2 00:00:00 grep defunct
Under normal circumstances, these zombie processes won't cause any trouble, and they do disappear when the R session ends. You can avoid them by using doParallel and a fork cluster instead of using doMC.
Cheers,
Rich Calaway
Principal Program Manager
Revolution Analytics

Related

"Out of memory" in hibrid MPI/OpenMP for GPU acceleration

I have compiled Quantum ESPRESSO (Program PWSCF v.6.7MaX) for GPU acceleration (hibrid MPI/OpenMP) with the next options:
module load compiler/intel/2020.1
module load hpc_sdk/20.9
./configure F90=pgf90 CC=pgcc MPIF90=mpif90 --with-cuda=yes --enable-cuda-env-check=no --with-cuda-runtime=11.0 --with-cuda-cc=70 --enable-openmp BLAS_LIBS='-lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core'
make -j8 pw
Apparently, the compilation ends succesfully. Then, I execute the program:
export OMP_NUM_THREADS=1
mpirun -n 2 /home/my_user/q-e-gpu-qe-gpu-6.7/bin/pw.x < silverslab32.in > silver4.out
Then, the program starts running and print out the next info:
Parallel version (MPI & OpenMP), running on 8 processor cores
Number of MPI processes: 2
Threads/MPI process: 4
...
GPU acceleration is ACTIVE
...
Estimated max dynamical RAM per process > 13.87 GB
Estimated total dynamical RAM > 27.75 GB
But after 2 minutes of execution the job ends with error:
0: ALLOCATE: 4345479360 bytes requested; status = 2(out of memory)
0: ALLOCATE: 4345482096 bytes requested; status = 2(out of memory)
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[47946,1],1]
Exit code: 127
--------------------------------------------------------------------------
This node has > 180GB of available RAM. I check the Memory use with the top command:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
89681 my_user 20 0 30.1g 3.6g 2.1g R 100.0 1.9 1:39.45 pw.x
89682 my_user 20 0 29.8g 3.2g 2.0g R 100.0 1.7 1:39.30 pw.x
I noticed that the process stops when RES memory reaches 4GB. This are the caracteristics of the node:
(base) [my_user#gpu001]$ numactl --hardware
available: 2 nodes (0-1)
node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41
node 0 size: 95313 MB
node 0 free: 41972 MB
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
node 1 size: 96746 MB
node 1 free: 70751 MB
node distances:
node 0 1
0: 10 21
1: 21 10
(base) [my_user#gpu001]$ free -lm
total used free shared buff/cache available
Mem: 192059 2561 112716 260 76781 188505
Low: 192059 79342 112716
High: 0 0 0
Swap: 8191 0 8191
The version of MPI is:
mpirun (Open MPI) 3.1.5
This node is a compute node in a cluster, but no matter if I submit the job with SLURM or run it directly on the node, the error is the same.
Note that I compile it on the login node and run it on this GPU node, the difference is that on the login node it has no GPU connected.
I would really appreciate it if you could help me figure out what could be going on.
Thank you in advance!

Can a child be killed by a not-parent process?

I'm making a project in which the parent forks a terminator process that kills one random child of the parent. It seems that it creates some problems.
Is it allowed?
Not only is that allowed: any process running as the same user or as the root user can kill any other process.
Running as my user ID in a terminal I can kill anything with my user ID. Even the terminal that I am running in. Or my GUI process.
The only process in Unix OS types that is immune to kill is PID 1 aka init. Because killing init would result in an immediate kernel panic. If PID 1 exits for any reason, such as an internal bug and segmentation fault, there's an immediate kernel panic.
It is allowed. Write the following code in parent.sh
terminator() {
sleep 2;
echo "(terminator) Going to kill Pid $1"
kill -9 "$1" && echo "(terminator) Pid $1 killed"
}
sleep 7 &
sleep 7 &
sleep 7 &
pid=$!
echo "Random pid=${pid} will be killed"
sleep 7 &
sleep 7 &
terminator ${pid} &
echo "All started"
ps -ef | sed -n '1p; /sleep 7/p'
sleep 3
echo "After kill"
ps -ef | sed -n '1p; /sleep 7/p'
Background processes are childs. The terminator child will kill a random other child after 2 seconds.
Random pid=6781 will be killed
All started
UID PID PPID C STIME TTY TIME CMD
notroot 6779 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6780 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6781 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6782 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6783 6777 0 16:59 pts/0 00:00:00 sleep 7
(terminator) Going to kill Pid 6781
(terminator) Pid 6781 killed
parent.sh: line ...: 6781 Killed sleep 7
After kill
UID PID PPID C STIME TTY TIME CMD
notroot 6779 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6780 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6782 6777 0 16:59 pts/0 00:00:00 sleep 7
notroot 6783 6777 0 16:59 pts/0 00:00:00 sleep 7

Optimize CPU Usage in Nginx and PHP5-FPM

I have a site serving Wordpress + Nginx + PHP5-FPM + **WP Super Cache. The server is a Digital Ocean droplet, with 4gb of memory and 2 cores. At times (like yesterday), CPU usage is around 90% and the site will become very slow. Most of the time it is between 30-50%.
Other articles I've read about "high" Nginx and PHP5-FPM usage mention rates of 40%+ for a single process. I'm not experiencing anything that unusual, but I wonder if my statistics are reasonable, given the machine and my Nginx configuration.
I've pasted nginx.conf here: http://pastebin.com/fD9HjqpB
General wordpress configuration file (for Nginx): http://pastebin.com/CVkHZtvU
And the site-specific config here: http://pastebin.com/KsVECwEa
Cpu(s): 47.8%us, 37.3%sy, 0.0%ni, 14.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4049968k total, 1363176k used, 2686792k free, 35376k buffers
Swap: 0k total, 0k used, 0k free, 549984k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
667 mysql 20 0 1314m 85m 7908 S 89 2.2 17:10.21 mysqld
1679 www-data 20 0 229m 54m 3836 S 15 1.4 0:04.23 php5-fpm
1688 www-data 20 0 234m 59m 3696 R 12 1.5 0:01.21 php5-fpm
1682 www-data 20 0 237m 61m 3780 S 10 1.6 0:02.93 php5-fpm
1681 www-data 20 0 231m 56m 3848 S 9 1.4 0:03.26 php5-fpm
1684 www-data 20 0 237m 62m 3816 S 9 1.6 0:02.15 php5-fpm
1680 www-data 20 0 240m 65m 5204 S 7 1.7 0:03.47 php5-fpm
1686 www-data 20 0 237m 62m 3836 S 6 1.6 0:01.33 php5-fpm
1687 www-data 20 0 237m 62m 3820 S 6 1.6 0:00.97 php5-fpm
1691 www-data 20 0 229m 54m 3828 S 6 1.4 0:00.52 php5-fpm
721 www-data 20 0 130m 57m 1104 S 0 1.5 0:00.48 nginx
Is there anything I could be optimizing further to lower CPU usage and avoid the 90%+ usage periods? Obviously we could increase the size of the droplet, but I'm hoping to avoid scaling with hardware vs. just optimizing if there is opportunity.
Any insight would be greatly appreciated!

What happens when you hit ctrl+z on a process?

If I am running a long-running process, and when I stop it with Ctrl+Z, I get the following message in my terminal:
76381 suspended git clone git#bitbucket.org:kevinburke/<large-repo>.git
What actually happens when the process is suspended? Is the state held in memory? Is this functionality implemented at the operating system level? How is the process able to resume execution right where it left off when I restart it with fg?
When you hit Ctrl+Z in a terminal, the line-discipline of the (pseudo-)terminal device driver (the kernel) sends a SIGTSTP signal to all the processes in the foreground process group of the terminal device.
That process group is an attribute of the terminal device. Typically, your shell is the process that defines which process group is the foreground process group of the terminal device.
In shell terminology, a process group is called a "job", and you can put a job in foreground and background with the fg and bg command and find out about the currently running jobs with the jobs command.
The SIGTSTP signal is like the SIGSTOP signal except that contrary to SIGSTOP, SIGTSTP can be handled by a process.
Upon reception of such a signal, the process is suspended. That is, it's paused and still there, only it won't be scheduled for running any more until it's killed or sent a SIGCONT signal to resume execution. The shell that started the job will be waiting for the leader of the process group in it. If it is suspended, the wait() will return indicating that the process was suspended. The shell can then update the state of the job and tell you it is suspended.
$ sleep 100 | sleep 200 & # start job in background: two sleep processes
[1] 18657 18658
$ ps -lj # note the PGID
F S UID PID PPID PGID SID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 10031 18657 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 S 10031 18658 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 R 10031 18692 26500 18692 26500 0 80 0 - 2964 - pts/2 00:00:00 ps
0 S 10031 26500 26498 26500 26500 0 80 0 - 10775 - pts/2 00:00:01 zsh
$ jobs -p
[1] + 18657 running sleep 100 |
running sleep 200
$ fg
[1] + running sleep 100 | sleep 200
^Z
zsh: suspended sleep 100 | sleep 200
$ jobs -p
[1] + 18657 suspended sleep 100 |
suspended sleep 200
$ ps -lj # note the "T" under the S column
F S UID PID PPID PGID SID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 T 10031 18657 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 T 10031 18658 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 R 10031 18766 26500 18766 26500 0 80 0 - 2964 - pts/2 00:00:00 ps
0 S 10031 26500 26498 26500 26500 0 80 0 - 10775 - pts/2 00:00:01 zsh
$ bg %1
[1] + continued sleep 100 | sleep 200
$ ps -lj
F S UID PID PPID PGID SID C PRI NI ADDR SZ WCHAN TTY TIME CMD
0 S 10031 18657 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 S 10031 18658 26500 18657 26500 0 85 5 - 2256 - pts/2 00:00:00 sleep
0 R 10031 18824 26500 18824 26500 0 80 0 - 2964 - pts/2 00:00:00 ps
0 S 10031 26500 26498 26500 26500 0 80 0 - 10775 - pts/2 00:00:01 zsh

Submitting R jobs using PBS

I am submitting a job using qsub that runs parallelized R. My
intention is to have R programme running on 4 different cores rather than 8 cores. Here are some of my settings in PBS file:
#PBS -l nodes=1:ppn=4
....
time R --no-save < program1.R > program1.log
I am issuing the command ta job_id and I'm seeing that 4 cores are listed. However, the job occupies a large amount of memory(31944900k used vs 32949628k total). If I were to use 8 cores, the jobs got hang due to memory limitation.
top - 21:03:53 up 77 days, 11:54, 0 users, load average: 3.99, 3.75, 3.37
Tasks: 207 total, 5 running, 202 sleeping, 0 stopped, 0 zombie
Cpu(s): 30.4%us, 1.6%sy, 0.0%ni, 66.8%id, 0.0%wa, 0.0%hi, 1.2%si, 0.0%st
Mem: 32949628k total, 31944900k used, 1004728k free, 269812k buffers
Swap: 2097136k total, 8360k used, 2088776k free, 6030856k cached
Here is a snapshot when issuing command ta job_id
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1794 x 25 0 6247m 6.0g 1780 R 99.2 19.1 8:14.37 R
1795 x 25 0 6332m 6.1g 1780 R 99.2 19.4 8:14.37 R
1796 x 25 0 6242m 6.0g 1784 R 99.2 19.1 8:14.37 R
1797 x 25 0 6322m 6.1g 1780 R 99.2 19.4 8:14.33 R
1714 x 18 0 65932 1504 1248 S 0.0 0.0 0:00.00 bash
1761 x 18 0 63840 1244 1052 S 0.0 0.0 0:00.00 20016.hpc
1783 x 18 0 133m 7096 1128 S 0.0 0.0 0:00.00 python
1786 x 18 0 137m 46m 2688 S 0.0 0.1 0:02.06 R
How can I prevent other users from using the other 4 cores? I like to mask somehow that my job is using 8 cores with 4 cores idling.
Could anyone kindly help me out on this? Can this be solved using pbs?
Many Thanks
"How can I prevent other users from using the other 4 cores? I like to mask somehow that my job is using 8 cores with 4 cores idling."
Maybe a simple way around it is to send a 'sleep' job on the other 4? Seems hackish though! (ans warning, my PBS is rusty!)
Why not do the following -
ask PBS for ppn=4, additionally, ask for all the memory on the node, i e
#PBS -l nodes=1:ppn=4 -l mem=31944900k
This might not be possible on your setup.
I am not sure how R is parallelized, but if it is OPENMP you could definitely ask for 8 cores but set OMP_NUM_THREADS to 4

Resources