I've noticed strange behaviour with launching parallel processes in R that only appears when R is built with icc. The spawned parallel processes are not killed when the main process ends.
Example code is as follows:
library(foreach)
library(doMC)
registerDoMC(cores=4)
d <- rep(1,16)
t <- foreach(i=1:4, .combine=c) %dopar% {
s <- foreach(1:4, .combine=c) %do% 1*1
}
identical(t, d)
Here we see the 4 spawned process are orphaned at the completion of the script.
build$ Rscript HungRProcs.R
Loading required package: iterators
Loading required package: parallel
[1] TRUE
build$ ps -elf | grep R
F S UID PID PPID C PRI NI ADDR SZ WCHAN STIME TTY TIME CMD
1 S root 1427 2 0 80 0 - 0 worker May15 ? 00:00:00 [SCIF INTR 0]
0 S build 19173 26999 0 80 0 - 35960 poll_s 12:27 pts/1 00:00:00 vim RStats-3.0.3-dw.spec
1 S walling 24425 1 1 80 0 - 51468 hrtime 13:11 pts/5 00:00:00 /home1/00157/walling/software/R-3.1.0/bin/exec/R --slave --no-restore --file=HungRProcs.R --args
1 S walling 24426 1 1 80 0 - 51468 hrtime 13:11 pts/5 00:00:00 /home1/00157/walling/software/R-3.1.0/bin/exec/R --slave --no-restore --file=HungRProcs.R --args
1 S walling 24427 1 1 80 0 - 51468 hrtime 13:11 pts/5 00:00:00 /home1/00157/walling/software/R-3.1.0/bin/exec/R --slave --no-restore --file=HungRProcs.R --args
1 S walling 24428 1 1 80 0 - 51468 hrtime 13:11 pts/5 00:00:00 /home1/00157/walling/software/R-3.1.0/bin/exec/R --slave --no-restore --file=HungRProcs.R --args
0 R walling 24430 21882 0 80 0 - 27561 - 13:11 pts/5 00:00:00 ps -elf
0 S walling 24431 21882 0 80 0 - 25814 pipe_w 13:11 pts/5 00:00:00 grep R
The configure used for the icc build is as follows:
build$ ./configure --prefix=/home1/00157/walling/software/R-3.1.0 CC=icc F77=ifort FC=ifort CXX=icpc
If built with gcc, the spawned processes are terminated when the main process completes. The configure used for the gcc build is as follows:
build$ ./configure --prefix=/home1/00157/walling/software/R-3.1.0 CC=gcc F77=gfortran FC=gfortran CXX=gcc
I have run tests against both R 3.0.3 and 3.1.0, different parallel backends via doMC, doSNOW and straight mclapply. I've also tested with multiple versions of the GNU compiler and Intel compiler and on both Centos 5.10 and 6.5. All tests cases have resulted in the same behaviour.
Any ideas why the compiler would affect proper termination of spawned sub-processes?
Related
I'm trying to make a call from within R to execute BASH commands, to get my feet wet:
I wanted to simply capture a listing of my current files located in a specific directory through use of the "ls -al" command. The output would be sent to text file called a01_test.txt.
The directory I would like to capture the contents of is "C:\Users\user00\a01_TEST" which is referenced as "/mnt/c/Users/user00/a01_TEST/" from a WSL Ubuntu 20.04.5 LTS perspective.
The directory contains five (5) files: file_01.txt, file_02.txt ,..., file_05.txt.
FYI, I am running R (R version 4.2.0 (2022-04-22 ucrt)) via RStudio (2022.07.1 Build 554) on Windows 11 (Version 10.0.22000 Build 22000).
I tried:
PATH_UNIX <- "/mnt/c/Users/user00/a01_TEST/"
FILENAME_TEST <-"a01_test.txt"
paste0("system(\"bash -c \'ls -al ",PATH_UNIX," >",PATH_UNIX,FILENAME_TEST,"\'\")")
However that only returned a command prompt -- nothing else:
> paste0("system(\"bash -c \'ls -al ",PATH_UNIX," >",PATH_UNIX,FILENAME_TEST,"\'\")")
[1] "system(\"bash -c 'ls -al /mnt/c/Users/user00/a01_TEST/ >/mnt/c/Users/user00/a01_TEST/a01_test.txt'\")"
>
I thought one could test the code using:
cat(print(paste0("system(\"bash -c \'ls -al ",PATH_UNIX," >",PATH_UNIX,FILENAME_TEST,"\'\")")))
which resulted in:
> cat(print(paste0("system(\"bash -c \'ls -al ",PATH_UNIX," >",PATH_UNIX,FILENAME_TEST,"\'\")")))
[1] "system(\"bash -c 'ls -al /mnt/c/Users/user00/a01_TEST/ >/mnt/c/Users/user00/a01_TEST/a01_test.txt'\")"
system("bash -c 'ls -al /mnt/c/Users/user00/a01_TEST/ >/mnt/c/Users/user00/a01_TEST/a01_test.txt'")
If I do not use variables, such as, PATH_UNIX and FILENAME_TEST and code the entire path manually, I can create a text file (a01_test.txt) giving me the desired listing of the directory's contents:
system("bash -c 'ls -al /mnt/c/Users/user00/a01_TEST > /mnt/c/Users/user00/a01_TEST/a01_test.txt'")
which results in:
> system("bash -c 'ls -al /mnt/c/Users/user00/a01_TEST > /mnt/c/Users/user00/a01_TEST/a01_test.txt'")
[1] 0
>
giving me the file called "a01_test.txt" containing the directory's contents:
total 0
drwxrwxrwx 1 user00 user00 4096 Nov 3 2022 .
drwxrwxrwx 1 user00 user00 4096 Nov 3 05:07 ..
-rwxrwxrwx 1 user00 user00 0 Nov 3 2022 a01_test.txt
-rwxrwxrwx 1 user00 user00 0 Nov 3 05:26 file_01.txt
-rwxrwxrwx 1 user00 user00 0 Nov 3 05:26 file_02.txt
-rwxrwxrwx 1 user00 user00 0 Nov 3 05:26 file_03.txt
-rwxrwxrwx 1 user00 user00 0 Nov 3 05:26 file_04.txt
-rwxrwxrwx 1 user00 user00 0 Nov 3 05:26 file_05.txt
Any assistance to make use of the variables PATH_UNIX & FILENAME_TEST to make a call to Linux/Unix to obtain a directory listing would be appreciated.
sprintf (?sprintf for further details) is a convenient way to create format strings that can subsequently be passed to system:
PATH_UNIX <- '/mnt/c/Users/user00/a01_TEST/'
FILENAME_TEST <- 'a01_test.txt'
cmdstr <- sprintf('bash -c \'ls -al %s > %s\'', PATH_UNIX, FILENAME_TEST)
message('bash command string = ', cmdstr)
system(command = cmdstr)
Expanding on the solution provided by br00t, and doing some testing, one could also use the paste0() function:
# DESIRED CMD TO BE PASSED VIA BASH
cat(paste0("system(bash -c \'ls -al ",PATH_UNIX," >",PATH_UNIX,FILENAME_TEST,"\')"))
# OUTPUT:
# system(bash -c 'ls -al /mnt/c/Users/user00/a01_TEST/ >/mnt/c/Users/user00/a01_TEST/a01_test.txt')
# PLACE DESIRED CMD IN A VAR:
cmdstr_test <- paste0("bash -c \'ls -al ",PATH_UNIX," > ",PATH_UNIX,FILENAME_TEST,"\'")
# CHECK VAR:
message('bash command string = ', cmdstr_test)
# OUTPUT:
# bash command string = bash -c 'ls -al /mnt/c/Users/user00/a01_TEST/ > /mnt/c/Users/user00/a01_TEST/a01_test.txt'
# RUN COMMAND USING system() function:
system(command = cmdstr_test)
# OUTPUT (Will get "0", if successful)
> system(command = cmdstr_test)
[1] 0
>
I try to use the following command to read an RDS file. But it doesn't work. My OS is Mac OS X.
$ lr -e "readRDS(file('stdin'))" < /tmp/x.rds
Error in readRDS(file("stdin")) : unknown input format
$ lr -p -e "readRDS('/dev/stdin')" < /tmp/x.rds
Error in readRDS("/dev/stdin") : error reading from connection
But this works.
$ lr -p -e "readRDS('/tmp/x.rds')"
x y
1 1 11
2 2 12
3 3 13
Does anybody know how to readRDS from stdin? Thanks.
It works for me (on linux, using littler 0.3.9 on R-devel) using '/dev/stdin' instead of 'stdin'; so try:
lr -p -e "print(readRDS('/dev/stdin'))" < /tmp/x.rds
I am using R in my Ubuntu machine with latest configuration
In R, I get below result:
> read.fwf(pipe('ps -ef | grep /var/lib/docker/'), width = 60)
V1
1 root 29155 29151 0 11:18 pts/0 00:00:00 sh -c ps -ef
2 root 29157 29155 0 11:18 pts/0 00:00:00 grep /var/li
However in Ubuntu console I get different result
ps -ef | grep /var/lib/docker/
root 29150 2509 0 11:17 pts/0 00:00:00 grep --color=auto /var/lib/docker/
I wanted R to fetch PID of /var/lib/docker/, which is according to Ubuntu 2509
Can anyone help me understand why I am getting different result and how to fetch the PID number correctly?
Thanks,
Use ps() in the ps package. This function outputs a data.frame with the process id information.
library(ps)
pid_df <- ps()
pid_df$pid[grep("docker", pid_df$name)]
or in one line:
subset(ps(), grep("docker", name))$pid
running ps aux returns :
USER 131 2.1 0.1 23423 423 FFF/5 R 10:12 0:00 -bash
USER 131 2.1 0.1 23423 423 FFF/5 R 10:12 0:00 -test
USER 131 2.1 0.1 23423 423 FFF/5 R 10:12 0:00 -test1
Attempting to filter on bash with wildcards so just
USER 131 2.1 0.1 23423 423 FFF/5 R 10:12 0:00 -bash
is returned :
ps aux|grep "*bash*"
which returns :
invalid option :
grep: invalid option -- 'p'
Usage: grep [OPTION]... PATTERN [FILE]...
Try `grep --help' for more information.
How to filter the output for bash ?
You should just use ps aux|grep 'bash' and it will work the way you want.The * when used in the grep command actually refers to the regex repetition operator of "zero or more" , not the * wildcard character.
ps aux | grep bash | grep -v bash
to return all bash process
Some versions of ps support this directly. For example, to list all processes whose name is bash, run ps like this:
ps -C bash
I am using R 2.14.0 64 bit on Linux. I went ahead and used the example described here. I am then running the example -
library(doMC)
registerDoMC()
system.time({
r <- foreach(icount(trials), .combine=cbind) %dopar% {
ind <- sample(100, 100, replace=TRUE)
result1 <- glm(x[ind,2]~x[ind,1], family=binomial(logit))
coefficients(result1)
} })
However, I see in top that it is using only one CPU core. To prove it another way, if I check a process that uses all cores, I see -
ignorant#mybox: ~/R$ ps -p 5369 -L -o pid,tid,psr,pcpu
PID TID PSR %CPU
5369 5369 0 0.1
5369 5371 1 0.0
5369 5372 2 0.0
5369 5373 3 0.0
5369 5374 4 0.0
5369 5375 5 0.0
5369 5376 6 0.0
5369 5377 7 0.0
But in this case, I see -
ignorant#mybox: ~/R$ ps -p 7988 -L -o pid,tid,psr,pcpu
PID TID PSR %CPU
7988 7988 0 19.9
ignorant#mybox: ~/R$ ps -p 7991 -L -o pid,tid,psr,pcpu
PID TID PSR %CPU
7991 7991 0 19.9
How can I get it to use multiple cores? I am using multicore instead of doSMP or something else, because I do not want to have copies of my data for each process.
You could try executing your script using the command:
$ taskset 0xffff R --slave -f parglm.R
If this fixes the problem, then you may have a version of R that was built with OpenBLAS or GotoBlas2 which sets the CPU affinity so that you can only use one core, which is a known problem.
If you want to run your example interactively, start R with:
$ taskset 0xffff R
First, you might want to look at htop, which is probably available for your distribution. You can clearly see the usage for each CPU.
Second, have you tried setting the number of cores on the machine directly?
Run this with htop open:
library(doMC)
registerDoMC(cores=12) # Try setting this appropriately.
system.time({
r <- foreach(1:1000, .combine=cbind) %dopar% {
mean(rnorm(100000))
} })
# I get:
# user system elapsed
# 12.789 1.136 1.860
If the user time is much higher than elapsed (not always -- I know, but a rule of thumb), you are probably using more than one core.