I'm running R on a remote Linux server and having an issue where parallel code causes the program to freeze (no error message). I've posted some toy code that replicates the problem below. The same code runs fine (< 1 second) on my PC at home so I'm at a loss for how to debug.
Even if it's unclear what the problem is, any advice on debugging this would be really useful. Thanks!
# Prelims
library(stringdist)
library(doParallel)
rm(list = ls())
cat("\014")
# Start parallel
registerDoParallel(cores=2)
# Works
cat("Test #1","\n")
foreach (i=1:2, .packages="stringdist") %dopar% {
cat(stringdist("JOHN","JAHN",method="jaccard",q=2),"\n")
}
# Works
cat("Test #2","\n")
foreach (i=1:2, .packages="stringdist") %do% {
cat(stringdist("JOHN",c("JAHN","DJIN"),method="jaccard",q=2),"\n")
}
# Doesn't work -- spawns two workers and freezes
cat("Test #3","\n")
test<-foreach (i=1:2, .packages="stringdist") %dopar% {
cat(i,"\n")
stringdist("JOHN",c("JAHN","DJIN"),method="jaccard",q=2)
}
stopImplicitCluster()
Output / result:
Only a partial solution, but looking more carefully I saw that "stringdist" already uses multiple threads. The "nested parallel" aspect of this seemed to be causing problems in the Linux server setup, albeit not always and not on my home PC [not sure why].
Setting "nthread=1" as a stringdist option allows me to use parallel foreach.
Related
I would like to multithread calculate in R. Below is my code. It examines the "Quality" column of NGS data and remembers the rows that contain ,.
It worked fine but didn't save time. Now I figured out the reason. Multiple threads (11 of them) were successfully created but only 1 got processed. See update 1. I also tried DoMC in the place of DoParallel, the example code worked but not if I insert my code into the shell. See Update2.
Forgive me to introduce Update 3. There are occations the program runs as planned on a 4 thread computer, under Windows. But it's not consistent.
library(microseq)
library(foreach)
library(doParallel)
library(stringr)
cn<-detectCores()-1 ##i.e. 11
read1<-readFastq("pairIGK.fastq")
Cluster <- makeCluster(cn, type = "SOCK", methods = FALSE)
registerDoParallel(Cluster)
del_list <- foreach (i=1:nrow(read1), .inorder = FALSE, .packages = c("stringr")) %dopar% {
if(str_count(read1$Quality[i],",")!=0) i
}
stopCluster(Cluster)
del_list<-unlist(del_list)
read1<-read1[c(-(del_list)),]
Update1. I tested the code on the same 12 thread computer but under Linux. I found basically only one core is working for R. I saw 10 or 11 Rsession items under the monitor app, and they were not processed at all.
Update2. I found a slide from Microsoft talking about Multi-thread calculation with R.
https://blog.revolutionanalytics.com/downloads/Speed%20up%20R%20with%20Parallel%20Programming%20in%20the%20Cloud%20%28useR%20July2018%29.pdf
It highlighted the package doMC by giving a example calculating the likelihood 2 classmate sharing birthday on varied class size. I tested the provided code per se, while it actually used all cores. The code goes,
pbirthdaysim <- function(n) {
ntests <- 100000
pop <- 1:365
anydup <- function(i)
any(duplicated(
sample(pop, n, replace=TRUE)))
sum(sapply(seq(ntests), anydup)) / ntests
}
library(doMC)
registerDoMC(11)
bdayp <- foreach(n=1:100) %dopar% pbirthdaysim(n)
It took ~20s to finish on my 12 thread machine, which agrees with the slide.
However, when I insert my function into the shell, the same thing happened. Multiple thread got created by only one was actually processed. My code goes like:
library(microseq)
library(foreach)
library(doParallel)
library(stringr)
library(doMC)
cn<-detectCores()-1
read1<-readFastq("pairIGK.fastq")
registerDoMC(cn)
del_list <- foreach (i=1:nrow(read1), .inorder = FALSE, .packages = c("stringr")) %dopar% {
if(str_count(read1$Quality[i],",")!=0) i
}
del_list<-unlist(del_list)
read1<-read1[c(-(del_list)),]
Update 3.
I'm really confused and will go on investigating.
I am new to programming and I am trying to use parallel processing for R in windows, using an existing code.
Following is the snippet of my code:
if (length(grep("linux", R.version$os)) == 1){
num_cores = detectCores()
impact_list <- mclapply(len_a, impact_func, mc.cores = (num_cores - 1))
}
# else if(length(grep("mingw32", R.version$os)) == 1){
# num_cores = detectCores()
# impact_list <- mclapply(len_a, impact_func, mc.cores = (num_cores - 1))
#
# }
else{
impact_list <- lapply(len_a, impact_func)
}
return(sum(unlist(impact_list, use.names = F)))
This works fine, I am using R on windows so the code enters in 'else' statement and it runs the code using lapply() and not by parallel processing.
I have added the 'else if' statement to make it work for windows. So when I un-comment 'else if' block of code and run it, I am getting an error "'mc.cores' > 1 is not supported on Windows".
Please suggest how can I use parallel processing in windows, so that less time is taken to run the code.
Any help will be appreciated.
(disclaimer: I'm author of the future framework here)
The future.apply package provides parallel versions of R's built-in "apply" functions. It's cross platform, i.e. it works on Linux, macOS, and Windows. The package allows you to often just replace an existing lapply() with a future_lapply() call, e.g.
library(future.apply)
plan(multisession)
your_fcn <- function(len_a) {
impact_list <- future_lapply(len_a, impact_func)
sum(unlist(impact_list, use.names = FALSE))
}
Regarding mclapply() per se: If you use parallel::mclapply() in your code, make sure that there is always an option not to use it. The reason is that it is not guaranteed to work in all environment, that is, it might be unstable and crash R. In R-devel thread 'mclapply returns NULLs on MacOS when running GAM' (https://stat.ethz.ch/pipermail/r-devel/2020-April/079384.html), the author of mclapply() wrote on 2020-04-28:
Do NOT use mcparallel() in packages except as a non-default option that user can set for the reasons Henrik explained. Multicore is intended for HPC applications that need to use many cores for computing-heavy jobs, but it does not play well with RStudio and more importantly you don't know the resource available so only the user can tell you when it's safe to use. Multi-core machines are often shared so using all detected cores is a very bad idea. The user should be able to explicitly enable it, but it should not be enabled by default.
I have a long running task (about 3-4 hours) and I use the doMC backend and a foreach (...) %dopar% loop. Code:
registerDoMC(4)
res <- foreach(i=1:n, .combine=function(x,y) rbindlist(list(x,y)), .inorder=FALSE, .errorhandling="stop", .verbose=TRUE) %dopar%
{
# do some stuff with data.table and append row at the end (that's why I use this combine function)
}
At some point during my execution, the number of parallel workers reduces: I originally set it to 4, and the number of active workers reduces to 2 when I inspect my processes in htop. At the end of my foreach I don't get any errors (even with verbose on) so I am completely baffled as to what is happening. Has anyone seen this problem before? (I am running on Linux btw). Any help would be greatly appreciated and I am happy to provide more information if anyone requests.
As described in this question, doMC preschedules tasks by default, which might cause some workers to finish earlier and sit idle. The solution was the same as in that question:
opts <- list(preschedule=FALSE)
results <- foreach(i=1:10, .options.multicore=opts) %dopar% {
# ...
}
I have written the following code (running in RStudio for Windows) to read a long list of very large text files into memory using a parallel foreach loop:
open.raw.txt <- function() {
files <- choose.files(caption="Select .txt files for import")
cores <- detectCores() - 2
registerDoParallel(cores)
data <- foreach(file.temp = files[1:length(files)], .combine = cbind) %dopar%
as.numeric(read.table(file.temp)[, 4])
stopImplicitCluster()
return(data)
}
Unfortunately, however, the function fails to complete and debugging shows that it gets stuck at the foreach loop stage. Oddly, windows task manager indicated that I am at close to full capacity processor wise (I have 32 cores, and this should use 30 of them) for around 10 seconds, then it drops back to baseline. However the loop never completes, indicating that it is doing the work and then getting stuck.
Even more bizarrely, if I remove the 'function' bit and just run each step one-by-one as follows:
files <- choose.files(caption="Select .txt files for import")
cores <- detectCores() - 2
registerDoParallel(cores)
data <- foreach(file.temp = files[1:length(files)], .combine = cbind) %dopar%
as.numeric(read.table(file.temp)[, 4])
stopImplicitCluster()
Then it all works fine. What is going on?
Update: I ran the function and then left it for a while (around an hour) and finally it completed. I am not quite sure how to interpret this, given that multiple cores are still only used for the first 10 seconds or so. Could the issue be with how the tasks are being shared out? Or maybe memory management? I'm new to parallelism, so not sure how to investigate this.
The problem is that you have multiple process opening and closing the same file. Usually when a file is opened by a process it is locked to other process, so that prevents reading the file in parallel
I am developing a parallel R code using the Snow package, but when calling C++ code using the Rcpp package the program just hangs and is unresponsive.
as an example...
I have the following code in R that is using snow to split into certain number of processes
MyRFunction<-function(i) {
n=i
.Call("CppFunction",n,PACKAGE="MyPackage")
}
if (mpi) {
cl<-getMPIcluster()
clusterExport(cl, list("set.user.Random.seed"))
clusterEvalQ(cl, {library(Rcpp); NULL})
out<-clusterApply(cl,1:mc.cores,MyRFunction)
stopCluster(cl)
}
else
out <- parallel::mclapply(1:mc.cores,MyRFunction)
Whereas my C++ function looks like...
RcppExport SEXP CppFunction(SEXP n) {
int n=as<int>(n);
}
If I run it with mpi=false and mc.cores=[some number of threads] the program runs beautifully BUT
if i run it with mpi=true, therefore using snow, the program just hangs at int=as<int>(n) ?????
On the other hand if I define the C++ function as...
RcppExport SEXP CppFunction(SEXP n) {
CharacterVector nn(n);
int n=boost::lexical_cast<int>(nn[0]);
}
The program runs perfectly on each mpi thread?? The problem is that it works for integers doubles etc, but not matrices
Also, I must use lexical_cast from the boost package to make it works since as<> does not.
Does anybody know why this is, and what I am missing here, so I can load my matrices as well?
It is not entirely clear from your question what you are doing but I'd recommend to
simplify: snow certainly works, and works with Rcpp as it does with other packages
trust packages: I found parallel computing setups easier when all nodes are identical local packages sets
be careful with threading: if you have trouble with explicit threading in the snow context, try it first without it and the add it once the basic mechanics work
Finally the issue was resolved, and the problem seems to lie with getMPICluster() which works perfectly fine for pure R code, but not as well with Rcpp, as explained above.
Instead using makeMPICluster command
mc.cores <- max(1, NumberOfNodes*CoresPerNode-1) # minus one for master
cl <- makeMPIcluster(mc.cores)
cat(sprintf("Running with %d workers\n", length(cl)))
clusterCall(cl, function() { library(MyPackage); NULL })
out<-clusterApply(cl,1:mc.cores,MyRFunction)
stopCluster(cl)
Works great! The problem is that you have to manually define the number of nodes and cores per node within the R code, instead of defining it using the mpirun command.