I'm trying to run a NetLogo simulation (using RNetLogo package) in R using parallel processing on my laptop. I'm trying to assess "t-feeding of females" using 3 (i.e., 0, 25, and 50) different "minimum-separation" values. For each "minimum-separation" value, I'd like to replicate the simulation 10 times. I can run everything correctly just using lapply but I'm having trouble with parLapply. I've just started using the package "parallel" so I'm sure it is something in the syntax.
#Set up clusters for parallel
processors <- detectCores()
cl <- makeCluster(processors)
#Simulation
sim3 <- function(min_sep) {
NLCommand("set minimum-separation ", min_sep, "setup")
ret <- NLDoReport(720, "go", "[t-feeding] of females", as.data.frame=TRUE)
tot <- sum(ret[,1])
return(tot)
}
#Replicate simulations 10 times using lapply and create boxplots. This one works.
rep.sim3 <- function(min_sep, rep) {
return(
lapply(min_sep, function(min_sep) {
replicate(rep, sim3(min_sep))
})
)
}
d <- seq(0,50,25)
res <- rep.sim3(d,10)
boxplot(res,names=d, xlab="Minimum Separation", ylab="Time spent feeding")
#Replicate simulations 10 times using parLapply. This one does not work.
rep.sim3 <- function(min_sep, rep) {
return(
parLapply(cl, min_sep, function(min_sep) {
replicate(rep, sim3(min_sep))
})
)
}
d <- seq(0,50,25)
res <- rep.sim3(d,10)
# Error in checkForRemoteErrors(val) : 3 nodes produced errors; first error: could not find function "sim3"
#Replicate simulations 10 times using parLapply. This one does work but creates a list of the wrong length and therefore the boxplot cannot be plotted correctly.
rep.sim3 <- function(min_sep, rep) {
return(
parLapply(cl, replicate(rep, d), sim3))
}
d <- seq(0,50,25)
res <- rep.sim3(d,10)
Ideally I'd like to make the first parLapply work. Alternatively, I guess I could modify res from the parLapply that works so that the list has a length of max_sep instead of 30. However, I can't seem to do that. Any help would be much appreciated!
Thanks in advance.
You need to initialize the cluster workers before executing rep.sim3. The error message indicates that your workers can't execute the sim3 function because you haven't exported it to them. Also, I noticed that you haven't loaded the RNetlogo package on the workers, either.
The easiest way to initialize the workers is with the clusterEvalQ and clusterExport functions:
clusterEvalQ(cl, library(RNetLogo))
clusterExport(cl, 'sim3')
Note that you shouldn't do this in your rep.sim3 function, since that would be inefficient and unnecessary. Do it just once after creating the cluster object and sim3 has been defined.
This initialization is necessary because the workers started via makeCluster don't know anything about your variables or functions, or anything else about your R session. And parLapply doesn't analyze the function that you pass to it any more than lapply does. The difference is that lapply executes in your local R session where sim3 is defined and the RNetLogo package is loaded. parLapply executes the specified function in remote R sessions that have not been initialized by executing your R script.
Related
I tried to look for a duplicate question and I know many people have asked about parLapply in R so I apologize if I missed one that is applicable to my situation.
Problem: I have the following function that runs correctly in R but when I try to run it in parallel using parLapply (I'm on a windows machine) I get the error that $ operator is invalid for atomic vectors. The error mentions that 3 nodes produced the errors no matter how many nodes I set my cluster at, for example I have 8 cores on my desktop so I set the cluster to 7 nodes.
Here is example code showing where the problem is:
library(parallel)
library(doParallel)
library(arrangements)
#Function
perms <- function(inputs)
{
x <- 0
L <- 2^length(inputs$w)
ip <- inputs$ip
for( i in 1:L)
{
y <- ip$getnext()%*%inputs$w
if (inputs$t >= y)
{
x <- x + 1
}
}
return(x)
}
#Inputs is a list of several other variables that are created before this
#function runs (w, t_obs and iperm), here is a reproducible example of them
#W is derived from my data, this is just an easy way to make a reproducible example
set.seed(1)
m <- 15
W <- matrix(runif(15,0,1))
iperm <- arrangements::ipermutations(0:1, m, replace = T)
t_obs <- 5
inputs <- list(W,t_obs, iperm)
names(inputs) <- c("w", "t", "ip")
#If I run the function not in parallel
perms(inputs)
#It gives a value of 27322 for this example data
This runs exactly as it should, however when I try the following to run in parallel I get an error
#make the cluster
cor <- detectCores()
cl<-makeCluster(cor-1,type="SOCK")
#passing library and arguments
clusterExport(cl, c("inputs"))
clusterEvalQ(cl, {
library(arrangements)
})
results <- parLapply(cl, inputs, perms)
I get the error:
Error in checkForRemoteErrors(val) :
3 nodes produced errors; first error: $ operator is invalid for atomic vectors
However I've checked to see if anything is an atomic vector using is.atomic(), and using is.recursive(inputs) it says this is TRUE.
My question is why am I getting this error when I try to run this using parLapply when the function otherwise runs correctly and is there a reason is says "3 nodes produced errors" even when I have 7 nodes?
It says "3 nodes" because, as you're passing it to parLapply, you are only activating three nodes. The first argument to parLapply should be a list of things, each element to pass to each node. In your case, your inputs is a list, correct, but it is being broken down, such that your three nodes are effectively seeing:
# node 1
perms(inputs[[1]]) # effectively inputs$w
# node 2
perms(inputs[[2]]) # effectively inputs$t
# node 3
perms(inputs[[3]]) # effectively inputs$ip
# nodes 4-7 idle
You could replicate this on the local host (not parallel) with:
lapply(inputs, perms)
and when you see it like that, perhaps it becomes a little more obvious what is being passed to your nodes. (If you want to see if further, do debug(perms) then run the lapply above, and see what the inputs inside that function call looks like.)
To get this to work once on one node (I think not what you're trying to do), you could do
parLapply(cl, list(inputs), perms)
But that's only going to run one instance on one node. Perhaps you would prefer to do something like:
parLapply(cl, replicate(7, inputs, simplify=FALSE), perms)
I'm adding an answer in case anyone with a similar problem comes across this. #r2evans answered my original question which lead to a realization that even fixing the above problems would not get me the desired result (see comments to his answer).
Problem: Using the package arrangements to generate a large number of combinations and apply a function to the combinations. This becomes very time consuming as the number of combinations gets huge. What we need to do is split the combinations into chunks depending on the number of cores you will using to run in parallel and then do the calculations in each node only on that specific chunk of the combinations.
Solution:
cor <- detectCores()-1
cl<-makeCluster(cor,type="SOCK")
set.seed(1)
m <- 15
W <- matrix(runif(15,0,1))
#iperm <- arrangements::ipermutations(0:1, m, replace = T)
t_obs <- 5
chunk_list <- list()
for (i in 1:cor)
{
chunk_list[i] <- i
}
chunk_size <- floor((2^m)/(cor))
chunk_size <- c(rep(chunk_size,cor-1), (2^m)-chunk_size*(cor-1))
inputs_list <- Map(list, t=list(t_obs), w=list(W), chunk_list = chunk_list, chunk_size = list(chunk_size))
#inputs <- list(W,t_obs, iperm)
#names(inputs) <- c("w", "t", "ip", "chunk_it")
perms <- function(inputs)
{
x <- 0
L <- 2^length(inputs$w)
ip <- arrangements::ipermutations(0:1, m, replace = T)
chunk_size <- floor((2^m)/(cor))
chunk_size <- c(rep(chunk_size,cor-1), (2^m)-chunk_size*(cor-1))
if (inputs$chunk_list !=1)
{
ip$getnext(sum(chunk_size[1:inputs$chunk_list-1]))
}
for( i in 1:chunk_size[inputs$chunk_list])
{
y <- ip$getnext()%*%inputs$w
if (inputs$t >= y)
{
x <- x + 1
}
}
return(x)
}
clusterExport(cl, c("inputs_list", "m", "cor"))
clusterEvalQ(cl, {
library(arrangements)
})
system.time(results <- parLapply(cl, inputs_list, perms))
Reduce(`+`, results)
What I did was split the total number of combinations up into different chunks, i.e. the first 4681 (I have 7 nodes assigned to cor), the second and so on and made sure I didn't miss any combinations. Then I changed my original function to generate the permutations in each node but to basically skip to the combination it should start calculating on, so for node 1 it starts with the first combination but for node it it starts with the 4682 and so on. I'm still working on optimizing this because it's currently only about 4 times as fast as running it in parallel even though I'm using 7 cores. I think the skip in the permutation option will speed this up but I haven't checked yet. Hopefully this is helpful to someone else, it speeds up my estimated time to run (with m = 25, not 15) a simulation from about 10 days to about 2.5 days.
You need to pass dplyr to the nodes to solve this
clusterEvalQ(clust,{library (dplyr)})
The above code should solve your issue.
I'm currently developing an R package that will be using parallel computing to solve some tasks, through means of the "parallel" package.
I'm getting some really awkward behavior when utilizing clusters defined inside functions of my package, where the parLapply function assigns a job to a worker and waits for it to finish to assign a job to next worker.
Or at least this is what appears to be happening, through the observation of the log file "cluster.log" and the list of running processes in the unix shell.
Below is a mockup version of the original function declared inside my package:
.parSolver <- function( varMatrix, var1 ) {
no_cores <- detectCores()
#Rows in varMatrix
rows <- 1:nrow(varMatrix[,])
# Split rows in n parts
n <- no_cores
parts <- split(rows, cut(rows, n))
# Initiate cluster
cl <- makePSOCKcluster(no_cores, methods = FALSE, outfile = "/home/cluster.log")
clusterEvalQ(cl, library(raster))
clusterExport(cl, "varMatrix", envir=environment())
clusterExport(cl, "var1", envir=environment())
rParts <- parLapply(cl = cl, X = 1:n, fun = function(x){
part <- rasterize(varMatrix[parts[[x]],], raster(var1), .....)
print(x)
return(part)
})
do.call(merge, rParts)
}
NOTES:
I'm using makePSOCKcluster because i want the code to run on windows and unix systems alike although this particular problem is only manifesting itself in a unix system.
Functions rasterize and raster are defined in library(raster), exported to the cluster.
The weird part to me is if I execute the exact same code of the function parSolver in a global environment every thing works smoothly, all workers take one job at the same time and the task completes in no time.
However if I do something like:
library(myPackage)
varMatrix <- (...)
var1 <- (...)
result <- parSolver(varMatrix, var1)
the described problem appears.
It appears to be a load balancing problem however that does not explain why it works ok in one situation and not in the other.
Am I missing something here?
Thanks in advance.
I don't think parLapply is running sequentially. More likely, it's just running inefficiently, making it appear to run sequentially.
I have a few suggestions to improve it:
Don't define the worker function inside parSolver
Don't export all of varMatrix to each worker
Create the cluster outside of parSolver
The first point is important, because as your example now stands, all of the variables defined in parSolver will be serialized along with the anonymous worker function and sent to the workers by parLapply. By defining the worker function outside of any function, the serialization won't capture any unwanted variables.
The second point avoids unnecessary socket I/O and uses less memory, making the code more scalable.
Here's a fake, but self-contained example that is similar to yours that demonstrates my suggestions:
# Define worker function outside of any function to avoid
# serialization problems (such as unexpected variable capture)
workerfn <- function(mat, var1) {
library(raster)
mat * var1
}
parSolver <- function(cl, varMatrix, var1) {
parts <- splitIndices(nrow(varMatrix), length(cl))
varMatrixParts <- lapply(parts, function(i) varMatrix[i,,drop=FALSE])
rParts <- clusterApply(cl, varMatrixParts, workerfn, var1)
do.call(rbind, rParts)
}
library(parallel)
cl <- makePSOCKcluster(3)
r <- parSolver(cl, matrix(1:20, 10, 2), 2)
print(r)
Note that this takes advantage of the clusterApply function to iterate over a list of row-chunks of varMatrix so that the entire matrix doesn't need to be sent to everyone. It also avoids calls to clusterEvalQ and clusterExport, simplifying the code, as well as making it a bit more efficient.
Here, I am trying to translate the language of a text by using parallel processing in R. This is the first time I am using Parallel processing. My code is:
install.packages("RYandexTranslate")
install.packages("textcat")
install.packages("plyr")
install.packages("parallel")
library("RYandexTranslate")
library("textcat")
library("dplyr")
library("parallel")
api_key <- "trnsl.1.1.20160707T103515Z.90fa575d702ae81e.6ec78e064eb94a1c00a9bc506c615f223cf0cf5b"
cl <- makeCluster(4)
Query_L_German <- c("5 euro muenze stempelglanz","2 euro muenzen uebersicht")
Par_Conversion <- function(QUery_L_German)
{
for(i in 1:length(Query_L_German))
{
x <- translate(api_key,Query_L_German[i], "de-en")$text
return(x)
}
}
a <- length(Query_L_German)
parLapply(cl, seq(a), function(i,Query_L_German,Par_Conversion)
for(i in 1:length(Query_L_German)){
x <- Par_Conversion(Query_L_German)
return(x)
}, Query_L_German, Par_Conversion)
But, I am getting following error:
Error in checkForRemoteErrors(val) : 3 nodes produced errors; first
error: object 'Query_L_German' not found
When you are using the function parLapply you need to define the function and variabels which are used within parLapply explicitly. This can be done by defining varlist in the the function clusterExport. Here is a in-depth question/answer on how to do this and other stuff with parLapply if you want to understand more.
Your example can be solved by inserting the following line before parLapply is used:
clusterExport(cl, varlist = c("api_key","Query_L_German","translate"))
I need to multi-thread my R application as it takes 5 minutes to run and is only using 15% of the computers available CPU.
An example of a process which takes a while to run is calculating the mean of a very large raster stack containing n layers:
mean = cellStats(raster_layers[[n]], stat='sd', na.rm=TRUE)
Using the parallel library, I can create a new cluster and pass a function to it:
cl <- makeCluster(8, type = "SOCK")
parLapply(cl, raster_layers[[1]], mean_function)
stopCluster(cl)
where mean function is:
mean_function <- function(raster_object)
{
result = cellStats(raster_object, stat='mean', na.rm=TRUE)
return(result)
}
This method works fine except that it can't see the 'raster' package which is required to use cellStats. So it fails saying no function for cellStats. I have tried including the library within the function but this doesnt help.
The raster package comes with a cluster function, and it CAN see the function cellStats, however as far as I can tell, the cluster function must return a raster object and must be passed a single raster object which isn't flexible enough for me, I need to be able to pass a list of objects and return a numeric variable... which I can do with normal clustering using the parallel library if only it can see the raster package functions.
So, does anybody know how I can pass a package to a node with multi-threading in R? Or, how I can return a single value from the raster cluster function perhaps?
The solution came from Ben Barnes, thank you.
The following code works fine:
mean_function <- function(variable)
{
result = cellStats(variable, stat='mean', na.rm=TRUE)
return(result)
}
cl <- makeCluster(procs, type = "SOCK")
clusterEvalQ(cl, library(raster))
result = parLapply(cl, a_list, mean_function)
stopCluster(cl)
Where procs is the number of processors you wish to use, which must be the same value as the length of the list you are passing (in this case called a_list).
a_list in this case needs to be a list containing rasters which can be operated on to calculate the mean using the cellStats function. So, a_list is simply a list of rasters, containing procs number of rasters.
My challenge is to parallel compute a recursive function. However, the recursion is quite deep, and therefore (in my own novice words) there is an issue with allocating a worker when all the workers are busy. in short, it crushes.
Here is some reproducible code. The code is very stupid, but the structure is what counts. This is a simplified version of what is going on.
I work on a windows machine, if the solution is to go linux, just say the word. Because the real function can be quite deep, managing the number of workers that are called for in the upper level will not solve the issue. Is there perhaps a way to know in what level the recursion is?
FUN <- function(optimizer,neighbors,considered,x){
considered <- c(considered,optimizer)
neighbors <- setdiff(x=neighbors,y=considered)
if (length(neighbors)==0) {
# this loop is STUPID, but it is just an example.
z <- numeric(10)
for (i in 1:100)
{
z[i] <- sample(x,1)
}
return(max(z))
} else {
# Something embarrassingly parallel,
# but cannot be vectorized.
z <- numeric(10)
z <- foreach(i=1:10, .combine='c') %dopar%{
FUN(optimizer=neighbors[1],neighbors=neighbors,
considered=considered,x=x)}
return(max(z))
}
}
require(doParallel,quietly=T)
cl <- makeCluster(3)
clusterExport(cl, c("FUN"))
registerDoParallel(cl)
getDoParWorkers()
>FUN(optimizer=1,neighbors=c(2),considered=c(),x=1:500)
[1] 500
>FUN(optimizer=1,neighbors=c(2,3),considered=c(),x=1:500)
Error in { : task 1 failed - "could not find function "%dopar%""
>FUN(optimizer=1,neighbors=c(2,3),considered=c(),x=1:500)
Error in { : task 1 failed - "could not find function "%dopar%""
Is this error really because the recursion is too deep or is it just because you haven't got require(doParallel) in your FUN function? So that when FUN is called on the workers, that instance of R hasn't got that package in its list.
Your first example doesn't do this because its simple enough to not get to the inner %dopar% loop.