I am having trouble understanding how to make my code parallel. My desire is to find 3 vectors out of a matrix of 20 that produce the closest linear regression to my measured variable (which means that there are a total of 1140 different combinations). Currently, I was able to use 3 nested foreach loops that return the best vectors. However, my desire is to make the outer loop (or all of them?) work in parallel. Any help would be appreciated!
Here is my code:
NIR= matrix(rexp(100, rate=0.01),ncol=20, nrow = 4) #Creating the matrix with 20 vectors
colnames(NIR)=c(1:20)
S.measured=c(7,9,11,13) #Measured variable
bestvectors<-matrix(data=NA,ncol = 3+1, nrow= 1) #creating a vector to save in it the best results
###### Parallel stuff
no_cores <- detectCores() - 1
cl<-makeCluster(no_cores)
registerDoParallel(cl)
#nested foreach loop to exhaustively find the best vectors
foreach(i=1:numcols) %:%
foreach(j=i:numcols) %:%
foreach(k=j:numcols) %do% {
if(i==j|i==k|j==k){ #To prevent same vector from being used twice
}
else{
lm<-lm(S.measured~NIR[,c(i,j,k)]-1) # package that calculates the linear regression
S.pred<-as.matrix(lm$fitted.values) # predicted vector to be compared with the actual measured one
error<-sqrt(sum(((S.pred-S.measured)/S.measured)^2)) # The 'Error' which is the product of the comparison which we want to minimize
#if the error is smaller than the last best one, it replaces it. If not nothing changes
if(error<as.numeric(bestvectors[1,3+1])|is.na(bestvectors[1,3+1])){
bestvectors[1,]<-c(colnames(NIR)[i],colnames(NIR)[j],colnames(NIR)[k],as.numeric(error))
bestvectors[,3+1]<-as.numeric(bestvectors[,3+1])
}
}
}
General advice for using foreach:
Use foreach(i=1:numcols) %dopar% { ... } if you would like your code to run on multiple cores. The %do% decorator imperfectly simulates parallelism but runs on a single core.
Processes spawned by %dopar% cannot communicate with each other while the loop is running. So, set up your code to output an R object, like a data.frame or vector, then do comparison afterwards. In your case, the logic in the if(error<as.numeric ... line should be executed sequentially (not in parallel) after your main foreach loop.
Behavior of nested %dopar% loops is inconsistent across operating systems and is unclear in the way it spawns processes across cores. For best performance and portability, use a single foreach loop in the outermost loop and then vanilla for loops within it.
Related
I am working with daily series of satellite images on a workstation with 64 cores.
For each image, I perform some algebra operations over all pixels using a foreach loop. Some testing revealed that the optimal number of cores for this foreach loop is 20.
This is roughly what I am doing now:
for (i in length(number_of_daily_images){
# perform some pre-processing on each image
# register cluster to loop over pixels
registerDoParallel(20)
out <- foreach(j=1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
} # end inner loop
} # end outer loop
I only have to load the satellite image once, so there is very little I/O processing involved in this code. So there is definitely room for speeding up this code even further. Since I am only using one third of the cores available on the computer, I would like to run three days simultaneously to save some precious time in my workflow.
Therefore, I was thinking about also parallelizing my outer loop. It would be something like this:
# register cluster to loop over images
registerDoParallel(3)
out2 <- foreach (i = length(number_of_daily_images) %dopar% {
# perform some pre-processing on each image
# register cluster to loop over pixels
registerDoParallel(20)
out1 <- foreach(j = 1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
} # end inner loop
} # end outer loop
However, when I run this code I get an error saying that one of the variables involved in the processing within the inner loop does not exist. But it works fine with a "regular" outter for loop.
Therefore, my question is: can I use two nested %dopar% loops in foreach like I was planning? If not, is there any other alternative to also parallelize my outer loop?
Foreach maintainer here.
Use the %:% operator:
registerDoParallel(60)
out2 <- foreach(i = 1:length(number_of_daily_images)) %:%
foreach(j = 1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
something(i, j)
}
Well, I don't think anyone understood the question...
I have a dynamic script. Sometimes, it will iterate through a list of 10 things, and sometimes it will only iterate through 1 thing.
I want to use foreach to run the script in parallel when the items to iterate through is greater than 1. I simply want to use 1 core per item to iterate through. So, if there are 5 things, I will parallel across 5 threads.
My question is, what happens when the list to iterate through is 1?
Is it better to NOT run in parallel and have the machine maximize throughput? Or can I have my script assign 1 worker and it will run the same as if I had not told it to run in parallel at all?
So lets call the "the number of things you are iterating" iter which you can set dynamically for different processes
Scripting the parallelization might look something like this
if(length(iter)==1){
Result <- #some function
} else {
cl <- makeCluster(iter)
registerDoParallel(cl)
Result <- foreach(z=1:iter) %dopar% {
# some function
}
stopCluster(cl)
}
Here if iter is 1 it will not invoke parallelization otherwise it will assign cores dynamically according to the number of iter. Note that if you intend to embed this in a function, makeCluster and registerDoParallel cannot be called within a function, you have to set them outside a function.
Alternatively you register as many clusters as you have nodes, run the foreach dynamically and the unused clusters will just remain idle.
EDIT: It is better to run NOT to run in parallel if you have only one operation to iterate through. If only to avoid additional time incurred by makeCluster(), registerDoParallel() and stopCluster(). But the difference will be small compared to going parallel with one worker. Modified code above adding conditional to screen for the case of just one worker. Please provide feedback bellow if you need further assistance.
I am running a process in parallel using the doParallel/Foreach backend in R. I'm registering a set of 20 cores as a cluster, and running the process about 100 times. I'm passing a matrix to each iteration of the parallel processes, and in the sub-process I replace the matrix with a random sample of its own rows. What I'm wondering is: should I expect that this modification persists for subsequent iterations handled by the same child process? E.g., when child process 1 finishes its first iteration, does it start the second iteration with the original matrix, or the random sample?
A minimal example:
library(doParallel)
X <- matrix(1:400, ncol=4)
cl<-makeCluster(2)
clusterExport(X)
registerDoParallel(cl)
results<-foreach(i=1:100) %dopar% {
set.seed(12345)
X <- X[sample.int(nrow(X),replace=TRUE),]
X
}
EDIT:
To be clear, if indeed the object will persist across iterations by the same worker process, this is not my desired behavior. Rather, I want to have each iteration take a fresh random sample of the original matrix, not a random sample of the most recent random sample (I recognize that in my minimal example it would moreover create the same random sample of the original matrix each time, due to the seed set--in my actual application I deal with this).
Side effects within a cluster worker that persistent across iterations of a foreach loop are possible, but that is not a supported feature of foreach. Programs that take advantage of it probably won't be portable to different parallel backends, and may not work with newer versions of the software. In fact, I tried to make that kind of side effect impossible when I first wrote foreach, but I eventually gave up.
Note that in your case, you're not modifying the copy of X that was explicitly exported to the workers: you're modifying a copy that was auto-exported to the workers by doParallel. That has probably been a source of confusion to you.
If you really want to do this, I suggest that you turn off auto-exporting of X and then modify the explicitly exported copy so that the program should be well defined and portable, although a bit ugly. Here's an example:
library(doParallel)
cl <- makePSOCKcluster(2)
registerDoParallel(cl)
X <- matrix(0, nrow=4, ncol=4)
clusterExport(cl, 'X')
ignore <- clusterApply(cl, seq_along(cl), function(i) ID <<- i)
results <-
foreach(i=1:4, .noexport='X') %dopar% {
X[i,] <<- ID
X
}
finalresults <- clusterEvalQ(cl, X)
results contains the matrices after each task, and finalresults contain the matrices on each of the workers after the foreach loop has completed.
Update
In general, the body of the foreach loop shouldn't modify any variable that is outside of the foreach loop. I only modify variables that I created previously in the same iteration of the foreach loop. If you want to make a modified version that is only used within that iteration, use a different variable name.
I'm having some trouble getting some nested foreach loops to run in parallel. Here's the situation:
This program essentially performs hypothesis tests using different numbers of observations and different levels of statistical significance. I have four nested foreach loops. The item data.structures is a list of matrices on which the tests are performed. There are two different lists I'm using for data.structures. One list contains 243 matrices (the small list), and the other contains 19,683 (the large list).
number.observations = c(50,100,250,500,1000)
significance.levels = c(.001,.01,.05,.1,.15)
require(foreach)
require(doParallel)
cl = makeCluster(detectCores())
registerDoParallel(cl)
results = foreach(data=data.structures,.inorder=FALSE,.combine='rbind') %:%
foreach(iter=1:iterations,.inorder=FALSE,.combine='rbind') %:%
foreach(number.observations=observations,.inorder=FALSE,.combine='rbind') %:%
foreach(alpha=significance.levels,.inorder=FALSE,.combine='rbind') %dopar% {
#SOME FUNCTIONS HERE
}
When I use the small list of matrices for data.structures, I can see all of the cores being fully utilized (100 percent CPU usage) in Windows' Resource Monitor with six threads each for eight processes, and the job completes as expected in a much shorter amount of time. When I change, however, to the larger list of matrices, the processes are initiated and can be seen in the Processes section of the Resource Monitor. Each of the eight processes shows three threads each with no CPU action. The total CPU usage is approximately 12 percent.
I'm new to parallelization with R. Even if I simplify the problem and the functions, I'm still only able to get the program to run in parallel with the small list. From my own reading, I'm wondering if this is an issue with workload distribution. I've included the .inorder = FALSE option to try and work around this to no avail. I'm fairly certain that this program is a good candidate for parallelization because it performs the same task hundreds of thousands of times and the loops don't depend on previous values.
Any help is tremendously appreciated!
similar issues happened in my code too.
y= foreach(a= seq(1,500,1),.combine='rbind') %:%
foreach(b = seq(1,10,1), .combine='rbind') %:%
foreach(c = seq(1,20,1), .combine='rbind' ) %:%
foreach (d = seq(1,50,1), .combine='rbind' ) %do% {
data.frame(a,b,c,d)
}
A very simple nested foreach parallel loops, it can be executed, but just not in parallel style.
I am trying to get some nested loops to run faster in R (in windows), the master loop running through a large dataset (i.e. 800000 x 3 matrix).
After trying to remove the temporary variables from the intermediate loops, I am now trying to get R to run the loop on the 4 cores of my machine instead of 1.
Thus I did the following:
install.packages('doSNOW')
library(doSNOW)
library(foreach)
c1<-makeCluster(4)
registerDoSNOW(c1)
foreach(k=1:length(big_data[,1])) %dopar% {
x<-big_data[k,1]
y<-big_data[k,2]
for (i in 1:length(data_2[,1] {
if ( # condition on x and y) {
new_data1<- …
new_data2<- …
new_data3<- …
for (j in 1:length(new_data3)) {
# do something
}
}
}
rm(new_data1)
rm(new_data2)
rm(new_data3)
gc()
}
stopCluster(c1)
My issue is that R keeps running and after say 10min when I stop the script manually I still have k=1 (without getting any explicit errors from R). I can see while R runs that it is using the 4 cores fine.
In comparison, when I use a simple for loop instead of foreach, only 1 core is used but at least after 10min my indices k have increased, and results are being stored.
So it appears that either, foreach is much slower than for (which doesnt make sense), or foreach just doesnt get into the other loops for some reason?
Any ideas on how to overcome this problem would be appreciated.
When you stop execution, there is no single value of k to examine. A different k is passed to each of the nodes, so at the same moment in time, one node might be at k=3, and another might be at k=100. You don't have access to these different values of k. In fact, if you're using %dopar%, the k you get when you stop execution has nothing to do with the k in foreach: it's the same as the k you had before starting.
For example, try running this:
k <- 999
foreach(k=1:3) %dopar% { Sys.sleep(2) }
k
You'll get 999.
(On the other hand, if you were to try foreach(k=1:3) %do% { ... }, you'd get k=3, just as if you'd used k in a for loop.)
Your tasks are indeed running. You'll have to either wait it out or somehow speed up your (rather complex) loop.