When I use a for loop, I typically have if constructs with next and break statements. Solving some problems and logical steps just requires that. However I am unable to use next, break statements in the foreach package. How can I use these statements inside of foreach looping structure?
The general idea when using the foreach package is that every iteration can be performed in parallel; so if you had N iteration and N CPUs you would get (ignoring thread communication) perfect speed-up.
So instead of using break, return an NA or 0 as early as possible. For example
library("foreach")
f = function(i) if(i < 3) sqrt(i) else NA
foreach(i=1:5) %do% f(i)
Now you could argue that you have wasted resources for i=4 and i=5, but this amounts to nano/microseconds and your total computation is measured in seconds/minutes.
Related
I am working with daily series of satellite images on a workstation with 64 cores.
For each image, I perform some algebra operations over all pixels using a foreach loop. Some testing revealed that the optimal number of cores for this foreach loop is 20.
This is roughly what I am doing now:
for (i in length(number_of_daily_images){
# perform some pre-processing on each image
# register cluster to loop over pixels
registerDoParallel(20)
out <- foreach(j=1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
} # end inner loop
} # end outer loop
I only have to load the satellite image once, so there is very little I/O processing involved in this code. So there is definitely room for speeding up this code even further. Since I am only using one third of the cores available on the computer, I would like to run three days simultaneously to save some precious time in my workflow.
Therefore, I was thinking about also parallelizing my outer loop. It would be something like this:
# register cluster to loop over images
registerDoParallel(3)
out2 <- foreach (i = length(number_of_daily_images) %dopar% {
# perform some pre-processing on each image
# register cluster to loop over pixels
registerDoParallel(20)
out1 <- foreach(j = 1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
} # end inner loop
} # end outer loop
However, when I run this code I get an error saying that one of the variables involved in the processing within the inner loop does not exist. But it works fine with a "regular" outter for loop.
Therefore, my question is: can I use two nested %dopar% loops in foreach like I was planning? If not, is there any other alternative to also parallelize my outer loop?
Foreach maintainer here.
Use the %:% operator:
registerDoParallel(60)
out2 <- foreach(i = 1:length(number_of_daily_images)) %:%
foreach(j = 1:length(number_of_pixels_in_each_image)) %dopar% {
# perform some calculations
something(i, j)
}
I am having trouble understanding how to make my code parallel. My desire is to find 3 vectors out of a matrix of 20 that produce the closest linear regression to my measured variable (which means that there are a total of 1140 different combinations). Currently, I was able to use 3 nested foreach loops that return the best vectors. However, my desire is to make the outer loop (or all of them?) work in parallel. Any help would be appreciated!
Here is my code:
NIR= matrix(rexp(100, rate=0.01),ncol=20, nrow = 4) #Creating the matrix with 20 vectors
colnames(NIR)=c(1:20)
S.measured=c(7,9,11,13) #Measured variable
bestvectors<-matrix(data=NA,ncol = 3+1, nrow= 1) #creating a vector to save in it the best results
###### Parallel stuff
no_cores <- detectCores() - 1
cl<-makeCluster(no_cores)
registerDoParallel(cl)
#nested foreach loop to exhaustively find the best vectors
foreach(i=1:numcols) %:%
foreach(j=i:numcols) %:%
foreach(k=j:numcols) %do% {
if(i==j|i==k|j==k){ #To prevent same vector from being used twice
}
else{
lm<-lm(S.measured~NIR[,c(i,j,k)]-1) # package that calculates the linear regression
S.pred<-as.matrix(lm$fitted.values) # predicted vector to be compared with the actual measured one
error<-sqrt(sum(((S.pred-S.measured)/S.measured)^2)) # The 'Error' which is the product of the comparison which we want to minimize
#if the error is smaller than the last best one, it replaces it. If not nothing changes
if(error<as.numeric(bestvectors[1,3+1])|is.na(bestvectors[1,3+1])){
bestvectors[1,]<-c(colnames(NIR)[i],colnames(NIR)[j],colnames(NIR)[k],as.numeric(error))
bestvectors[,3+1]<-as.numeric(bestvectors[,3+1])
}
}
}
General advice for using foreach:
Use foreach(i=1:numcols) %dopar% { ... } if you would like your code to run on multiple cores. The %do% decorator imperfectly simulates parallelism but runs on a single core.
Processes spawned by %dopar% cannot communicate with each other while the loop is running. So, set up your code to output an R object, like a data.frame or vector, then do comparison afterwards. In your case, the logic in the if(error<as.numeric ... line should be executed sequentially (not in parallel) after your main foreach loop.
Behavior of nested %dopar% loops is inconsistent across operating systems and is unclear in the way it spawns processes across cores. For best performance and portability, use a single foreach loop in the outermost loop and then vanilla for loops within it.
I know that loops are slow in R and that I should try to do things in a vectorised manner instead.
But, why? Why are loops slow and apply is fast? apply calls several sub-functions -- that doesn't seem fast.
Update: I'm sorry, the question was ill-posed. I was confusing vectorisation with apply. My question should have been,
"Why is vectorisation faster?"
It's not always the case that loops are slow and apply is fast. There's a nice discussion of this in the May, 2008, issue of R News:
Uwe Ligges and John Fox. R Help Desk: How can I avoid this loop or
make it faster? R News, 8(1):46-50, May 2008.
In the section "Loops!" (starting on pg 48), they say:
Many comments about R state that using loops is a particularly bad idea. This is not necessarily true. In certain cases, it is difficult to write vectorized code, or vectorized code may consume a huge amount of memory.
They further suggest:
Initialize new objects to full length before the loop, rather
than increasing their size within the loop. Do not do things in a
loop that can be done outside the loop. Do not avoid loops simply
for the sake of avoiding loops.
They have a simple example where a for loop takes 1.3 sec but apply runs out of memory.
Loops in R are slow for the same reason any interpreted language is slow: every
operation carries around a lot of extra baggage.
Look at R_execClosure in eval.c (this is the function called to call a
user-defined function). It's nearly 100 lines long and performs all sorts of
operations -- creating an environment for execution, assigning arguments into
the environment, etc.
Think how much less happens when you call a function in C (push args on to
stack, jump, pop args).
So that is why you get timings like these (as joran pointed out in the comment,
it's not actually apply that's being fast; it's the internal C loop in mean
that's being fast. apply is just regular old R code):
A = matrix(as.numeric(1:100000))
Using a loop: 0.342 seconds:
system.time({
Sum = 0
for (i in seq_along(A)) {
Sum = Sum + A[[i]]
}
Sum
})
Using sum: unmeasurably small:
sum(A)
It's a little disconcerting because, asymptotically, the loop is just as good
as sum; there's no practical reason it should be slow; it's just doing more
extra work each iteration.
So consider:
# 0.370 seconds
system.time({
I = 0
while (I < 100000) {
10
I = I + 1
}
})
# 0.743 seconds -- double the time just adding parentheses
system.time({
I = 0
while (I < 100000) {
((((((((((10))))))))))
I = I + 1
}
})
(That example was discovered by Radford Neal)
Because ( in R is an operator, and actually requires a name lookup every time you use it:
> `(` = function(x) 2
> (3)
[1] 2
Or, in general, interpreted operations (in any language) have more steps. Of course, those steps provide benefits as well: you couldn't do that ( trick in C.
The only Answer to the Question posed is; loops are not slow if what you need to do is iterate over a set of data performing some function and that function or the operation is not vectorized. A for() loop will be as quick, in general, as apply(), but possibly a little bit slower than an lapply() call. The last point is well covered on SO, for example in this Answer, and applies if the code involved in setting up and operating the loop is a significant part of the overall computational burden of the loop.
Why many people think for() loops are slow is because they, the user, are writing bad code. In general (though there are several exceptions), if you need to expand/grow an object, that too will involve copying so you have both the overhead of copying and growing the object. This is not just restricted to loops, but if you copy/grow at each iteration of a loop, of course, the loop is going to be slow because you are incurring many copy/grow operations.
The general idiom for using for() loops in R is that you allocate the storage you require before the loop starts, and then fill in the object thus allocated. If you follow that idiom, loops will not be slow. This is what apply() manages for you, but it is just hidden from view.
Of course, if a vectorised function exists for the operation you are implementing with the for() loop, don't do that. Likewise, don't use apply() etc if a vectorised function exists (e.g. apply(foo, 2, mean) is better performed via colMeans(foo)).
Just as a comparison (don't read too much into it!): I ran a (very) simple for loop in R and in JavaScript in Chrome and IE 8.
Note that Chrome does compilation to native code, and R with the compiler package compiles to bytecode.
# In R 2.13.1, this took 500 ms
f <- function() { sum<-0.5; for(i in 1:1000000) sum<-sum+i; sum }
system.time( f() )
# And the compiled version took 130 ms
library(compiler)
g <- cmpfun(f)
system.time( g() )
#Gavin Simpson: Btw, it took 1162 ms in S-Plus...
And the "same" code as JavaScript:
// In IE8, this took 282 ms
// In Chrome 14.0, this took 4 ms
function f() {
var sum = 0.5;
for(i=1; i<=1000000; ++i) sum = sum + i;
return sum;
}
var start = new Date().getTime();
f();
time = new Date().getTime() - start;
I am trying to get some nested loops to run faster in R (in windows), the master loop running through a large dataset (i.e. 800000 x 3 matrix).
After trying to remove the temporary variables from the intermediate loops, I am now trying to get R to run the loop on the 4 cores of my machine instead of 1.
Thus I did the following:
install.packages('doSNOW')
library(doSNOW)
library(foreach)
c1<-makeCluster(4)
registerDoSNOW(c1)
foreach(k=1:length(big_data[,1])) %dopar% {
x<-big_data[k,1]
y<-big_data[k,2]
for (i in 1:length(data_2[,1] {
if ( # condition on x and y) {
new_data1<- …
new_data2<- …
new_data3<- …
for (j in 1:length(new_data3)) {
# do something
}
}
}
rm(new_data1)
rm(new_data2)
rm(new_data3)
gc()
}
stopCluster(c1)
My issue is that R keeps running and after say 10min when I stop the script manually I still have k=1 (without getting any explicit errors from R). I can see while R runs that it is using the 4 cores fine.
In comparison, when I use a simple for loop instead of foreach, only 1 core is used but at least after 10min my indices k have increased, and results are being stored.
So it appears that either, foreach is much slower than for (which doesnt make sense), or foreach just doesnt get into the other loops for some reason?
Any ideas on how to overcome this problem would be appreciated.
When you stop execution, there is no single value of k to examine. A different k is passed to each of the nodes, so at the same moment in time, one node might be at k=3, and another might be at k=100. You don't have access to these different values of k. In fact, if you're using %dopar%, the k you get when you stop execution has nothing to do with the k in foreach: it's the same as the k you had before starting.
For example, try running this:
k <- 999
foreach(k=1:3) %dopar% { Sys.sleep(2) }
k
You'll get 999.
(On the other hand, if you were to try foreach(k=1:3) %do% { ... }, you'd get k=3, just as if you'd used k in a for loop.)
Your tasks are indeed running. You'll have to either wait it out or somehow speed up your (rather complex) loop.
I know that loops are slow in R and that I should try to do things in a vectorised manner instead.
But, why? Why are loops slow and apply is fast? apply calls several sub-functions -- that doesn't seem fast.
Update: I'm sorry, the question was ill-posed. I was confusing vectorisation with apply. My question should have been,
"Why is vectorisation faster?"
It's not always the case that loops are slow and apply is fast. There's a nice discussion of this in the May, 2008, issue of R News:
Uwe Ligges and John Fox. R Help Desk: How can I avoid this loop or
make it faster? R News, 8(1):46-50, May 2008.
In the section "Loops!" (starting on pg 48), they say:
Many comments about R state that using loops is a particularly bad idea. This is not necessarily true. In certain cases, it is difficult to write vectorized code, or vectorized code may consume a huge amount of memory.
They further suggest:
Initialize new objects to full length before the loop, rather
than increasing their size within the loop. Do not do things in a
loop that can be done outside the loop. Do not avoid loops simply
for the sake of avoiding loops.
They have a simple example where a for loop takes 1.3 sec but apply runs out of memory.
Loops in R are slow for the same reason any interpreted language is slow: every
operation carries around a lot of extra baggage.
Look at R_execClosure in eval.c (this is the function called to call a
user-defined function). It's nearly 100 lines long and performs all sorts of
operations -- creating an environment for execution, assigning arguments into
the environment, etc.
Think how much less happens when you call a function in C (push args on to
stack, jump, pop args).
So that is why you get timings like these (as joran pointed out in the comment,
it's not actually apply that's being fast; it's the internal C loop in mean
that's being fast. apply is just regular old R code):
A = matrix(as.numeric(1:100000))
Using a loop: 0.342 seconds:
system.time({
Sum = 0
for (i in seq_along(A)) {
Sum = Sum + A[[i]]
}
Sum
})
Using sum: unmeasurably small:
sum(A)
It's a little disconcerting because, asymptotically, the loop is just as good
as sum; there's no practical reason it should be slow; it's just doing more
extra work each iteration.
So consider:
# 0.370 seconds
system.time({
I = 0
while (I < 100000) {
10
I = I + 1
}
})
# 0.743 seconds -- double the time just adding parentheses
system.time({
I = 0
while (I < 100000) {
((((((((((10))))))))))
I = I + 1
}
})
(That example was discovered by Radford Neal)
Because ( in R is an operator, and actually requires a name lookup every time you use it:
> `(` = function(x) 2
> (3)
[1] 2
Or, in general, interpreted operations (in any language) have more steps. Of course, those steps provide benefits as well: you couldn't do that ( trick in C.
The only Answer to the Question posed is; loops are not slow if what you need to do is iterate over a set of data performing some function and that function or the operation is not vectorized. A for() loop will be as quick, in general, as apply(), but possibly a little bit slower than an lapply() call. The last point is well covered on SO, for example in this Answer, and applies if the code involved in setting up and operating the loop is a significant part of the overall computational burden of the loop.
Why many people think for() loops are slow is because they, the user, are writing bad code. In general (though there are several exceptions), if you need to expand/grow an object, that too will involve copying so you have both the overhead of copying and growing the object. This is not just restricted to loops, but if you copy/grow at each iteration of a loop, of course, the loop is going to be slow because you are incurring many copy/grow operations.
The general idiom for using for() loops in R is that you allocate the storage you require before the loop starts, and then fill in the object thus allocated. If you follow that idiom, loops will not be slow. This is what apply() manages for you, but it is just hidden from view.
Of course, if a vectorised function exists for the operation you are implementing with the for() loop, don't do that. Likewise, don't use apply() etc if a vectorised function exists (e.g. apply(foo, 2, mean) is better performed via colMeans(foo)).
Just as a comparison (don't read too much into it!): I ran a (very) simple for loop in R and in JavaScript in Chrome and IE 8.
Note that Chrome does compilation to native code, and R with the compiler package compiles to bytecode.
# In R 2.13.1, this took 500 ms
f <- function() { sum<-0.5; for(i in 1:1000000) sum<-sum+i; sum }
system.time( f() )
# And the compiled version took 130 ms
library(compiler)
g <- cmpfun(f)
system.time( g() )
#Gavin Simpson: Btw, it took 1162 ms in S-Plus...
And the "same" code as JavaScript:
// In IE8, this took 282 ms
// In Chrome 14.0, this took 4 ms
function f() {
var sum = 0.5;
for(i=1; i<=1000000; ++i) sum = sum + i;
return sum;
}
var start = new Date().getTime();
f();
time = new Date().getTime() - start;