Call function on its own output, N times - r

I want to run a function N times, with it's input being the output it produced in the last iteration. Here's a manual example (with N=3):
fun <- function(data) {
x <- data$x
y <- data$y
new_x <- x+y
new_y <- x*y
list(x=new_x, y=new_y)
}
#Initialise:
data <- list(x=2,y=3)
#Run N times:
data <- fun(data)
data <- fun(data)
data <- fun(data)
Is there a simple/fast way to do this, without using slow loops?

Is there a simple/fast way to do this
Yes, this is a trivial loop:
N = 3
for(i in 1:N) {
data = fun(data)
}
without using slow loops?
This is not slow.
Loops in R are slower than vectorized operations. However, since each iteration depends on the previous result, this cannot be vectorized. With R's JIT compilation, a for loop will likely be faster than common ways in R to hide loops, like *apply functions. And anyway, it's difficult to make most of the *apply functions update their inputs for successive iterations, as is needed here. (JIT compilation has been enabled by default for many years now.)

Related

Convert R apply statement to lapply for parallel processing

I have the following R "apply" statement:
for(i in 1:NROW(dataframe_stuff_that_needs_lookup_from_simulation))
{
matrix_of_sums[,i]<-
apply(simulation_results[,colnames(simulation_results) %in%
dataframe_stuff_that_needs_lookup_from_simulation[i,]],1,sum)
}
So, I have the following data structures:
simulation_results: A matrix with column names that identify every possible piece of desired simulation lookup data for 2000 simulations (rows).
dataframe_stuff_that_needs_lookup_from_simulation: Contains, among other items, fields whose values match the column names in the simulation_results data structure.
matrix_of_sums: When function is run, a 2000 row x 250,000 column (# of simulations x items being simulated) structure meant to hold simulation results.
So, the apply function is looking up the dataframe columns values for each row in a 250,000 data set, computing the sum, and storing it in the matrix_of_sums data structure.
Unfortunately, this processing takes a very long time. I have explored the use of rowsums as an alternative, and it has cut the processing time in half, but I would like to try multi-core processing to see if that cuts processing time even more. Can someone help me convert the code above to "lapply" from "apply"?
Thanks!
With base R parallel, try
library(parallel)
cl <- makeCluster(detectCores())
matrix_of_sums <- parLapply(cl, 1:nrow(dataframe_stuff_that_needs_lookup_from_simulation), function(i)
rowSums(simulation_results[,colnames(simulation_results) %in%
dataframe_stuff_that_needs_lookup_from_simulation[i,]]))
stopCluster(cl)
ans <- Reduce("cbind", matrix_of_sums)
You could also try foreach %dopar%
library(doParallel) # will load parallel, foreach, and iterators
cl <- makeCluster(detectCores())
registerDoParallel(cl)
matrix_of_sums <- foreach(i = 1:NROW(dataframe_stuff_that_needs_lookup_from_simulation)) %dopar% {
rowSums(simulation_results[,colnames(simulation_results) %in%
dataframe_stuff_that_needs_lookup_from_simulation[i,]])
}
stopCluster(cl)
ans <- Reduce("cbind", matrix_of_sums)
I wasn't quite sure how you wanted your output at the end, but it looks like you're doing a cbind of each result. Let me know if you're expecting something else however.
without really having any applicable or sample data to go off of... the process would look like this:
Create a holding matrix(matrix_of_sums)
loop by row through variable table(dataframe_stuff_that_needs_lookup_from_simulation)
find matching indices within the simulation model(simulation_results)
bind the rowSums into the holding matrix(matrix of sums)
I recreated a sample set which is meaningless and produces identical results but should work for your data
# Holding matrix which will be our end-goal
msums <- matrix(nrow = 2000,ncol = 0)
# Loop
parallel::mclapply(1:nrow(ts_df), function(i){
# Store the row to its own variable for ease
d <- ts_df[i,]
# cbind the results using the global assignment operator `<<-`
msums <<- cbind(
msums,
rowSums(
sim_df[,which(colnames(sim_df) %in% colnames(d))]
))
}, mc.cores = parallel::detectCores(), mc.allow.recursive = TRUE)

Speed up for loop assigning data to matrix in R

I am simulating data and filling a matrix using a for loop in R. Currently the loop is running slower than I would like. I've done some work to vectorize some of the variables to improve the loops speed but it still taking some time. I believe the
mat[j,year] <- sum(vec==1)/x
part of the loop is slowing things down. I've looked into filling matrices more efficiently but could not find anything to help my current problem. Eventually this will be used as a part of a shiny app so all of variables I assign will need to be easily assigned different values.
Any advice to speed up the loop or more efficiently write this loop would be greatly appreciated.
Here is the loop:
#These variables are all specified because they need to change with different simulations
num.sims <- 20
time <- 50
mat <- matrix(nrow = num.sims, ncol = time)
x <- 1000
init <- 0.5*x
vec <- vector(length = x)
ratio <- 1
freq <- -0.4
freq.vec <- numeric(nrow(mat))
## start a loop
for (j in 1:num.sims) {
vec[1:init] <- 1; vec[(init+1):x] <- 2
year <- 2
freq.vec[j] <- sum(vec==1)/x
for (i in 1:(x*(time-1))) {
freq.1 <- sum(vec==1)/x; freq.2 <- 1 - freq.1
fit.ratio <- exp(freq*(freq.1-0.5) + log(ratio))
Pr.1 <- fit.ratio*freq.1/(fit.ratio*freq.1 + freq.2)
vec[ceiling(x*runif(1))] <- sample(c(1,2), 1, prob=c(Pr.1,1-Pr.1))
## record data
if (i %% x == 0) {
mat[j,year] <- sum(vec==1)/x
year <- year + 1
}}}
The inner loop is what is slowing you down. You're doing x number of iterations to update each cell in the matrix. Since each trip to modify vec depends on the previous iteration, this would be difficult to simplify. #Andrew Feierman is probably correct that this would benefit from being moved to C++, at least the four lines before the if statement.
Alternatively, this only takes 10-20 seconds to run. Unless you're going to scale this up or run it many times, it might not be worth the trouble to speed it up. If you do keep it as is, you could put a progress bar in Shiny to let the user know things are still working.
Depending on how often you will need to call this loop, it could be worth rewriting it in C++. R is built on C++, and any C++ will run many, many times faster than even efficient R code.
sourceCpp is a good package to start with: https://www.rdocumentation.org/packages/Rcpp/versions/0.12.11/topics/sourceCpp

Which function inside a loop is more efficient (ncol/nrow() or dim())

In an exercise attempt I am trying to create a multiplication table using a for loop. I am new to programming and R is my first language that I learn, so I would like to know which functions inside loops are faster and more efficient. For now, I am not using methods of the apply family because I think that understanding of basic functions like the loops is important.
Here are two ways that I use to create a multiplicaton table:
Using dim() function:
mtx <- matrix(nrow=10, ncol=10)
for(i in 1:dim(mtx)[1]){
for(j in 1:dim(mtx)[2]){
mtx[i,j] <- i*j
}
}
Using ncol/nrow() function:
mtx <- matrix(nrow=10, ncol=10)
for(i in 1:ncol(mtx)){
for(j in 1:nrow(mtx)){
mtx[i,j] <- i*j
}
}
Which way is more efficient and generaly better to use?
Thank you
If you use the functions like you do in your example, the difference is really neglectable. This is because the functions get called only once per loop definition (and not every loop iteration!)
I would definitely prefer ncol/nrow because its much easier too read than dim(x)[1].
That being said, if you just go for the timings, the dim function is faster than ncol/nrow. If you look at the source code, you can see that ncol is implemented as
function (x)
dim(x)[2L]
which means that ncol calls dim and is therefore marginally slower.
If you really want to save some speed with big matrices I would suggest to create the loop vectors beforehand like this:
rows <- 1:nrow(mtx)
cols <- 1:ncols(mtx)
for (i in rows) {
for (j in cols) {
mtx[i, j] <- i * j
}
}

Need help on combine function in a Parallel Simulation study using doMC

I want to ask for some help on writing a combine function for foreach(). Consider the function below:
library(mvtnorm)
library(doMC)
mySimFunc <- function(){
myNum <- runif(1)
myVec <- rnorm(10)
myMat <- rmvnorm(5, rep(0, 3), diag(3))
myListRslt <- list("myNum" = myNum, "myVec" = myVec, "myMat" = myMat)
return (myListRslt)
}
Now I'd like to run the code above for 1000 times using foreach() %dopar% and in each iteration I'd like to:
return myNum as is
get average of myVec and return it
get colMeans() of myMat and return it.
I'd like foreach() %dopar% to return a final list including:
a vector of length 1000 including 1000 myNum each corresponding to an iteration
a vector of length 1000 including 1000 average of myVec in each iteration
a matrix with 1000 rows where each row includes colMeans of myMat in that iteration
My Ideal solution
My ideal solution is o find a way that foreach() acts exactly like for so that I can simply define:
myNumRslt <- NULL
myVecRslt <- NULL
myMatRslt <- NULL
# and then simply aggregate result of each iteration to the variables above as:
foreach(i = 1:1000) %dopar%{
rslt <- mySimFunc()
myNumRslt <- c(myNumRslt, rslt$myNum)
myVecRslt <- c(myVecRslt, mean(rslt$myVec))
myMatRslt.tmp <- colMeans(rslt$myMat)
myMatRslt <- rbind(myMatRslt, myMatRslt.tmp)
}
BUT, unfortunately seems that it's not possible to do that with foreach() so then I think the only solution is to write a combine function that does similar to result aggregation above.
Challenge
1) How could I write a combine function that returns what I explained above?
2) When we do %dopar% (suppose using doMC package), does doMC distribute each iteration to a CPU or it goes further and divide each iteration to further pieces and distribute them?
3) Is there any better (more efficient) way than using doMC and foreach() ?
idea's
In this question Brian mentioned a brilliant way to deal with lists including numeric values. In my case, I have numeric values as well as vectors and matrices. I don't know how to extend Brian's idea in my case.
Thanks very much for your help.
Edit
Cleaned up, generalizable solution using .combine:
#modify function to include aggregation
mySimFunc2 <- function(){
myNum <- runif(1)
myVec <- mean(rnorm(10))
myMat <- colMeans(rmvnorm(5, rep(0, 3), diag(3)))
myListRslt <- list("myNum" = myNum, "myVec" = myVec, "myMat" = myMat)
return (myListRslt)
}
#.combine function
MyComb1 <- function(...) {
lst=list(...)
vec<-sapply(1:length(lst), function (i) return(lst[[i]][[1]] ))
vecavg<-sapply(1:length(lst),function (i) return(lst[[i]][[2]] ))
colmeans<-t(sapply(1:length(lst), function (i) return(lst[[i]][[3]])))
final<-list(vec,vecavg,colmeans)
names(final)<-c("vec","vecavg","colmeans")
return(final)
}
library(doParallel)
cl <- makeCluster(3) #set cores
registerDoParallel(cl)
foreach(i=1:1000,.export=c("mySimFunc2","MyComb1"),.combine=MyComb1,
.multicombine=TRUE,.maxcombine=1000, .packages=c("mvtnorm"))%dopar%{mySimFunc2()}
You should now have a list output containing the desired three objects, which I've titled respectively as vec, vecavg, and colmeans. Note you must set .maxcombine to the number of iterations if iterations are greater than 100.
As a side note, it does not make sense to parallelize for this example task, although I'm guessing the real task may be more complex.

What is the easiest way to parallelize a vectorized function in R?

I have a very large list X and a vectorized function f. I want to calculate f(X), but this will take a long time if I do it with a single core. I have (access to) a 48-core server. What is the easiest way to parallelize the calculation of f(X)? The following is not the right answer:
library(foreach)
library(doMC)
registerDoMC()
foreach(x=X, .combine=c) %dopar% f(x)
The above code will indeed parallelize the calculation of f(X), but it will do so by applying f separately to every element of X. This ignores the vectorized nature of f and will probably make things slower as a result, not faster. Rather than applying f elementwise to X, I want to split X into reasonably-sized chunks and apply f to those.
So, should I just manually split X into 48 equal-sized sublists and then apply f to each in parallel, then manually put together the result? Or is there a package designed for this?
In case anyone is wondering, my specific use case is here.
Although this is an older question this might be interesting for everyone who stumbled upon this via google (like me): Have a look at the pvec function in the multicore package. I think it does exactly what you want.
Here's my implementation. It's a function chunkmap that takes a
vectorized function, a list of arguments that should be vectorized,
and a list of arguments that should not be vectorized (i.e.
constants), and returns the same result as calling the function on the
arguments directly, except that the result is calculated in parallel.
For a function f, vector arguments v1, v2, v3, and scalar
arguments s1, s2, the following should return identical results:
f(a=v1, b=v2, c=v3, d=s1, e=s2)
f(c=v3, b=v2, e=s2, a=v1, d=s1)
chunkapply(FUN=f, VECTOR.ARGS=list(a=v1, b=v2, c=v3), SCALAR.ARGS=list(d=s1, e=s2))
chunkapply(FUN=f, SCALAR.ARGS=list(e=s2, d=s1), VECTOR.ARGS=list(a=v1, c=v3, b=v2))
Since it is impossible for the chunkapply function to know which
arguments of f are vectorized and which are not, it is up to you to
specify when you call it, or else you will get the wrong results. You
should generally name your arguments to ensure that they get bound
correctly.
library(foreach)
library(iterators)
# Use your favorite doPar backend here
library(doMC)
registerDoMC()
get.chunk.size <- function(vec.length,
min.chunk.size=NULL, max.chunk.size=NULL,
max.chunks=NULL) {
if (is.null(max.chunks)) {
max.chunks <- getDoParWorkers()
}
size <- vec.length / max.chunks
if (!is.null(max.chunk.size)) {
size <- min(size, max.chunk.size)
}
if (!is.null(min.chunk.size)) {
size <- max(size, min.chunk.size)
}
num.chunks <- ceiling(vec.length / size)
actual.size <- ceiling(vec.length / num.chunks)
return(actual.size)
}
ichunk.vectors <- function(vectors=NULL,
min.chunk.size=NULL,
max.chunk.size=NULL,
max.chunks=NULL) {
## Calculate number of chunks
recycle.length <- max(sapply(vectors, length))
actual.chunk.size <- get.chunk.size(recycle.length, min.chunk.size, max.chunk.size, max.chunks)
num.chunks <- ceiling(recycle.length / actual.chunk.size)
## Make the chunk iterator
i <- 1
it <- idiv(recycle.length, chunks=num.chunks)
nextEl <- function() {
n <- nextElem(it)
ix <- seq(i, length = n)
i <<- i + n
vchunks <- foreach(v=vectors) %do% v[1+ (ix-1) %% length(v)]
names(vchunks) <- names(vectors)
vchunks
}
obj <- list(nextElem = nextEl)
class(obj) <- c("ichunk", "abstractiter", "iter")
obj
}
chunkapply <- function(FUN, VECTOR.ARGS, SCALAR.ARGS=list(), MERGE=TRUE, ...) {
## Check that the arguments make sense
stopifnot(is.list(VECTOR.ARGS))
stopifnot(length(VECTOR.ARGS) >= 1)
stopifnot(is.list(SCALAR.ARGS))
## Choose appropriate combine function
if (MERGE) {
combine.fun <- append
} else {
combine.fun <- foreach:::defcombine
}
## Chunk and apply, and maybe merge
foreach(vchunk=ichunk.vectors(vectors=VECTOR.ARGS, ...),
.combine=combine.fun,
.options.multicore = mcoptions) %dopar%
{
do.call(FUN, args=append(vchunk, SCALAR.ARGS))
}
}
## Only do chunkapply if it will run in parallel
maybe.chunkapply <- function(FUN, VECTOR.ARGS, SCALAR.ARGS=list(), ...) {
if (getDoParWorkers() > 1) {
chunkapply(FUN, VECTOR.ARGS, SCALAR.ARGS, ...)
} else {
do.call(FUN, append(VECTOR.ARGS, SCALAR.ARGS))
}
}
Here are some examples showing that chunkapply(f,list(x)) produces identical results to f(x). I have set the max.chunk.size extremely small to ensure that the chunking algorithm is actually used.
> # Generate all even integers from 2 to 100 inclusive
> identical(chunkapply(function(x,y) x*y, list(1:50), list(2), max.chunk.size=10), 1:50 * 2)
[1] TRUE
> ## Sample from a standard normal distribution, then discard values greater than 1
> a <- rnorm(n=100)
> cutoff <- 1
> identical(chunkapply(function(x,limit) x[x<=limit], list(x=a), list(limit=cutoff), max.chunk.size=10), a[a<cutoff])
[1] TRUE
If anyone has a better name than "chunkapply", please suggest it.
Edit:
As another answer points out, there is a function called pvec in the multicore pacakge that has very similar functionality to what I have written. For simple cases, you should us that, and you should vote up Jonas Rauch's answer for it. However, my function is a bit more general, so if any of the following apply to you, you might want to consider using my function instead:
You need to use a parallel backend other than multicore (e.g. MPI). My function uses foreach, so you can use any parallelization framework that provides a backend for foreach.
You need to pass multiple vectorized arguments. pvec only vectorizes over a single argument, so you couldn't easily implement parallel vectorized addition with pvec, for example. My function allows you to specify arbitrary arguments.
The itertools package was designed to address this kind of problem. In this case, I would use isplitVector:
n <- getDoParWorkers()
foreach(x=isplitVector(X, chunks=n), .combine='c') %dopar% f(x)
For this example, pvec is undoubtably faster and simpler, but this can be used on Windows with the doParallel package, for example.
Map-Reduce might be what you're looking for; it's been ported to R
How about something like this? R will take advantage of all the available memory and multicore will parallelize over all available cores.
library(multicore)
result = mclapply(X, function,mc.preschedule=FALSE, mc.set.seed=FALSE)

Resources