Nested rolling sum in vector - r

I am struggling to produce an efficient code to compute the vector result r result from an input vector v using this function.
r(i) = \sum_{j=i}^{i-N} [o(i)-o(j)] * exp(o(i)-o(j))
where i loops (from N to M) over the vector v. Size of v is M>>N.
Of course this is feasible with 2 nested for loops, but it is too slow for computational purposes, probably out of fashion and deprecated style...
for (i in c(N+1):length(v)){
csum <- 0
for (j in i:c(i-N)) {
csum <- csum + (v[i]-v[j])*exp(v[i]-v[j])
r[i] <- csum
In my real application M > 10^5 and the v vector is indeed several vectors.
I have been trying with nested applications of lapply and rollapply without success.
Any suggestion is welcome.

I don't know if it is any more efficient but something you can try:
r[N:M] <- sapply(N:M, function(i) tail(cumsum((v[i]-v[1:N])*exp(v[i]-v[1:N])), 1))
checking that both computations give same results, I got r with your way and r2 with mine, initializing r2 to rep(NA, M) and assessed the similarity:
all((r-r2)<1e-12, na.rm=TRUE)
# [1] TRUE
NOTE: as in #lmo answer, tail(cumsum(...), 1) can be efficiently replaced by just using sum(...):
r[N:M] <- sapply(N:M, function(i) sum((v[i]-v[1:N])*exp(v[i]-v[1:N])))

Here is a method with a single for loop.
# create new blank vector
rr <- rep(NA,M)
for(i in N:length(v)) {
rr[i] <- sum((v[i] - v[seq_len(N)]) * exp(v[i] - v[seq_len(N)]))
check for equality
all.equal(r, rr)
[1] TRUE
You could reduce the number of operations by 1 if you store the difference. This should add a little speed up.
for(i in N:length(v)) {
x <- v[i] - v[seq_len(N)]
rr[i] <- sum(x * exp(x))


Is there a faster way in R of creating large dataframe without loops?

Below is an example of what I'm currently doing and it's rather slow. I figure that there must be something more efficient than this since it takes about 95 seconds to run.
df <- data.frame(matrix(vector(), 1000, 1000))
for (i in (1:1000)) {
for (j in (i:1000)) {
df[i, j] <- i *1000 + j
df[j, i] <- df[i, j]
Also note that the i*1000+j is just for illustration. I'm doing something else there, but it's a calculation dependent on getting all the pairs for 1:1000. Thanks.
Consider outer with a re-assignment of lower triangle of matrix return.
out_mat <- outer(1:1E3, 1:1E3, function(i,j) i * 1E3 + j)
out_mat[lower.tri(out_mat)] <- t(out_mat)[lower.tri(t(out_mat))]
df2 <- data.frame(out_mat)
Results show exact equivalent of df generated from for loop:
identical(df, df2)
# [1] TRUE
However, above depends on your actual calculation as beyond simple arithmetic, the outer call may not work depending on the dimensions.

R - How can I make this loop faster?

Is there some way to make this loop faster in r?
V=array(NA, dim=c(nrow(pixDF), n))
for(i in 1:n)
where, in this case, pixDF is a matrix with 490000 rows and 4 columns filled with doubles. n = 5. Cov is a (4,5,4) array filled with "doubles". Mean is a (5,4) array filled with doubles as well.
This loop was taking about 30min on my computer. (before editing).
Right now it's taking 1min.
As Ronak notes, it is hard to help without reproducible example. But, I think that apply could be avoided. Something like this COULD work:
V <- array(NA, dim = c(nrow(pixDF), n))
tpixDF <- t(pixDF)
for (i in 1:n) {
x <- Cov[, i, ]
sdC <- sqrt(det(x))
iC <- solve(x)
mi <- Mean[i, ]
k <- t(tpixDF - mi)
V[, i] <- sdC*exp(-0.5*rowSums(k %*% iC * k))
Also, as Roland mentions inv probably is equal solve.

How to vectorize triple nested loops?

I've done searching similar problems and I have a vague idea about what should I do: to vectorize everything or use apply() family. But I'm a beginner on R programming and both of the above methods are quite confusing.
Here is my source code:
for (h in 1:200)
for (j in 0:N)
for (k in 0:N)
constjk=dbinom(k, j + k, 0.5)
Let me explain a bit. I want to collect 200 sum1 values (that's the first loop), and for every sum1 value, it is the summation of (lambda[h]/2)*constjk*wk*wj, thus the other two loops. Most tedious is that N changes with h, so I have no idea how to vectorize the j-loop and the k-loop. But of course I can vectorize the h-loop with lambda<-seq() and N<-ceiling(), and that's the best I can do. Is there a way to further simplify the code?
Your code can be perfectly verctorized with 3 nested sapply calls. It might be a bit hard to read for the untrained eye, but the essence of it is that instead of adding one value at a time to sum1[h] we calculate all the terms produced by the innermost loop in one go and sum them up.
Although this vectorized solution is faster than your tripple for loop, the improvement is not dramatical. If you plan to use it many times I suggest you implement it in C or Fortran (with regular for loops), which improves the speed a lot. Beware though that it has high time complexity and will scale badly with increased values of lambda, ultimatelly reaching a point when it is not possible to compute within reasonable time regardless of the implementation.
lambda <- 2 + 1:200/12.5
sum1 <- sapply(lambda, function(l){
N <- ceiling(l*max(x))
sum(sapply(0:N, function(j){
wj <- (sum(x <= (j+1)/l) - sum(x <= j/l))/100
sum(sapply(0:N, function(k){
constjk <- dbinom(k, j + k, 0.5)
wk <- (sum(x <= (k+1)/l) - sum(x <= k/l))/100
Btw, you don't need to predefine variables like h, j, k, wj and wk. Especially since not when vectorizing, as assignments to them inside the functions fed to sapply will create overlayered local variables with the same name (i.e. ignoring the ones you predefied).
Let`s wrap your simulation in a function and time it:
sim1 <- function(num=20){
for (h in 1:num)
for (j in 0:N)
for (k in 0:N)
constjk=dbinom(k, j + k, 0.5)
system.time(res1 <- sim1())
# user system elapsed
# 5.4 0.0 5.4
Now let's make it faster:
sim2 <- function(num=20){
set.seed(42) #to make it reproducible
x <- rlnorm(100,0,1.6)
h <- 1:num
sum1 <- numeric(num)
lambda <- 2+1:num/12.5
N <- ceiling(lambda*max(x))
#functions for wj and wk
wjfun <- function(x,j,lambda,h){
wkfun <- function(x,k,lambda,h){
#function to calculate values of sum1
fun1 <- function(N,h,x,lambda) {
sum1 <- 0
set.seed(42) #to make it reproducible
#calculate constants using outer
const <- outer(0:N[h],0:N[h],FUN=function(j,k) dbinom(k, j + k, 0.5))
wk <- numeric(N[h]+1)
#loop only once to calculate wk
for (k in 0:N[h]){
wk[k+1] <- (sum(x<=(k+1)/lambda[h])-sum(x<=k/lambda[h]))/100
for (j in 0:N[h])
wj <- (sum(x<=(j+1)/lambda[h])-sum(x<=j/lambda[h]))/100
for (k in 0:N[h])
sum1 <- sum1+(lambda[h]/2)*const[j+1,k+1]*wk[k+1]*wj
for (h in 1:num)
sum1[h] <- fun1(N,h,x,lambda)
system.time(res2 <- sim2())
#user system elapsed
#1.25 0.00 1.25
#[1] TRUE
Timings for #Backlin`s code (with 20 interations) for comparison:
user system elapsed
3.30 0.00 3.29
If this is still too slow and you cannot or don't want to use another language, there is also the possibility of parallelization. As far as I see the outer loop is embarrassingly parallel. There are some nice and easy packages for parallelization.

Recursion in a prime generator

I'm making a prime generator, and to make it more efficient, i'm trying to only test numbers against primes that I've already found rather than all numbers < sqrt of the number being tested. I'm trying to get a to be my list of primes, but i'm not sure how to make it recur inside my second for loop. I think this is only testing against a <- 2 and not a <- c(a,i)
x <- 3:1000
a <- 2
for (i in x)
{for (j in a)
{if (i %% j == 0)
else {a <- unique(c(a,i))}}}
The solution might be to cut out the second loop and instead compare your proposed prime number to the entire vector instead, like:
x <- 3:1000
a <- 2
for (i in x) {
if (!any(i %% a == 0)) {
a <- c(a,i)
That seemed to work for me.
A non-recursive mod using simple prime function that's about as fast as you can make it in R is below. Rather than cycle through each individual value and test it's primeness it removes all of the multiples of primes in big chunks. This isolates each subsequent remaining value as a prime. So, it takes out 2x, then 3x, then 4 is gone so 5x values go. It's the most efficient way to do it in R.
primest <- function(n){
p <- 2:n
i <- 1
while (p[i] <= sqrt(n)) {
p <- p[p %% p[i] != 0 | p==p[i]]
i <- i+1
(you might want to see this stack question for faster methods using a sieve and also my timings of the function. What's above will run 50, maybe 500x faster than the version you're working from.)

vector binding in R

I would like to implement a simulation program, which requires the following structure:
It has a for loop, the program will generate an vector in each iteration. I need each generated vector is appended to the existing vector.
I do not how how to do this in R. Thanks for the help.
These answers work, but they all require a call to a non-deterministic function like sample() in the loop. This is not loop-invariant code (it is random each time), but it can still be moved out of the for loop. The trick is to use the n argument and generate all the random numbers you need beforehand (if your problem allows this; some may not, but many do). Now you make one call rather than n calls, which matters if your n is large. Here is a quick example random walk (but many problems can be phrased this way). Also, full disclosure: I haven't had any coffee today, so please point out if you see an error :-)
steps <- 30
n <- 100
directions <- c(-1, 1)
results <- vector('list', n)
for (i in seq_len(n)) {
walk <- numeric(steps)
for (s in seq_len(steps)) {
walk[s] <- sample(directions, 1)
results[[i]] <- sum(walk)
We can rewrite this with one call to sample():
all.steps <- sample(directions, n*steps, replace=TRUE)
dim(all.steps) <- c(n, steps)
walks <- apply(all.steps, 1, sum)
Proof of speed increase (n=10000):
> system.time({
+ for (i in seq_len(n)) {
+ walk <- numeric(steps)
+ for (s in seq_len(steps)) {
+ walk[s] <- sample(directions, 1)
+ }
+ results[[i]] <- sum(walk)
+ }})
user system elapsed
4.231 0.332 4.758
> system.time({
+ all.steps <- sample(directions, n*steps, replace=TRUE)
+ dim(all.steps) <- c(n, steps)
+ walks <- apply(all.steps, 1, sum)
+ })
user system elapsed
0.010 0.001 0.012
If your simulation needs just one random variable per simulation function call, use sapply(), or better yet the multicore package's mclapply(). Revolution Analytics's foreach package may be of use here too. Also, JD Long has a great presentation and post about simulating stuff in R on Hadoop via Amazon's EMR here (I can't find the video, but I'm sure someone will know).
Take home points:
Preallocate with numeric(n) or vector('list', n)
Push invariant code out of for loops. Cleverly push stochastic functions out of code with their n argument.
Try hard for sapply() or lapply(), or better yet mclapply.
Don't use x <- c(x, rnorm(100)). Every time you do this, a member of R-core kills a puppy.
Probably the best thing you can do is preallocate a list of length n (n is number of iterations) and flatten out the list after you're done.
n <- 10
start <- vector("list", n)
for (i in 1:n) {
a[[i]] <- sample(10)
start <- unlist(start)
You could do it the old nasty way. This may be slow for larger vectors.
start <- c()
for (i in 1:n) {
add <- sample(10)
start <- c(start, add)
x <- rnorm(100)
for (i in 100) {
x <- c(x, rnorm(100))
This link should be useful:
Assuming your simulation function -- call it func -- returns a vector with the same length each time, you can store the results in the columns of a pre-allocated matrix:
sim1 <- function(reps, func) {
first <- func()
result <- matrix(first, nrow=length(first), ncol=reps)
for (i in, to=reps - 1)) {
result[, i] <- func()
Or you could express it as follows using replicate:
sim2 <- function(reps, func) {
return(as.vector(replicate(reps, func(), simplify=TRUE)))
> sim2(3, function() 1:3)
[1] 1 2 3 1 2 3 1 2 3
