R: Anything faster than outer()?

R: Anything faster than outer()? - r

Using R, I need to evaluate an expression of the form [using latex notation]
\frac{1}{n^2} \sum_{i=1}^n \sum_{j=1}^n f(x_i-x_j),
where x_i,x_j are real (scalar) numbers and f is a nonlinear function with scalar input and output.
My current best [now using R commands] is
mat <- outer(x,x,function(y,z) f(y-z))
res <- mean(mat)
where x is a vector of length n which holds all the x_i's.
For n = 10000, this operation takes about 26 seconds on my PC, but (as expected) computation time increases heavily in n. I'd like to speed this up, mainly because I want to pass the above result to an optimizer later on. Any suggestions? Thanks!

Related

R: difference between apply(object, 1, function(x) sum(x-a)/b) and rowsums((object-a)/b)

I'm new to R and am struggling with the apply function. It is really slow to execute and I was trying to optimize some code I received.
I am trying to do some matrix operations (element-wise multiplication and division on ~10^6 element matrices) then sum the rows of the resulting matrix. I found the fantastic library Rfast and it executes what I thought was the same code in about 1/30 the time, but I am getting systematic differences between my 'optimized' answer and the previous answer.
The original code was something along the lines of
ans <- apply(object, 1, function(x) sum((x - a) / b))
and my code is
ans = Rfast:::rowsums((object-a)/b)
I'm not sure if it's because one of the methods is throwing away precision or making rounding errors - any thoughts?
Edit
Trying to reproduce the error is pretty hard...
I have been able to isolate the discrepancy to when I divide by my vector b with entries each ~ 3000 (i.e. [3016.460436, 3021.210321, 3033.3303219]. If I take this term out the two methods give the same answer.
I then tried two methods to improve my answer, one was dividing b by 1000 then dividing the sum by 1000 at the end. This didn't work, presumably because the float precision is the same either way.
I also tried forcing my b vector to be integers, which also didn't work.
Sample data doesn't reproduce my error either, which is frustrating...
objmat = rbind(rep(c(1,0,0),1000),rep(c(0,0,1),1000))
amat = rbind(rep(c(0.064384654, 0.025465132, 0.36543214),1000))
bmat = rbind(rep(c(1016.460431,1021.210431,1033.330431),1000))
ans = apply(objmat,1,function(x) sum((x-amat)/bmat))
gives
ans[1] = 0.5418828413
rowsums((objmat[1,]-amat)/bmat) = 0.5418828413
I think it has to be a floating point precision error, but I'm not sure why my dummy data doesn't reproduce it, or which method (apply or rowsums) would be more accurate!

Arithmetic/logic operations with large sparse tensors in R

I need to make quick calculations (+,*,>) with large 3D arrays (tensors) in R (like 1500 x 150 x 30000). Since these arrays are very sparse (only 0.03% entries are non-zeros) I first use as_sptensor function from the 'tensorr' library to convert my tensors to sparse ones like:
x <- array(data = c(1,0,0,0,0,0,0,1,1,1,1,1) , dim = c(3,2,2))
s <- as_dtensor(x)
s1 <- as_sptensor(s)
And then I do some arithmetic operations, e.g. multiplication:
s1*s1
I also have memory limitations of 8GB totally so that also help s me to store the result.
The problem is that when I deal with large tensors like:
A <-some_index_matrix[1:3,1:1000000]
A2 <- sptensor(A, rep(1,ncol(A)), dims=c(max(A[1,]),max(A[2,]),max(A[3,])))
A2*A2
I fail to get this product result within reasonable time. How can I optimize my code for such calculations to be carried out within seconds?

R: avoiding loops or recursion for calc in which vector result where values depend on earlier components

Say I have a function func that takes two scalar numeric inputs and delivers a scalar numeric result, and I have the following code to calculate a result vector u, based on input numeric vector v and initial value u0 for the result vector:
u<-rep(u0,1+length(v))
for (k in 2:length(u)){
u[k]<-func(u[k-1],v[k-1])
}
Note how a component of the result vector depends not only on the corresponding element of the input vector but also on the immediately prior element of the result vector. I can see no obvious way to vectorise this.
It is common to do this sort of thing in financial simulations, for instance when projecting forward company accounts, rolling them up with interest or inflation and adding in operational cash flows each year.
For some specific instances, it is possible to find a case-specific, non-iterative coding, but I would like to know if there's a general solution.
The problem can also be coded by recursion, as follows:
calc.u<-function(v,u0){
if (length(v)<2){
func(u0,v[1]) }
else {
u.prior<-func(u0,v[-length(v),drop=FALSE])
c(u.prior,func(u.prior[length(u.prior)],v[length(v)]) )
}
u<-calc.u(v,u0)
Is there an R tactic for doing this without using either iteration or recursion, ie for vectorising it?
Answered: Thank you #MrFlick for introducing me to the Reduce function, which does exactly what I was wanting. I see that
Reduce('+',v,0,accumulate=T)[-1]
gives me
cumsum(v)
and
Reduce('*',v,0,accumulate=T)[-1]
gives me
cumprod(v)
as expected, where the [-1] is to discard the initial value.
Very nice indeed! Thanks again.

If you have this example
u0 <- 5
v <- (1:5)*2
func <- function(u,v) {u/2+v}
u <- rep(u0,1+length(v))
for (k in 2:length(u)){
u[k]<-func(u[k-1],v[k-1])
}
this is equivalent to
w <- Reduce(func, v, u0, accumulate=TRUE)
And we can check that
all(u==w)
# [1] TRUE

Output of parApply different from my input

I am still quite new to r (used to program in Matlab) and I am trying use the parallel package to speed up some calculations. Below is an example which I am trying to calculate the rolling standard deviation of a matrix (by column) with the use of zoo package, with and without parallelising the codes. However, the shape of the outputs came out to be different.
# load library
library('zoo')
library('parallel')
library('snow')
# Data
z <- matrix(runif(1000000,0,1),100,1000)
#This is what I want to calculate with timing
system.time(zz <- rollapply(z,10,sd,by.column=T, fill=NA))
# Trying to achieve the same output with parallel computing
cl<-makeSOCKcluster(4)
clusterEvalQ(cl, library(zoo))
system.time(yy <-parCapply(cl,z,function(x) rollapplyr(x,10,sd,fill=NA)))
stopCluster(cl)
My first output zz has the same dimensions as input z, whereas output yy is a vector rather than a matrix. I understand that I can do something like matrix(yy,nrow(z),ncol(z)) however I would like to know if I have done something wrong or if there is a better way of coding to improve this. Thank you.

From the documentation:
parRapply and parCapply always return a vector. If FUN always returns
a scalar result this will be of length the number of rows or columns:
otherwise it will be the concatenation of the returned values.
And:
parRapply and parCapply are parallel row and column apply functions
for a matrix x; they may be slightly more efficient than parApply but
do less post-processing of the result.
So, I'd suggest you use parApply.

Writing a window function with state using only R's basics

I am trying to write R code which acts as a "moving window", just with memory (state). I have figured out (thanks to this question) how to apply a function to subsequent tuples of elements. For example, if I wish to write a (simple) moving average with a typical period 4, I would do the following:
mapply(myfunc, x[1:(length(x)-4)], x[2:(length(x)-3)], x[3:(length(x)-2)], x[4:(length(x)-1)])
Where myfunc is a function with 4 arguments, which calculates their mean (I cannot use mean, as it expects only 1 argument, and I don't know how to make the 4 arguments a single vector).
That's quite cumbersome, though, and if the typical period is 100, say, I am not sure how to do it.
So here's my first question: how do I generalize this?
But here's another issue: suppose I wish the applied function to be able to save state. A simple example would be to keep record of how many values it was applied on so far. Another example is the exponential moving average (EMA), which is not really a window function, but instead a function which works on single values but which keeps state (the last resulted mean).
How can I write a function which when applied to a vector, works on its values one by one, returning a vector of the same length, which is able to retain its last output every time, or save any other "state" during its calculations? In Python, for example, I'd use classes for that, but that's quite difficult in R.
Important note: I am not interested in auxiliary R packages like zoo or TTR to do the work for me. I am trying to learn R, and in any case the functions I wish to write, while having similarities with MA or EMA, are custom, and do not exist in any of these packages.

Regarding your first question,
n <- length(x)
k <- 4
r <- embed(x, n-k)[1:k, seq(n-k, 1)]
do.call("mapply", c("myfunc", split(r, 1:k)))
Regarding the second question, Reduce can be used to iterate over a vector saving state.

For things like this you should consider using a plain for loop:
x <- runif(10000)
k <- 100
n <- length(x)
res <- numeric(n - k)
library(microbenchmark)
microbenchmark(times=5,
for(i in k:n) res[i - k + 1] <- sum(vec[i:(i + k)]),
{
r <- embed(x, n-k)[1:k, seq(n-k, 1)]
gg <- do.call("mapply", c("sum", split(r, 1:k)))
},
flt <- filter(x, rep(1, k))
)
Produces:
Unit: milliseconds
min lq median uq max neval
for 163.5403 164.4929 165.2543 166.6315 167.0608 5
embed/mapply 1255.2833 1307.3708 1338.2748 1341.5719 1405.1210 5
filter 6.7101 6.7971 6.8073 6.8161 6.8991 5
Now, the results are not identical and I don't pretend to understand exactly what GGrothendieck is doing with embed, but generally speaking for loops are just as fast as *pply functions so long as you initialize your result vectors first. Windowed calculations don't lend themselves well to vectorization, so might as well use a for loop.
EDIT: as several have pointed out in comments, there appears to be an internally implemented function to do (filter) this that is quite a bit faster, so that seems to be the best option (though you should confirm it actually does what you want as again, the results are not exactly identical and I am not personally familiar with the function; in it's default configuration it appears to do a rolling weighted sum, or sum if weights are 1, with a centered window).