how to modify my R code to accelerate computational speed - r

Here are my R code. Could you please give me some advice so that can accelerate the computational speed :)
First, the function myfun()generates a complex number.
Second, I compute the elements of matrix M using myfun().
myfun<-function(a,b,nq,ul,uk)
{
m<-seq(1,(nq/2)+1,length=(nq/2)+1);
k<-m;
D<-matrix(NA,nrow = length(k),ncol = length(k));
for(i in 1:length(k)) # row
for(j in 1:length(m)) # column
{
D[i,j]<-(2/nq)*cos(((j-1)*(i-1)*pi)/(nq*0.5))
}
D[,1]<-D[,1]*0.5;
D[,ncol(D)]<-D[,ncol(D)]*0.5;
# compute the vector v
vseq<-seq(2,nq-2,by=2);
vr<-2/(1-vseq^2);
vr<-c(1,vr,1/(1-nq*nq));
v<-matrix(vr,ncol=1); # v is a N by 1 matrix
# compute the vector w, length(w)=nq/2+1
h<-function(x,ul,uk)
{
((b-a)/2)*(exp((b-a)/2*x+(a+b)/2)+1)^(1i*uk)*cos(((b-a)/2*x+(a+b)/2-a)*ul)
}
w<-matrix(rep(NA,length(v)),ncol=1);
for(i in 1:length(w))
{
w[i]<-h((cos((i-1)*pi/nq)),ul,uk)+h((-cos((i-1)*pi/nq)),ul,uk)
}
res<-t(t(D)%*%v)%*%w; # each element of matrix M
return(res)
}
Next, compute each element of matrix M. The N-th column and N-th row are zeros.
matrix.M<-matrix(0,ncol = N,nrow = N);
for(i in 1:N-1)
for(j in 1:N-1)
{
matrix.M[i,j]<-myfun(a,b,nq,i-1,j-1)
}
We can set parameters as
a<--173.2;
b<-78;
alpha<-0.24;
Dt<-0.1;
M<-1000;
N<-150;
u<-seq(1,150,by=1)*pi/(b-a);
nq<-3000;
I appreciate your help!

Here are some suggestions for speeding the function up. I use three "tricks":
Vectorize as many functions as possible
Use the outer function instead of a double loop
Use the hidden gem crossprod for the final matrix products
myfun<-function(a,b,nq,ul,uk) {
m<-seq(1,(nq/2)+1,length=(nq/2)+1);
k<-m;
## Use outer to compute the elements of the matrix
D <- outer(1:length(k), 1:length(m), function(i, j) {(2/nq)*cos(((j-1)*(i-1)*pi)/(nq*0.5))} )
D[,1]<-D[,1]*0.5;
D[,ncol(D)]<-D[,ncol(D)]*0.5;
# compute the vector v
vseq<-seq(2,nq-2,by=2);
vr<-2/(1-vseq^2);
vr<-c(1,vr,1/(1-nq*nq));
v<-matrix(vr,ncol=1); # v is a N by 1 matrix
h<-function(x,ul,uk) {
((b-a)/2)*(exp((b-a)/2*x+(a+b)/2)+1)^(1i*uk)*cos(((b-a)/2*x+(a+b)/2-a)*ul)
}
## Compute the full w vector in one go
vect <- seq_along(v)-1
w <- h((cos(vect*pi/nq)),ul,uk) + h((-cos(vect*pi/nq)),ul,uk)
## Compute the cross products.
res <- crossprod(crossprod(D, v), w)
return(res)
}
I think this should save around 80% of the time compared to the original function. The time hog was the initial computation of D. Hope this helps.

Related

How to generate n random variables from x using sample() function when x has geometric distribution?

I am trying to write a function in R to generate n random variables from x using sample () function when x~Ge(p) (it means x has geometric distribution). In my function I would like to use a while loop.
I think my function needs two inputs as size and p. I need also a for loop in my function. What I think will work is something like a below framework for my function:
rGE <- function(size,p){
for
i<-1
while()
...
return(i)
}
I would like to develope my above function in order to generate n random variables from x (when x~Ge(p))
For a home-grown, inefficient (but comprehensible) version of rgeom, something like this should work:
my_rgeom <- function(n, p) {
x <- numeric(n) ## allocate space for the results (all zeros)
for (i in seq(n)) {
done <- FALSE
while (!done) {
x[i] <- x[i] + 1
done <- runif(1)<p
}
}
return(x)
}
I'm sure you could use sample() instead of runif() for the innermost loop, but it's not obvious to me how. One piece of advice: if you're unfamiliar with programming, try writing your proposed algorithm out as pseudocode rather than jumping in to R-bashing right away. It can be easier if you deal with the logic and the coding nuts-and-bolts separately ...
You could use rgeom:
set.seed(1)
rgeom(n = 10, p = .1)
#> [1] 6 3 23 3 24 13 15 2 20 3
I have finally written the below function:
rge<- function(n, p) {
x <- numeric(n)
for (i in seq(n)) {
j <- 0
while (j==0) {
x[i] <- x[i] + 1
j <- sum(sample(0:1,replace=TRUE,prob=c(1-p,p)))
}
}
return(x)
}
rge(10,.2)
I hope it really generates n random variables number from geometric distribution.

Nested rolling sum in vector

I am struggling to produce an efficient code to compute the vector result r result from an input vector v using this function.
r(i) = \sum_{j=i}^{i-N} [o(i)-o(j)] * exp(o(i)-o(j))
where i loops (from N to M) over the vector v. Size of v is M>>N.
Of course this is feasible with 2 nested for loops, but it is too slow for computational purposes, probably out of fashion and deprecated style...
A MWE:
for (i in c(N+1):length(v)){
csum <- 0
for (j in i:c(i-N)) {
csum <- csum + (v[i]-v[j])*exp(v[i]-v[j])
}
r[i] <- csum
}
In my real application M > 10^5 and the v vector is indeed several vectors.
I have been trying with nested applications of lapply and rollapply without success.
Any suggestion is welcome.
Thanks!
I don't know if it is any more efficient but something you can try:
r[N:M] <- sapply(N:M, function(i) tail(cumsum((v[i]-v[1:N])*exp(v[i]-v[1:N])), 1))
checking that both computations give same results, I got r with your way and r2 with mine, initializing r2 to rep(NA, M) and assessed the similarity:
all((r-r2)<1e-12, na.rm=TRUE)
# [1] TRUE
NOTE: as in #lmo answer, tail(cumsum(...), 1) can be efficiently replaced by just using sum(...):
r[N:M] <- sapply(N:M, function(i) sum((v[i]-v[1:N])*exp(v[i]-v[1:N])))
Here is a method with a single for loop.
# create new blank vector
rr <- rep(NA,M)
for(i in N:length(v)) {
rr[i] <- sum((v[i] - v[seq_len(N)]) * exp(v[i] - v[seq_len(N)]))
}
check for equality
all.equal(r, rr)
[1] TRUE
You could reduce the number of operations by 1 if you store the difference. This should add a little speed up.
for(i in N:length(v)) {
x <- v[i] - v[seq_len(N)]
rr[i] <- sum(x * exp(x))
}

Paste elements together pairwise

Let's say I have two vectors and a distance matrix like below
v1 = sample(c(0,1),5,replace=TRUE)
v2 = sample(c(0,1),5,replace=TRUE)
d = matrix(rep(1,5*5),ncol=5)
diag(d) <- 0
Using the function below I'm calculating distance
how do I paste them together
Here is a first attempt by removing the inner loop and using the vectorization for the vector2 multiplication and sum:
f_d_categorical2 <- function(vector1, vector2, dist.matrix) {
ptm <- proc.time()
dist <- 0
for (i in 1:length(vector1)) {
dist <- dist + sum(vector1[i]*vector2*dist.matrix[i,])
}
print(proc.time()-ptm)
return(dist)
}
Process time went from 1.8 to 0.03 sec. I am sure there is room for improvement and additional test cases.

Calculate "matrix of cofactors" in R?

Is there any way to calculate the matrix of cofactors in R directly?
(Without multiplying it by determinant!)
http://en.wikipedia.org/wiki/Minor_(linear_algebra)#Matrix_of_cofactors
Build your own function:
library(functional)
M<-matrix(1:9,3,3)
getCofactor = function(M, i, j)
{
stopifnot(length(unique(dim(M)))==1)
stopifnot(all(c(i,j)<=dim(M)))
det(M[-i,-j])*(-1)^(i+j)
}
grid = expand.grid(1:dim(M)[1], 1:dim(M)[2])
matrix(mapply(Curry(getCofactor, M=M), grid$Var1, grid$Var2), nrow=dim(M)[1])
You can write a function that gives you the whole matrix of cofactors with one click.
getCofactors <- function(M) {
stopifnot(length(unique(dim(M)))==1) # Check if Matrix = Square
cf <- M # creating a Matrix that has the same Dimensions as M
for(i in 1:dim(M)[1]){
for(j in 1:dim(M)[2]){
cf[i,j] <- (det(M[-i,-j])*(-1)^(i+j)) # overwriting the Values of cf Matrix with cofactors
}
}
return(cf) # output of cofactors matrix
}
If you Want you can save your function and load it on demand:
dump("getCofactors", file="getCofactors.R")
source("getCofactors.R")

Allocate a big matrix

I'm using the bigmemory package. I want to calculate w. My v length is 478000 and k length is 240500. The two matrix multiplication is w very large.
I run the code by loop, but it still is running and is not finished yet and I don't know if will give me the result or not.
I tried to calculate it without the for loop, but I got and error. Please any help to correct my code to make it fast.
v <-read.big.matrix('v.dat',type='double')
k <-read.big.matrix('k.dat',type='double')
m=length(v);
n=length(k);
for(i in 1:m)
{
for(j in 1:n)
{
w[i,j] = 2 * cos(dt * v[i] * k[j]) - 2
}
}
How I can define w before the loop because the size of w is very large I couldn't do like w <- matrix(nr,ncol).
Preallocating a matrix can be done like this:
m = matrix(rep(0, number_or_rows*number_of_columns),
number_of_rows, number_of_columns))
This creates a matrix with the amount of rows and columns defined in the variables number_of_rows and number_of_columns, filled initially with all 0.
What is probably going to be a problem is that because w is equal in size to v and k, you might very well run into memory issues when filling w. You could solve this by also using a bigmemory matrix for w, or running your analysis in chunks.
You need to use the 'big.matrix"-class constructors, and since you are obviously exceeding RAM resources, it would appear necessary that you define it as a "filebacked.big.matrix"
w <- filebacked.big.matrix( m, n , # additional arguments to allocate files and dims
)
See the last example in:
help(big.matrix, package=bigmemory)
agstudy is on the right track, but you could use outer here, as
w <- outer(v,k,FUN=function(x,y) 2*cos(x*y)-2 )
v<-runif(10)
k<-runif(10)
m=length(v);
n=length(k);
w<-matrix(nr=m,nc=n)
for(i in 1:m)
{
for(j in 1:n)
{
w[i,j] = 2 * cos( v[i] * k[j]) - 2
}
}
ww <- outer(v,k,function(x,y) 2*cos(x*y)-2)
Test: ww-w is a matrix of zeroes.
I would do something like this using R vectorization feature:
for(i in 1:m)
{
w[i] = 2 * cos(dt * v[i] * k) - 2 # I compute n terms here
}

Resources