In R, faster way than for loop or apply

In R, faster way than for loop or apply - r

For these two matrices, I want to find the product of the matrix X and the row of Q and apply ifelse function to see if the product is greater than zero.
n1=1000, m=10000
X=cbind(rnorm(n1),rbinom(n1))
Q=matrix(rnorm(2*m), ncol=2)
To do this, I tried for loop and apply function in the following.
D=10000
ind_beta=matrix(0,n1,D)
for (l in 1:D){
ind[,l]=as.vector(ifelse(X%*%Q[l,]>=0,1,0))
}
and
ind=apply(Q,1,function(x){ifelse(X%*%Q>=0,1,0)})
Both codes give the same result, but it is really time consuming.
Is there any way to make this fast? Thanks in advance.

How about:
Make data (reproducibly):
set.seed(101)
n1=1000; m=10000
X=cbind(rnorm(n1),rbinom(n1,size=1,prob=0.6))
Q=matrix(rnorm(2*m), ncol=2)
Your way takes about 2.5 seconds:
system.time(ind <- apply(Q,1,function(x){ifelse(X%*%x>=0,1,0)}))
This takes about 0.3 seconds:
system.time({
XQ <- X %*% t(Q)
ind2 <- matrix(as.numeric(XQ>=0),nrow(XQ))
})
Results match:
all.equal(ind,ind2) ## TRUE

Related

For Loop in R - Need Assistance

I'm struggling with For Loops and hope that someone can assist me. I need use a loop in R to determine the value of Σ25i=1i2
I'm new to learning R, and I can't seem to figure this out.
Thank you for your help.

With a for loop you can do :
n <- 25
vec <- numeric(n)
for(i in seq_len(n)) vec[i] <- i^2
sum(vec)
#[1] 5525
seq_len creates a sequence from 1 to n and for each value we square the number and store it in vec at ith positon.
However, you can do this without for loop directly.
sum(seq_len(n)^2)
#[1] 5525

A faster way for calculation with larger n is to apply the math formula
res <- n*(n+1)*(2n+1)/6

R is vectorized, therefore a loop is not the natural way of doing this computation. Better to perform the sum this way:
sum(seq(1, 25, 1)^2)
#5525

Matrix operation efficiency in R

I have 3 matrices X, K and M as follows.
x <- matrix(c(1,2,3,1,2,3,1,2,3),ncol=3)
K <- matrix(c(4,5,4,5,4,5),ncol=3)
M <- matrix(c(0.1,0.2,0.3),ncol=1)
Here is what I need to accomplish.
For example,
Y(1,1)=(1-4)^2*0.1^2+(1-4)^2*0.2^2+(1-4)^2*0.3^2
Y(1,2)=(1-5)^2*0.1^2+(1-5)^2*0.2^2+(1-5)^2*0.3^2
...
Y(3,2)=(3-5)^2*0.1^2+(3-5)^2*0.2^2+(3-5)^2*0.3^2
Currently I used 3 for loops to calculate the final matrix in R. But for large matrices, this is taking extremely long to calculate. And I also need to change the elements in matrix M to find the best value that produces minimal squared errors. Is there a better way to code it up, i.e. Euclidean norm?
for (lin in 1:N) {
for (col in 1:K) {
Y[lin,col] <- 0
for (m in 1:M){
Y[lin,col] <- Y[lin,col] + (X[lin,m]-K[col,m])^2 * M[m,1]^2
}
}
}
Edit:
I ended up using Rcpp to write the code in C++ and call it from R. It is significantly faster! It takes 2-3 seconds to fill up a 2000 * 2000 matrix.

Thank you. I was able to figure this out. The change made my calculation twice as fast as before. For anyone who may be interested, I replaced the last for loop for(m in 1:M) with the following:
Y[lin,col] <- norm(as.matrix((X[lin,]-K[col,]) * M[1,]),"F")^2
Note that I transposed the matrix M so that it has 3 columns instead of 1.

How to create a function the finds the indexed slope of two vectors

I have looked everywhere for this answer but I am having a hard time even figuring our how to ask this question. I am trying to create a function such that it creates a vector that is a function of two other vectors, where I use a for loop to index values at k and k+1. Here is an example of my code, which does not work:
x <- 1:10
y <- x^2
d <- data.frame(x,y)
invSlope <- NULL
invSlope.f <- function(X,Y){
for(k in 1:length(X)-1){
invSlope[k] = (X[k+1] - X[k])/ (Y[k+1] - Y[k])
invSlope[length(X)] = 0
return(invSlope)
}
}
d$invSlope <- invSlope.f(d$x,d$y)
What I am trying to accomplish is at d$invSlope[1] I have the inverse of the slope of the line that comes after it (delta x/delta y). The last value of the vector would just be 0. I can accomplish this with a for loop (or even nested for loops), but I would like to generalize this to a function.
Thanks

The diff function is a vectorized approach... we don't need no steenkin' loops:
finvslope <- function(xseq, yseq) { c( diff(xseq)/diff(yseq) , 0) }

Applying a function to a random sample of vector elements

I have been learning R for the past few days, and want to find out whether the problem below can be solved in a better manner (compacter code perhaps) than my solution.
Problem: A vector V of N (~ 1000) numeric elements, needs to be transformed in the following way.
Choose M (~ 100) elements at random.
Replace each such element x with f(x).
My Solution: for (i in sample(1:N, M)) V[i] = f(V[i])
Edit: The function f takes as input a single numeric value, and also outputs a single numeric value. Something like: f <- function (x) x^3 + 2
Edit: Thanks for everyone's contributions! I now understand the power of vectorized functions. :)

How about this
i <- sample(1:N, M)
V[i] <- f(V[i])
No need for loop since [<- is a vectorized function. See ?"[<-" to get further details on that.

It depends on the type of your function. If f is vectorised then
V <- f(V) # V is a vector with random numbers
will do the job. If f takes and returns a single value then:
V <- sapply(V, f)
Thankfully, in R most of the function are vectorised, so the first approach would work quite often.

R sum over infinite series loop?

I have this:
time=1:200
m=1:1000
sum[i]= sum(1/(1+2*m)^2)*exp( (-kappa*(1+2*m)^2 * pi^2 * time[i])/(z1^2))
I need to find the sum of the expression above for m=1:1000 and time=1:200
I have tried many variety of loop and cannot make it stick. I am even having trouble expressing this here....

This command will return a matrix:
time <- 1:200
m <- 1:1000
sapply(time,
function(time) sum(1/(1+2*m)^2)*exp((-kappa*(1+2*m)^2*pi^2*time)/(z1^2)))
In the matrix you will find the result for all combinations. The rows indicate the values of m, the columns indicate the values of time.

Maybe this will work:
sum<-0
time<-0
for(i in 1:200){
time<-time+1
m<-0
for(j in 1:1000){
m<-m+1
sum<-sum+(1/(1+2*m)^2)*exp((-kappa*(1+2*m)^2*pi^2*time)/(z1^2))
}
}
The loops should repeat the equation 200,000 times, once with each combination of m and time. At the end, sum should be the sum of all these equations. However, I don't know what kappa and z1 are, so my script may need some tweaking.

Another way to do this:
output <- expand.grid(time = 1:200, m =1:1000)
output[,"sum"] <- with(output, sum(1/(1+2*m)^2)*exp( (-kappa*(1+2*m)^2 * pi^2 * time)/(z1^2)))

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

In R, faster way than for loop or apply - r

Related

For Loop in R - Need Assistance

Matrix operation efficiency in R

How to create a function the finds the indexed slope of two vectors

Applying a function to a random sample of vector elements

R sum over infinite series loop?

Categories

Resources