I'd like to sort out what it the fastest way to get the norm of a set of vectors contained in a matrix. I was using apply (this is an example, my matrices are much bigger):
a = matrix(1:9, 3,3)
norm_a = apply(a, 1, function(x) sqrt(sum(x^2)))
but then I wanted to speed up my code and moved to:
norm_a = sqrt(a^2%*%rep(1,dim(a)[2]))
which is actually much faster (seen with system.time, I'm not an expert in benchmarking). But I haven't found any final answer to this question so far. Does anyone have an insight about this ?
thanks
This depends on the size of your matrix:
library(microbenchmark)
microbenchmark(f1 = apply(a, 1, function(x) sqrt(sum(x^2))),
f2 = sqrt(a^2%*%rep(1,dim(a)[2])),
f3 = sqrt(rowSums(a^2)))
#Unit: microseconds
# expr min lq mean median uq max neval cld
# f1 44.656 46.812 52.21050 47.5815 49.4295 191.248 100 c
# f2 1.849 2.772 4.07532 4.3120 4.6210 16.323 100 a
# f3 6.160 7.392 9.25537 9.5480 10.1630 20.943 100 b
set.seed(42)
b <- matrix(rnorm(1e6), 1000)
microbenchmark(f1 = apply(b, 1, function(x) sqrt(sum(x^2))),
f2 = sqrt(b^2%*%rep(1,dim(b)[2])),
f3 = sqrt(rowSums(b^2)))
#Unit: milliseconds
# expr min lq mean median uq max neval cld
# f1 30.851752 55.513228 86.84168 109.439043 112.54796 152.27730 100 b
# f2 5.503050 7.434152 14.36080 8.861268 10.42327 66.41539 100 a
# f3 4.430403 5.895553 12.92235 7.359163 8.62321 74.65256 100 a
Related
I hava a sparse matrix with zeros on the main diagonal that I want to change to ones, but compared to a QR-decomposition the operation is very very slow:
mat <- matrix(c(0,1,1,1,0,1,1,1,0),ncol=3)
mat1 <- Matrix::bdiag(mat,mat,mat)
mat2 <- Matrix::bdiag(mat,mat,mat)
identity_mat <- Matrix::Diagonal(9)
microbenchmark::microbenchmark(
qr(mat1),
Matrix::diag(mat2) <- 1,
mat1 + identity_mat
)
results in
Unit: microseconds
expr min lq mean median uq max neval
qr(mat1) 55.825 69.0080 79.16561 72.9365 85.6095 149.676 100
Matrix::diag(mat2) <- 1 302.172 326.2365 379.60509 364.1985 401.8005 756.477 100
mat1 + identity_mat 1714.578 1762.8665 2006.50270 1974.4125 2073.1795 6671.644 100
How can I set the diagonal to ones faster?
This is slightly faster (it uses the triplet sparse matrix instead of the compressed one).
N <- 3
mat <- matrix(c(0,1,1,1,0,1,1,1,0),ncol=3)
mat1 <- do.call(Matrix::bdiag, replicate(N, mat, simplify = FALSE))
mat2 <- do.call(Matrix::bdiag, replicate(N, mat, simplify = FALSE))
mat3 <- Matrix::.bdiag(replicate(N, mat, simplify = FALSE))
identity_mat <- Matrix::Diagonal(3*N)
microbenchmark::microbenchmark(
qr(mat1),
Matrix::diag(mat2) <- 1,
Matrix::diag(mat3) <- 1,
mat1 + identity_mat
)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> qr(mat1) 50.519 65.8000 83.40258 74.1075 84.9095 451.866 100
#> Matrix::diag(mat2) <- 1 266.200 318.6375 452.58706 338.8715 405.3270 5460.654 100
#> Matrix::diag(mat3) <- 1 164.340 181.7700 246.14324 204.1055 235.4700 3083.771 100
#> mat1 + identity_mat 1519.636 1739.8940 2297.10306 1863.0430 2251.7720 18617.782 100
For much larger matrices these barely take any time longer (below is for for N = 300) which makes me wonder if it's just making the S4 objects that is slow (There's probably lots of validation going on in the background).
N <- 300
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> qr(mat1) 239799.888 251484.867 260169.1626 257957.9940 265350.8880 321234.482 100
#> Matrix::diag(mat2) <- 1 396.399 415.131 529.8535 495.5805 575.4920 2367.596 100
#> Matrix::diag(mat3) <- 1 257.128 276.636 361.8436 322.2445 380.6375 2210.064 100
#> mat1 + identity_mat 1605.454 1692.756 2176.5367 1833.2210 2000.9815 16803.231 100
If you can make assumptions about your matrices you may be able to hack it to work faster. In particular if the matrix you are writing the diagonal to has no entries on the diagonal beforehand (as in your example) you could do this:
N <- 3
mat4 <- Matrix::.bdiag(replicate(N, mat, simplify = FALSE))
insert_diagonal <- function(m, d) {
m#i <- c(m#i, 0:(d-1))
m#j <- c(m#j, 0:(d-1))
m#x <- c(m#x, rep(1, d))
m
}
microbenchmark::microbenchmark(
qr(mat1),
Matrix::diag(mat2) <- 1,
Matrix::diag(mat3) <- 1,
insert_diagonal(mat4, 3*N)
)
#> Unit: microseconds
#> expr min lq mean median uq max neval
#> qr(mat1) 63.885 81.0315 97.14267 90.4660 99.4635 413.534 100
#> Matrix::diag(mat2) <- 1 325.229 368.2320 417.94677 408.6095 425.9595 755.734 100
#> Matrix::diag(mat3) <- 1 195.907 212.6790 266.83832 249.9585 266.0280 796.030 100
#> insert_diagonal(mat4, 3 * N) 23.676 30.2365 35.59022 35.5075 39.2745 62.028 100
You can try
`diag<-`(as.matrix(mat1), 1)
and benchmark
microbenchmark::microbenchmark(
qr(mat1),
Matrix::diag(mat2) <- 1,
`diag<-`(as.matrix(mat1), 1)
)
gives
Unit: microseconds
expr min lq mean median uq max neval
qr(mat1) 52.4 56.55 61.457 60.85 64.05 103.5 100
Matrix::diag(mat2) <- 1 269.2 275.35 290.202 282.75 297.65 443.4 100
`diag<-`(as.matrix(mat1), 1) 38.6 42.40 47.721 46.10 48.90 147.4 100
Update
microbenchmark::microbenchmark(
qr(mat1),
Matrix::diag(mat2) <- 1,
as(`diag<-`(as.matrix(mat1), 1), "sparseMatrix")
)
shows
Unit: microseconds
expr min lq mean median
qr(mat1) 50.6 56.55 61.592 60.2
Matrix::diag(mat2) <- 1 270.6 277.20 290.993 280.3
as(`diag<-`(as.matrix(mat1), 1), "sparseMatrix") 96.6 105.45 111.285 112.1
uq max neval
64.00 179.5 100
294.35 492.3 100
115.25 136.2 100
Here's a solution using RcppArmadillo:
Rcpp::cppFunction(code="
arma::sp_mat set_unit_diagonal(arma::sp_mat& A) {
A.diag().ones();
return A;
}", depends="RcppArmadillo")
mat <- matrix(c(0,1,1,1,0,1,1,1,0),ncol=3)
mat1 <- Matrix::bdiag(mat,mat,mat)
mat2 <- Matrix::bdiag(mat,mat,mat)
mat3 <- Matrix::bdiag(mat,mat,mat)
identity_mat <- Matrix::Diagonal(9)
The solution is very quick:
Unit: microseconds
expr min lq mean median uq max neval
qr(mat1) 52.374 62.4805 69.12798 65.747 75.4455 110.630 100
Matrix::diag(mat2) <- 1 272.026 289.9080 323.98992 300.557 358.5355 419.486 100
mat1 + identity_mat 1543.191 1620.2835 1913.68513 1637.990 1970.8930 13572.716 100
set_unit_diagonal(mat3) 7.426 11.9100 14.00686 13.970 15.5955 26.484 100
#BEGIN CODE
my.kernel <- function(Yt){
for (i in 1:length(Yt)) {
Yt[i] <- ifelse(abs(Yt[i]) <= 1, (35/32)*(1 - Yt[i]^2)^3, 0)}
Yt}
# Print results
my.kernel.density.estimator <- function(y,Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + (1/(length(Yt)*h))*my.kernel((Yt[i]-y)/h)}
result}
# Print results
my.loglik.cv <- function(Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + log(my.kernel.density.estimator(Yt[i],Yt[-i],h))}
result}
# Print the results
# END CODE
Yt, h and y can be any vector/number. Here is one example.
Yt<- seq(0, 10, 0.01)
h <- 1
y<- 1
The main point is to understand how to make it run faster.
In R mathematical operations are vectorized. In other words you do not need to apply the same mathematical operation on each vector element separately, you can perform it on all vector elements simultaneously.
The function
my.kernel <- function(Yt){
for (i in 1:length(Yt)) {
Yt[i] <- ifelse(abs(Yt[i]) <= 1, (35/32)*(1 - Yt[i]^2)^3, 0)}
Yt}
can be rewritten as
my.kernel.vec <- function(x) ifelse(abs(x) <= 1, (35/32)*(1 - x^2)^3, 0)
Yt <- seq(0, 10, 0.01)
h <- 1
y <- 1
all.equal(my.kernel(Yt),
my.kernel.vec(Yt))
#output
TRUE
the difference in speed is not minor:
library(microbenchmark)
microbenchmark(my.kernel(Yt),
my.kernel.vec(Yt))
Unit: microseconds
expr min lq mean median uq max neval cld
my.kernel(Yt) 1110.8 1179.2 1438.136 1311.35 1708.9 6756.4 100 b
my.kernel.vec(Yt) 54.3 66.3 104.204 70.20 74.3 3495.4 100 a
That is quite of a speed up.
Similarly
my.kernel.density.estimator <- function(y,Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + (1/(length(Yt)*h))*my.kernel((Yt[i]-y)/h)}
result}
can be changed to utilize R vectorized operations
my.kernel.density.estimator.vec <- function(y,Yt,h) sum((1/(length(Yt)*h))*my.kernel.vec((Yt-y)/h))
all.equal(my.kernel.density.estimator.vec(1, Yt, 1),
my.kernel.density.estimator(1, Yt, 1))
#output
TRUE
microbenchmark(my.kernel.density.estimator.vec(1, Yt, 1),
my.kernel.density.estimator(1, Yt, 1))
Unit: microseconds
expr min lq mean median uq max neval cld
my.kernel.density.estimator.vec(1, Yt, 1) 57.8 59.6 101.918 63.10 70.25 3716.4 100 a
my.kernel.density.estimator(1, Yt, 1) 2110.8 2163.6 2285.316 2231.35 2283.20 7826.7 100 b
Finally in
my.loglik.cv <- function(Yt,h){
result <- 0
for(i in 1:length(Yt)){
result <- result + log(my.kernel.density.estimator(Yt[i],Yt[-i],h))}
result}
you need to loop in order to create vectors Yt[i] and Yt[-i] so I left it as is.
microbenchmark(my.loglik.cv.vec(Yt, 1),
my.loglik.cv(Yt, 1), times = 10)
Unit: milliseconds
expr min lq mean median uq max neval cld
my.loglik.cv.vec(Yt, 1) 59.1957 59.6794 79.13856 90.46365 92.7877 93.4487 10 a
my.loglik.cv(Yt, 1) 2240.7176 2280.7737 2309.83982 2299.39885 2343.6714 2412.8111 10 b
Not to mention the speedup on larger vectors:
Yt <- seq(0, 10, 0.001)
microbenchmark(my.loglik.cv.vec(Yt, 1),
my.loglik.cv(Yt, 1), times = 1)
Unit: seconds
expr min lq mean median uq max neval
my.loglik.cv.vec(Yt, 1) 5.460431 5.460431 5.460431 5.460431 5.460431 5.460431 1
my.loglik.cv(Yt, 1) 230.221194 230.221194 230.221194 230.221194 230.221194 230.221194 1
So I have created a function that generates a data frame for gamma distribution.
Currently, I have.
sample_gamma <- function(alpha,beta,n,iter) {
gamma.df <- as.data.frame(matrix(nrow = iter, ncol = 3))
colnames(gamma.df) <- c("iteration","mean","standard dev")
gamma.df$iteration <- c(1:iter)
for (i in 1:iter) {
gamma.dist <- rgamma(n,shape = alpha, rate = beta, scale = 1/beta)
gamma.df[i,2] <- mean(gamma.dist)
gamma.df[i,3] <- sd(gamma.dist)
}
print(gamma.df)
}
The function does everything I need it to do but I was wondering if there were any alternate or cleaner ways to do it
I would create a function which returns mean and sd for one iteration.
sample_gamma <- function(alpha,beta,n) {
dist <- rgamma(n,shape = alpha, rate = beta)
c(mean = mean(dist), sd = sd(dist))
}
and then repeat it using replicate
t(replicate(5, sample_gamma(2, 3, 4)))
# mean sd
#[1,] 0.5990206 0.2404226
#[2,] 0.6108976 0.3083426
#[3,] 1.0616542 0.4602403
#[4,] 0.3415355 0.1543885
#[5,] 1.0558066 0.9659599
While I think Ronak Shah's answer is simple and relatively idiomatic (R-wise), here's one that is a little more efficient when scaled to high iter counts (since it only does a single random-pull):
sample_gamma <- function(alpha, beta, n, iter) {
mtx <- matrix(rgamma(n*iter, shape=alpha, rate=beta), nrow=n, ncol=iter)
t(apply(mtx, 2, function(a) c(mean=mean(a), sd=sd(a))))
}
sample_gamma(2, 3, 4, 5)
# mean sd
# [1,] 0.6486220 0.22900833
# [2,] 0.8551055 0.07874287
# [3,] 0.7854750 0.72694260
# [4,] 0.7045878 0.24834502
# [5,] 1.1783301 0.25210538
Benchmarking:
microbenchmark::microbenchmark(
RS=t(replicate(5, sample_gamma_RS(2,3,4))),
r2=sample_gamma_r2(2,3,4,5)
)
# Unit: microseconds
# expr min lq mean median uq max neval
# RS 413.7 493.70 757.884 743.80 946.1 1611.6 100
# r2 405.2 461.15 681.630 706.35 898.6 1348.2 100
microbenchmark::microbenchmark(
RS=t(replicate(500, sample_gamma_RS(2,3,4))),
r2=sample_gamma_r2(2,3,4,500)
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# RS 31.271 40.58735 56.44298 57.85735 65.08605 95.1866 100
# r2 29.110 38.81230 53.99426 57.45820 61.35720 100.5820 100
microbenchmark::microbenchmark(
RS=t(replicate(500, sample_gamma_RS(2,3,400))),
r2=sample_gamma_r2(2,3,400,500)
)
# Unit: milliseconds
# expr min lq mean median uq max neval
# RS 60.6782 101.3112 121.3533 116.7464 140.8845 227.1904 100
# r2 66.3892 81.0329 106.9920 98.7170 126.7742 198.3947 100
I confess I thought it would be a more dramatic difference in performance.
If a function return 2 or more values, and using fill = NA, rollapply become much slower.
Is there any ways to avoid it?
f1= function(v)c(mean(v)+ median(v)) #return vector of length 1
f2= function(v)c(mean(v), median(v)) #return vector of length 2
v = rnorm(1000)
microbenchmark(rollapplyr(v, 20, f1), rollapplyr(v,20, f1, fill=NA) )
# expr min lq mean median uq max neval
# rollapplyr(v, 20, f1) 50.84485 53.68726 57.21892 54.63793 57.78519 75.88305 100
# rollapplyr(v, 20, f1, fill = NA) 52.11355 54.69866 59.73473 56.20600 63.10546 99.96493 100
microbenchmark(rollapplyr(v, 20, f2), rollapplyr(v,20, f2, fill=NA) )
# expr min lq mean median uq max neval
# rollapplyr(v, 20, f2) 51.77687 52.29403 56.80307 53.44605 56.65524 105.6713 100
# rollapplyr(v, 20, f2, fill = NA) 69.93853 71.08953 76.48056 72.21896 80.58282 151.4455 100
The reason is to be found in the speed of using fill.na on different types of data, as happens internally in the rollapply()function. Your f1 returns a single vector, whereas f2 returns a matrix of two columns (well, both are zoo objects actually, but you catch my drift).
The speed decrease for inserting the NA is not proportionate to the mere doubling of the number of elements, as this shows:
library(zoo)
library(microbenchmark)
v <- zoo(rnorm(1000))
m <- zoo(matrix(rnorm(2000), ncol=2))
ix <- seq(1000)>50
microbenchmark(na.fill(v, NA, ix), na.fill(m, NA, ix))
# Unit: microseconds
# expr min lq mean median uq max neval
# na.fill(v, NA, ix) 402.861 511.912 679.1114 659.597 754.8385 4716.46 100
# na.fill(m, NA, ix) 9746.643 10091.038 14281.5598 14057.304 17589.9670 22249.96 100
Any generic way of doing the following R code faster? For example in python dict comprehension (see equivalent below) would be a nice faster alternative.
R:
l1 <- 1:3
l2 <- c("a", "b", "c")
foo <- function(x) {return(5*x)}
bar <- list()
for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])
Python
l1 = range(1, 4)
l2 = ["a", "b", "c"]
def foo(x):
return 5*x
{b: foo(a) for a, b in zip(l1, l2)}
We're talking about speed, so let's do some benchmarking:
library(microbenchmark)
microbenchmark(op = {for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2))
Unit: microseconds
expr min lq mean median uq max neval
op 7.982 9.122 10.81052 9.693 10.548 36.206 100
lapply 5.987 6.557 7.73159 6.842 7.270 55.877 100
vectorised 4.561 5.132 6.72526 5.417 5.987 80.964 100
But these small values don't mean much, so I pumped up the vector length to 10,000 where you'll really see a difference:
l <- 10000
l1 <- seq_len(l)
l2 <- sample(letters, l, replace = TRUE)
microbenchmark(op = {bar <- list(); for (i in 1:length(l1)) bar[l2[i]] <- foo(l1[i])},
lapply = setNames(lapply(l1,foo),l2),
vectorised = setNames(as.list(foo(l1)), l2),
times = 100)
Unit: microseconds
expr min lq mean median uq max neval
op 30122.865 33325.788 34914.8339 34769.8825 36721.428 41515.405 100
lapply 13526.397 14446.078 15217.5309 14829.2320 15351.933 19241.767 100
vectorised 199.559 259.997 349.0544 296.9155 368.614 3189.523 100
But tacking onto what everyone else said, it doesn't have to be a list. If you remove the list requirement:
microbenchmark(setNames(foo(l1), l2))
Unit: microseconds
expr min lq mean median uq max neval
setNames(foo(l1), l2) 22.522 23.8045 58.06888 25.0875 48.322 1427.417 100