Suppose A is some square matrix. How can I easily exponentiate this matrix in R?
I tried two ways already: Trial 1 with a for-loop hack and Trial 2 a bit more elegantly but it is still a far cry from Ak simplicity.
Trial 1
t(matrix(rnorm(16),ncol=4,nrow=4)) -> a
for(i in 1:2){a <- a %*% a}
Trial 2
a <- t(matrix(c(0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0),nrow=4))
i <- diag(4)
(function(n) {if (n<=1) a else (i+a) %*% Recall(n-1)})(10)
If A is diagonizable, you could use eigenvalue decomposition:
matrix.power <- function(A, n) { # only works for diagonalizable matrices
e <- eigen(A)
M <- e$vectors # matrix for changing basis
d <- e$values # eigen values
return(M %*% diag(d^n) %*% solve(M))
When A is not diagonalizable, the matrix M (matrix of eigenvectors) is singular. Thus, using it with A = matrix(c(0,1,0,0),2,2) would give Error in solve.default(M) : system is computationally singular.
The expm package has an %^% operator:
findFn("{matrix power}")
Although Reduce is more elegant, a for-loop solution is faster and seems to be as fast as expm::%^%
m1 <- matrix(1:9, 3)
m2 <- matrix(1:9, 3)
m3 <- matrix(1:9, 3)
system.time(replicate(1000, Reduce("%*%" , list(m1,m1,m1) ) ) )
# user system elapsed
# 0.026 0.000 0.037
mlist <- list(m1,m2,m3)
m0 <- diag(1, nrow=3,ncol=3)
system.time(replicate(1000, for (i in 1:3 ) {m0 <- m0 %*% m1 } ) )
# user system elapsed
# 0.013 0.000 0.014
library(expm) # and I think this may be imported with pkg:Matrix
system.time(replicate(1000, m0%^%3))
# user system elapsed
#0.011 0.000 0.017
On the other hand the matrix.power solution is much, much slower:
system.time(replicate(1000, matrix.power(m1, 4)) )
user system elapsed
0.677 0.013 1.037
#BenBolker is correct (yet again). The for-loop appears linear in time as the exponent rises whereas the expm::%^% function appears to be even better than log(exponent).
> m0 <- diag(1, nrow=3,ncol=3)
> system.time(replicate(1000, for (i in 1:400 ) {m0 <- m0 %*% m1 } ) )
user system elapsed
0.678 0.037 0.708
> system.time(replicate(1000, m0%^%400))
user system elapsed
0.006 0.000 0.006
Indeed the expm's package does use exponentiation by squaring.
In pure r, this can be done rather efficiently like so,
"%^%" <- function(mat,power){
base = mat
out = diag(nrow(mat))
while(power > 1){
if(power %% 2 == 1){
out = out %*% base
base = base %*% base
power = power %/% 2
out %*% base
Timing this,
m0 <- diag(1, nrow=3,ncol=3)
system.time(replicate(10000, m0%^%4000))#expm's %^% function
user system elapsed
0.31 0.00 0.31
system.time(replicate(10000, m0%^%4000))# my %^% function
user system elapsed
0.28 0.00 0.28
So, as expected, they are the same speed because they use the same algorithm. It looks like the overhead of the looping r code does not make a significant difference.
So, if you don't want to use expm, and need that performance, then you can just use this, if you don't mind looking at imperative code.
A shorter solution with eigenvalue decomposition:
"%^%" <- function(S, power)
with(eigen(S), vectors %*% (values^power * t(vectors)))
Simple solution
`%^%` <- function(A, n) {
A1 <- A
for(i in seq_len(n-1)){
A <- A %*% A1
I have some troubles with a code which take a huge amount of time to run. Could someone give me some help? Thanks in advance!
for (k in 1:dim(b)[2]){
for (i in seq(1,ddimi,1)){
for (j in seq(i+1,ddim,1)){
Note that:
num [1:5, 1:30007] -0.000292 -0.001384 -0.001412 -0.002603 -0.002848
- attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:30007] "V1" "V2" "V3" "V4" ...
For-loops and growing your vector with c() are slowing you down. It's best to try to take advantage of vectorization, and to use *apply (or map) functions as much as possible. Here's something that does a little of both by iterating over the columns with sapply, creating combinations and computing the products and differences of those combinations:
mat <- sapply(b, function(x) {y <- combn(x, 2); y[1,] - y[2,] * y[1,] - y[2,]})
It should be fast – maybe not quite as fast as user10488504's very efficient solution, but still pretty fast. It also has very tight syntax, and you might also find it useful that the output is a matrix, with each column corresponding to a column from b.
b <-*30007, -.001, -.0003), byrow = T, nrow = 5))
b <- matrix(rnorm(5*30007), nrow=5)
#With foor-Loop
for (k in 1:dim(b)[2]){
for (i in seq(1,ddimi,1)){
for (j in seq(i+1,ddim,1)){
# User System verstrichen
# 104.568 3.636 108.206
#Vectorized with matrix indices
K <- 1:dim(b)[2] #for (k in 1:dim(b)[2]){... creates this vector
I <- seq(1,ddimi,1) #for (i in seq(1,ddimi,1)){... creates this vector
J <- unlist(lapply(I+1, function(x) seq(x,ddim,1))) #for (j in seq(i+1,ddim,1)){... creates this vector
IK <- as.matrix(expand.grid(I, K)) #Get all combinations between I and K as you will have with the nested for loops of k and i
IK <- IK[rep(seq_len(nrow(IK)), rep((ddim-1):1,length.out=nrow(IK))),] #IK-rows need to be repeated, as it is used repeatedly in the "for (j in seq(i+1,ddim,1)){" loop
JK <- as.matrix(expand.grid(j=J, k=K)) #Get all combinations between J and K as you will have with the nested for loops of k and j
#Now you have all the indexes of your for loop as vectors and can make the calculations
tt <- b[IK] - b[JK] * b[IK] - b[JK]
# User System verstrichen
# 0.056 0.000 0.097
identical(all_dist, tt)
#[1] TRUE
As you are using k only on the left side without interaction with the other loops you can partly vectorize by simply leaving the k loop and the index away.
for (i in seq(1,ddimi,1)){
for (j in seq(i+1,ddim,1)){
tt=c(tt, (b[i,])-(b[j,])*(b[i,])-(b[j,]))
dim(tt) <- c(30007, 10)
tt <- as.vector(t(tt))
# User System verstrichen
# 0.017 0.000 0.017
identical(all_dist, tt)
#[1] TRUE
Or you can replace the inner two for loops with index vectors and make an apply loop instead of the k-for loop:
I <- seq(1,ddimi,1)
J <- unlist(lapply(I+1, function(x) seq(x,ddim,1)))
I <- I[rep(seq_along(I), rep((ddim-1):1,length.out=length(I)))]
tt <- as.vector(apply(b, 2, function(x) {x[I] - x[J] * x[I] - x[J]}))
# User System verstrichen
# 0.085 0.000 0.085
identical(all_dist, tt)
#[1] TRUE
Used time of the nice solution from gersht:
mat <- as.vector(sapply(, function(x) {y <- combn(x, 2); y[1,] - y[2,] * y[1,] - y[2,]}))
# User System verstrichen
# 1.083 0.000 1.082
identical(all_dist, mat)
#[1] TRUE
Long time lurker, first time asker.
I'm trying to calculate 'items in common between 2 sets of items' for a 20M+ items dataset. Sample data looks like this.
#serially numbered items
parents <- rep(1:10000)
#generate rnorm # of children items
numchild <- round(rnorm(10000, mean=30, sd=10))
#fill the parent-child list
parent_child <- list()
for (x in 1:length(parents)){
if (numchild[x]>0){
f1 <- sample(1:length(parents), size=numchild[x])
f2 <- list(parents[f1])
parent_child <- c(parent_child, f2)
else {
parent_child <- c(parent_child, list(x+1)) #if numchild=0, make up something
Here is what I want to do: say parent item #1 has 5 children items-- 1,2,3,4,5 and parent item #2 has 3 children item-- 4,10,22.
I want to compute the length(intersection) of every (parent_i, parent_j) combination. In the above case, it would be 1 common item-- 4.
I am doing this for 10M+ parent items that on average have 15-20 children items with a (0,100) range. So that's a 10M x 10M item-item matrix.
I have a foreach loop that I am testing out on a smaller subset that works but doesn't quite scale for the full dataset (64 core machine with 256GB RAM). With the loop below I am already computing only half of the user-user matrix--> (parent_i, parent_j) same as (parent_j, parent_i) for this purpose.
#small subset
a <- parent_child[1:1000]
outerresults <- foreach (i = 1:(length(a)), .combine=rbind, .packages=c('foreach','doParallel')) %dopar% {
b <- a[[i]]
rest <- a[i+1:length(a)]
foreach (j = 1:(length(rest)), .combine=rbind) %dopar% {
common <- length(intersect(b, rest[[j]]))
if (common > 0) {g <- data.frame(u1=i, u2=j+1, common)}
I've been experimenting variations on this (using Reduce, storing parent-children in a daataframe etc.) but haven't had much luck.
Is there a way to make this scale?
I reversed the split, so that we have a child-parent relationship
len <- sapply(parent_child, length)
child_parent <- split(rep(seq_along(parent_child), len),
unlist(parent_child, use.names=FALSE))
Something like the following constructs a string with pairs of parents sharing a child, across all children
keep <- sapply(child_parent, length) > 1
int <- lapply(child_parent[keep], function(x) {
x <- combn(sort(x), 2)
paste(x[1,], x[2,], sep=".")
and tallying
table(unlist(int, use.names=FALSE))
or a little more quickly
xx <- unlist(int, use.names=FALSE)
nms <- unique(xx)
cnt <- match(xx, nms)
setNames(tabulate(cnt, length(nms), nms)
f1 <- function(parent_child) {
len <- sapply(parent_child, length)
child_parent <- split(rep(seq_along(parent_child), len),
unlist(parent_child, use.names=FALSE))
keep <- sapply(child_parent, length) > 1
int <- lapply(child_parent[keep], function(x) {
x <- combn(sort(x), 2)
paste(x[1,], x[2,], sep=".")
xx <- unlist(int, use.names=FALSE)
nms <- unique(xx)
cnt <- match(xx, nms)
setNames(tabulate(cnt, length(nms)), nms)
with (this is for all 10000 parent-child elements)
> system.time(ans1 <- f1(parent_child))
user system elapsed
14.625 0.012 14.668
> head(ans1)
542.1611 542.1832 542.2135 542.2435 542.2527 542.2806
1 1 1 1 1 1
I'm not sure that this would really scale to the size of problem you're talking about, though -- it's polynomial in the number of parents per child.
One possibility for speed-up is to 'memoize' the combinatorial calculation, using the length of the argument as a 'key' and storing the combination as 'value'. This reduces the number of times combn is called to the number of unique lengths of elements of child_parent.
combn1 <- local({
memo <- new.env(parent=emptyenv())
function(x) {
key <- as.character(length(x))
if (!exists(key, memo))
memo[[key]] <- t(combn(length(x), 2))
paste(x[memo[[key]][,1]], x[memo[[key]][,2]], sep=".")
f2 <- function(parent_child) {
len <- sapply(parent_child, length)
child_parent <- split(rep(seq_along(parent_child), len),
unlist(parent_child, use.names=FALSE))
keep <- sapply(child_parent, length) > 1
int <- lapply(child_parent[keep], combn1)
xx <- unlist(int, use.names=FALSE)
nms <- unique(xx)
cnt <- match(xx, nms)
setNames(tabulate(cnt, length(nms)), nms)
which helps somewhat
> system.time(ans2 <- f2(parent_child))
user system elapsed
5.337 0.000 5.347
> identical(ans1, ans2)
[1] TRUE
The slow part is now paste
> Rprof(); ans2 <- f2(parent_child); Rprof(NULL); summaryRprof()
self.time self.pct total.time total.pct
"paste" 3.92 73.41 3.92 73.41
"match" 0.74 13.86 0.74 13.86
"unique.default" 0.40 7.49 0.40 7.49
"as.character" 0.08 1.50 0.08 1.50
"unlist" 0.08 1.50 0.08 1.50
"combn" 0.06 1.12 0.06 1.12
"lapply" 0.02 0.37 4.00 74.91
"any" 0.02 0.37 0.02 0.37
"setNames" 0.02 0.37 0.02 0.37
We can avoid this by encoding the parents with shared child id into a single integer; because of the way floating point numbers are represented in R, this will be exact until about 2^21
encode <- function(x, y, n)
(x - 1) * (n + 1) + y
decode <- function(z, n)
list(x=ceiling(z / (n + 1)), y = z %% (n + 1))
and adjusting our combn1 and f2 functions as
combn2 <- local({
memo <- new.env(parent=emptyenv())
function(x, encode_n) {
key <- as.character(length(x))
if (!exists(key, memo))
memo[[key]] <- t(combn(length(x), 2))
encode(x[memo[[key]][,1]], x[memo[[key]][,2]], encode_n)
f3 <- function(parent_child) {
encode_n <- length(parent_child)
len <- sapply(parent_child, length)
child_parent <-
unname(split(rep(seq_along(parent_child), len),
unlist(parent_child, use.names=FALSE)))
keep <- sapply(child_parent, length) > 1
int <- lapply(child_parent[keep], combn2, encode_n)
id <- unlist(int, use.names=FALSE)
uid <- unique(xx)
n <- tabulate(match(xx, uid), length(uid)), c(decode(uid, encode_n), list(n=n)))
leading to
> system.time(f3(parent_child))
user system elapsed
2.140 0.000 2.146
This compares very favorably (note that the timing in the previous line is for 10,000 parent-child relations) with jlhoward's revised answer
> system.time(result.3 <-"rbind",lapply(1:99,gg)))
user system elapsed
2.465 0.000 2.468
> system.time(f3(parent_child[1:99]))
user system elapsed
0.016 0.000 0.014
and scales in a much more reasonable way.
For what it's worth, the data generation routine is in the second circle of Patrick Burn's R Inferno, using the 'copy-and-append' algorithm rather than pre-allocating the space and filling it in. Avoid this by writing the for loop body as a function, and using lapply. Avoid the need for the complicated conditional in the for loop by fixing the issue before-hand
numchild <- round(rnorm(10000, mean=30, sd=10))
numchild[numchild < 0] <- sample(numchild[numchild > 0], sum(numchild < 0))
or by sampling from a distribution (rpois, rbinom) that generates positive integer values. Data generation is then
n_parents <- 10000
numchild <- round(rnorm(n_parents, mean=30, sd=10))
numchild[numchild < 0] <- sample(numchild[numchild > 0], sum(numchild < 0))
parent_child <- lapply(numchild, sample, x=n_parents)
Here is another approach that is about 10X faster than my previous answer, and 17X faster than the original code (also simpler):
ff <- function(u2, u1, a) {
common <- length(intersect(a,parent_child[[u2]]))
if (common>0) {return(data.frame(u1,u2,common))}
gg <- function(u1) {
a <- parent_child[[u1]]"rbind",lapply((u1+1):100,ff,u1,a))
system.time(result.3 <-"rbind",lapply(1:99,gg)))
user system elapsed
1.04 0.00 1.03
result.3 is identical to result.2 from previous answer:
[1] 0
Well, a small improvement (I think):
Original code (wrapped in function call):
f = function(n) {
#small subset
a <- parent_child[1:n]
outerresults <- foreach (i = 1:(length(a)),
.packages=c('foreach','doParallel')) %dopar% {
b <- a[[i]]
rest <- a[i+1:length(a)]
foreach (j = 1:(length(rest)), .combine=rbind) %dopar% {
common <- length(intersect(b, rest[[j]]))
if (common > 0) {g <- data.frame(u1=i, u2=j+1, common)}
Modified code:
g <- function(n) {
a <- parent_child[1:n]
outerresults <- foreach (i = 1:n,
.packages=c('foreach','doParallel')) %dopar% {
b <- a[[i]]
foreach (j = (i):n, .combine=rbind) %dopar% {
if (i!=j) {
c <- a[[j]]
common <- length(intersect(b, c))
if (common > 0) {g <- data.frame(u1=i, u2=j, common)}
user system elapsed
17.21 0.00 17.33
user system elapsed
10.42 0.00 10.47
The numbering for u2 is a little different becasue of the different approaches, but both produce the same vector of matches:
[1] 0
I tried this with data table joins replacing intersect(...) and it was actually much slower(!!)
I have 3 vectors v, w and a. I want to find out the summation of indicator (v > w_i)* a_i.Is there any faster way than following code?
v = rnorm(1600)
w = runif(500)
a = rnorm(500)
m = v > rep(w, each = length(v))
dim(m)=c(length(v), length(w))
. system.time({
m = v > rep(w, each = length(v))
dim(m)=c(length(v), length(w))
rowSums(m %*% diag(a))
user system elapsed
0.03 0.00 0.04
Even a non-vectorized solution will be faster than setting up a giant sparse matrix like you have done with diag.
# user system elapsed
# 0.032 0.000 0.031
m = v > rep(w, each = length(v))
dim(m)=c(length(v), length(w))
res<-rowSums(m %*% diag(a))
# user system elapsed
# 0.364 0.000 0.362
But, if you wanted to get fancy, you could do something like this:
# user system elapsed
# 0 0 0
I need to calculate the mean of each off-diagonal element in an n × n matrix. The lower and upper triangles are redundant. Here's the code I'm currently using:
A <- replicate(500, rnorm(500))
sapply(1:(nrow(A)-1), function(x) mean(A[row(A) == (col(A) - x)]))
Which seems to work but does not scale well with larger matrices. The ones I have aren't huge, around 2-5000^2, but even with 1000^2 it's taking longer than I'd like:
A <- replicate(1000, rnorm(1000))
system.time(sapply(1:(nrow(A)-1), function(x) mean(A[row(A) == (col(A) - x)])))
> user system elapsed
> 26.662 4.846 31.494
Is there a smarter way of doing this?
edit To clarify, I'd like the mean of each diagonal independently, e.g. for:
1 2 3 4
1 2 3 4
1 2 3 4
1 2 3 4
I would like:
You can get significantly faster just by extracting the diagonals directly using linear addressing: superdiag here extracts the ith superdiagonal from A (i=1 is the principal diagonal)
superdiag <- function(A,i) {
r <- 1:len;
c <- i:n;
superdiagmeans <- function(A) {
sapply(2:nrow(A), function(i){mean(superdiag(A,i))})
Running this on a 1K square matrix gives a ~800x speedup:
> A <- replicate(1000, rnorm(1000))
> system.time(sapply(1:(nrow(A)-1), function(x) mean(A[row(A) == (col(A) - x)])))
user system elapsed
26.464 3.345 29.793
> system.time(superdiagmeans(A))
user system elapsed
0.033 0.006 0.039
This gives you results in the same order as the original.
You can use the following function :
diagmean <- function(x){
id <- row(x) - col(x)
sol <- tapply(x,id,mean)
If we check this on your matrix, the speed gain is substantial:
> system.time(diagmean(A))
user system elapsed
2.58 0.00 2.58
> system.time(sapply(1:(nrow(A)-1), function(x) mean(A[row(A) == (col(A) - x)])))
user system elapsed
38.93 4.01 42.98
Note that this function calculates both upper and lower triangles. You can calculate eg only the lower triangular using:
diagmean <- function(A){
id <- row(A) - col(A)
id[id>=0] <- NA
This results in another speed gain. Note that the solution will be reversed compared to yours :
> A <- matrix(rep(c(1,2,3,4),4),ncol=4)
> sapply(1:(nrow(A)-1), function(x) mean(A[row(A) == (col(A) - x)]))
[1] 2.0 1.5 1.0
> diagmean(A)
-3 -2 -1
1.0 1.5 2.0
Suppose I have a large matrix:
M <- matrix(rnorm(1e7),nrow=20)
Further suppose that each column represents a sample. Say I would like to apply t.test() to each column, is there a way to do this that is much faster than using apply()?
apply(M, 2, t.test)
It took slightly less than 2 minutes to run the analysis on my computer:
> system.time(invisible( apply(M, 2, t.test)))
user system elapsed
113.513 0.663 113.519
You can do better than this with the colttests function from the genefilter package (on Bioconductor).
> library(genefilter)
> M <- matrix(rnorm(40),nrow=20)
> my.t.test <- function(c){
+ n <- sqrt(length(c))
+ mean(c)*n/sd(c)
+ }
> x1 <- apply(M, 2, function(c) my.t.test(c))
> x2 <- colttests(M, gl(1, nrow(M)))[,"statistic"]
> all.equal(x1, x2)
[1] TRUE
> M <- matrix(rnorm(1e7), nrow=20)
> system.time(invisible(apply(M, 2, function(c) my.t.test(c))))
user system elapsed
27.386 0.004 27.445
> system.time(invisible(colttests(M, gl(1, nrow(M)))[,"statistic"]))
user system elapsed
0.412 0.000 0.414
Ref: "Computing thousands of test statistics simultaneously in R", SCGN, Vol 18 (1), 2007,
If you have a multicore machine there are some gains from using all the cores, for example using mclapply.
> library(multicore)
> M <- matrix(rnorm(40),nrow=20)
> x1 <- apply(M, 2, t.test)
> x2 <- mclapply(1:dim(M)[2], function(i) t.test(M[,i]))
> all.equal(x1, x2)
[1] "Component 1: Component 9: 1 string mismatch" "Component 2: Component 9: 1 string mismatch"
# str(x1) and str(x2) show that the difference is immaterial
This mini-example shows that things go as we planned. Now scale up:
> M <- matrix(rnorm(1e7), nrow=20)
> system.time(invisible(apply(M, 2, t.test)))
user system elapsed
101.346 0.626 101.859
> system.time(invisible(mclapply(1:dim(M)[2], function(i) t.test(M[,i]))))
user system elapsed
55.049 2.527 43.668
This is using 8 virtual cores. Your mileage may vary. Not a huge gain, but it comes from very little effort.
If you only care about the t-statistic itself, extracting the corresponding field ($statistic) makes things a bit faster, in particular in the multicore case:
> system.time(invisible(apply(M, 2, function(c) t.test(c)$statistic)))
user system elapsed
80.920 0.437 82.109
> system.time(invisible(mclapply(1:dim(M)[2], function(i) t.test(M[,i])$statistic)))
user system elapsed
21.246 1.367 24.107
Or even faster, compute the t value directly
my.t.test <- function(c){
n <- sqrt(length(c))
> system.time(invisible(apply(M, 2, function(c) my.t.test(c))))
user system elapsed
21.371 0.247 21.532
> system.time(invisible(mclapply(1:dim(M)[2], function(i) my.t.test(M[,i]))))
user system elapsed
144.161 8.658 6.313