More general or efficient approach for this matrix multiplication? - r

In R, is there a more efficient and/or general way to produce the desired output from the two matrices below? I'm suspicious that what I've done is just some esoteric matrix multiplication operation of which I'm not aware.
ff <- matrix(1:6,ncol=2)
# [,1] [,2]
# [1,] 1 4
# [2,] 2 5
# [3,] 3 6
bb <- matrix(7:10,ncol=2)
# [,1] [,2]
# [1,] 7 9
# [2,] 8 10
# DESIRE:
# 7 36
# 14 45
# 21 54
# 8 40
# 16 50
# 24 60
This works, but isn't the general solution I'm looking for:
rr1 <- t(t(ff) * bb[1,])
rr2 <- t(t(ff) * bb[2,])
rbind(rr1,rr2)
# [,1] [,2]
# [1,] 7 36
# [2,] 14 45
# [3,] 21 54
# [4,] 8 40
# [5,] 16 50
# [6,] 24 60
This next code block seems pretty efficient and is general. But is there a better way?
Something like kronecker(ffa,bba)? (which clearly doesn't work in this case)
ffa <- matrix(rep(t(ff),2), ncol=2, byrow=T)
bba <- matrix(rep(bb,each=3), ncol=2)
ffa * bba
# [,1] [,2]
# [1,] 7 36
# [2,] 14 45
# [3,] 21 54
# [4,] 8 40
# [5,] 16 50
# [6,] 24 60
This is related to my other questions:
Using apply function over the row margin with expectation of stacked results, where I'm trying to understand the behavior of apply itself and:
Is this an example of some more general matrix product?, where I'm asking about the theoretical math, specifically.

Use a kronecker product and pick off the appropriate columns:
kronecker(bb, ff)[, c(diag(ncol(bb))) == 1]
or using the infix operator for kronecker:
(bb %x% ff)[, c(diag(ncol(bb))) == 1]
Another approach is to convert the arguments to data frames and mapply kronecker across them. For the case in the question this performs the calculation cbind(bb[, 1] %x% ff[, 1], bb[, 2] %x% ff[, 2]) but in a more general manner without resorting to indices:
mapply(kronecker, as.data.frame(bb), as.data.frame(ff))
or using the infix operator for kronecker:
mapply(`%x%`, as.data.frame(bb), as.data.frame(ff))

The functionality you are seeking for is available within the Matrix package as the function KhatriRao. Since the function is in Matrix, output is a matrix of class "dgCMatrix" (sparse matrix). You can transform it to an ordinary matrix of class "matrix" by as.matrix.
library(Matrix)
as.matrix(KhatriRao(bb, ff))

Related

Multiplication of matrix in R

I have the following problem :
w <- matrix(1:3,nrow=3,ncol=1)
mymat <- as.matrix(cbind(a = 6:15, b = 16:25, c= 26:35))
mymat
a b c
[1,] 6 16 26
[2,] 7 17 27
[3,] 8 18 28
[4,] 9 19 29
[5,] 10 20 30
[6,] 11 21 31
[7,] 12 22 32
[8,] 13 23 33
[9,] 14 24 34
[10,] 15 25 35
I want to obtain the following results in a matrix the same size as mymat:
a b c
[1,] 6*1 16*2 26*3
[2,] 7*1 17*2 27*3
[3,] 8*1 18*2 28*3
...
I've tried the lappy function but I am unable to get the results I want. Thanks!
Using sweep():
sweep(mymat, 2, w, "*")
Converting w into a matrix of the same dimensions:
mymat * t(w)[rep(1, NROW(mymat)), ]
1) diag Post multipy it by the appropriate diagonal matrix. We can omit c(), although it won't hurt, if w is a vector rather than a matrix.
mymat %*% diag(c(w))
2) KhatriRao We could alternately use the KhatriRao product. If w is the w defined in the question then matrix could be optionally omitted but we included it in case w is actually a vector. Note that the Matrix package comes with R so it does not have to be installed.
library(Matrix)
KhatriRao(mymat, t(matrix(w)))
3) mapply
mapply(`*`, as.data.frame(mymat), w)
We can use also use col to replicate the values and then multiply in base R
mymat * w[col(mymat)]

how to avoid for loop using apply function in this question

mat<-matrix(1:9,nrow=3,ncol=3)
for(i in 1:3){
print(colSums(mat[1:i,]))
}
I'm trying to calculate mean of colSums of part of a matrix.
How do I avoid for loop in this case? The answer may be similar to the code below but I don't know how to proceed.
apply(mat,2,function(x) colSums(mat[]))
Thanks in advance!
The simplest way is to use cumsum() to get the sums and rowMeans() to get the means:
apply(mat, 2, cumsum)[2:4, ]
# [,1] [,2] [,3] [,4]
# [1,] 3 11 19 27
# [2,] 6 18 30 42
# [3,] 10 26 42 58
rowMeans(apply(mat, 2, cumsum)[2:4, ])
# [1] 15 24 34

do.call in r with matrix and lists

in language R, in order to generate a new matrix (N*6) as from an older one (N*3), is there a better way than the next one to do it without having to "unpack/unlist" the inner lists created in the apply function in order to "expand" the source matrix?
transformed <- matrix(byrow=T)
transformed <- as.matrix(
do.call("rbind", as.list(
apply(dataset, 1, function(x) {
x <- list(x[1], x[2], x[3], x[2]*x[3], x[2]^2, x[3]^2)
})
))
)
#Unpack all inner lists from the expanded matrix
ret_trans <- as.matrix( apply(transformed, 2, function(x) unlist(x)) )
EDIT: I add an example of that
dataset
[,1] [,2] [,3]
[1,] 1 6 11
[2,] 2 7 12
[3,] 3 8 13
[4,] 4 9 14
[5,] 5 10 15
and on applying the code above I want to expand to N*6, 5*6 (sorry, I misspelled the column dimension up there, and the margin of apply function) it should be like that
transformed
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 6 11 66 36 121
[2,] 2 7 12 84 49 144
[3,] 3 8 13 104 64 169
[4,] 4 9 14 126 81 196
[5,] 5 10 15 150 100 225
The question is if there is another way of doing that without having to use the last apply function, without having to coerce the x to be a list
thanks all for your replies
Like suggested in the comments, do:
cbind(dataset, dataset[,2] * dataset[,3], dataset[,c(2, 3)]^2)
It will be a lot faster than using apply, which should have looked like this:
transformed <- function(x) c(x[1], x[2], x[3], x[2]*x[3], x[2]^2, x[3]^2)
apply(dataset, 1, transformed)

Creating a symmetric matrix in R

I have a matrix in R that is supposed to be symmetric, however, due to machine precision the matrix is never symmetric (the values differ by around 10^-16). Since I know the matrix is symmetric I have been doing this so far to get around the problem:
s.diag = diag(s)
s[lower.tri(s,diag=T)] = 0
s = s + t(s) + diag(s.diag,S)
Is there a better one line command for this?
s<-matrix(1:25,5)
s[lower.tri(s)] = t(s)[lower.tri(s)]
You can force the matrix to be symmetric using forceSymmetric function in Matrix package in R:
library(Matrix)
x<-Matrix(rnorm(9), 3)
> x
3 x 3 Matrix of class "dgeMatrix"
[,1] [,2] [,3]
[1,] -1.3484514 -0.4460452 -0.2828216
[2,] 0.7076883 -1.0411563 0.4324291
[3,] -0.4108909 -0.3292247 -0.3076071
A <- forceSymmetric(x)
> A
3 x 3 Matrix of class "dsyMatrix"
[,1] [,2] [,3]
[1,] -1.3484514 -0.4460452 -0.2828216
[2,] -0.4460452 -1.0411563 0.4324291
[3,] -0.2828216 0.4324291 -0.3076071
Is the workaround really necessary if the values only differ by that much?
Someone pointed out that my previous answer was wrong. I like some of the other ones better, but since I can't delete this one (accepted by a user who left), here's yet another solution using the micEcon package:
symMatrix(s[upper.tri(s, TRUE)], nrow=nrow(s), byrow=TRUE)
s<-matrix(1:25,5)
pmean <- function(x,y) (x+y)/2
s[] <- pmean(s, matrix(s, nrow(s), byrow=TRUE))
s
#-------
[,1] [,2] [,3] [,4] [,5]
[1,] 1 4 7 10 13
[2,] 4 7 10 13 16
[3,] 7 10 13 16 19
[4,] 10 13 16 19 22
[5,] 13 16 19 22 25
I was curious to compare all the methods, so ran a quick microbenchmark. Clearly, the simplest 0.5 * (S + t(S)) is the fastest.
The specific function Matrix::forceSymmetric() is sometimes slightly faster, but it returns an object of a different class (dsyMatrix instead of matrix), and converting back to matrix takes a lot of time (although one might argue that it is a good idea to keep the output as dsyMatrix for further gains in computation).
S <-matrix(1:50^2,50)
pick_lower <- function(M) M[lower.tri(M)] = t(M)[lower.tri(M)]
microbenchmark::microbenchmark(micEcon=miscTools::symMatrix(S[upper.tri(S, TRUE)], nrow=nrow(S), byrow=TRUE),
Matri_raw =Matrix::forceSymmetric(S),
Matri_conv =as.matrix(Matrix::forceSymmetric(S)),
pick_lower = pick_lower(S),
base =0.5 * (S + t(S)),
times=100)
#> Unit: microseconds
#> expr min lq mean median uq max neval cld
#> micEcon 62.133 74.7515 136.49538 104.2430 115.6950 3581.001 100 a
#> Matri_raw 14.766 17.9130 24.15157 24.5060 26.6050 63.939 100 a
#> Matri_conv 46.767 59.8165 5621.96140 66.3785 73.5380 555393.346 100 a
#> pick_lower 27.907 30.7930 235.65058 48.9760 53.0425 12484.779 100 a
#> base 10.771 12.4535 16.97627 17.1190 18.3175 47.623 100 a
Created on 2021-02-08 by the reprex package (v1.0.0)
as.dist() will overwrite the upper triangle of a matrix with the lower one and replace the diagonal with zeros. This method only works on numeric matrices.
mat <- matrix(1:25, 5)
unname(`diag<-`(as.matrix(as.dist(mat)), diag(mat)))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 3 4 5
# [2,] 2 7 8 9 10
# [3,] 3 8 13 14 15
# [4,] 4 9 14 19 20
# [5,] 5 10 15 20 25
Inspired by user3318600
s<-matrix(1:25,5)
s[lower.tri(s)]<-s[upper.tri(s)]

How do I apply a multi-parameter function in R?

I have the following data frame and vector.
> y
v1 v2 v3
1 1 6 43
2 4 7 5
3 0 2 32
> v
[1] 1 2 3
I want to apply the following function to every ROW in that data frame such that v is added to every ROW of y:
x <- function(vector1,vector2) {
x <- vector1 + vector2
}
... in order to get THESE results:
v1 v2 v3
1 2 8 46
2 5 9 8
3 1 4 35
mapply applies the function to COLUMNS:
> z <- mapply(x, y, MoreArgs=list(vector2=v))
> z
v1 v2 v3
[1,] 2 7 44
[2,] 6 9 7
[3,] 3 5 35
I've tried transposing the data frame so that the function will be applied to rows and not columns, but mapply gives me weird results after transposing:
> transposed <- t(y)
> transposed
[,1] [,2] [,3]
v1 1 4 0
v2 6 7 2
v3 43 5 32
> z <- mapply(x, transposed, MoreArgs=list(vector2=v))
> z
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9]
[1,] 2 7 44 5 8 6 1 3 33
[2,] 3 8 45 6 9 7 2 4 34
[3,] 4 9 46 7 10 8 3 5 35
...Help?
############################ EDIT #########################
Thanks for all the answers! I'm learning tons of new R functions that I've never seen before, which is fantastic.
I want to clarify my earlier question a bit. What I'm really asking is a much more general question - how to apply a multi-parameter function to each row in R (at the moment, I'm tempted to conclude that I should just use a loop, but I would like to figure out if it IS possible, just for future reference...) (I also purposefully refrained from showing the code I'm working with since it's kind of messy).
I tried using the sweep function as was suggested, but I get the following error:
testsweep <- function(vector, z, n) {
testsweep <- z
}
> n <- names(Na_exp)
> n
[1] "NaCl.10000.2hr.AVG_Signal" "NaCl.10000.4hr.AVG_Signal"
> t <- head(Li_fcs,n=1)
> t
LiCl.1000.1hr.FoldChange LiCl.2000.1hr.FoldChange LiCl.5000.1hr.FoldChange
[1,] -0.05371838 -0.1010928 -0.01939986
LiCl.10000.1hr.FoldChange LiCl.1000.2hr.FoldChange
[1,] 0.1275617 -0.107154
LiCl.2000.2hr.FoldChange LiCl.5000.2hr.FoldChange
[1,] -0.06760782 -0.09770226
LiCl.10000.2hr.FoldChange LiCl.1000.4hr.FoldChange
[1,] -0.1124188 -0.06140386
LiCl.2000.4hr.FoldChange LiCl.5000.4hr.FoldChange
[1,] -0.04323497 -0.04275953
LiCl.10000.4hr.FoldChange LiCl.1000.8hr.FoldChange
[1,] 0.03633496 0.01879461
LiCl.2000.8hr.FoldChange LiCl.5000.8hr.FoldChange
[1,] 0.257977 -0.06357423
LiCl.10000.8hr.FoldChange
[1,] 0.07214176
> z <- colnames(Li_fcs)
> z
[1] "LiCl.1000.1hr.FoldChange" "LiCl.2000.1hr.FoldChange"
[3] "LiCl.5000.1hr.FoldChange" "LiCl.10000.1hr.FoldChange"
[5] "LiCl.1000.2hr.FoldChange" "LiCl.2000.2hr.FoldChange"
[7] "LiCl.5000.2hr.FoldChange" "LiCl.10000.2hr.FoldChange"
[9] "LiCl.1000.4hr.FoldChange" "LiCl.2000.4hr.FoldChange"
[11] "LiCl.5000.4hr.FoldChange" "LiCl.10000.4hr.FoldChange"
[13] "LiCl.1000.8hr.FoldChange" "LiCl.2000.8hr.FoldChange"
[15] "LiCl.5000.8hr.FoldChange" "LiCl.10000.8hr.FoldChange"
But when I try to apply sweep...
> test <- sweep(t, 2, z, n, FUN="testsweep")
Error in if (check.margin) { : argument is not interpretable as logical
In addition: Warning message:
In if (check.margin) { :
the condition has length > 1 and only the first element will be used
When I remove the n parameter from this test example, sweep works fine. This suggests to me that sweep cannot be used unless the all parameters provided to sweep are either the same number of columns as the t vector, or of length 1. Please correct me if I am mistaken...
You are asking to "sweeping" v across rows of y with the "+" function:
sweep(y, 1, v, FUN="+")
v1 v2 v3
1 2 7 44
2 6 9 7
3 3 5 35
If your actual problem is really no more complicated than this, you can take advantage of R's recycling rules. You need to transpose y first, then add, then transpose the result because R matrices are stored in column-major order.
t(t(y)+v)
v1 v2 v3
1 2 8 46
2 5 9 8
3 1 4 35
I don't think you need mapply here. Just use t() directly or you can use rep() to make the recycling match as you want:
> set.seed(1)
> mat <- matrix(sample(1:100, 9, TRUE), ncol = 3)
> vec <- 1:3
>
> mat
[,1] [,2] [,3]
[1,] 27 91 95
[2,] 38 21 67
[3,] 58 90 63
#Approach 1 using t()
> ans1 <- t(t(mat) + vec)
#Approach 2 using rep()
> ans2 <- mat + rep(vec, each = nrow(mat))
#Are they the same?
> identical(ans1, ans2)
[1] TRUE
#Hurray!
> ans1
[,1] [,2] [,3]
[1,] 28 93 98
[2,] 39 23 70
[3,] 59 92 66
How about using apply?
t(apply(y, 1, function(x) x + v))
[,1] [,2] [,3]
[1,] 2 8 46
[2,] 5 9 8
[3,] 1 4 35
I don't know why apply returns the row as columms so it needs to be transposed.
I would defintely take a look at mdply form the plyr package. This exactly does what you want to do:
mdply(data.frame(mean = 1:5, sd = 1:5), rnorm, n = 2)

Resources