Apply a bunch of functions to columns of a matrix in R - r

There is a way to apply a function f to every column of a matrix:
M <- matrix(seq(1,16), 4, 4)
apply(M, 2, mean)
#[1] 2.5 6.5 10.5 14.5
But if I want to build a descriptive statistics about matrix I should use more indeces. For example, max, min, mean etc.
But R doesn't allow to do something like this:
apply(M, 2, c(mean, max))
to get this output:
# [,1] [,2] [,3] [,4]
#mean 2.5 6.5 10.5 14.5
#max 4 8 12 16
Would you tell me how to manage with this problem?

apply(M, 2, function(x) c(mean(x), max(x)))
# [,1] [,2] [,3] [,4]
# [1,] 2.5 6.5 10.5 14.5
# [2,] 4.0 8.0 12.0 16.0

Try the following:
f <- c("max", "min", "mean")
sapply(f, function(x) apply(M, 2, x))
max min mean
[1,] 4 1 2.5
[2,] 8 5 6.5
[3,] 12 9 10.5
[4,] 16 13 14.5

Related

element-wise averages of two (or more) nested lists of matrices

I have two lists A_1 and A_2, each contains two matrices.
A_1 <- list(a=matrix(1:8, 2), b=matrix(2:9, 2))
A_2 <- list(a=matrix(10:17, 2), b=matrix(5:12, 2))
I'd like to calculate element-wise averages of these two lists which results a list of
tibble::lst((A_1$a + A_2$a)/2, (A_1$b + A_2$b)/2)
I used
purrr::pmap(list(A_1 , A_2), mean)
but got
Error in mean.default(.l[[1L]][[i]], .l[[2L]][[i]], ...) :
'trim' must be numeric of length one`
or
purrr::map2(A_1, A_2, mean)
Error in mean.default(.x[[i]], .y[[i]], ...) :
'trim' must be numeric of length one`
In base R, We could use:
A <-list(A_1, A_2)
lapply(Reduce(\(x, y)Map('+', x, y), A), '/', length(A))
$a
[,1] [,2] [,3] [,4]
[1,] 5.5 7.5 9.5 11.5
[2,] 6.5 8.5 10.5 12.5
$b
[,1] [,2] [,3] [,4]
[1,] 3.5 5.5 7.5 9.5
[2,] 4.5 6.5 8.5 10.5
This code is generic in that we can use to find the mean of several lists.
Note that A_1 and A_2 must have the same number of matrices, not necessarily 2. Can be 10 etc. Also note that each corresponding matrix has the same dimensions. Example below:
B_1 <- list(matrix(c(1,2,3,4), 2), matrix(c(1,3,4,2), 2),
matrix(c(1:10), 5), matrix(c(1:20), 5))
B_2 <- lapply(B_1, '*', 2) # In this case, its B_1 * 2
B_3 <- lapply(B_2, '*', 3) #
Now you could use the code provide above:
B <-list(B_1, B_2, B_3)
lapply(Reduce(\(x, y)Map('+', x, y), B), '/', length(B))
Your mistake is in using the second matrix as trim= argument of mean whcih is the second. You need to concatenate the matrices. Example:
mean(1:3, 2:4)
# Error in mean.default(1:3, 2:4) : 'trim' must be numeric of length one
mean(c(1:3, 2:4))
# [1] 2.5
As solution you may use Map
Map(\(x, y) (x + y)/2, A_1, A_2)
# $a
# [,1] [,2] [,3] [,4]
# [1,] 5.5 7.5 9.5 11.5
# [2,] 6.5 8.5 10.5 12.5
#
# $b
# [,1] [,2] [,3] [,4]
# [1,] 3.5 5.5 7.5 9.5
# [2,] 4.5 6.5 8.5 10.5
Or, why not using arrays?
AA_1 <- array(unlist(A_1), dim=c(dim(A_1$a), length(A_1)))
AA_2 <- array(unlist(A_2), dim=c(dim(A_2$a), length(A_2)))
(AA_1 + AA_2)/2
# , , 1
#
# [,1] [,2] [,3] [,4]
# [1,] 5.5 7.5 9.5 11.5
# [2,] 6.5 8.5 10.5 12.5
#
# , , 2
#
# [,1] [,2] [,3] [,4]
# [1,] 3.5 5.5 7.5 9.5
# [2,] 4.5 6.5 8.5 10.5
in base R:
item_names <- names(A_1)
structure(
lapply(item_names, function(name){
0.5 * (A_1[[name]] + A_2[[name]])
## or, if you want the scalar mean:
## mean(A_1[[name]] + A_2[[name]])
}),
names = item_names
)
#> $a
#> [,1] [,2] [,3] [,4]
#> [1,] 5.5 7.5 9.5 11.5
#> [2,] 6.5 8.5 10.5 12.5
#>
#> $b
#> [,1] [,2] [,3] [,4]
#> [1,] 3.5 5.5 7.5 9.5
#> [2,] 4.5 6.5 8.5 10.5

Dividing a list of matrices by a matrix

I have a list of matrices that I like to divide the values in each matrix by a different value.
l1 <- list(1,2,3,4,5,6)
l2 <- list(7,8,9,10,11,12)
mat <- Map(
function(x, y) outer(unlist(x), unlist(y), `+`) / 2,
split(l1, ceiling(seq_along(l1) / 3)),
split(l2, ceiling(seq_along(l2) / 3))
)
For example the output below shows one of the elements in the mat list:
$`1`
[,1] [,2] [,3]
[1,] 4.0 4.5 5.0
[2,] 4.5 5.0 5.5
[3,] 5.0 5.5 6.0
I would like to divide the values in the matrix by another matrix with different values
Maybe a matrix that looks like this (I wasn't sure how to create a matrix in r)
2 1 2
3 2 3
1 2 3
My desired output would then look like this:
[,1] [,2] [,3]
[1,] 4.0/2 4.5/1 5.0/2
[2,] 4.5/3 5.0/2 5.5/3
[3,] 5.0/1 5.5/2 6.0/3
How could I do create this output? How do I create a matrix with my desired values in R?
Thank you.
If your matrices are the same dimensions you can divide them with the / operator.
# create matrix to divide by
mat_div <- matrix(c(2,3,1,1,2,2,2,3,3), nrow = 3)
# divide list of matricies
lapply(mat, `/`, mat_div)
#------
$`1`
[,1] [,2] [,3]
[1,] 2.0 4.50 2.500000
[2,] 1.5 2.50 1.833333
[3,] 5.0 2.75 2.000000
$`2`
[,1] [,2] [,3]
[1,] 3.5 7.50 4.000000
[2,] 2.5 4.00 2.833333
[3,] 8.0 4.25 3.000000
We can use Map
mat <- Map(`/`, mat, list(mat2))
-otuput
mat
$`1`
[,1] [,2] [,3]
[1,] 2.0 4.50 2.500000
[2,] 1.5 2.50 1.833333
[3,] 5.0 2.75 2.000000
$`2`
[,1] [,2] [,3]
[1,] 3.5 7.50 4.000000
[2,] 2.5 4.00 2.833333
[3,] 8.0 4.25 3.000000
data
mat2 <- cbind(c(2, 3, 1), c(1, 2, 2), c(2, 3, 3))

apply and lapply in one function return an NAN

I have a function return list of list, I would like to find the standard deviation of the matrices of my output. The output of my function is a list of two list. I tried this code but it return me NAN. Since my function is complex, then I use this example from another question please see here since it is quite close to what I am trying to do.
> A <- matrix(c(1:9), 3, 3)
> A
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> B <- matrix(c(2:10), 3, 3)
> B
[,1] [,2] [,3]
[1,] 2 5 8
[2,] 3 6 9
[3,] 4 7 10
> my.list1 <- list(A, B)
so the mean of the first list is:
[,1] [,2] [,3]
[1,] 1.5 4.5 7.5
[2,] 2.5 5.5 8.5
[3,] 3.5 6.5 9.5
Then the standard deviation will be:
[,1] [,2] [,3]
[1,] 0.7071068 0.7071068 0.7071068
[2,] 0.7071068 0.7071068 0.7071068
[3,] 0.7071068 0.7071068 0.7071068
> c <- matrix(c(1:9), 3, 3)
> c
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> d <- matrix(c(2:10), 3, 3)
> d
[,1] [,2] [,3]
[1,] 2 5 8
[2,] 3 6 9
[3,] 4 7 10
> my.list2 <- list(c, d)
my.list <-list(my.list1,my.list2)
How can I get the standard deviation of my matrices on an element by element for the list?
Try ?rapply
> rapply(my.list, sd)
[1] 2.738613 2.738613 2.738613 2.738613
You could bind your lists into an array, or perhaps make your function return an array(?), then you could use apply() to apply your chosen functions...
A <- matrix(1:9, 3, 3)
B <- matrix(2:10, 3, 3)
my.list1 <- list(A, B)
c <- matrix(1:9, 3, 3)
d <- matrix(2:10, 3, 3)
my.list2 <- list(c, d)
Create array from all 4 lists
my.array1 <- abind::abind(c(my.list1, my.list2), along = 3)
Find the mean() of the required dimension
apply(my.array1, c(1, 2), mean)
apply(my.array1, c(1,2), sd)
Output
[,1] [,2] [,3]
[1,] 1.5 4.5 7.5
[2,] 2.5 5.5 8.5
[3,] 3.5 6.5 9.5

Unexpected apply function behaviour in R

I've discovered a surprising behaviour by apply that I wonder if anyone can explain. Lets take a simple matrix:
> (m = matrix(1:8,ncol=4))
[,1] [,2] [,3] [,4]
[1,] 1 3 5 7
[2,] 2 4 6 8
We can flip it vertically thus:
> apply(m, MARGIN=2, rev)
[,1] [,2] [,3] [,4]
[1,] 2 4 6 8
[2,] 1 3 5 7
This applies the rev() vector reversal function iteratively to each column. But when we try to apply rev by row we get:
> apply(m, MARGIN=1, rev)
[,1] [,2]
[1,] 7 8
[2,] 5 6
[3,] 3 4
[4,] 1 2
.. a 90 degree anti-clockwise rotation! Apply delivers the same result using FUN=function(v) {v[length(v):1]} so it is definitely not rev's fault.
Any explanation for this?
This is because apply returns a matrix that is defined column-wise, and you're iterating over the rows.
The first application of apply presents each row, which is then a column in the result.
Presenting the function print shows what's being passed to rev at each iteration:
x <- apply(m, 1, print)
[1] 1 3 5 7
[1] 2 4 6 8
That is, each call to print is passed a vector. Two calls, and c(1,3,5,7) and c(2,4,6,8) are being passed to the function.
Reversing these gives c(7,5,3,1) and c(8,6,4,2), then these are used as the columns of the return matrix, giving the result that you see.
The documentation states that
If each call to FUN returns a vector of length n, then apply returns
an array of dimension c(n, dim(X)[MARGIN]) if n > 1.
From that perspective, this behaviour is not a bug whatsoever, that's how it intended to work.
One may wonder why this is chosen to be a default setting, instead of preserving the structure of the original matrix. Consider the following example:
> apply(m, 1, quantile)
[,1] [,2]
0% 1.0 2.0
25% 2.5 3.5
50% 4.0 5.0
75% 5.5 6.5
100% 7.0 8.0
> apply(m, 2, quantile)
[,1] [,2] [,3] [,4]
0% 1.00 3.00 5.00 7.00
25% 1.25 3.25 5.25 7.25
50% 1.50 3.50 5.50 7.50
75% 1.75 3.75 5.75 7.75
100% 2.00 4.00 6.00 8.00
> all(rownames(apply(m, 2, quantile)) == rownames(apply(m, 1, quantile)))
[1] TRUE
Consistent? Indeed, why would we expect anything else?
When you pass a row vector to rev, it returns a column vector.
t(c(1,2,3,4))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
rev(t(c(1,2,3,4)))
[1] 4 3 2 1
which is not what you expected
[,1] [,2] [,3] [,4]
[1,] 4 3 2 1
So, you'll have to transpose the call to apply to get what you want
t(apply(m, MARGIN=1, rev))
[,1] [,2] [,3] [,4]
[1,] 7 5 3 1
[2,] 8 6 4 2

Generate covariance matrix from correlation matrix

I have a correlation matrix:
a <- matrix(c(1, .8, .8, .8, 1, .8, .8, .8, 1), 3)
## [,1] [,2] [,3]
## [1,] 1.0 0.8 0.8
## [2,] 0.8 1.0 0.8
## [3,] 0.8 0.8 1.0
I would now like to create a covariance matrix from the correlation matrix. How can this be done in R?
I tried:
e1.sd <- 3
e2.sd <- 10
e3.sd <- 3
e.cov <- a * as.matrix(c, e1.sd, e2.sd, e3.sd) %*% t(as.matrix(c(e1.sd, e2.sd, e3.sd)))
But I get the error:
Error in a * as.matrix(c, e1.sd, e2.sd, e3.sd) %*% t(as.matrix(c(e1.sd, :
non-conformable arrays
What am I doing wrong?
If you know the standard deviations of your individual variables, you can:
stdevs <- c(e1.sd, e2.sd, e3.sd)
#stdevs is the vector that contains the standard deviations of your variables
b <- stdevs %*% t(stdevs)
# b is an n*n matrix whose generic term is stdev[i]*stdev[j] (n is your number of variables)
a_covariance <- b * a #your covariance matrix
On the other hand, if you don't know the standard deviations, it's impossible.
require(MBESS)
a <- matrix(c(1,.8,.8,.8,1,.8,.8,.8,1),3)
> cor2cov(a,c(3,10,3))
[,1] [,2] [,3]
[1,] 9.0 24 7.2
[2,] 24.0 100 24.0
[3,] 7.2 24 9.0
Building on S4M's answer, in base R, I would write this function:
cor2cov <- function(V, sd) {
V * tcrossprod(sd)
}
tcrossprod will calculate the product of each combination of elements of the sd vector (equivalent to x %*% t(x)), which we then (scalar) multiply by the variance-covariance matrix
Here's a quick check that the function is correct using the built in mtcars data set:
all.equal(
cor2cov(cor(mtcars), sapply(mtcars, sd)),
cov(mtcars)
)
The answer marked as correct is wrong.
The correct solution seems to be the one provided by MBESS package, so see the post from dayne.
> a
[,1] [,2] [,3]
[1,] 1.0 0.8 0.8
[2,] 0.8 1.0 0.8
[3,] 0.8 0.8 1.0
> b <- c(3,10,3)
> b %*% t(b)
[,1] [,2] [,3]
[1,] 9 30 9
[2,] 30 100 30
[3,] 9 30 9
> c <- b %*% t(b)
> c %*% a
[,1] [,2] [,3]
[1,] 40.2 44.4 40.2
[2,] 134.0 148.0 134.0
[3,] 40.2 44.4 40.2
> cor2cov(cor.mat=a, b )
[,1] [,2] [,3]
[1,] 9.0 24 7.2
[2,] 24.0 100 24.0
[3,] 7.2 24 9.0
> a %*% c
[,1] [,2] [,3]
[1,] 40.2 134 40.2
[2,] 44.4 148 44.4
[3,] 40.2 134 40.2
>

Resources