Component-wise addition of matrices as list elements - r

So I have a 1131 element list, with each element being a 5 by 5 matrix. The first element looks much like the other ones
sotest.corr.try[1]
[[1]]
[,1] [,2] [,3]
[1,] 1.00000000 -0.04125426 0.1565728
[2,] -0.04125426 1.00000000 0.1199373
[3,] 0.15657281 0.11993733 1.0000000
[4,] 0.10209354 0.06125212 0.1937589
[5,] -0.19069820 0.17598585 -0.1235949
[,4] [,5]
[1,] 0.10209354 -0.19069820
[2,] 0.06125212 0.17598585
[3,] 0.19375885 -0.12359492
[4,] 1.00000000 -0.08771679
[5,] -0.08771679 1.00000000
Starting at element 126, I'd like to just add the preceding 125 matrices to 126. So that the component in the 1,2 spot, for example, would be the sum of the first 126 1,2 components. I've noticed that something like this gets what I want
sotest.corr.try[[1]]+sotest.corr.try[[2]]
[,1] [,2] [,3] [,4]
[1,] 2.00000000 -0.08842164 0.3155670 0.2063603
[2,] -0.08842164 2.00000000 0.2363135 0.1156103
[3,] 0.31556697 0.23631345 2.0000000 0.3869373
[4,] 0.20636030 0.11561033 0.3869373 2.0000000
[5,] -0.38288102 0.35103362 -0.2489587 -0.1804376
[,5]
[1,] -0.3828810
[2,] 0.3510336
[3,] -0.2489587
[4,] -0.1804376
[5,] 2.0000000
But this doesn't
sum(sotest.corr.try[[1:126]])
Error in sotest.corr.try[[1:126]] : recursive indexing failed at level 2
Is there any way to do this quickly? Maybe using lapply?
Thanks

For purposes of illustration suppose we have a list L of 5 2x2 matrices and we want the output to be the first two, followed by the cumulative sums for the others.
1) We concatenate the first two components of the list with everything but the first two components of the cumulative sum list computed using Reduce.
# test input
M <- matrix(1:4, 2)
L <- list(M, 2*M, 3*M, 4*M, 5*M)
ix <- 1:2
out1 <- c(L[ix], Reduce(`+`, L, acc = TRUE)[-ix])
# check
out2 <- list(L[[1]], L[[2]], L[[1]] + L[[2]] + L[[3]],
L[[1]] + L[[2]] + L[[3]] + L[[4]], L[[1]] + L[[2]] + L[[3]] + L[[4]] + L[[5]])
identical(out1, out2)
## [1] TRUE
2) A simple for loop would also work. Input L is from (1).
L2 <- L
for(i in seq_along(L2)[-1]) L2[[i]] <- L2[[i]] + L2[[i-1]]
ix <- 1:2
out3 <- c(L[ix], L2[-ix])
# check - out2 is from (1)
identical(out2, out3)
## [1] TRUE

Here are two other options using apply or rowSums with array (borrow data from G. Grothendieck's answer)
> apply(
+ array(
+ do.call(
+ cbind,
+ L
+ ), c(2, 2, length(L))
+ ), 1:2, sum
+ )
[,1] [,2]
[1,] 15 45
[2,] 30 60
> rowSums(
+ array(
+ do.call(
+ cbind,
+ L
+ ), c(2, 2, length(L))
+ ),
+ dims = 2
+ )
[,1] [,2]
[1,] 15 45
[2,] 30 60

Related

dividing first row of each matrix in list of matrices by the lagged sum of all rows in each matrix of a list matrices

I have a list of 100 - 3 x 51 matrices in r and would like to divide the first row of each matrix in the list by the lagged (n=2) sum of all rows in each corresponding matrix iteratively corresponding with the lag. I know how to achieve this within the same row within a list of vectors with the following code
Example_List2 <- lapply(Example_List1,function(x){x/lag(x,n=2)})
My attempt at the 3 row list is coded below. I would ultimately like to make this the replacement first row of a new DB and repeat this process for each row with dif lags. My attempted code is
List2 <- List1
lapply(List2, `[`,1,) <- lapply(List1,function(x){lapply(x, `[`,1,)/lag(colSums(x),n=2)})
lapply(List2, `[`,2,) <- lapply(List1,function(x){lapply(x, `[`,2,)/lag(colSums(x),n=3)})
lapply(List2, `[`,3,) <- lapply(List1,function(x){lapply(x, `[`,3,)/lag(colSums(x),n=4)})
We may use
library(rbindlist)
List2 <- lapply(List1, \(x) x/do.call(rbind, shift(colSums(x), n = 2:4)))
For the second case
List3 <- lapply(List1, \(x) {
n1 <- 2:4
x1 <- colSums(x)
x2 <- x
for(i in seq_along(n1)) {
x2[i,] <- shift(x[i,], n = n1[i], type = "lead")}
colSums(x2)/x1
})
data
set.seed(24)
List1 <- replicate(100, matrix(rnorm(3*51), nrow = 3), simplify = FALSE)
Try this.
lapply(List1, \(x) {
u <- 1:2
x[1, -u] <- x[1, -u]/colSums(x[, -u])
# x[1, u] <- NA_real_ ## uncomment if you want NA for x[1, 1:2]
x
})
# [[1]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 4 0.2916667 0.3030303 0.3095238
# [2,] 2 5 8.0000000 11.0000000 14.0000000
# [3,] 3 6 9.0000000 12.0000000 15.0000000
#
# [[2]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 4 0.2916667 0.3030303 0.3095238
# [2,] 2 5 8.0000000 11.0000000 14.0000000
# [3,] 3 6 9.0000000 12.0000000 15.0000000
#
# [[3]]
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 4 0.2916667 0.3030303 0.3095238
# [2,] 2 5 8.0000000 11.0000000 14.0000000
# [3,] 3 6 9.0000000 12.0000000 15.0000000
Data:
List1 <- replicate(3, matrix(1:15, 3, 5), simplify=FALSE)

Create a symmetric matrix from circular shifts of a vector

I'm struggling with the creation of a symmetric matrix.
Let's say a vector v <- c(1,2,3)
I want to create a matrix like this:
matrix(ncol = 3, nrow = 3, c(1,2,3,2,3,1,3,1,2), byrow = FALSE)
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 2 3 1
[3,] 3 1 2
(This is just an reprex, I have many vectors with different lengths.)
Notice this is a symmetric matrix with diagonal c(1,3,2) (different from vector v) and the manual process to create the matrix would be like this:
Using the first row as base (vector v) the process is to fill the empty spaces with the remaining values on the left side.
Any help is appreciated. Thanks!
Let me answer my own question in order to close it properly, using the incredible simple and easy solution from Henrik's comment:
matrix(v, nrow = 3, ncol = 4, byrow = TRUE)[ , 1:3]
Maybe the byrow = TRUE matches the three steps of the illustration best conceptually, but the output is the same with:
matrix(v, nrow = 4, ncol = 3)[1:3, ]
# [,1] [,2] [,3]
# [1,] 1 2 3
# [2,] 2 3 1
# [3,] 3 1 2
Because there may be "many vectors with different lengths", it could be convenient to make a simple function and apply it to the vectors stored in a list:
cycle = function(x){
len = length(x)
matrix(x, nrow = len + 1, ncol = len)[1:len , ]
}
l = list(v1 = 1:3, v2 = letters[1:4])
lapply(l, cycle)
# $v1
# [,1] [,2] [,3]
# [1,] 1 2 3
# [2,] 2 3 1
# [3,] 3 1 2
#
# $v2
# [,1] [,2] [,3] [,4]
# [1,] "a" "b" "c" "d"
# [2,] "b" "c" "d" "a"
# [3,] "c" "d" "a" "b"
# [4,] "d" "a" "b" "c"
Another option is to use Reduce and make c(v[-1], v[1]) accumulative.
do.call(rbind, Reduce(function(x, y) c(x[-1], x[1]), v[-1], v, accumulate = TRUE))
# [,1] [,2] [,3]
#[1,] 1 2 3
#[2,] 2 3 1
#[3,] 3 1 2

Output converted from matrix to vector in apply

I want to apply a function over one margin (column in my example) of a matrix. The problem is that the function returns matrix and apply converts it to vector so that it returns a matrix. My goal is to get three-dimensional array. Here is the example (note that matrix() is not the function of interest, just an example):
x <- matrix(1:12, 4, 3)
apply(x, 2, matrix, nrow = 2, ncol = 2)
The output is exactly the same as the input. I have pretty dull solution to this:
library(abind)
abind2 <- function (x, ...)
abind(x, ..., along = dim(x) + 1)
apply(x, 2, list) %>%
lapply(unlist) %>%
lapply(matrix, nrow = 2, ncol = 2) %>%
do.call(what = 'abind2')
I believe there must exist something better than this. Something that does not include list()ing and unlist()ing columns.
Edit:
Also, the solution should be ready to be easily applicable to any-dimensional array with any choice of MARGIN which my solution is not.
This, for example, I want to return 4-dimensional array.
x <- array(1:24, c(4,3,2))
apply(x, 2:3, list) %>%
lapply(unlist) %>%
lapply(matrix, nrow = 2, ncol = 2) %>%
do.call(what = 'abind2')
Not that complicated at all. Simply use
array(x, dim = c(2, 2, ncol(x)))
Matrix and general arrays are stored by column into a 1D long array in physical address. You can just reallocate dimension.
OK, here is possibly what you want to do in general:
tapply(x, col(x), FUN = matrix, nrow = 2, ncol = 2)
#$`1`
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
#
#$`2`
# [,1] [,2]
#[1,] 5 7
#[2,] 6 8
#
#$`3`
# [,1] [,2]
#[1,] 9 11
#[2,] 10 12
You can try to convert your matrix into a data.frame and use lapply to apply your function on the columns (as a data.frame is a list), it will return a list, where each element represents the function result for a column:
lapply(as.data.frame(x), matrix, nrow = 2, ncol = 2)
# $V1
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4
# $V2
# [,1] [,2]
# [1,] 5 7
# [2,] 6 8
# $V3
# [,1] [,2]
# [1,] 9 11
# [2,] 10 12
EDIT with the second definition of x:
x <- array(1:24, c(4,3,2))
lapply(as.data.frame(x), matrix, nrow = 2, ncol = 2)
# $V1
# [,1] [,2]
# [1,] 1 3
# [2,] 2 4
# $V2
# [,1] [,2]
# [1,] 5 7
# [2,] 6 8
# $V3
# [,1] [,2]
# [1,] 9 11
# [2,] 10 12
# $V4
# [,1] [,2]
# [1,] 13 15
# [2,] 14 16
# $V5
# [,1] [,2]
# [1,] 17 19
# [2,] 18 20
# $V6
# [,1] [,2]
# [1,] 21 23
# [2,] 22 24
EDIT2: a try to get an arry as result
Based on this similar question, you may try this code:
x <- array(1:24, c(4,3,2))
sapply(1:3,
function(y) sapply(1:ncol(x[, y, ]),
function(z) matrix(x[,y,z], ncol=2, nrow=2),
simplify="array"),
simplify="array")
Dimension of the result is 2 2 2 3.
Actually, the problem here is that it needs two different calls to apply when x is an array of more than 2 dimension. In the last example of the quesion (with x <- array(1:24, c(4,3,2))), we want to apply to each element of third dimension a function that apply to each element of second dimension the matrix function.

Initializing a matrix in R

I want to initialise a matrix with randomly generated numbers such that the sum of numbers in a row/column is 1 in 1 go.Both do not need to be 1 simultaneously i.e. either row sum is 1 or column sum is 1
For sum of rows = 1 you could try something like:
num_rows <- 5
num_cols <- 5
random_uniform_matrix <- matrix(runif(num_rows * num_cols), nrow = num_rows, ncol = num_cols)
random_uniform_matrix_normalised <- random_uniform_matrix / rowSums(random_uniform_matrix)
random_uniform_matrix_normalised
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0.23587728 0.09577532 0.28102271 0.03763127 0.34969342
# [2,] 0.07252286 0.42979916 0.19738456 0.19545165 0.10484177
# [3,] 0.12868304 0.30537875 0.08245634 0.26911364 0.21436823
# [4,] 0.31938540 0.37610285 0.18834984 0.10297283 0.01318908
# [5,] 0.10775810 0.09167090 0.54077248 0.16717661 0.09262190

How to keep certain values in an array in R?

Suppose I have a data array,
dat <- array(NA, c(115,45,248))
Q1: What I do if I want to get a new data array,
datnew <- array(NA, c(115,45,248))
in which, all the positive value remain and the negative value changed to NA?
Q2: What I do if I want to get a new data array,
datnew <- array(NA,c(115,45,31))
by averaging with the third dimension, but only averaging every 8 values?
Thanks a lot.
For question 2,
you can reverse the order of the dimensions, then add a dimension representing the groups to average over, then use apply:
tmp <- array( 1:32, c(2,2,8) )
tmp2 <- array( aperm(tmp), c(4,2,2,2) )
apply( tmp2, 2:4, mean )
Answer to Q1:
dat[dat < 0] <- NA
We treat dat as if it were a vector (it is but just with dims).
Answer to Q2:
Following Greg's nice, succinct solution, the solution I had in mind when posting my comment earlier was this (using Greg's tmp)
foo <- function(x, grp) aggregate(x, by = list(grp = grp), mean)$x
apply(tmp, 2:1, foo, grp = gl(2,4))
Examples:
Q1
> dat <- array(rnorm(3*3*3), c(3,3,3))
> dat
, , 1
[,1] [,2] [,3]
[1,] 0.1427815 0.1642626 -0.6876034
[2,] 0.6791252 2.1420478 -0.7073936
[3,] -0.9695173 -1.1050933 -0.3068230
, , 2
[,1] [,2] [,3]
[1,] 0.8246182 0.5132398 2.5428203
[2,] -0.4328711 0.9080648 -0.1231653
[3,] -0.7798170 -1.1160706 -0.9237559
, , 3
[,1] [,2] [,3]
[1,] -0.79505298 0.8795420 0.4520150
[2,] 0.04154077 -1.0422061 0.4657002
[3,] -0.67168971 0.7925304 -0.5461143
> dat[dat < 0] <- NA
> dat
, , 1
[,1] [,2] [,3]
[1,] 0.1427815 0.1642626 NA
[2,] 0.6791252 2.1420478 NA
[3,] NA NA NA
, , 2
[,1] [,2] [,3]
[1,] 0.8246182 0.5132398 2.542820
[2,] NA 0.9080648 NA
[3,] NA NA NA
, , 3
[,1] [,2] [,3]
[1,] NA 0.8795420 0.4520150
[2,] 0.04154077 NA 0.4657002
[3,] NA 0.7925304 NA
Q2
> foo <- function(x, grp) aggregate(x, by = list(grp = grp), mean)$x
> apply(tmp, 2:1, foo, grp = gl(2,4))
, , 1
[,1] [,2]
[1,] 7 9
[2,] 23 25
, , 2
[,1] [,2]
[1,] 8 10
[2,] 24 26
> all.equal(apply(tmp, 2:1, foo, grp = gl(2,4)), apply( tmp2, 2:4, mean ))
[1] TRUE
For question 1:
tmp2 <- ifelse(tmp1<0,tmp1,NA)
For question 2 see Greg's solution.

Resources