R: How to compare 2 matrices - r

I have 2 matrices
matrix1 (nrow=3, ncol=3)
matrix2 (nrow=5, ncol=5)
I know how to compare and replace conditionnaly an element of a matrix by an element of another matrix, but ONLY if these 2 elements are sharing the same [i,j] like this :
ifelse(matrix1<0.5, matrix2[,], matrix1[,])
Question:
Here I'd like to replace an element of matrix1, by an element of matrix2 of another column like this:
If matrix1[i,j]<0.5 Then I want to replace it be matrix2[i,j+2]
Else I want to replace it be matrix2[i,j+1]
The problem is:
I can't use loop because of performance
I don't know how to explain to ifelse to move on another column.
How can I do this kind of comparaison efficiently on big matrix ?
Here is the data:
> dput(matrix1)
structure(c(0.782534098718315, 0.279918688116595, 0.139927505282685,
0.485497816000134, 0.150636059232056, 0.976677459431812, 0.101831247797236,
0.491994257550687, 0.492571017006412), .Dim = c(3L, 3L))
> dput(matrix2)
structure(1:25, .Dim = c(5L, 5L))

Here m1 and m2 are the two matrices. Based on #alexis_laz comments, you could try
indx1 <- tail(seq(ncol(m1)+1),ncol(m1))
indx2 <- tail(seq(ncol(m1)+2),ncol(m1))
rowInd <- 1:nrow(m1)
ifelse(m1 < 0.5, m2[rowInd,indx2], m2[rowInd, indx1])
# [,1] [,2] [,3]
#[1,] 6 16 21
#[2,] 12 17 22
#[3,] 13 13 23
Or you can create index by
indx <- cbind(c(row(m1)), c(col(m1)))
indx1 <- cbind(indx[,1], indx[,2]+1)
indx2 <- cbind(indx[,1], indx[,2]+2)
ifelse(m1 < 0.5, m2[indx2], m2[indx1])
# [,1] [,2] [,3]
#[1,] 6 16 21
#[2,] 12 17 22
#[3,] 13 13 23

Related

How to vectorize this operation?

I have a n x 3 x m array, call it I. It contains 3 columns, n rows (say n=10), and m slices. I have a computation that must be done to replace the third column in each slice based on the other 2 columns in the slice.
I've written a function insertNewRows(I[,,simIndex]) that takes a given slice and replaces the third column. The following for-loop does what I want, but it's slow. Is there a way to speed this up by using one of the apply functions? I cannot figure out how to get them to work in the way I'd like.
for(simIndex in 1:m){
I[,, simIndex] = insertNewRows(I[,,simIndex])
}
I can provide more details on insertNewRows if needed, but the short version is that it takes a probability based on the columns I[,1:2, simIndex] of a given slice of the array, and generates a binomial RV based on the probability.
It seems like one of the apply functions should work just by using
I = apply(FUN = insertNewRows, MARGIN = c(1,2,3)) but that just produces gibberish..?
Thank you in advance!
IK
The question has not defined the input nor the transformation nor the result so we can't really answer it but here is an example of adding a row of ones to to a[,,i] for each i so maybe that will suggest how you could solve the problem yourself.
This is how you could use sapply, apply, plyr::aaply, reshaping using matrix/aperm and abind::abind.
# input array and function
a <- array(1:24, 2:4)
f <- function(x) rbind(x, 1) # append a row of 1's
aa <- array(sapply(1:dim(a)[3], function(i) f(a[,,i])), dim(a) + c(1,0,0))
aa2 <- array(apply(a, 3, f), dim(a) + c(1,0,0))
aa3 <- aperm(plyr::aaply(a, 3, f), c(2, 3, 1))
aa4 <- array(rbind(matrix(a, dim(a)[1]), 1), dim(a) + c(1,0,0))
aa5 <- abind::abind(a, array(1, dim(a)[2:3]), along = 1)
dimnames(aa3) <- dimnames(aa5) <- NULL
sapply(list(aa2, aa3, aa4, aa5), identical, aa)
## [1] TRUE TRUE TRUE TRUE
aa[,,1]
## [,1] [,2] [,3]
## [1,] 1 3 5
## [2,] 2 4 6
## [3,] 1 1 1
aa[,,2]
## [,1] [,2] [,3]
## [1,] 7 9 11
## [2,] 8 10 12
## [3,] 1 1 1
aa[,,3]
## [,1] [,2] [,3]
## [1,] 13 15 17
## [2,] 14 16 18
## [3,] 1 1 1
aa[,,4]
## [,1] [,2] [,3]
## [1,] 19 21 23
## [2,] 20 22 24
## [3,] 1 1 1

Convert bigger dimension matrix to smaller dimension matrix with a loop

I currently have 185*185 matrix and the goal is to convert this matrix into a 35*35 matrix by aggregating the value based on the rows and cols of the 185 matrix.
Example:
I have a 8*8 matrix as below:
matrix_x <- matrix(1:64, nrow = 8)
Then I want to convert it into a 4*4 matrix:
matrix_y <- matrix(NA, nrow = 4, ncol = 4)
The list below is created for aggregating the 8*8 matrix cols to a 4*4 matrix
col_list <- list(
1,
2:3,
c(4,8),
5:7
)
What I've done to achieve this is by assigning the value manually as below
matrix_y[1,1] <- sum(matrix_x[col_list[[1]],col_list[[1]]])
matrix_y[1,2] <- sum(matrix_x[col_list[[1]],col_list[[2]]])
matrix_y[1,3] <- sum(matrix_x[col_list[[1]],col_list[[3]]])
matrix_y[1,4] <- sum(matrix_x[col_list[[1]],col_list[[4]]])
matrix_y[2,1] <- sum(matrix_x[col_list[[2]],col_list[[1]]])
matrix_y[2,2] <- sum(matrix_x[col_list[[2]],col_list[[2]]])
matrix_y[2,3] <- sum(matrix_x[col_list[[2]],col_list[[3]]])
matrix_y[2,4] <- sum(matrix_x[col_list[[2]],col_list[[4]]])
matrix_y[3,1] <- sum(matrix_x[col_list[[3]],col_list[[1]]])
matrix_y[3,2] <- sum(matrix_x[col_list[[3]],col_list[[2]]])
matrix_y[3,3] <- sum(matrix_x[col_list[[3]],col_list[[3]]])
matrix_y[3,4] <- sum(matrix_x[col_list[[3]],col_list[[4]]])
matrix_y[4,1] <- sum(matrix_x[col_list[[4]],col_list[[1]]])
matrix_y[4,2] <- sum(matrix_x[col_list[[4]],col_list[[2]]])
matrix_y[4,3] <- sum(matrix_x[col_list[[4]],col_list[[3]]])
matrix_y[4,4] <- sum(matrix_x[col_list[[4]],col_list[[4]]])
This approach works well, but I'm looking for a more efficient way to achieve this since the approach I've done takes so many code lines.
There should be a neater/easier way to do this but here is one straight-forward option :
n <- 4
t(sapply(seq_len(n), function(p) sapply(col_list, function(q) sum(matrix_x[p, q]))))
# [,1] [,2] [,3] [,4]
#[1,] 1 26 82 123
#[2,] 2 28 84 126
#[3,] 3 30 86 129
#[4,] 4 32 88 132
This gives the same matrix as matrix_y in the post.
For the updated question, we can use outer
apply_fun <- function(x, y) sum(matrix_x[x, y])
outer(col_list, col_list, Vectorize(apply_fun))
# [,1] [,2] [,3] [,4]
#[1,] 1 26 82 123
#[2,] 5 58 170 255
#[3,] 12 72 184 276
#[4,] 18 108 276 414
Or following the same approach as in original answer with nested sapply
t(sapply(col_list, function(p) sapply(col_list, function(q) sum(matrix_x[p, q]))))

rowsums for matrix over randomly specified subsets of columns in R

I have this matrix
mu<-1:100
sigma<-100:1
sample.size<-10
toy.mat<-mapply(function(x,y){rnorm(x,y,n=sample.size)},x=mu,y=sigma)
colnames(toy.mat) <- c(rep(1,10),rep(2,10), rep(3,10), rep(4,10), rep(5,10),
rep(6,10), rep(7,10), rep(8,10), rep(9,10), rep(10,10) )
For the 10 columns named (1) I like to randomly select 5 pairs and rowsums each pair to generate 5 columns named (1a, 1b, 1c, 1d, 1e). I will do the same with columns named 2, 3 to 10.
Is there a data.table method to do this?
I'm still unsure about what you're trying to do.
This is what I understood.
I first split toy.mat into a list of 10 matrices (chunks). This is for convenience.
# Split toy.mat into list of matrices
lst <- lapply(seq(1, 100, by = 10), function(i) toy.mat[, i:(i+9)]);
Next, generate 5 random pairs, by sampling 10 numbers from the sequence 1:10 and coercing them into a 5x2 matrix. Repeat for all 10 matrix chunks.
# Generate 5 random pairs
set.seed(2017); # For reproducibility of results
rand <- replicate(10, matrix(sample(1:10, 10), ncol = 5), simplify = FALSE);
head(rand, n = 2);
#[[1]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 4 9 1 6
#[2,] 5 3 8 2 7
#
#[[2]]
# [,1] [,2] [,3] [,4] [,5]
#[1,] 7 9 3 5 10
#[2,] 1 4 2 6 8
Select corresponding columns based on pairs from rand and calculate the rowSums. Do that for every matrix chunk.
# Select column pairs and calculate rowSums
lst.rand <- lapply(1:10, function(i)
sapply(as.data.frame(rand[[i]]), function(w) rowSums(lst[[i]][, w])));
Bind list elements into matrix, and set column names.
# Bind into
mat <- do.call(cbind, lst.rand);
colnames(mat) <- as.vector(sapply(1:10, function(i) paste0(i, letters[1:5])));
mat[1:5, 1:6];
# 1a 1b 1c 1d 1e 2a
#[1,] 21.410826 34.90337 -11.297396 -50.56332 -115.82456 51.32369
#[2,] 5.323713 -144.26640 169.697538 -58.35540 96.25637 -78.95717
#[3,] -78.925937 -45.32790 -177.546469 251.69348 -52.85132 123.38741
#[4,] -33.673704 -95.64937 3.561921 -253.95046 -136.88182 -10.20650
#[5,] 51.080564 -180.87033 -161.861342 108.41120 188.07454 52.34226

Subset matrix with arrays in r

It is probably fairly basic but I have not found an easy solution.
Assume I have a three-dimensional matrix:
m <- array(seq_len(18),dim=c(3,3,2))
and I would like to subset the matrix with the arrays of indexes:
idxrows <- c(1,2,3)
idxcols <- c(1,1,2)
obtaining the arrays in position (1,1),(2,1) and (3,2), that is:
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 10 14 18
I have tried m[idxrows,idxcols,] but without any luck.
Is there anyway to do it (without obviously using a for loop)?
Not sure if there is any easy built in extract syntax, but you can work around this with mapply:
mapply(function(i, j) m[i,j,], idxrows, idxcols)
# [,1] [,2] [,3]
#[1,] 1 2 6
#[2,] 10 11 15
Or slightly more convoluted, create a index matrix whose columns match the dimensions of the original array:
thirdDim <- dim(m)[3]
index <- cbind(rep(idxrows, each = thirdDim), rep(idxcols, each = thirdDim), 1:thirdDim)
matrix(m[index], nrow = thirdDim)
# [,1] [,2] [,3]
#[1,] 1 2 6
#[2,] 10 11 15

Sum Every N Values in Matrix

So I have taken a look at this question posted before which was used for summing every 2 values in each row in a matrix. Here is the link:
sum specific columns among rows. I also took a look at another question here: R Sum every k columns in matrix which is more similiar to mine. I could not get the solution in this case to work. Here is the code that I am working with...
y <- matrix(1:27, nrow = 3)
y
m1 <- as.matrix(y)
n <- 3
dim(m1) <- c(nrow(m1)/n, ncol(m1), n)
res <- matrix(rowSums(apply(m1, 1, I)), ncol=n)
identical(res[1,],rowSums(y[1:3,]))
sapply(split.default(y, 0:(length(y)-1) %/% 3), rowSums)
I just get an error message when applying this. The desired output is a matrix with the following values:
[,1] [,2] [,3]
[1,] 12 39 66
[2,] 15 42 69
[3,] 18 45 72
To sum consecutive sets of n elements from each row, you just need to write a function that does the summing and apply it to each row:
n <- 3
t(apply(y, 1, function(x) tapply(x, ceiling(seq_along(x)/n), sum)))
# 1 2 3
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72
Transform the matrix to an array and use colSums (as suggested by #nongkrong):
y <- matrix(1:27, nrow = 3)
n <- 3
a <- y
dim(a) <- c(nrow(a), ncol(a)/n, n)
b <- aperm(a, c(2,1,3))
colSums(b)
# [,1] [,2] [,3]
#[1,] 12 39 66
#[2,] 15 42 69
#[3,] 18 45 72
Of course this assumes that ncol(y) is divisible by n.
PS: You can of course avoid creating so many intermediate objects. They are there for didactic purposes.
I would do something similar to the OP -- apply rowSums on subsets of the matrix:
n = 3
ng = ncol(y)/n
sapply( 1:ng, function(jg) rowSums(y[, (jg-1)*n + 1:n ]))
# [,1] [,2] [,3]
# [1,] 12 39 66
# [2,] 15 42 69
# [3,] 18 45 72

Resources