R: Mapping unique columns of matching elements in 2 matrices

R: Mapping unique columns of matching elements in 2 matrices - r

I am trying to map matching columns between 2 matrices. For simplicity, I have 2 simple matrices, a and b:
a <- matrix(c(1, 2), nrow = 2, ncol = 2)
b <- matrix(c(1,2,1,2,3:8), nrow = 2, ncol = 5)
> a
[,1] [,2]
[1,] 1 1
[2,] 2 2
> b
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 3 5 7
[2,] 2 2 4 6 8
I want to create a vector of length length(a[, 1]) = 2, ie
> out
[1] 1 2
Where the first element of out is the column number in b that matches the first column of a, and the second element of out is the column number in b that matches the second column in a. I have tried
> match(data.frame(a), data.frame(b))
[1] 1 1
but I need each element of the resulting vector to be unique. Probably simple solution, but I am not seeing it. Thanks!

May be you are looking for something like intersect.
a <- matrix(c(10, 20), nrow = 2, ncol = 2)
b <- matrix(c(10,20,1,2,3:6,10,20), nrow = 2, ncol = 5)
#> b
# [,1] [,2] [,3] [,4] [,5]
#[1,] 10 1 3 5 10
#[2,] 20 2 4 6 20
#Finding matching columns in b from a. Only 1st column of a is considered
matched <- b[,1:ncol(b)] == a[,1:1]
#> matched
# [,1] [,2] [,3] [,4] [,5]
#[1,] TRUE FALSE FALSE FALSE TRUE
#[2,] TRUE FALSE FALSE FALSE TRUE
desired <- which(matched[1,], arr.ind = TRUE)
#> desired
#[1] 1 5
The matched column 1 and 5 are returned.

I guess I'm not allowed to comment on here. Anyhoo...the above answer by MKR looks good, but I would add this line before creating the "desired" object. This is to ensure every column element matches (instead of testing the first row only).
matched<-sapply(1:ncol(matched),function(x) all(matched[,x]))

Related

row sum based on matrix with logicals

I have two dataframes that look similar to this example:
> matrix(1:9, nrow = 3, ncol = 3)
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
> matrix(rexp(9), 3) < 1
[,1] [,2] [,3]
[1,] TRUE TRUE FALSE
[2,] FALSE TRUE FALSE
[3,] FALSE FALSE TRUE
I want to sum individual entries of a row, but only when the logical matrix of the same size is TRUE, else this row element should not be in the sum. All rows have at least one case where one element of matrix 2 is TRUE.
The result should look like this
[,1]
[1,] 12
[2,] 5
[3,] 9
Thanks for the help.

Multiplying your T/F matrix by your other one will zero out all the elements where FALSE. You can then sum by row.
m1 <- matrix(1:9, nrow = 3, ncol = 3)
m2 <- matrix(rexp(9), 3) < 1
as.matrix(rowSums(m1 * m2), ncol = 1)

We replace the elements to NA and use rowSums with na.rm
matrix(rowSums(replace(m1, m2, NA), na.rm = TRUE))
# [,1]
#[1,] 12
#[2,] 5
#[3,] 9
Or use ifelse
matrix(rowSums(ifelse(m2, 0, m1)))
data
m1 <- matrix(1:9, nrow = 3, ncol = 3)
m2 <- matrix(rexp(9), 3) >= 1

How to check if I have repeated elements in a matrix (checking rows and columns) in R?

M = matrix(data = 1:9, nrow = 3, ncol = 3)
print (M)
## [,1] [,2] [,3]
## [1,] 1 4 7
## [2,] 1 4 8
## [3,] 3 6 9
I wanna check that elements in row [1,] are equal to elements at row[2,]. Or which elements at row[1,] are equal to elements at column[,3], for example.

Preserve structure, when indexing a matrix with another matrix in R

Dear StackOverflowers,
I have an integer matrix in R and I would like to subset it so that I remove 1 specified cell in each column. So that, for instance, a 4x3 matrix becomes a 3x3 matrix. I have tried doing it by creating the second logical matrix of the same dimensions.
(subject.matrix <- matrix(1:12, nrow = 4))
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 6 10
[3,] 3 7 11
[4,] 4 8 12
(query.matrix <- matrix(c(T, T, F, T, T, F, T, T, T, T, T, F), nrow = 4))
[,1] [,2] [,3]
[1,] TRUE TRUE TRUE
[2,] TRUE FALSE TRUE
[3,] FALSE TRUE TRUE
[4,] TRUE TRUE FALSE
The problem is that, when I index the first matrix by the second one, it is simplified to an integer vector.
subject.matrix[query.matrix]
[1] 1 2 4 5 7 8 9 10 11
I've tried adding drop=F, but to no avail. I know, I can just wrap the resulting vector into a 3x3 matrix. So the expected outcome would be:
matrix(subject.matrix[query.matrix], nrow = 3)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
But I wonder if there's a more elegant/direct solution. I'm also not attached to using a logical matrix as the index, if that means a simpler solution. Perhaps, I could subset it with a vector of indices for the rows to be removed in each column, which in this case would translate into c(3, 2, 4).
Many thanks!
Edit based on #LyzandeR suggestion: My final goal was to take column sums of the resulting matrix. So replacing the redundant values with NA's seems to be the best way to go.

I think that the only way you can preserve the matrix structure would be to use a more general way of your question edit i.e.:
matrix(subject.matrix[query.matrix], ncol = ncol(subject.matrix))
You could even convert it into a function if you plan on using it multiple times:
subset.mat <- function(mat, index, cols=ncol(mat)) {
matrix(mat[index], ncol = cols)
}
Output:
> subset.mat(subject.matrix, query.matrix)
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Also (sorry just read your updated comment) you might consider using NAs in the matrix instead of subsetting them out, which will allow you to calculate the column sums as you say:
subject.matrix[!query.matrix] <- NA
subject.matrix
# [,1] [,2] [,3]
#[1,] 1 5 9
#[2,] 2 NA 10
#[3,] NA 7 11
#[4,] 4 8 NA

This is a little brute-forceish, but I think you'll be able to extrapolate it into something more general:
new.matrix = matrix(ncol = ncol(subject.matrix), nrow = nrow(subject.matrix) - 1)
for(i in 1:ncol(subject.matrix)){
new.matrix[,i] = subject.matrix[,i][query.matrix[,i] == TRUE]
}
new.matrix
[,1] [,2] [,3]
[1,] 1 5 9
[2,] 2 7 10
[3,] 4 8 11
Essentially, I just initialized an empty matrix, and then iterated through each column of subject.matrix taking only the TRUE values for query.matrix.

Selecting rows conditional on column value from multiple matrices

I have
mat1 = matrix(c(2, 4, 3, 6, 7, 8), nrow=2, ncol=3)
mat2 = matrix(c(5, 6, 7, 1, 2, 3), nrow=2, ncol=3)
mat3 = matrix(c(8, 5, 8, 6, 7, 9), nrow=2, ncol=3)
which gives me 3 matrices:
[,1] [,2] [,3]
[1,] 2 3 7
[2,] 4 6 8
[,1] [,2] [,3]
[1,] 5 7 2
[2,] 6 1 3
[,1] [,2] [,3]
[1,] 8 8 7
[2,] 5 6 9
What I would like to do is compare the three matrices per row per first column, and select the row of the matrix that has the highest value on the first column.
For example: in row 1 column 1, matrix3 has the highest value (8) compared to matrix1 (2) and matrix2 (5). In row 2 column 1, matrix2 has the highest value (6). I would like to create a new matrix that copies the row of the matrix that has that highest value, resulting in:
[,1] [,2] [,3]
[1,] 8 8 7 <- From mat3
[2,] 6 1 3 <- From mat2
I know how to get a vector with the highest values from column 1, but I cannot get the whole row of the matrix copied into a new matrix. I have:
mat <- (mat1[1,])
which just copies the first row of the first matrix
[1] 2 3 7
I can select which number is the maximum number:
max(mat1[,1],mat2[,1],mat3[,1])
[1] 8
But I cannot seem to combine the two to return a matrix with the whole row.
Getting the code to loop for each row will be no problem, but I cannot seem to get it to work for the first row and as such, I am missing the essential code. Any help would be greatly appreciated. Thank you.

Are you working interactively? Do you manipulate multiple matrices spread in your workspace? A straightforward answer to your problem could be:
#which matrices have the largest element of column 1 in each row?
max.col(cbind(mat1[, 1], mat2[, 1], mat3[, 1]))
#[1] 3 2
rbind(mat3[1, ], mat2[2, ]) #use the above information to get your matrix
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3
On a more ganeral use-case, a way could be:
mat_ls = list(mat1, mat2, mat3) #put your matrices in a "list"
which_col = 1 #compare column 1
which_mats = max.col(do.call(cbind, lapply(mat_ls, function(x) x[, which_col])))
do.call(rbind, lapply(seq_along(which_mats),
function(i) mat_ls[[which_mats[i]]][i, ]))
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3

Probably not the prettiest solution
temp <- rbind(mat1, mat2, mat3)
rbind(temp[c(T,F),][which.max(temp[c(T,F),][, 1]),],
temp[c(F,T),][which.max(temp[c(F,T),][, 1]),])
## [,1] [,2] [,3]
## [1,] 8 8 7
## [2,] 6 1 3

You may also try:
a2 <- aperm(simplify2array( mget(ls(pattern="mat"))),c(3,2,1)) #gets all matrices with name `mat`
t(sapply(1:(dim(a2)[3]), function(i) {x1 <- a2[,,i]; x1[which.max(x1[,1]),]}))
# [,1] [,2] [,3]
#[1,] 8 8 7
#[2,] 6 1 3

Extracting a row from a data frame in R

Let's say we have a matrix something like this:
> A = matrix(
+ c(2, 4, 3, 1, 5, 7), # the data elements
+ nrow=2, # number of rows
+ ncol=3, # number of columns
+ byrow = TRUE) # fill matrix by rows
> A # print the matrix
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
Now, I just used this small example, but imagine if the matrix was much bigger, like 200 rows and 5 columns etc. What I want to do, is to get the minimum value from column 3, and extract that row. In other words, find and get the row where is the 3rd attribute the lowest in the entire column of that data frame.
dataToReturn <- which(A== min(A[, 3])
but this doesn't work.

Another way is to use which.min
A[which.min(A[, 3]), ]
##[1] 2 4 3

You can do this with a simple subsetting via [] and min:
A[A[,3] == min(A[,3]),]
[1] 2 4 3
This reads: Return those row(s) of A where the value of column 3 equals the minimum of column 3 of A.
If you have a matrix like this:
A <- matrix(c(2,4,3,1,5,7,1,3,3), nrow=3, byrow = T)
> A
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 5 7
[3,] 1 3 3
> A[which.min(A[, 3]), ] #returns only the first row with minimum condition
[1] 2 4 3
> A[A[,3] == min(A[,3]),] #returns all rows with minimum condition
[,1] [,2] [,3]
[1,] 2 4 3
[2,] 1 3 3