subset matrix by more than one thing fast - r

I have a matrix I would like to subset quickly using two criteria. 1) the colnames match the rownames and 2) the value in one matrix is FALSE
m
[,1]
A 1
B 2
C 3
D 4
E 5
tf
E B A
[1,] FALSE FALSE TRUE
the output should be
m2
[,1]
E 5
B 2

As there is only a single row for 'tf', when we subset the logical matrix by negating, it results in a named vector as subsetting ([) by default is drop=TRUE. Extract the names from the vector and use it as row index to subset the 'm'. Here, we can use drop=FALSE as there is only a single column in 'm'.
m[names(tf[,!tf]), , drop=FALSE]
# [,1]
#E 5
#B 2
data
m <- matrix(1:5, nrow=5, 1, dimnames=list(LETTERS[1:5], NULL))
tf <- matrix(c(FALSE, FALSE, TRUE), ncol=3, dimnames=list(NULL, c("E", "B", "A")))

Related

Matrix Indexes that fulfill Condition

I have a matrix, that looks like this
myMatrix <- matrix(data = TRUE, nrow = 3, ncol = 3)
myMatrix[as.matrix(expand.grid(1:2, 1:2))] <- FALSE
myMatrix
[,1] [,2] [,3]
[1,] FALSE FALSE TRUE
[2,] FALSE FALSE TRUE
[3,] TRUE TRUE TRUE
and I would like to get a dataframe or a matrix that lists all row and column indexes where myMatrix is TRUE:
column row
1 3 1
2 3 2
3 1 3
4 2 3
5 3 3
How can I do this?
We can either use which
which(myMatrix, arr.ind = TRUE)
Or with arrayInd and specify the .dims
We could use row and col to get row and column index of each element in matrix and subset the TRUE values using logical values of myMatrix.
data.frame(column = col(myMatrix)[myMatrix], row = row(myMatrix)[myMatrix])

Extract different col in every row of an R data.frame [duplicate]

This question already has answers here:
Using row-wise column indices in a vector to extract values from data frame [duplicate]
(2 answers)
Closed 3 years ago.
I have a vector of colnames that is as long as the number of rows in a data frame:
> x <- data.frame(a=c(1,2,3), b=c(3,2,1), c=c(5,6,4))
> cols <- c("c", "a", "b")
> x
a b c
1 1 3 5
2 2 2 6
3 3 1 4
Now I want to extract from x the column cols[i] for each row i of x, that is 5, 2, 1 in this case. I have tried to create a matrix with T and F depending on the macth:
> A <- matrix(rep(colnames(x),nrow(x)), nrow=nrow(x), ncol=ncol(x), byrow=TRUE) == cols
> A
[,1] [,2] [,3]
[1,] FALSE FALSE TRUE
[2,] TRUE FALSE FALSE
[3,] FALSE TRUE FALSE
This looks correct, but when I use this as an index, the result is returned by row:
> x[A]
[1] 2 1 5
Does someone know of the proper way to solve this indexing problem?
x <- data.frame(a=c(1,2,3), b=c(3,2,1), c=c(5,6,4))
cols <- c("c", "a", "b")
sapply(1:length(cols),function(i){x[i,cols[i]]})
[1] 5 2 1

Split matrix to a list of matrix by vector

I am trying to split my matrix to a list by unique value in vector. Vector will have as many values as is in each column in matrix.
Here is an example:
#matrix
b <- cbind(c(2,2,1,0), c(2,2,1,5), c(2,2,5,6))
#vector
a <- c(5,5,4,1)
#??
#my outcome should looks like
v <- list(cbind(c(2,2), c(2,2), c(2,2)), c(1,1,5), c(0,5,6))
so basically, I want to split my matrix into multiple matrices by rows by unique values in a vector. More specifically, my vector is sorted from highest value to lowest value and I need to keep it in a list! As you can see in the example, v[[1]] is matrix for unique(a)[1] and so on.
lapply(split(seq_along(a), a), #split indices by a
function(m, ind) m[ind,], m = b)[order(unique(a))]
#$`5`
# [,1] [,2] [,3]
#[1,] 2 2 2
#[2,] 2 2 2
#
#$`4`
#[1] 1 1 5
#
#$`1`
#[1] 0 5 6

Named rows and cols for matrices in R

Is it possible to have named rows and columns in Matrices?
for example:
[,a] [,b]
[a,] 1 , 2
[b,] 3 , 4
Is it even reasonable to have such a thing for exploring the data?
Sure. Use dimnames:
> a <- matrix(1:4, nrow = 2)
> a
[,1] [,2]
[1,] 1 3
[2,] 2 4
> dimnames(a) <- list(c("A", "B"), c("AA", "BB"))
> a
AA BB
A 1 3
B 2 4
With dimnames, you can provide a list of (first) rownames and (second) colnames for your matrix. Alternatively, you can specify rownames(x) <- whatever and colnames(x) <- whatever.

How to create a factor from a binary indicator matrix?

Say I have the following matrix mat, which is a binary indicator matrix for the levels A, B, and C for a set of 5 observations:
mat <- matrix(c(1,0,0,
1,0,0,
0,1,0,
0,1,0,
0,0,1), ncol = 3, byrow = TRUE)
colnames(mat) <- LETTERS[1:3]
> mat
A B C
[1,] 1 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 1 0
[5,] 0 0 1
I want to convert that into a single factor such that the output is equivalent to fac defines as:
> fac <- factor(rep(LETTERS[1:3], times = c(2,2,1)))
> fac
[1] A A B B C
Levels: A B C
Extra points if you get the labels from the colnames of mat, but a set of numeric codes (e.g. c(1,1,2,2,3)) would also be acceptable as desired output.
Elegant solution with matrix multiplication (and shortest up to now):
as.factor(colnames(mat)[mat %*% 1:ncol(mat)])
This solution makes use of the arr.ind=TRUE argument of which, returning the matching positions as array locations. These are then used to index the colnames:
> factor(colnames(mat)[which(mat==1, arr.ind=TRUE)[, 2]])
[1] A A B B C
Levels: A B C
Decomposing into steps:
> which(mat==1, arr.ind=TRUE)
row col
[1,] 1 1
[2,] 2 1
[3,] 3 2
[4,] 4 2
[5,] 5 3
Use the values of the second column, i.e. which(...)[, 2] and index colnames:
> colnames(mat)[c(1, 1, 2, 2, 3)]
[1] "A" "A" "B" "B" "C"
And then convert to a factor
One way is to replicate the names out by row number and index directly with the matrix, then wrap that with factor to restore the levels:
factor(rep(colnames(mat), each = nrow(mat))[as.logical(mat)])
[1] A A B B C
Levels: A B C
If this is from model.matrix, the colnames have fac prepended, and so this should work the same but removing the extra text:
factor(gsub("^fac", "", rep(colnames(mat), each = nrow(mat))[as.logical(mat)]))
You could use something like this:
lvls<-apply(mat, 1, function(currow){match(1, currow)})
fac<-factor(lvls, 1:3, labels=colnames(mat))
Here is another one
factor(rep(colnames(mat), colSums(mat)))

Resources