Split matrix to a list of matrix by vector - r

I am trying to split my matrix to a list by unique value in vector. Vector will have as many values as is in each column in matrix.
Here is an example:
#matrix
b <- cbind(c(2,2,1,0), c(2,2,1,5), c(2,2,5,6))
#vector
a <- c(5,5,4,1)
#??
#my outcome should looks like
v <- list(cbind(c(2,2), c(2,2), c(2,2)), c(1,1,5), c(0,5,6))
so basically, I want to split my matrix into multiple matrices by rows by unique values in a vector. More specifically, my vector is sorted from highest value to lowest value and I need to keep it in a list! As you can see in the example, v[[1]] is matrix for unique(a)[1] and so on.

lapply(split(seq_along(a), a), #split indices by a
function(m, ind) m[ind,], m = b)[order(unique(a))]
#$`5`
# [,1] [,2] [,3]
#[1,] 2 2 2
#[2,] 2 2 2
#
#$`4`
#[1] 1 1 5
#
#$`1`
#[1] 0 5 6

Related

Select one data point per row using indexing vector with negative values

Suppose you have a matrix a
a <- matrix(1:9, 3, 3)
a
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
and a vector b indicating which element of each row you want to extract. That is, the vector b indicates the column of the element, for instance:
b <- c(1, 3, 1)
If we want to extract the indicated data points, we can simply index each desired element like this:
a[cbind(1:nrow(a),b)]
[1] 1 8 3
I would like to do it with a negative index vector. That is, R should return a matrix where exactly one element per row is omitted (in this case, a 3x2 matrix). If I try it in a naive approach, R throws an error:
c = -b
a[cbind(1:nrow(a),c)]
Error in a[cbind(1:nrow(a), c)] :
negative values are not allowed in a matrix subscript
Thank you!
Not pretty, but you could do
b <- c(1, 3, 1) + 3 * 0:2
matrix(c(t(a))[-b], 3, 2, byrow = TRUE)
Maybe this is another naive approach. We loop over every row in the matrix and remove index specified in b.
t(sapply(seq_len(nrow(a)), function(x) a[x, -b[x]]))
# [,1] [,2]
#[1,] 4 7
#[2,] 2 5
#[3,] 6 9
Or using mapply with split
t(mapply(`[`, split(a, seq_len(nrow(a))), -b))

Delete specific values in a matrix according to two position vectors

My aim is to delete specific positions in a matrix according to a vector. Just giving you a small example.
Users_pos <- c(1,2)
Items_pos <- c(3,2)
Given a Matrix A:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
My aim according to the two Vectors User_pos and Item_pos is to delete the following values
A[1,3] and A[3,2]
I'm wondering if there's a possibility to do so without typing in the values for rows and columns by hand.
You can index k elements in a matrix A using A[X], where X is a k-row, 2-column matrix where each row is the (row, col) value of the indicated element. Therefore, you can index your two elements in A with the following indexing matrix:
rbind(Users_pos, Items_pos)
# [,1] [,2]
# Users_pos 1 2
# Items_pos 3 2
Using this indexing, you could choose to extract the information current stored with A[X] or replace those elements with A[X] <- new.values. If you, for instance, wanted to replace these elements with NA, you could do:
A[rbind(Users_pos, Items_pos)] <- NA
A
# [,1] [,2] [,3]
# [1,] 1 NA 3
# [2,] 4 5 6
# [3,] 7 NA 9

Comparing two vectors in a different order and obtaining position of matches

I have a matrix where the colnames are sample names and I have created a vector of the colnames.
I also have a vector of sample names I need to subset from the matrix which I have found are not in the same order as the colnames of the matrix.
To subset the matrix I need to find which columns in the matrix correspond to the samples I need.
To illustrate this:
colnames <- c("A","B","C","D","E","F","G","H","I")
sample_names<- c("B","D","I")
I need a way to get R to return the position information such that for the example sample names "B","D","I", the colnames position is: [1] 2 4 9
Sample data:
> m=matrix(rep(1:4,3),ncol=4)
> colnames(m)<-c("A","C","D","B")
> m
A C D B
[1,] 1 4 3 2
[2,] 2 1 4 3
[3,] 3 2 1 4
> vec<-c("A","B")
> vec
[1] "A" "B"
To answer your exact question, use which, it will return the index of TRUE values in a logical vector.
> which(colnames(m)==vec)
[1] 1 4
But as your goal seems to be subsetting the matrix, just use directly the sample names vector to get it like this:
> m[, vec]
A B
[1,] 1 2
[2,] 2 3
[3,] 3 4

Use row and columns indices in matrix to extract values from matrix

I have a Matrix A which looks like
A = matrix(1:9,3,3)
A
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
and a matrix of indices of elements I am interested in. Column 1 contains row indices, and column 2 contains column indices:
v = matrix(c(1, 3, 2, 2, 2, 3), nrow = 3, ncol = 2)
v
[,1] [,2]
[1,] 1 2
[2,] 3 2
[3,] 2 3
I want to use the rows and column indices in 'v' to extract numbers from 'A'; the indices correspond to the numbers 4 (A[1, 2]), 6 (A[3, 2]) and 8.
How can I extract those numbers directly from 'A' without using a loop?
When I use
A[v[ , 1], v[ , 2]]
I get
[,1] [,2] [,3]
[1,] 4 4 7
[2,] 6 6 9
[3,] 5 5 8
because R takes all combinations of the first and second column of 'v'.
What I want is an expression which gives me directly 4, 6, 8.
I could just take the diagonal elements but there must be an easier way.
From ?"[", you will find the following:
When indexing arrays by [ a single argument i can be a matrix with as many columns as there are dimensions of x; the result is then a vector with elements corresponding to the sets of indices in each row of i.
and later on...
A third form of indexing is via a numeric matrix with the one column for each dimension: each row of the index matrix then selects a single element of the array, and the result is a vector. Negative indices are not allowed in the index matrix. NA and zero values are allowed: rows of an index matrix containing a zero are ignored, whereas rows containing an NA produce an NA in the result.
Thus, what you are looking for is simply:
A[v]

efficiently update matrix element with a matrix of indices

I have a matrix of indices I where some of the indices are repeated. I put an example below.
I have another matrix A with dimensions compatible with the indices and initiated to 0 everywhere. I would like to do something like
A[I] += 1
I face two issues:
A[I] = A[I] + 1 is too inefficient
matrix I has redundant indices. For example rows 2 & 6 are identical and I would like to obtain A[1,2] = 2
A partial answer would be to create a 3 columns matrix with the two first columns being the product of unique(I) and the third column with the counts, but I don't see any solution for that either. Any pointer or help would be greatly appreciated!
> I is:
[,1] [,2]
[1,] 1 1
[2,] 1 2
[3,] 1 3
[4,] 1 4
[5,] 1 1
[6,] 1 2
[7,] 1 3
This may be quickest using sparse matrix methods (see the Matrix package and others).
Using standard matricies you could collapse the identical rows using the xtabs function then matrix assignment (edited based on comment):
I <- cbind(1, c(1:4,1:3))
tmp <- as.data.frame(xtabs( ~I[,1]+I[,2] ))
A <- matrix(0, nrow=5, ncol=5)
tmp2 <- as.matrix(tmp[,1:2])
tmp3 <- as.numeric(tmp2)
dim(tmp3) <- dim(tmp2)
A[ tmp3 ] <- tmp[,3]
A
You could probably make it a little quicker by pulling the core functionality out of as.data.frame.table rather than converting to data frame and back again.
Here is another version that may be more efficient. It will overwrite some 0's with other 0's computed by xtabs:
I <- cbind(1:5,1:5)
A <- matrix(0, 5, 5)
tmp <- xtabs( ~I[,2]+I[,1] )
A[ as.numeric(rownames(tmp)), as.numeric(colnames(tmp)) ] <- c(tmp)
A
If the A matrix has dimnames and the I matrix has the names instead of the indexes, then this later one will also work (just remove the as.numerics.
Here you go:
## Reproducible versions of your A and I objects
A <- matrix(0, nrow=2, ncol=5)
## For computations that follow, you'll be better off having this as a data.frame
## (Just use `I <- as.data.frame(I)` to convert a matrix object I).
I <- read.table(text=" 1 1
1 2
1 3
1 4
1 1
1 2
1 3", header=FALSE)
## Create data.frame with number of times each matrix element should
## be incremented
I$count <- ave(I[,1], I[,1], I[,2], FUN=length)
I <- unique(I)
## Replace desired elements, using a two column matrix (the "third form of
## indexing" mentioned in "Matrices and arrays" section" of ?"[").
A[as.matrix(I[1:2])] <- I[[3]]
A
# [,1] [,2] [,3] [,4] [,5]
# [1,] 2 2 2 1 0
# [2,] 0 0 0 0 0

Resources