R: List of indices, including empty ones, to binary matrix - r

Say I have a list of indices, including elements that are empty, like:
l <- list(c(1,2,3), c(1), c(1,5), NULL, c(2, 3, 5))
Which specify the non-zero elements in a matrix, like:
(m <- matrix(c(1,1,1,0,0, 1,0,0,0,0, 1,0,0,0,5, 0,0,0,0,0, 0,1,1,0,1), nrow=5, byrow=TRUE))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 0 0
[2,] 1 0 0 0 0
[3,] 1 0 0 0 5
[4,] 0 0 0 0 0
[5,] 0 1 1 0 1
What is the fastest way, using R, to make m from l, giving that the matrix is very big, say 50.000 rows and 2000 columns?

You can Filter the non-NULL list elements from 'l' and use melt to reshape it to 'data.frame' with 'key/value' columns or `row/column' index columns.
library(reshape2)
d2 <- melt(Filter(Negate(is.null), setNames(l, seq_along(l))))
Un1 <- unlist(l)
m1 <- matrix(0, nrow=length(l), ncol=max(Un1))
m1[cbind(as.numeric(d2[,2]), d2[,1])] <- 1
m1
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 1 1 0 0
#[2,] 1 0 0 0 0
#[3,] 1 0 0 0 1
#[4,] 0 0 0 0 0
#[5,] 0 1 1 0 1
Or
library(Matrix)
as.matrix(sparseMatrix(as.numeric(d2[,2]), d2[,1], x=1))
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 1 1 0 0
#[2,] 1 0 0 0 0
#[3,] 1 0 0 0 1
#[4,] 0 0 0 0 0
#[5,] 0 1 1 0 1

You can do:
do.call(rbind, lapply(l, function(x) (1:max(unlist(l)) %in% x)+0L))
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 1 1 0 0
#[2,] 1 0 0 0 0
#[3,] 1 0 0 0 1
#[4,] 0 0 0 0 0
#[5,] 0 1 1 0 1
Even if akrun solution should be faster!

Related

How to set all rows of a list of matrices to zero using if condition statement in R

Suppose I have a matrix, mat. Suppose further that the sum of one row of this matrix is equal to zero. Then, I need to set all the coming rows (the rows after the zero row) to zero. For example,
mat <- c(1,2,0,0,0,
3,4,0,2,1,
0,0,0,1,0,
1,2,0,0,0,
0,1,0,1,0)
mat <- matrix(mat,5,5)
mat
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 4 0 2 1
[3,] 0 0 0 0 0
[4,] 0 2 1 0 1
[5,] 0 1 0 0 0
All the entries of row 3 are zero. Hence, I want rows 4, and 5 to become zeros as well. I have a list of matrices and would like to apply the same to all the matrices using the lapply function. For simplicity, I make a list of 3 matrices similar to the mat.
mat <- c(1,2,0,0,0,
3,3,0,2,1,
0,0,0,4,0,
1,3,0,0,0,
0,1,0,1,0)
mat <- matrix(mat,5,5)
mat1 <- c(1,2,0,0,0,
3,4,0,2,1,
0,0,0,1,0,
1,2,0,0,0,
0,1,0,1,0)
mat1 <- matrix(mat1,5,5)
mat2 <- c(1,2,0,0,0,
3,4,0,2,1,
0,0,0,2,0,
1,2,0,0,0,
0,2,0,3,0)
mat2 <- matrix(mat2,5,5)
Mat <- list(mat1, mat2, mat3)
You did not actually post mat3 in your data so I just used mat3 <- matrix(1, 5, 5), i.e. a 5x5 matrix of ones. This was to ensure it could handle cases where there is no row where all values are zero.
This will return a list of matrices where all rows are zero after the first row of zeroes:
lapply(Mat, \(mat) {
first_zero_row <- which(rowSums(mat)==0)[1]
if(!is.na(first_zero_row)) {
mat[first_zero_row:nrow(mat),] <- 0
}
mat
})
Output:
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 4 0 2 1
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 4 0 2 2
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 1 1 1 1
[3,] 1 1 1 1 1
[4,] 1 1 1 1 1
[5,] 1 1 1 1 1
Another option could be:
lapply(Mat, function(x) {x[cumsum(rowSums(x != 0) == 0) != 0, ] <- 0; x})
[[1]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 3 0 3 1
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 4 0 2 1
[3,] 0 0 0 0 0
[4,] 0 0 0 0 0
[5,] 0 0 0 0 0
[[3]]
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 0 1 0
[2,] 2 4 0 2 2
[3,] 0 0 0 1 0
[4,] 0 2 2 0 3
[5,] 0 1 0 0 0

Change multiple Matrix elements by Index Vectors

I have a matrix
myMatrix <- matrix(data = 0, nrow = 4, ncol = 4)
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 0 0 0
[4,] 0 0 0 0
and I want to change particular values
myMatrix[1,1] <- 1
myMatrix[2,3] <- 1
myMatrix[4,4] <- 1
myMatrix
[,1] [,2] [,3] [,4]
[1,] 1 0 0 0
[2,] 0 0 1 0
[3,] 0 0 0 0
[4,] 0 0 0 1
How can I do this efficient/elegantly if I have two vectors containing the row and column indexes:
rowIndexes <- c(1,2,4)
colIndexes <- c(1,3,4)
The assigned value is constant (in this case 1).
I know how to do it with a for-loop, but this feels inefficient.
We can cbind the row/column index, subset the myMatrix and assign values to 1
myMatrix[cbind(rowIndexes, colIndexes)] <- 1
myMatrix
# [,1] [,2] [,3] [,4]
#[1,] 1 0 0 0
#[2,] 0 0 1 0
#[3,] 0 0 0 0
#[4,] 0 0 0 1

How can I make an identical matrix with a column vector?

I have a column vector with dimensions 4000x1, and I need to make a matrix with that vector, but the matrix needs to have the column vector as a diagonal and the other numbers as zero. Something like this:
Column Vector
> vector <- matrix(c(1:5), ncol=1, nrow=5)
> vector
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
Matrix
[,1] [,2] [,3] [,4]
a 1 0 0 0
b 0 2 0 0
c 0 0 3 0
How can I produce this output?
This sounds like the diag() function, e.g.,
> my_vect <- 1:5
> diag(my_vect)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 2 0 0 0
[3,] 0 0 3 0 0
[4,] 0 0 0 4 0
[5,] 0 0 0 0 5
By the way, as you have written it vector is actually a 5x1 matrix, so you would need to convert it to, well, a vector:
> diag(as.vector(vector))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 2 0 0 0
[3,] 0 0 3 0 0
[4,] 0 0 0 4 0
[5,] 0 0 0 0 5

How to create a matrix with 1s and 0s based off a 2-column data frame

Here is an example of what my data looks like (what I have is actually 1300 lines, or 1300 connections/edges between two different nodes):
node# node#
1 3
1 4
2 4
2 5
3 4
3 5
I currently have the above data in a data frame. This represent a network where a car can drive from node 1 to 3 or 1 to 4, and from node 2 to 4 or node 2 to 5, etc. I'd like to create a matrix that looks like this:
> [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 0 0 0 0 0
[2,] 0 0 0 0 0 0
[3,] 0 0 0 0 0 0
[4,] 0 0 0 0 0 0
[5,] 0 0 0 0 0 0
Where I'm stuck: I want to input 1s into the matrix from the leaving node, and a -1 in the matrix of the destination node, in the same column. So for this 6 node-connection data frame, the matrix would look like:
> [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 1 0 0 0 0
[2,] 0 0 1 1 0 0
[3,] -1 0 0 0 1 1
[4,] 0 -1 -1 0 -1 0
[5,] 0 0 0 -1 0 -1
But like I said, I have more than 1300 connections, so doing this by hand would take a while. So I'm guessing matrix(0, 5, 1300) would be where I start?
You can index specific row/column pairs of a matrix using a 2-column indexing matrix. This provides a convenient way to set all the 1's and then set all the -1's:
mat <- matrix(0, nrow=max(dat), ncol=nrow(dat))
mat[cbind(dat$node1, seq_len(nrow(dat)))] <- 1
mat[cbind(dat$node2, seq_len(nrow(dat)))] <- -1
mat
# [,1] [,2] [,3] [,4] [,5] [,6]
# [1,] 1 1 0 0 0 0
# [2,] 0 0 1 1 0 0
# [3,] -1 0 0 0 1 1
# [4,] 0 -1 -1 0 -1 0
# [5,] 0 0 0 -1 0 -1
(Thanks to #PierreLafortune for the trick about calling max on a data frame!)
Data:
dat <- data.frame(node1=c(1, 1, 2, 2, 3, 3), node2=c(3, 4, 4, 5, 4, 5))
We could also use sparseMatrix from library(Matrix)
library(Matrix)
B <- sparseMatrix(dat$node2, seq_len(nrow(dat)), x= -1)
mat <- sparseMatrix(dat$node1, seq_len(nrow(dat)), x= 1,
dims=dim(B)) + B
as.matrix(mat)
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 1 1 0 0 0 0
#[2,] 0 0 1 1 0 0
#[3,] -1 0 0 0 1 1
#[4,] 0 -1 -1 0 -1 0
#[5,] 0 0 0 -1 0 -1
NOTE: dat taken from #josliber's post.

use of the [<- operator to modify a line of data

I have some data
data <- diag(5)
I want to use the [<- operator to change a line.
The result should be:
data[1,] <- 2
> data
[,1] [,2] [,3] [,4] [,5]
[1,] 2 2 2 2 2
[2,] 0 1 0 0 0
[3,] 0 0 1 0 0
[4,] 0 0 0 1 0
[5,] 0 0 0 0 1
I know I can do e.g.
`[<-`(data, i=1, j=3, 2)
which gives
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 8 0 0
[2,] 0 1 0 0 0
[3,] 0 0 1 0 0
[4,] 0 0 0 1 0
[5,] 0 0 0 0 1
but how can I operate on line (or column, same issue)?
I tried j=NULL, j=integer(0), it doesn't work. I could do j=1:5 and get what I want but I am wondering how to mimic data[1,] <- 2 and not data[1,1:5] <- 2.
> `[<-`(data, 1, , 2) # blank 2nd argument
[,1] [,2] [,3] [,4] [,5]
[1,] 2 2 2 2 2
[2,] 0 1 0 0 0
[3,] 0 0 1 0 0
[4,] 0 0 0 1 0
[5,] 0 0 0 0 1
You can use ncol to ensure that all columns are set:
`[<-`(data, i = 1, j = 1:ncol(data), 2)

Resources