How do I convert this sparse matrix to the normal one? - r

I have a sparse matrix represented as
> (f <- data.frame(row=c(1,2,3,1,2,1,2,3,4,1,1,2),value=1:12))
row value
1 1 1
2 2 2
3 3 3
4 1 4
5 2 5
6 1 6
7 2 7
8 3 8
9 4 9
10 1 10
11 1 11
12 2 12
Here the first column is always present (in fact, the first few are present, the rest are not).
I want to get the data into the matrix format:
> t(matrix(c(1,2,3,NA,4,5,NA,NA,6,7,8,9,10,NA,NA,NA,11,12,NA,NA),nrow=4,ncol=5))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 NA
[2,] 4 5 NA NA
[3,] 6 7 8 9
[4,] 10 NA NA NA
[5,] 11 12 NA NA
Here is what seems to be working:
> library(Matrix)
> as.matrix(sparseMatrix(i = cumsum(f[[1]] == 1), j=f[[1]], x=f[[2]]))
[,1] [,2] [,3] [,4]
[1,] 1 2 3 0
[2,] 4 5 0 0
[3,] 6 7 8 9
[4,] 10 0 0 0
[5,] 11 12 0 0
Except that I have to replace 0 with NA myself.
Is there a better solution?

You can do everything with base functions. The trick is to use indexing by a 2-col (row and col indices) matrix:
j <- f$row
i <- cumsum(j == 1)
x <- f$value
m <- matrix(NA, max(i), max(j))
m[cbind(i, j)] <- x
m
Whether it is better or not than using the Matrix package is subjective. Overkill in my opinion if you are not doing anything else with it. Also if your data had 0 in the f$value column, they would end up being converted as NA if you are not too careful.

Related

Convert matrix into symmetrical matrix in R [duplicate]

This question already has answers here:
Most Efficient way to create a symmetric matrix
(4 answers)
How can I generate random real symmetric matrix in R [duplicate]
(1 answer)
Closed 2 years ago.
I currently have a matrix output from a program that looks like the following where the bottom left has all 1s:
B C D E
A 0 1 2 3
B 1 1 3 3
C 1 1 1 3
D 1 1 1 0
Is there a way to convert it into a symmetrical matrix instead of having all the 1s?
I do not think that the solution of #RonakShah is correct.
M = matrix(1:16, nrow=4)
M
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 2 6 10 14
[3,] 3 7 11 15
[4,] 4 8 12 16
M[lower.tri(M)] <- M[upper.tri(M)]
M
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 5 6 10 14
[3,] 9 13 11 15
[4,] 10 14 15 16
This is not symmetric. Instead, use
M = matrix(1:16, nrow=4)
M[lower.tri(M)] <- t(M)[lower.tri(M)]
M
[,1] [,2] [,3] [,4]
[1,] 1 5 9 13
[2,] 5 6 10 14
[3,] 9 10 11 15
[4,] 13 14 15 16
You can copy the upper triangular values to lower triangle.
mat[lower.tri(mat)] <- mat[upper.tri(mat)]
mat
# B C D E
#A 0 1 2 3
#B 1 1 3 3
#C 2 3 1 3
#D 3 3 3 0

Putting a sequence into a matrix around existing values

I want to generate a symmetric matrix around a diagonal of zeroes and a predetermined sequence around them. In theory the lines should show as
0 1 3 5 7 9
1 0 3 5 7 9
I've tried tweaking with the conditionals, but I suspect that it's wonky because of indexing, which I am nowhere near skilled enough to fix.
bend <- function(n){
m <- seq(1, n, by=2)
a <- length(m)
y <- matrix(nrow= a, ncol = a, byrow= TRUE)
y <- ifelse(row(y) == col(y), 0, m)
y
}
Assuming that the input is a 9, expected output is
0 1 3 5 7 9
1 0 3 5 7 9
1 3 0 5 7 9
1 3 5 0 7 9
1 3 5 7 0 9
1 3 5 7 9 0
Actual output is
0 3 5 7 9 1
3 0 7 9 1 3
5 7 0 1 3 5
7 9 1 0 5 7
9 1 3 5 0 9
1 3 5 7 9 0
There's a simpler way to do what you need. You can start off by creating a matrix of length(x) + 1 columns and rows with all elements as a logical TRUE. Then make the diagonal FALSE using diag(). Now you can replace the TRUEs with your desired vector. The diagonal being FALSE is not affected. Since the values are replaced column-wise you need a final transpose t() to get correct result.
This way, you don't need to worry about tracking indices.
x <- c(1,3,5,7,9)
make_matrix <- function(x) {
m <- matrix(TRUE, ncol = length(x) + 1, nrow = length(x) + 1)
diag(m) <- FALSE
m[m] <- x
t(m)
}
make_matrix(x)
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 1 3 5 7 9
[2,] 1 0 3 5 7 9
[3,] 1 3 0 5 7 9
[4,] 1 3 5 0 7 9
[5,] 1 3 5 7 0 9
[6,] 1 3 5 7 9 0
Here's another way with sapply. This creates the necessary row elements in each iteration and puts them in a matrix by column. Again, you need a t() to get correct results. -
sapply(0:length(x), function(a) append(x, 0, after = a)) %>% t()
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0 1 3 5 7 9
[2,] 1 0 3 5 7 9
[3,] 1 3 0 5 7 9
[4,] 1 3 5 0 7 9
[5,] 1 3 5 7 0 9
[6,] 1 3 5 7 9 0
Benchmarks -
sapply is slower, likely because it's creating the matrix elements one row at a time and calls append for every row. All this overhead is avoided in the make_matrix() approach.
x <- sample(100)
microbenchmark(
make_matrix = make_matrix(x),
sapply = t(sapply(0:length(x), function(a) append(x, 0, after = a))),
akrun_forloop = {
n <- length(x) + 1
m1 <- matrix(0, n, n)
for(i in seq_len(nrow(m1))) m1[i, -i] <- x
},
times = 1000
)
Unit: microseconds
expr min lq mean median uq max neval
make_matrix 111.495 117.5610 128.3135 126.890 135.7540 225.323 1000
sapply 520.620 551.1765 592.2642 573.335 602.2585 10477.221 1000
akrun_forloop 3380.292 3526.3080 3837.1570 3648.765 3812.5075 20943.245 1000
Using a simple for loop
n <- length(x) + 1
m1 <- matrix(0, n, n)
for(i in seq_len(nrow(m1))) m1[i, -i] <- x
m1
# [,1] [,2] [,3] [,4] [,5] [,6]
#[1,] 0 1 3 5 7 9
#[2,] 1 0 3 5 7 9
#[3,] 1 3 0 5 7 9
#[4,] 1 3 5 0 7 9
#[5,] 1 3 5 7 0 9
#[6,] 1 3 5 7 9 0
data
x <- c(1,3,5,7,9)

Individual shift of each column in a matrix

I look for a R-code that transform the matrix as follows (a: the original matrix, b: the desired output), example:
a <- matrix(c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6), nrow = 6, ncol = 4)
b <- matrix(c(1,2,3,4,5,6,2,3,4,5,6,0,3,4,5,6,0,0,4,5,6,0,0,0), nrow = 6, ncol = 4)
a
[,1] [,2] [,3] [,4]
[1,] 1 1 1 1
[2,] 2 2 2 2
[3,] 3 3 3 3
[4,] 4 4 4 4
[5,] 5 5 5 5
[6,] 6 6 6 6
b
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
[3,] 3 4 5 6
[4,] 4 5 6 0
[5,] 5 6 0 0
[6,] 6 0 0 0
Thus, the first column is not shifted, the second column is shifted up one step, the third column shifted up two steps, and so on. The shifted columns are padded with zeros.
The following links didn't help me (nor: double for-loop, a function with different variables, the codes diag or kronecker).
R: Shift values in single column of dataframe UP
r matrix individual shift operations of elements
Rotate a Matrix in R
Have you any ideas? Thanks.
This seems to work with data.table. Should perform well with a large matrix:
library(data.table)
# One way
dt[, shift(.SD, 0:3, 0, "lead", FALSE), .SDcols = 1]
# Alternatively
dt[, shift(dt, 0:3, 0, "lead", FALSE)][, 1:4]
Both return:
V1 V2 V3 V4
1: 1 2 3 4
2: 2 3 4 5
3: 3 4 5 6
4: 4 5 6 0
5: 5 6 0 0
6: 6 0 0 0
Using the following data:
a <- matrix(c(1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6,1,2,3,4,5,6), nrow = 6, ncol = 4)
dt <- setDT(as.data.frame(a))
I have a raw solution using sapply. You shift your column on each iteration of sapply, and then sapply concatenate all the output, that you can feed to matrix with the good size (the size of your initial matrix)
matrix(sapply(1:dim(a)[2], function(x){c(a[x:dim(a)[1], x], rep(0, (x - 1) ))}), ncol = dim(a)[2], nrow = dim(a)[1])
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4
[2,] 2 3 4 5
[3,] 3 4 5 6
[4,] 4 5 6 0
[5,] 5 6 0 0
[6,] 6 0 0 0
You can shift the columns by filling a matrix which have one row more than "a" with the values from "a" (a Warning is generated during the recycling). Select the original number of rows. Replace the lower right triangle with zeros.
nr <- nrow(a)
a2 <- matrix(a, ncol = ncol(a), nrow = nr + 1)[1:nr, ]
a2[col(a2) + row(a2) > nr + 1] <- 0
a2
# [,1] [,2] [,3] [,4]
# [1,] 1 2 3 4
# [2,] 2 3 4 5
# [3,] 3 4 5 6
# [4,] 4 5 6 0
# [5,] 5 6 0 0
# [6,] 6 0 0 0
Building on tyluRp's answer, which almost worked for me, I suggest to loop through all columns and call shift on each, individually. Let's start with a matrix of random numbers here:
a <- matrix(floor(10*runif(24)), ncol=4)
a
[,1] [,2] [,3] [,4]
[1,] 8 4 8 3
[2,] 0 6 9 0
[3,] 1 6 0 7
[4,] 0 3 9 7
[5,] 2 4 2 9
[6,] 4 8 5 6
library(data.table)
dt <- setDT(as.data.frame(a))
Now the loop that does the job...
for (i in 2:length(dt)) dt[,i] <- shift(dt[,i,with=F],(i-1),0,"lead")
...by replacing columns with their shifted version.
The original answers replaced all columns by shifted copies of the first column, thus losing data. This is probably due to the group behaviour of data.table.

How can I find repeated values/ data points and their index in 2D matrix of a dataframe in R?

For example suppose I have matrix A
x y z f
1 1 2 A 1005
2 2 4 B 1002
3 3 2 B 1001
4 4 8 C 1001
5 5 10 D 1004
6 6 12 D 1004
7 7 11 E 1005
8 8 14 E 1003
From this matrix I want to find the repeated values like 1001, 1005, D, 2 (in third column) and I also want to find their index (which row, or which position).
I am new to R!
Obviously it is possible to do with simple searching element by element by using a for loop, but I want to know, is there any function available in R for this kind of problem.
Furthermore, I tried using duplicated and unique, both functions are giving me the duplicated row number or column number, they are also giving me how many of them were repeated, but I can not search for whole matrix using both of them!
You can write a rather simple function to get this information. Though note that this solution works with a matrix. It does not work with a data.frame. A similar function could be written for a data.frame using the fact that the data.frame data structure is a subset of a list.
# example data
set.seed(234)
m <- matrix(sample(1:10, size=100, replace=T), 10)
find_matches <- function(mat, value) {
nr <- nrow(mat)
val_match <- which(mat == value)
out <- matrix(NA, nrow= length(val_match), ncol= 2)
out[,2] <- floor(val_match / nr) + 1
out[,1] <- val_match %% nr
return(out)
}
R> m
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 8 6 6 7 6 7 4 10 6 9
[2,] 8 6 6 3 10 4 5 4 6 9
[3,] 1 6 9 2 9 2 3 6 4 2
[4,] 8 6 7 8 3 9 9 4 9 2
[5,] 1 1 5 6 7 1 5 1 10 6
[6,] 7 5 4 7 8 2 4 4 7 10
[7,] 10 4 7 8 3 1 8 6 3 4
[8,] 8 8 2 2 7 5 6 4 10 4
[9,] 10 2 9 6 6 9 7 2 4 7
[10,] 3 9 9 4 2 7 7 2 9 6
R> find_matches(m, 8)
[,1] [,2]
[1,] 1 1
[2,] 2 1
[3,] 4 1
[4,] 8 1
[5,] 8 2
[6,] 4 4
[7,] 7 4
[8,] 6 5
[9,] 7 7
In this function, the row index is output in column 1 and the column index is output in column 2

Order a matrix by multiple column in r

I have a matrix
df<-matrix(data=c(3,7,5,0,1,0,0,0,0,8,0,9), ncol=2)
rownames(df)<-c("a","b","c","d","e","f")
[,1] [,2]
a 3 0
b 7 0
c 5 0
d 0 8
e 1 0
f 0 9
and I would like to order the matrix in descending order first by column 1 and then by column two resulting in the matrix
df.ordered<-matrix(data=c(7,5,3,1,0,0,0,0,0,0,9,8),ncol=2)
rownames(df.ordered)<-c("b","c","a","e","f","d")
[,1] [,2]
b 7 0
c 5 0
a 3 0
e 1 0
f 0 9
d 0 8
Any suggestions on how I could achieve this? Thanks.
The order function should do it.
df[order(df[,1],df[,2],decreasing=TRUE),]
To complete the main answer, here is a way to do it programmatically, without having to specify the columns by hand:
set.seed(2013) # preparing my example
mat <- matrix(sample.int(10,size = 30, replace = T), ncol = 3)
mat
[,1] [,2] [,3]
[1,] 5 1 6
[2,] 10 3 1
[3,] 8 8 1
[4,] 8 9 9
[5,] 3 7 3
[6,] 8 8 5
[7,] 10 10 2
[8,] 8 10 7
[9,] 10 1 9
[10,] 9 4 5
As a simple example, let say I want to use all the columns in their order of appearance to sort the rows of the matrix: (One could easily give a vector of indexes to the matrix)
mat[do.call(order, as.data.frame(mat)),] #could be ..as.data.frame(mat[,index_vec])..
[,1] [,2] [,3]
[1,] 3 7 3
[2,] 5 1 6
[3,] 8 8 1
[4,] 8 8 5
[5,] 8 9 9
[6,] 8 10 7
[7,] 9 4 5
[8,] 10 1 9
[9,] 10 3 1
[10,] 10 10 2
order function will help you out, try this:
df[order(-df[,1],-df[,2]),]
[,1] [,2]
b 7 0
c 5 0
a 3 0
e 1 0
f 0 9
d 0 8
The minus before df indicates that the order is decreasing. You will get the same result setting decreasing=TRUE.
df[order(df[,1],df[,2],decreasing=TRUE),]

Resources