Is there a built-in function in either slam package or Matrix package to convert a sparse matrix in simple triplet matrix form (from slam package) to a sparse matrix in dgTMatrix/dgCMatrix form (from Matrix package) ?
And is there a built-in way to access non-zero entries from simple triplet matrix ?
I'm working in R
Actually, there is a built-in way:
simple_triplet_matrix_sparse <- sparseMatrix(i=simple_triplet_matrix_sparse$i, j=simple_triplet_matrix_sparse$j, x=simple_triplet_matrix_sparse$v,
dims=c(simple_triplet_matrix_sparse$nrow, simple_triplet_matrix_sparse$ncol))
From my own experience, this trick saved me tons of time and miseries, and computer crashing doing large-scale text mining using tm package. This question doesn't really need a reproducible example. A simple triplet matrix is a simple triplet matrix no matter what data it contains. This question is merely asking if there's a built-in function in either package to support conversion between the two.
slight modification. sparseMatrix takes integers as inputs, whereas slam takes i, j, as factors and v can be anything
as.sparseMatrix <- function(simple_triplet_matrix_sparse) {
sparseMatrix(
i = simple_triplet_matrix_sparse$i,
j = simple_triplet_matrix_sparse$j,
x = simple_triplet_matrix_sparse$v,
dims = c(
simple_triplet_matrix_sparse$nrow,
simple_triplet_matrix_sparse$ncol
),
dimnames = dimnames(simple_triplet_matrix_sparse)
)
}
Related
Might be a very silly question, but I cannot seem to find a proper way to create a sparse diagonal matrix in R.
I've found the functions:
diag.spam()
spdiags()
and used them with library Matrix and package spam downloaded, but R did not seem to recognize these functions. Does anyone know a function or library I need to download?
I need it because I want to create diagonal matrices larger than 256 by 256.
The Diagonal() function in the Matrix package. (Matrix is a "recommended" package, which means it is automatically available when you install R.)
library(Matrix)
m <- Diagonal(500)
image(m)
Diagonal(n) creates an n x n identity matrix. If you want to create a diagonal matrix with a specified diagonal x, use Diagonal(x=<your vector>)
Use bandSparse of the Matrix library.
to get an n-by-n matrix with m on its diagonal use, write:
bandSparse(n,n,0,list(rep(m, n+1)))
Given a sparse matrix M, which is 32x24 I'm trying to create a larger sparse matrix of this form:
A = [[O(32),M],[t(M),O(24)]]
Here O(n) is a zero sparse matrix of dimension nxn.
M itself is a block matrix:
M = [[m.aa,m.ab],[m.ba,m.bb]]
where m.ij is 16x12.
I'm using the Matrix package for sparsematrix and blockmatrix for blockmatrix. One problem I have is that the use.as.blockmatrix=FALSE parameter, which works nicely for ordinary block matrices, seems not to work properly for block sparse matrices. I can't take the transpose of the block matrices in question, which makes the construction of A difficult.
Here's how I'm generating m.ij:
m.aa<-rsparsematrix(
#dimensions:
nrow=16,ncol=12,
nnz=20,
rand.x=function(x) 1 )
m.ab<-rsparsematrix(
#dimensions:
nrow=16,ncol=12,
nnz=10,
rand.x=function(x) 1 )
m.ba<-rsparsematrix(
nrow=16,ncol=12,
nnz=0,
rand.x=function(x) 1 )
m.bb<-m.aa
M<-blockmatrix(dim=c(2,2),names=c("maa","mba","mab","mbb"),
maa=m.aa,mab=m.ab,mba=m.ba,mbb=m.bb,
use.as.blockmatrix=FALSE)
But attr(M,"class") shows M is still a blockmatrix, even though I have use.as.blockmatrix=FALSE.
I can create O(32) and O(24), but t(M) gives me the error message argument is not a matrix, so I can't use it for block A(2,1) :(
A might be constructed with something like:
Mt<-t(M)
O32<-rsparsematrix(nrow=32,ncol=32,nnz=0)
O24<-rsparsematrix(nrow=24,ncol=24,nnz=0)
A<-blockmatrix(dim=c(2,2),names=c("RR","BR","RB","BB"), RR=O32,RB=M,BR=Mt,BB=O24)
This is perhaps a little awkward, but rather than using blockmatrix you can put together the appropriate blocks yourself using rBind(), cBind(), and Matrix(0,...):
M <- cBind(rBind(m.aa,m.ba),rBind(m.ab,m.bb))
A <- rBind(
cBind(Matrix(0,32,32), M ),
cBind(t(M), Matrix(0,24,24))
)
I'm working with a vector (~14000x1) of various values that I would like to put on the diagonal of a sparse matrix where I'm using the library Matrix. I want to do this while avoiding the need of creating a full matrix and then converting back to a sparse matrix after.
So far I can do this with a for loop but it takes a long time. Can you think of a more efficient and least memory-intense way of doing it?
Here's a simple reproducible example:
library(Matrix)
x = Matrix(matrix(1,14000,1),sparse=TRUE)
X = Diagonal(14000)
for(i in 1:13383){
X[i,i]=aa[i]
print(i)
}
I have started working on a few ML projects and use R as the preferred language. I am trying to build a basic recommendation system
http://www.dataperspective.info/2014/05/basic-recommendation-engine-using-r.html
I need to find the similarity matrix (according to the website) and using cosine function (in 'lsa' package) to find user_similarity.
library(lsa)
data_rating <- read.csv("recommendation_basic1.csv", header = TRUE)
x = data_rating[,2:7]
x[is.na(x)] = 0
print(x)
similarity_users <- cosine(as.matrix(x))
similarity_users
But I need to find the similarity matrix among users and this code is giving me an output similarity matrix among the movies. Do I need to modify the below line?
x = data_rating[,2:7]
PS. The recommendation_basic1.csv is the same as in the link.
Putting this in so the question is not unanswered.
You can just use similarity_users <- cosine(as.matrix(t(x)))
Here, the t is matrix transpose, so it just switches the rows and columns which is equivalent to switching the users and the movies.
I have a matrix of factors in R and want to convert it to a matrix of dummy variables 0-1 for all possible levels of each factors.
However this "dummy" matrix is very large (91690x16593) and very sparse. I need to store it in a sparse matrix, otherwise it does not fit in my 12GB of ram.
Currently, I am using the following code and it works very fine and takes seconds:
library(Matrix)
X_factors <- data.frame(lapply(my_matrix, as.factor))
#encode factor data in a sparse matrix
X <- sparse.model.matrix(~.-1, data = X_factors)
However, I want to use the e1071 package in R, and eventually save this matrix to libsvm format with write.matrix.csr(), so first I need to convert my sparse matrix to the SparseM format.
I tried to do:
library(SparseM)
X2 <- as.matrix.csr(X)
but it very quickly fills my RAM and eventually R crashes. I suspect that internally, as.matrix.csr first converts the sparse matrix to a dense matrix that does not fit in my computer memory.
My other alternative would be to create my sparse matrix directly in the SparseM format.
I tried as.matrix.csr(X_factors) but it does not accept a data-frame of factors.
Is there an equivalent to sparse.model.matrix(~.-1, data = X_factors) in the SparseM package? I searched in the documentation but I did not find.
Quite tricky but I think I got it.
Let's start with a sparse matrix from the Matrix package:
i <- c(1,3:8)
j <- c(2,9,6:10)
x <- 7 * (1:7)
X <- sparseMatrix(i, j, x = x)
The Matrix package uses a column-oriented compression format, while SparseM supports both column and row oriented formats and has functions that can easily handle the conversion from one format to the other.
So we will first convert our column-oriented Matrix into a column-oriented SparseM matrix: we just need to be careful calling the right constructor and noticing that both packages use different conventions for indices (start at 0 or 1):
X.csc <- new("matrix.csc", ra = X#x,
ja = X#i + 1L,
ia = X#p + 1L,
dimension = X#Dim)
Then, change from column-oriented to row-oriented format:
X.csr <- as.matrix.csr(X.csc)
And you're done! You can check that the two matrices are identical (on my small example) by doing:
range(as.matrix(X) - as.matrix(X.csc))
# [1] 0 0