R pnorm function with sparse matrix - r

I would like to find pvalues of a large sparse matrix. All the elements in this matrix are standard normal z-scores. I want to use pnorm function, but I met a problem that pnorm does not support sparse matrix. Except for transforming sparse matrix to full matrix, is there any other more efficient way?
Any suggestions are appreciated!

If it is a sparse matrix, you can easily replace the 0 values with pnorm(0..). What remains is to calculate the non-zero values, which you can do. For example a sparse matrix:
data <- rnorm(1e5)
zero_index <- sample(1e5)[1:9e4]
data[zero_index] <- 0
mat <- matrix(data, ncol=100)
mat_sparse <- Matrix(mat, sparse=TRUE)
Create a matrix with pnorm for 0:
mat_pnorm <- matrix(pnorm(rep(0,length(mat_sparse))),ncol=ncol(mat_sparse))
nzData <- summary(mat_sparse)
mat_pnorm[as.matrix(nzData[,1:2])] <- pnorm(nzData$x)
all.equal(mat_pnorm,pnorm(mat))
[1] TRUE
You did not specify how you would like the p-values, but you can easily have it cast into a vector instead of a matrix which was used above.

Related

Multiplicating a matrix with a vector results in a matrix

I have a document-term matrix:
document_term_matrix <- as.matrix(DocumentTermMatrix(corpus, control = list(stemming = FALSE, stopwords=FALSE, minWordLength=3, removeNumbers=TRUE, removePunctuation=TRUE )))
For this document-term matrix, I've calculated the local term- and global term weighing as follows:
lw_tf <- lw_tf(document_term_matrix)
gw_idf <- gw_idf(document_term_matrix)
lw_tf is a matrix with the same dimensionality as the document-term-matrix (nxm) and gw_idf is a vector of size n. However, when I run:
tf_idf <- lw_tf * gw_idf
The dimensionality of tf_idf is again nxm.
Originally, I would not expect this multiplication to work, as the dimensionalities are not conformable. However, given this output I now expect the dimensionality of gw_idf to be mxm. Is this indeed the case? And if so: what happened to the gw_idf vector of size n?
Matrix multiplication is done in R by using %*%, not * (the latter is just element-wise multiplication). Your reasoning is partially correct, you were just using the wrong symbols.
About the matrix multiplication, a matrix multiplication is only possible if the second dimension of the first matrix is the same as the first dimensions of the second matrix. The resulting dimensions is the dim1 of first matrix by the dim2 of the second matrix.
In your case, you're telling us you have a 1 x n matrix multiplied by a n x m matrix, which should result in a 1 x m matrix. You can check such case in this example:
a <- matrix(runif(100, 0 , 1), nrow = 1, ncol = 100)
b <- matrix(runif(100 * 200, 0, 1), nrow = 100, ncol = 200)
c <- a %*% b
dim(c)
[1] 1 200
Now, about your specific case, I don't really have this package that makes term-documents (would be nice of you to provide an easily reproducible example!), but if you're multiplying a nxm matrix element-wise (you're using *, like I said in the beginning) by a nx1 array, the result does not make sense. Either your variable gw_idf is not an array at all (maybe it's just a scalar) or you're simply making a wrong conclusion.

Compute the null space of a sparse matrix

I found the function (null OR nullspace) to find the null space of a regular matrix in R, but I couldn't find any function or package for a sparse matrix (sparseMatrix).
Does anybody know how to do this?
If you take a look at the code of ggm::null, you will see that it is based on the QR decomposition of the input matrix.
On the other hand, the Matrix package provides its own method to compute the QR decomposition of a sparse matrix.
For example:
require(Matrix)
A <- matrix(rep(0:1, 3), 3, 2)
As <- Matrix(A, sparse = TRUE)
qr.Q(qr(A), complete=TRUE)[, 2:3]
qr.Q(qr(As), complete=TRUE)[, 2:3]

Construct diagonal matrix using bigalgebra

I'd like to construct a bigger diagonal matrix from a vector. I installed the bigalgebra package, but it don't have the diag function. In addition, how to make the inverse (solve) and transpose (t) to big matrices.
v <- runif(42109)
V <- diag(v)
Error: cannot allocate vector of size 13.2 Gb
If sparse matrices are an option, you can use the Matrix package (supplied with R).
library(Matrix)
V <- Matrix(0, nrow=42109, ncol=42109)
diag(V) <- v

how to calculate mean of multiple matrices

I have 2000 covariance matrices of size 27*27, I want to get the mean covariance matrix over all 2000 matrices. The result I want is one matrix of size 27x27 in which position [1,1] is the mean of position [1,1] of the given 27 matrices.
I could see from other posts that I should make an array and use apply function, but it does not work!
my codes:
a<-array(ml.1[c(1:2000)])
apply(a,c(1,2),mean)
I get this error message:
Error in if (d2 == 0L) { : missing value where TRUE/FALSE needed
I would appreciate if anyone can help me to solve this problem.
First, #eipi10 is right your're question is not reproducible. But the key here is in how you set up your array.
#Make some fake data 10 matrices 10x10
m <- lapply(1:10, function(x) matrix(rnorm(100), nrow = 10))
#bind the matrices together
a <- do.call(cbind, m)
#recast the matrix into three dimensions
dim(a) <- c(10,10,10)
#now apply should work
apply(a, c(1,2), mean)

making matrix with eigenvalues

I am trying to make a diagonal matrix of eigenvalues.
Here is my code:
E = eigen(cor(A))
VAL = E$values
VEC = E$vectors
so I get a vector with eigenvalues, but how do I turn it into a matrix.
I guess I can just use cbind() and manually input a e-value matrix, but there has to be a more correct way
You can use diag:
diag(E$values)

Resources