An efficient way to diagonalize a sparse vector in R - r

I'm working with a vector (~14000x1) of various values that I would like to put on the diagonal of a sparse matrix where I'm using the library Matrix. I want to do this while avoiding the need of creating a full matrix and then converting back to a sparse matrix after.
So far I can do this with a for loop but it takes a long time. Can you think of a more efficient and least memory-intense way of doing it?
Here's a simple reproducible example:
library(Matrix)
x = Matrix(matrix(1,14000,1),sparse=TRUE)
X = Diagonal(14000)
for(i in 1:13383){
X[i,i]=aa[i]
print(i)
}

Related

Block-diagonal matrix from array in R

How can we construct a block-diagonal matrix from a three-dimensional array in R? There are several possibilities when starting from a list of matrices (e.g., Reduce(magic::adiag, list_of_matrices)) or individual matrices (e.g., magic::adiag(matrix1, matrix2)). However, I could not find anything when we start with an array:
matrices <- array(NA, c(3,3,2))
matrices[,,1] <- diag(1,3)
matrices[,,2] <- matrix(rnorm(9), 3, 3)
Are there any efficient solutions for constructing the corresponding 9x9 block matrix or is it a better idea to just convert to a list and use magic::adiag? The latter seems relatively inefficient, especially when the number of matrices is large.
I guess converting to a list and using magic::adiag is the fastest way. Try the following lines of code which is rather short and I use frequently:
library(magic)
arr <- array(1:8, c(2,2,3))
do.call("adiag", lapply(seq(dim(arr)[3]), function(x) arr[ , , x]))
This essentially reduces to a one-liner but uses lists.

Block sparse matrices as sparse matrices

Given a sparse matrix M, which is 32x24 I'm trying to create a larger sparse matrix of this form:
A = [[O(32),M],[t(M),O(24)]]
Here O(n) is a zero sparse matrix of dimension nxn.
M itself is a block matrix:
M = [[m.aa,m.ab],[m.ba,m.bb]]
where m.ij is 16x12.
I'm using the Matrix package for sparsematrix and blockmatrix for blockmatrix. One problem I have is that the use.as.blockmatrix=FALSE parameter, which works nicely for ordinary block matrices, seems not to work properly for block sparse matrices. I can't take the transpose of the block matrices in question, which makes the construction of A difficult.
Here's how I'm generating m.ij:
m.aa<-rsparsematrix(
#dimensions:
nrow=16,ncol=12,
nnz=20,
rand.x=function(x) 1 )
m.ab<-rsparsematrix(
#dimensions:
nrow=16,ncol=12,
nnz=10,
rand.x=function(x) 1 )
m.ba<-rsparsematrix(
nrow=16,ncol=12,
nnz=0,
rand.x=function(x) 1 )
m.bb<-m.aa
M<-blockmatrix(dim=c(2,2),names=c("maa","mba","mab","mbb"),
maa=m.aa,mab=m.ab,mba=m.ba,mbb=m.bb,
use.as.blockmatrix=FALSE)
But attr(M,"class") shows M is still a blockmatrix, even though I have use.as.blockmatrix=FALSE.
I can create O(32) and O(24), but t(M) gives me the error message argument is not a matrix, so I can't use it for block A(2,1) :(
A might be constructed with something like:
Mt<-t(M)
O32<-rsparsematrix(nrow=32,ncol=32,nnz=0)
O24<-rsparsematrix(nrow=24,ncol=24,nnz=0)
A<-blockmatrix(dim=c(2,2),names=c("RR","BR","RB","BB"), RR=O32,RB=M,BR=Mt,BB=O24)
This is perhaps a little awkward, but rather than using blockmatrix you can put together the appropriate blocks yourself using rBind(), cBind(), and Matrix(0,...):
M <- cBind(rBind(m.aa,m.ba),rBind(m.ab,m.bb))
A <- rBind(
cBind(Matrix(0,32,32), M ),
cBind(t(M), Matrix(0,24,24))
)

Preallocate sparse matrix with max nonzeros in R

I'm looking to preallocate a sparse matrix in R (using simple_triplet_matrix) by providing the dimensions of the matrix, m x n, and also the number of non-zero elements I expect to have. Matlab has the function "spalloc" (see below), but I have not been able to find an equivalent in R. Any suggestions?
S = spalloc(m,n,nzmax) creates an all zero sparse matrix S of size m-by-n with room to hold nzmax nonzeros.
Whereas it may make sense to preallocate a traditional dense matrix in R (in the same way it is much more efficient to preallocate a regular (atomic) vector rather than increasing its size one by one,
I'm pretty sure it will not pay to preallocate sparse matrices in R, in most situations.
Why?
For dense matrices, you allocate and then assign "piece by piece", e.g.,
m[i,j] <- value
For sparse matrices, however that is very different: If you do something like
S[i,j] <- value
the internal code has to check if [i,j] is an existing entry (typically non-zero) or not. If it is, it can change the value, but otherwise, one way or the other, the triplet (i,j, value) needs to be stored and that means extending the current structure etc. If you do this piece by piece, it is inefficient... mostly irrespectively if you had done some preallocation or not.
If, on the other hand, you already know in advance all the [i,j] combinations which will contain non-zeroes, you could "pre-allocate", but in this case,
just store the vector i and j of length nnzero, say. And then use your underlying "algorithm" to also construct a vector x of the same length which contains all the corresponding values, i.e., entries.
Now, indeed, as #Pafnucy suggested, use spMatrix() or sparseMatrix(), two slightly different versions of the same functionality: Constructing a sparse matrix, given its contents.
I am happy to help further, as I am the maintainer of the Matrix package.

Populating Large Matrix and Computations

I am trying to populate a 25000 x 25000 matrix in a for loop, but R locks up on me. The data has many zero entries, so would a sparse matrix be suitable?
Here is some sample data and code.
x<-c(1,3,0,4,1,0,4,1,1,4)
y<-x
z<-matrix(NA,nrow=10,ncol=10)
for(i in 1:10){
if(x[i]==0){
z[i,]=0
} else{
for(j in 1:10){
if(x[i]==y[j]){
z[i,j]=1
} else{z[i,j]=0
}
}
}
}
One other question. Is it possible to do computations on matrices this large. When I perform some calculations on some sample matrices of this size I get an output of NA with a warning of integer overflow or R completely locks up.
You could vectorize this and that should help you. Also, if your data is indeed sparse and you can conduct your analysis on a sparse matrix it definitely is something to consider.
library(Matrix)
# set up all pairs
pairs <- expand.grid(x,x)
# get matrix indices
idx <- which(pairs[,1] == pairs[,2] & pairs[,1] != 0)
# create empty matrix with zero's instead
z<-matrix(0,nrow=10,ncol=10)
z[idx] = 1
# create empty sparse matrix
z2 <-Matrix(0,nrow=10,ncol=10, sparse=TRUE)
z2[idx] = 1
all(z == z2)
[1] TRUE
The comment by #alexis_lax would make this even simpler and faster. I had completely forgotten about the outer function.
# normal matrix
z = outer(x, x, "==") * (x!=0)
# sparse matrix
z2 = Matrix(outer(x, x, "==") * (x!=0), sparse=TRUE)
To answer your second question if computations can be done on such a big matrix the answer is yes. You just need to approach it more cautiously and use the appropriate tools. Sparse matrices are nice and many typical matrix functions are available and some other package are compatible. Here is a link to a page with some examples.
Another thought, if you are working with really large matrices you may want to look in to other packages like bigmemory which are designed to deal with R's large overhead.

using interp1 in R for matrix

I am trying to use the interp1 function in R for linearly interpolating a matrix without using a for loop. So far I have tried:
bthD <- c(0,2,3,4,5) # original depth vector
bthA <- c(4000,3500,3200,3000,2800) # original array of area
Temp <- c(4.5,4.2,4.2,4,5,5,4.5,4.2,4.2,4)
Temp <- matrix(Temp,2) # matrix for temperature measurements
# -- interpolating bathymetry data --
depthTemp <- c(0.5,1,2,3,4)
layerZ <- seq(depthTemp[1],depthTemp[5],0.1)
library(signal)
layerA <- interp1(bthD,bthA,layerZ);
# -- interpolate= matrix --
layerT <- list()
for (i in 1:2){
t <- Temp[i,]
layerT[[i]] <- interp1(depthTemp,t,layerZ)
}
layerT <- do.call(rbind,layerT)
So, here I have used interp1 on each row of the matrix in a for loop. I would like to know how I could do this without using a for loop. I can do this in matlab by transposing the matrix as follows:
layerT = interp1(depthTemp,Temp',layerZ)'; % matlab code
but when I attempt to do this in R
layerT <- interp1(depthTemp,t(Temp),layerZ)
it does not return a matrix of interpolated results, but a numeric array. How can I ensure that R returns a matrix of the interpolated values?
There is nothing wrong with your approach; I probably would avoid the intermediate t <-
If you want to feel R-ish, try
apply(Temp,1,function(t) interp1(depthTemp,t,layerZ))
You may have to add a t(ranspose) in front of all if you really need it that way.
Since this is a 3d-field, per-row interpolation might not be optimal. My favorite is interp.loess in package tgp, but for regular spacings other options might by available. The method does not work for you mini-example (which is fine for the question), but required a larger grid.

Resources