I am trying to calculate the pseudoinverse of a large sparse matrix in R using the singular value decomposition. The matrix L is roughly 240,000 x 240,000, and I have it stored as type dgCMatrix. $L$ represents the Laplacian of a large diameter graph, and I happen to know that the pseudoinverse $L^+$ should also be sparse. From empirical observations of a smaller subset of this graph, L is ~.07% sparse, while $L^+$ which is ~2.5% sparse.
I have tried using pinv, ginv, and other standard pseudoinverse functions, but they error out due to memory constraints. I then tried to opt for the sparse matrix svd provided by the package irlba, which I was then going to use to compute the pseudoinverse using the standard formula after converting all outputs to sparse matrices. My code is here:
lim = 40
digits = 4
SVD =irlba(L,lim)
tU = round(SVD$u,digits)
nonZeroU = which(abs(U)>0,arr.ind = T)
sparseU = sparseMatrix(i=nonZeroU[,2],j=nonZeroU[,1],x = U[nonZeroU])
V = round(SVD$v,digits)
nonZeroV = which(abs(V)>0,arr.ind = T)
sparseV = sparseMatrix(i=nonZeroV[,1],j=nonZeroV[,2],x = U[nonZeroV])
D = as(Diagonal(x=1/SVD$d),"sparseMatrx")
pL =D%*%sparseU
pL = sparseV%*%pL
I am able to get to the last line without an issue, but then I get an error due to memory constraints that says
Error in sparseV %*% pL :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Of course I could piece together the pseudoinverse entry by entry using a for loop and vector multiplications, but I would like to be able to calculate it using a simple function that takes advantage of the sparsity of the resultant pseudoinverse matrix. Is there any way to use the SVD of L to efficiently and approximately compute the pseudoinverse of $L+$, other than calculating each row individually?
Related
I used the function linshrink of the nlshrink package to have a shrinkage estimation of the eigenvalues of a symmetric matrix M. Unfortunately the function does not return the eigenvectors, which I also need. How can I manually compute them? I thought about applying the definition and use (M − λI)x = 0 for every eigenvalue λ, but I'm not sure how to properly do it, since computing the matrix A = M − λI and using it as an input in solve(A,b) with b=rep(0,nrow(M)) obviously returns a vector of zero. Can anybody help me? Here are a few lines to provide a working example:
library(nlshrink)
M <- matrix(1:16,4)
M[lower.tri(M)] = t(M)[lower.tri(M)]
M <- M/16.1
shrinkval <- linshrink(M) #eigenvalues
I want to conduct a rpca and had a dataframe df before.
rpca(df, k = 1000, center = FALSE, scale = TRUE, retx = TRUE, p = 10,
+ q = 2, rand = TRUE)
I got the error message: "cannot allocate vector of size 500mb"
So I created a sparse matrix m and repeated the rpca (not knowing that this would be larger than the original df) so I got the error message
Error: cannot allocate vector of size 1.5 Gb
Is there a way to transform this sparse matrix to a sparse dataframe?
Note: I know there are entries here on converting sparse matrix to a dataframe, but since the dataframe was already too large I am looking for a sparse dataframe, which I couldn't find anything on...Any help is highly appreciated. (please bear in mind I am new to ml and r)
I have a vector of integers as input values (starting values for optim par)
my.data.var <- c(10,0.25,0.25,0.25,0.25,0.25,
10,0.25,0.25,0.25,0.25,0.25,
10,0.25,0.25,0.25,0.25,0.25,
10,0.25,0.25,0.25,0.25,0.25)
Optimization problem is a min. problem.
The error function calculates sum of square root of diff in values between
TWO MATRICES (Given Values Matrix vs Calculated Matrix)
The calculated matrix is the one that uses above integer vector.
Hence, in the error function, I stack the integer vector into a
matrix as my.data.var.mat <- matrix(my.data.var,nrow = 4,ncol = 6,byrow = TRUE)
The constraint that I must introduce is that colSum(my.data.var.mat) <=1
The optim is defined as
sols<-optim(my.data.var,Error.func,method="L-BFGS-B",upper=c(Inf,1,1,1,1,1,Inf,1,1,1,1,1,Inf,1,1,1,1,1,Inf,1,1,1,1,1),
lower=c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0))
Error Function is defined as
Error.func <- function(my.data.var){
my.data.var.mat <- matrix(my.data.var,nrow = ncol(my.data.matrix.prod),ncol = ncol(my.data.matrix.inj)+1,byrow = TRUE)
Calc.Qjk.Value <- Qjk.Cal.func(my.data.timet0,my.data.qo,my.data.matrix.time,
my.data.matrix.inj, my.data.matrix.prod,my.data.var,my.data.var.mat)
diff.values <- my.data.matrix.prod-Calc.Qjk.Value #FIND DIFFERENCE BETWEEN CAL. MATRIX AND ORIGINAL MATRIX
Error <- ((colSums ((diff.values^2), na.rm = FALSE, dims = 1))/nrow(my.data.matrix.inj))^0.5 #sum of square root of the diff
Error_total <- sum(Error,na.rm=FALSE)/ncol(my.data.matrix.prod) # total avg error
Error_total
}
Given Dataset: my.data.matrix.prod , my.data.timet0, my.data.qo, my.data.matrix.time, my.data.matrix.inj
So, my question is how and where should I introduce the matrix col sum constraint? Or the other way to put it as how would OPTIM vary integer vector under Matrix col sum constraint?
I realized that nloptr is a better option than optim since my problem consisted of "inequality constraints".
I modified the implementation as I explain in this post here. "multiple inequality constraints" - Minimization with R nloptr package
Hence, closing this thread.
I have to create a model which is a mixture of a normal and log-normal distribution. To create it, I need to estimate the 2 covariance matrixes and the mixing parameter (total =7 parameters) by maximizing the log-likelihood function. This maximization has to be performed by the nlm routine.
As I use relative data, the means are known and equal to 1.
I’ve already tried to do it in 1 dimension (with 1 set of relative data) and it works well. However, when I introduce the 2nd set of relative data I get illogical results for the correlation and a lot of warnings messages (at all 25).
To estimate these parameters I defined first the log-likelihood function with the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the parameters and finally I use the nlm routine to estimate the parameters (see script below).
`P <- read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc", return.header=
FALSE );
V <- read.ascii.grid("d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc", return.header=
FALSE );
p <- c(P); # tranform matrix into a vector
v <- c(V);
p<- p[!is.na(p)] # removing NA values
v<- v[!is.na(v)]
p_rel <- p/mean(p) #Transforming the data to relative values
v_rel <- v/mean(v)
PV <- cbind(p_rel, v_rel) # create a matrix of vectors
L <- function(par,p_rel,v_rel) {
return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= matrix(c(par[1]^2, par[1]*par[2]
*par[3],par[1]*par[2]*par[3], par[2]^2 ),nrow=2, ncol=2))+
par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog= matrix(c(par[4]^2,par[4]*par[5]*par[6],par[4]
*par[5]*par[6],par[5]^2), nrow=2,ncol=2)) )))
}
par.start<- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood estimators
result<-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200, check.analyticals= TRUE)
Messages d'avis :
1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
production de NaN
2: In sqrt(2 * pi * det(varlog)) : production de NaN
3: In nlm(L, par.start, p_rel = p_rel, v_rel = v_rel, hessian = TRUE) :
NA/Inf replaced by maximum positive value
4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
production de NaN
…. Until 25.
par.hat <- result$estimate
cat("sigN_p =", par[1],"\n","sigN_v =", par[2],"\n","rhoN =", par[3],"\n","sigLN_p =", par [4],"\n","sigLN_v =", par[5],"\n","rhoLN =", par[6],"\n","mixing parameter =", par[7],"\n")
sigN_p = 0.5403361
sigN_v = 0.6667375
rhoN = 0.6260181
sigLN_p = 1.705626
sigLN_v = 1.592832
rhoLN = 0.9735974
mixing parameter = 0.8113369`
Does someone know what is wrong in my model or how should I do to find these parameters in 2 dimensions?
Thank you very much for taking time to look at my questions.
Regards,
Gladys Hertzog
When I do these kind of optimization problems, I find that it's important to make sure that all the variables that I'm optimizing over are constrained to plausible values. For example, standard deviation variables have to be positive, and from knowledge of the situation that I'm modelling I'll probably be able to put an upper bound all my standard deviation variables as well. So if s is one of my standard deviation variables, and if m is the maximum value that I want it to take, instead of working with s I'll solve for the variable z which is related to s via
s = m/(1+e-z)
In that formula, z is unconstrained, but s must lie between 0 and m. This is vital because optimization routines where the variables are not constrained to take plausible values will often try completely implausible values while they're trying to bound the solution. Implausible values often cause problems with e.g. precision, that then results in NaN's etc. The general formula that I use for constraining a single variable x to lie between a and b is
x = a + (b - a)/(1+e-z)
However, regarding your particular problem where you're looking for covariance matrices, a more sophisticated approach is necessary than simply bounding all the individual variables. Covariance matrices must be positive semi-definite, so if you're simply optimizing the individual values in the matrix, the optimization will probably fail (producing NaN's) if a matrix which isn't positive definite is fed into the likelihood function. To get round this problem, one approach is to solve for the Cholesky decomposition of the covariance matrix instead of the covariance matrix itself. My guess is that this is probably what's causing your optimization to fail.
I did a sparse qr decomposition with "Matrix" package in R like
a <- Matrix(runif(20), nrow = 5, sparse = T)
a[3:5,] <- 0 #now **a** is a 5X4 matrix
b <- qr.R(qr(a), complete = T) #but now **b** is a 7X4 matrix!
anyone knows why? Note that if I keep a dense, then the bug(?) does not appear.
I'll assume you did not see the warning, otherwise you would have mentioned it, right?
Warning message:
In qr.R(qr(a), complete = T) :
qr.R(< sparse >) may differ from qr.R(< dense >) because of permutations
Now if you are asking what those permutations mean, it's a different story...
The help("sparseQR-class") page may have more info on the issue:
However, because the matrix Q is not uniquely defined, the results of qr.qy and qr.qty do not necessarily match those from the corresponding dense matrix calculations.
Maybe it is the same with qr.R?
Finally, further down on the same help page:
qr.R --- signature(qr = "sparseQR"): compute the upper triangular R matrix of the QR decomposition. Note that this currently warns because of possible permutation mismatch with the classical qr.R() result, and you can suppress these warnings by setting options() either "Matrix.quiet.qr.R" or (the more general) either "Matrix.quiet" to TRUE.