How to compute the inverse of a close to singular matrix in R? - r

I want to minimize function FlogV (working with a multinormal distribution, Z is data matrix NxC; SIGMA it´s a square matrix CxC of var-covariance of data, R a vector with length C)
FLogV <- function(P){
(here I define parameters, P, within R and SIGMA)
logC <- (C/2)*N*log(2*pi)+(1/2)*N*log(det(SIGMA))
SOMA.t <- 0
for (j in 1:N){
SOMA.t <- SOMA.t+sum(t(Z[j,]-R)%*%solve(SIGMA)%*%(Z[j,]-R))
}
MlogV <- logC + (1/2)*SOMA.t
return(MlogV)
}
minLogV <- optim(P,FLogV)
All this is part of an extend code which was already tested and works well, except in the most important thing: I can´t optimize because I get this error:
“Error in solve.default(SIGMA) :
system is computationally singular: reciprocal condition number = 3.57726e-55”
If I use ginv() or pseudoinverse() or qr.solve() I get:
“Error in svd(X) : infinite or missing values in 'x'”
The thing is: if I take the SIGMA matrix after the error message, I can solve(SIGMA), the eigen values are all positive and the determinant is very small but positive
det(SIGMA)
[1] 3.384674e-76
eigen(SIGMA)$values
[1] 0.066490265 0.024034173 0.018738777 0.015718562 0.013568884 0.013086845
….
[31] 0.002414433 0.002061556 0.001795105 0.001607811
I already read several papers about change matrices like SIGMA (which are close to singular), did several transformations on data scale and form but I realized that, for a 34x34 matrix like the example, after det(SIGMA) close to e-40, R assumes it like 0 and calculation fails; also I can´t reduce matrix dimensions and can´t input in my function correction algorithms to singular matrices because R can´t evaluate it working with this optimization functions like optim. I really appreciate any suggestion to this problem.
Thanks in advance,
Maria D.

It isn't clear from your post whether the failure is coming from det() or solve()
If its just the solve in the quadratic term, you may want to try the two argument version of solve, it can be a bit more stable. solve(X,Y) is the same as solve(X) %*% Y
If you can factor sigma using chol(), you will get a triangular matrix such that LL'=Sigma. The determinant is the product of the diagonals, and you might try this for the quadratic term:
crossprod( backsolve(L, Z[j,]-R))

Related

Calculate the reconstruction error as the difference between the original and the reconstructed matrix

I am currently in an online class in genomics, coming in as a wetlab physician, so my statistical knowledge is not the best. Right now we are working on PCA and SVD in R. I got a big matrix:
head(mat)
ALL_GSM330151.CEL ALL_GSM330153.CEL ALL_GSM330154.CEL ALL_GSM330157.CEL ALL_GSM330171.CEL ALL_GSM330174.CEL ALL_GSM330178.CEL ALL_GSM330182.CEL
ENSG00000224137 5.326553 3.512053 3.455480 3.472999 3.639132 3.391880 3.282522 3.682531
ENSG00000153253 6.436815 9.563955 7.186604 2.946697 6.949510 9.095092 3.795587 11.987291
ENSG00000096006 6.943404 8.840839 4.600026 4.735104 4.183136 3.049792 9.736803 3.338362
ENSG00000229807 3.322499 3.263655 3.406379 9.525888 3.595898 9.281170 8.946498 3.473750
ENSG00000138772 7.195113 8.741458 6.109578 5.631912 5.224844 3.260912 8.889246 3.052587
ENSG00000169575 7.853829 10.428492 10.512497 13.041571 10.836815 11.964498 10.786381 11.953912
Those are just the first few columns and rows, it has 60 columns and 1000 rows. Columns are cancer samples, rows are genes
The task is to:
removing the eigenvectors and reconstructing the matrix using SVD, then we need to calculate the reconstruction error as the difference between the original and the reconstructed matrix. HINT: You have to use the svd() function and equalize the eigenvalue to $0$ for the component you want to remove.
I have been all over google, but can't find a way to solve this task, which might be because I don't really get the question itself.
so i performed SVD on my matrix m:
d <- svd(mat)
Which gives me 3 matrices (Eigenassays, Eigenvalues and Eigenvectors), which i can access using d$u and so on.
How do I equalize the eigenvalue and ultimately calculate the error?
https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/svd
the decomposition expresses your matrix mat as a product of 3 matrices
mat = d$u x diag(d$d) x t(d$v)
so first confirm you are able to do the matrix multiplications to get back mat
once you are able to do this, set the last couple of elements of d$d to zero before doing the matrix multiplication
It helps to create a function that handles the singular values.
Here, for instance, is one that zeros out any singular value that is too small compared to the largest singular value:
zap <- function(d, digits = 3) ifelse(d < 10^(-digits) * max(abs(d))), 0, d)
Although mathematically all singular values are guaranteed non-negative, numerical issues with floating point algorithms can--and do--create negative singular values, so I have prophylactically wrapped the singular values in a call to abs.
Apply this function to the diagonal matrix in the SVD of a matrix X and reconstruct the matrix by multiplying the components:
X. <- with(svd(X), u %*% diag(zap(d)) %*% t(v))
There are many ways to assess the reconstruction error. One is the Frobenius norm of the difference,
sqrt(sum((X - X.)^2))

MLE using nlminb in R - understand/debug certain errors

This is my first question here, so I will try to make it as well written as possible. Please be overbearing should I make a silly mistake.
Briefly, I am trying to do a maximum likelihood estimation where I need to estimate 5 parameters. The general form of the problem I want to solve is as follows: A weighted average of three copulas, each with one parameter to be estimated, where the weights are nonnegative and sum to 1 and also need to be estimated.
There are packages in R for doing MLE on single copulas or on a weighted average of copulas with fixed weights. However, to the best of my knowledge, no packages exist to directly solve the problem I outlined above. Therefore I am trying to code the problem myself. There is one particular type of error I am having trouble tracing to its source. Below I have tried to give a minimal reproducible example where only one parameter needs to be estimated.
library(copula)
set.seed(150)
x <- rCopula(100, claytonCopula(250))
# Copula density
clayton_density <- function(x, theta){
dCopula(x, claytonCopula(theta))
}
# Negative log-likelihood function
nll.clayton <- function(theta){
theta_trans <- -1 + exp(theta) # admissible theta values for Clayton copula
nll <- -sum(log(clayton_density(x, theta_trans)))
return(nll)
}
# Initial guess for optimization
guess <- function(x){
init <- rep(NA, 1)
tau.n <- cor(x[,1], x[,2], method = "kendall")
# Guess using method of moments
itau <- iTau(claytonCopula(), tau = tau.n)
# In case itau is negative, we need a conditional statement
# Use log because it is (almost) inverse of theta transformation above
if (itau <= 0) {
init[1] <- log(0.1) # Ensures positive initial guess
}
else {
init[1] <- log(itau)
}
}
estimate <- nlminb(guess(x), nll.clayton)
(parameter <- -1 + exp(estimate$par)) # Retrieve estimated parameter
fitCopula(claytonCopula(), x) # Compare with fitCopula function
This works great when simulating data with small values of the copula parameter, and gives almost exactly the same answer as fitCopula() every time.
For large values of the copula parameter, such as 250, the following error shows up when I run the line with nlminb():"Error in .local(u, copula, log, ...) : parameter is NA
Called from: .local(u, copula, log, ...)
Error during wrapup: unimplemented type (29) in 'eval'"
When I run fitCopula(), the optimization is finished, but this message pops up: "Warning message:
In dlogcdtheta(copula, u) :
dlogcdtheta() returned NaN in column(s) 1 for this explicit copula; falling back to numeric derivative for those columns"
I have been able to find out using debug() that somewhere in the optimization process of nlminb, the parameter of interest is assigned the value NaN, which then yields this error when dCopula() is called. However, I do not know at which iteration it happens, and what nlminb() is doing when it happens. I suspect that perhaps at some iteration, the objective function is evaluated at Inf/-Inf, but I do not know what nlminb() does next. Also, something similar seems to happen with fitCopula(), but the optimization is still carried out to the end, only with the abovementioned warning.
I would really appreciate any help in understanding what is going on, how I might debug it myself and/or how I can deal with the problem. As might be evident from the question, I do not have a strong background in coding. Thank you so much in advance to anyone that takes the time to consider this problem.
Update:
When I run dCopula(x, claytonCopula(-1+exp(guess(x)))) or equivalently clayton_density(x, -1+exp(guess(x))), it becomes apparent that the density evaluates to 0 at several datapoints. Unfortunately, creating pseudobservations by using x <- pobs(x) does not solve the problem, which can be see by repeating dCopula(x, claytonCopula(-1+exp(guess(x)))). The result is that when applying the logarithm function, we get several -Inf evaluations, which of course implies that the whole negative log-likelihood function evaluates to Inf, as can be seen by running nll.clayton(guess(x)). Hence, in addition to the above queries, any tips on handling log(0) when doing MLE numerically is welcome and appreciated.
Second update
Editing the second line in nll.clayton as follows seems to work okay:
nll <- -sum(log(clayton_density(x, theta_trans) + 1e-8))
However, I do not know if this is a "good" way to circumvent the problem, in the sense that it does not introduce potential for large errors (though it would surprise me if it did).

What is R algorithm for dummy coding model matrix?

I noticed when using a dummy coding for fitting my linear models R excludes certain parameters when forming model matrix. What is the R algorithm for doing this?
It is not well documented, but it goes back to whatever pivoting algorithm the underlying LAPACK code uses:
from the source code of lm.fit:
z <- .Call(C_Cdqrls, x, y, tol, FALSE)
...
coef <- z$coefficients
pivot <- z$pivot
...
r2 <- if(z$rank < p) (z$rank+1L):p else integer()
if (is.matrix(y)) {
....
} else {
coef[r2] <- NA
## avoid copy
if(z$pivoted) coef[pivot] <- coef
...
}
If you want to dig back further, you need to look into dqrdc2.f, which says (for what it's worth):
c dqrdc2 uses householder transformations to compute the qr
c factorization of an n by p matrix x. a limited column
c pivoting strategy based on the 2-norms of the reduced columns
c moves columns with near-zero norm to the right-hand edge of
c the x matrix. this strategy means that sequential one
c degree-of-freedom effects can be computed in a natural way.
In practice I have generally found that R eliminates the last (rightmost) column of a set of collinear predictor variables ...

I want to maximize returns on a portfolio ensuring risk is below a certain level. Which function can I use for optimization?

Objective function to be maximized : pos%*%mu where pos is the weights row vector and mu is the column vector of mean returns of d stocks
Constraints: 1) ones%*%pos = 1 where ones is a row vector of 1's of size 1*d (d is the number of stocks)
2) pos%*%cov%*%t(pos) = rb^2 # where cov is the covariance matrix of size d*d and rb is risk budget which is the free parameter whose values will be changed to draw the efficient frontier
I want to write a code for this optimization problem in R but I can't think of any function or library for help.
PS: solve.QP in library quadprog has been used to minimize covariance subject to a target return . Can this function be also used to maximize return subject to a risk budget ? How should I specify the Dmat matrix and dvec vector for this problem ?
EDIT :
library(quadprog)
mu <- matrix(c(0.01,0.02,0.03),3,1)
cov # predefined covariance matrix of size 3*3
pos <- matrix(c(1/3,1/3,1/3),1,3) # random weights vector
edr <- pos%*%mu # expected daily return on portfolio
m1 <- matrix(1,1,3) # constraint no.1 ( sum of weights = 1 )
m2 <- pos%*%cov # constraint no.2
Amat <- rbind(m1,m2)
bvec <- matrix(c(1,0.1),2,1)
solve.QP(Dmat= ,dvec= ,Amat=Amat,bvec=bvec,meq=2)
How should I specify Dmat and dvec ? I want to optimize over pos
Also, I think I have not specified constraint no.2 correctly. It should make the variance of portfolio equal to the risk budget.
(Disclaimer: There may be a better way to do this in R. I am by no means an expert in anything related to R, and I'm making a few assumptions about how R is doing things, notably that you're using an interior-point method. Also, there is likely an R package for what you're trying to do, but I don't know what it is or how to use it.)
Minimising risk subject to a target return is a linearly-constrained problem with a quadratic objective, looking like this:
min x^T Q x
subject to sum x_i = 1
sum ret_i x_i >= target
(and x >= 0 if you want to be long-only).
Maximising return subject to a risk budget is quadratically-constrained, however; it looks like this:
max ret^T x
subject to sum x_i = 1
x^T Q x <= riskbudget
(and maybe x >= 0).
Convex quadratic terms in the objective impose less of a computational cost in an interior-point method compared to introducing a convex quadratic constraint. With a quadratic objective term, the Q matrix just shows up in the augmented system. With a convex quadratic constraint, you need to optimise over a more complicated cone containing a second-order cone factor and you need to be careful about how you solve the linear systems that arise.
I would suggest you use the risk-minimisation formulation repeatedly, doing a binary search on the target parameter until you've found a portfolio approximately maximising return subject to your risk budget. I am suggesting this approach because it is likely sufficient for your needs.
If you really want to solve your problem directly, I would suggest using an interface Todd, Toh, and Tutuncu's SDPT3. This really is overkill; SDPT3 permits you to formulate and solve symmetric cone programs of your choosing. I would also note that portfolio optimisation problems are particularly special cases of symmetric cone programs; other approaches exist that are reportedly very successful. Unfortunately, I'm not studied up on them.

Inaccuracies w/ prcomp? R lang PCA for eigenfaces

My question is: in the case of having a matrix we want to do PCA on, where the number of features greatly outnumbers the number of trials, why doesn't prcomp behave as expected (or am I missing something)?
Below is a summary of the issue, full code is here, compressed 7MB data source is here (is 55MB uncompressed), target image is here.
My exact situation is that I have a matrix p by n matrix X (p = features, n = trials) where the trials are photos taken of faces, and the features are the pixels in the photos (so a 32256 by 148 matrix). What I want to do is find the principal component score vectors of that matrix. Since finding the covariance matrix XX^T is too expensive, an easy solution is to find the eigenvectors (v_i) of X^TX and transform them by X (Xv_i) more info.
XTX <- t(X) %*% X # missing the 1 / n - 1 for cov matrix b/c we normalize later anyway
eigen <- eigen(XTX)
eigenvectors.XTX.col <- eigen$vectors
principal.component.scores <- apply(eigenvectors.XTX.col, 2, function(c) {
normalize.vector(X %*% matrix(c, ncol = 1))
})
The principal component scores are eigenfaces in my case, and can be used to successfully reconstruct the target face as seen here: http://cl.ly/image/260w0N0u0Z3y (refer to my full code for how)
Passing X to prcomp should do something equivalent, but has a different result than the above homegrown way:
pca <- prcomp(X)
pca$x # right size, but wrong pc scores
The result of using pca$x in reconstructing the face is not total crap, but much worse: http://cl.ly/image/2p19360u2P43
I also checked that using prcomp on t(x) yielded a different rotation matrix, so prcomp is doing something fancy, but something mysterious under the hood. I know from here that prcomp is using SVD to calculate the principal component loading vectors instead of eigen decomposition, but that should not be leading to any errors here (or so I think...).
What is the correct way of using the built in prcomp method, there must be a way, right?
Wow, the answer is not a fun one at all, and rather has to do with default parameters in the prcomp method:
To solve this issue, first, I looked at the R source of prcomp and saw that the rotation matrix should equal svd(X)$v. Checking this on the R command line proved that with my X (data here) it did not. This is because even though there is the default param scale = F to prcomp, prcomp will still run the R scale method, if only to center the matrix, which is default True as seen here. In my case, this is bad because I passed the data as already centered (subtracted mean image).
So, rerunning with prcomp(X, center = F) will yield a rotation matrix equal to svd(X)$v as expected. From this point forward, the only "mistake" prcomp makes when constructing prcomp(X, center = F)$x will be to not normalize the columns, so they are each only off by a scalar multiple from the principal.component.scores matrix I reference above in my code. Without normalizing prcomp(X, center = F)$x the results are better, but not quite great as seen here:
http://cl.ly/image/3u2y3m1h2S0o
But after normalizing via pca.x.norm <- apply(pca$x, 2, normalize.vector) the results of prcomp in reconstructing the face is identical:
http://cl.ly/image/24390O3x0A0x
tl;dr - prcomp unexpectedly centers the data even with the param scale = F, plus for the purposes of eigenfaces you will need to normalize the columns of prcomp(X, center = F)$x, then everything will work as desired!

Resources