Computing eigenvectors given shrinkage eigenvalues - r

I used the function linshrink of the nlshrink package to have a shrinkage estimation of the eigenvalues of a symmetric matrix M. Unfortunately the function does not return the eigenvectors, which I also need. How can I manually compute them? I thought about applying the definition and use (M − λI)x = 0 for every eigenvalue λ, but I'm not sure how to properly do it, since computing the matrix A = M − λI and using it as an input in solve(A,b) with b=rep(0,nrow(M)) obviously returns a vector of zero. Can anybody help me? Here are a few lines to provide a working example:
library(nlshrink)
M <- matrix(1:16,4)
M[lower.tri(M)] = t(M)[lower.tri(M)]
M <- M/16.1
shrinkval <- linshrink(M) #eigenvalues

Related

R - Standardize matrix to have unit diagonals

I am seeking to generate the below matrix:
Θ = B + δIp ∈ Rp×p, where Ip is the identity matrix, each off-diagonal entry
in B (symmetric matrix) is generated independently and equals 0.5 with probability
0.1 or 0 with probability 0.9. The parameter δ > 0 is chosen such that Θ is positive definite. The matrix is standardized to have unit diagonals (transforming from covariance matrix to correlation matrix).
I think that I have most of the code, but I'm unsure of how to standardize the matrix to have unit diagonals syntactically in R (and theoretically, why that is a useful feature of a matrix).
# set number of cols/rows
p <- 5
set.seed(123)
# generate matrix B with values of 0.5 given probabilities
B <- matrix(sample(c(0,0.5), p^2, replace=TRUE, prob=c(0.9,0.1)), p)
# call the matrix lower triangle, need a symmetric matrix
i <- lower.tri(B)
B[i] <- t(B)[i]
diag(B) <- rep(0, p)
# finding parameter delta, such that Θ is positive definite.
(delta <- -min(eigen(B, symmetric=TRUE, only.values=TRUE)$values))
# set theta (delta is 2.8802)
theta <- B + 2.89*(diag(p))
# now to standardize the matrix to have unit diagonals ?
There are many ways to do this, but the following is very fast in timing experiments:
v <- 1/sqrt(diag(theta))
B <- theta * outer(v, v)
This divides all rows and columns by their standard deviations, which are the square roots of the diagonal elements.
It will fail whenever any diagonal is zero (or negative): but in that case such standardization isn't possible. Computing the square roots and their reciprocals first allows you to learn as soon as possible--with minimal computation--whether the procedure will succeed.
BTW, a direct way to compute B in the first part of your code (which has a zero diagonal) is
B <- as.matrix(structure(sample(c(0,1/2), p*(p-1)/2, replace=TRUE, prob=c(.9,.1),
Size=p, Diag=TRUE, class="dist"))
This eliminates the superfluous sampling.

Efficient multivariate normal inference

I'm trying to implement a modified version of a Kalman filter, in which I have a n-dimensional Normal prior on my vector of n hidden variables, and then a sequence of m independent data vectors which are distributed with different-but-known covariance matrices according to the hidden variables.
More formally:
The generalised solution for the posterior distribution of any given hidden variable conditional on all previous observed variables is:
So it is fairly "easy" to compute the final posterior distribution – all you have to do is apply the above transformations iteratively, starting with your prior and using the known covariance matrices and observed values at each step i. So a potential way to code this, assuming I have a list Sigma with the known covariance matrices, a matrix O with the observed value vectors, and the other variables stored in Lambda, Delta, and mu, is:
inv_Sigma = solve (Sigma [[1]])
inv_Lambda_plus_Delta = solve (Delta)
Delta_i = solve (inv_Sigma + inv_Lambda_plus_Delta)
mu_i = Delta_i %*% (inv_Sigma %*% O [, 1] + inv_Lambda_plus_Delta %*% mu)
if (n > 1) {
for (i in seq (2, n)) {
inv_Sigma = solve (Sigma [[i]])
inv_Lambda_plus_Delta = solve (Lambda + Delta_i)
Delta_i = solve (inv_Sigma + inv_Lambda_plus_Delta)
mu_i = Delta_i %*% (inv_Sigma %*% O [, i] + inv_Lambda_plus_Delta %*% mu_i)
}
}
However, as the dimension of the vectors grows, having to calculate these matrix inverses iteratively over and over and over gets increasingly computationally expensive, to the point where (for the problem I want to solve) it's prohibitive. Is there a more efficient way to do this calculation, either with some already-existing function or package or with a more efficient implementation of matrix inversion? Is a simulation-based solution better for my problem?

How do I minimize a linear least squares function in R?

I'm reading Deep Learning by Goodfellow et al. and am trying to implement gradient descent as shown in Section 4.5 Example: Linear Least Squares. This is page 92 in the hard copy of the book.
The algorithm can be viewed in detail at https://www.deeplearningbook.org/contents/numerical.html with R implementation of linear least squares on page 94.
I've tried implementing in R, and the algorithm as implemented converges on a vector, but this vector does not seem to minimize the least squares function as required. Adding epsilon to the vector in question frequently produces a "minimum" less than the minimum outputted by my program.
options(digits = 15)
dim_square = 2 ### set dimension of square matrix
# Generate random vector, random matrix, and
set.seed(1234)
A = matrix(nrow = dim_square, ncol = dim_square, byrow = T, rlnorm(dim_square ^ 2)/10)
b = rep(rnorm(1), dim_square)
# having fixed A & B, select X randomly
x = rnorm(dim_square) # vector length of dim_square--supposed to be arbitrary
f = function(x, A, b){
total_vector = A %*% x + b # this is the function that we want to minimize
total = 0.5 * sum(abs(total_vector) ^ 2) # L2 norm squared
return(total)
}
f(x,A,b)
# how close do we want to get?
epsilon = 0.1
delta = 0.01
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
steps = vector()
while(L2_norm > delta){
x = x - epsilon * value
value = (t(A) %*% A) %*% x - t(A) %*% b
L2_norm = (sum(abs(value) ^ 2)) ^ 0.5
print(L2_norm)
}
minimum = f(x, A, b)
minimum
minimum_minus = f(x - 0.5*epsilon, A, b)
minimum_minus # less than the minimum found by gradient descent! Why?
On page 94 of the pdf appearing at https://www.deeplearningbook.org/contents/numerical.html
I am trying to find the values of the vector x such that f(x) is minimized. However, as demonstrated by the minimum in my code, and minimum_minus, minimum is not the actual minimum, as it exceeds minimum minus.
Any idea what the problem might be?
Original Problem
Finding the value of x such that the quantity Ax - b is minimized is equivalent to finding the value of x such that Ax - b = 0, or x = (A^-1)*b. This is because the L2 norm is the euclidean norm, more commonly known as the distance formula. By definition, distance cannot be negative, making its minimum identically zero.
This algorithm, as implemented, actually comes quite close to estimating x. However, because of recursive subtraction and rounding one quickly runs into the problem of underflow, resulting in massive oscillation, below:
Value of L2 Norm as a function of step size
Above algorithm vs. solve function in R
Above we have the results of A %% x followed by A %% min_x, with x estimated by the implemented algorithm and min_x estimated by the solve function in R.
The problem of underflow, well known to those familiar with numerical analysis, is probably best tackled by the programmers of lower-level libraries best equipped to tackle it.
To summarize, the algorithm appears to work as implemented. Important to note, however, is that not every function will have a minimum (think of a straight line), and also be aware that this algorithm should only be able to find a local, as opposed to a global minimum.

Invert singular matrices in R

I am trying to grasp the basic concept of invertible and non-invertible matrices.
I created a random non-singular square matrix
S <- matrix(rnorm(100, 0, 1), ncol = 10, nrow = 10)
I know that this matrix is positive definite (thus invertible) because when I decompose the matrix S into its eigenvalues, their product is positive.
eig_S <- eigen(S)
eig_S$values
[1] 3.0883683+0.000000i -2.0577317+1.558181i -2.0577317-1.558181i 1.6884120+1.353997i 1.6884120-1.353997i
[6] -2.1295086+0.000000i 0.1805059+1.942696i 0.1805059-1.942696i -0.8874465+0.000000i 0.8528495+0.000000i
solve(S)
According to this paper, we can compute the inverse of a non-singular matrix by its SVD too.
Where
(where U and V are eigenvectors and D eigenvalues, please do correct me if I am wrong).
The inverse then is, .
Indeed, I can run the formula in R:
s <- svd(S)
s$v%*%solve(diag(s$d))%*%t(s$u)
Which produces exactly the same result as solve(S).
My first question is:
1) Are s$d indeed represent the eigenvalues of S? Because s$d and eig_S$values are quite different.
Now the second part,
If I create a singular matrix
I <- matrix(rnorm(100, 0, 1), ncol = 5, nrow = 20)
I <- I%*%t(I)
eig_I <- eigen(I)
eig_I$values
[1] 3.750029e+01 2.489995e+01 1.554184e+01 1.120580e+01 8.674039e+00 3.082593e-15 5.529794e-16 3.227684e-16
[9] 2.834454e-16 5.876634e-17 -1.139421e-18 -2.304783e-17 -6.636508e-17 -7.309336e-17 -1.744084e-16 -2.561197e-16
[17] -3.075499e-16 -4.150320e-16 -7.164553e-16 -3.727682e-15
The solve function will produce an error
solve(I)
system is computationally singular: reciprocal condition number =
1.61045e-19
So, again according to the same paper we can use the SVD
i <- svd(I)
solve(i$u %*% diag(i$d) %*% t(i$v))
which produces the same error.
Then I tried to use the Cholesky decomposition for matrix inversion
Conj(t(I))%*%solve(I%*%Conj(t(I)))
and again I get the same error.
Could someone please explain where am I using the equations wrong?
I know that for matrix I%*%Conj(t(I)), the determinant of the eigenvalue matrix is positive but the matrix is not a full rank due to the initial multiplication that I did.
j <- eigen(I%*%Conj(t(I)))
det(diag(j$values))
[1] 3.17708e-196
qr(I %*% Conj(t(I)))$rank
[1] 5
UPDATE 1: Following the comments bellow, and after going through the paper/Wikipedia page again. I used these two codes, which they produce some results but I am not sure about their validity. The first example seems more believable. The SVD solution
i$v%*%diag(1/i$d)%*%t(i$u)
and the Cholesky
Conj(t(I))%*%(I%*%Conj(t(I)))^(-1)
I am not sure if I interpreted the two sources correctly though.

A function for calculating the eigenvalues of a matrix in R

I want to write a function like eigen() to calculating eigenvalues and eigenvectors of an arbitary matrix. I wrote the following codes for calculation of eigenvalues and I need a function or method to solve the resulted linear equation.
eig <- function(x){
if(nrow(x)!=ncol(x)) stop("dimension error")
ff <- function(lambda){
for(i in 1:nrow(x)) x[i,i] <- x[i,i] - lambda
}
det(x)
}
I need to solve det(x)=0 that is a polynomial linear equation to find the values of lambda. Is there any way?
Here is one solution using uniroot.all:
library(rootSolve)
myeig <- function(mat){
myeig1 <- function(lambda) {
y = mat
diag(y) = diag(mat) - lambda
return(det(y))
}
myeig2 <- function(lambda){
sapply(lambda, myeig1)
}
uniroot.all(myeig2, c(-10, 10))
}
R > x <- matrix(rnorm(9), 3)
R > eigen(x)$values
[1] -1.77461906 -1.21589769 -0.01010515
R > myeig(x)
[1] -1.77462211 -1.21589767 -0.01009019
Computing determinant is such a bad idea as it is not numerically stable. You can easily get Inf etc even for a moderately big matrix. I suggest reading the following answers (read them otherwise you have no idea what my code is doing):
Are eigenvectors returned by R function eigen() wrong?
eigenvectors when A-lx is singular with no solution
then use either of the following
NullSpace(A - diag(lambda, nrow(A)))
nullspace(A - diag(lambda, nrow(A)))
The solution from #liuminzhao won't work if there is two repeated eigenvalues. The function will fail to find the roots, because the characteristic polynomial of the matrix will not change sign (it is zero and does not cross the zero line), which is what rootSolve::uniroot.all() is doing when looking for roots. So you need another way to find a local minima (like optim()). Moreover, it will failed to determine the number of repeated eigenvalues.
A better way is to find the characteristic equation with, which is easily done with pracma::charpoly() and then using polyroot().
par <- pracma::charpoly(M) # find parameters of the CP of matrix M
par <- par[length(par):1] # reverse order for polyroot()
roots <- Re(polyroot(par)) # keep real part of the polyroot()
The pracma::charpoly() is not too complicated in itself, see its source code, starting at line a1 <- a.

Resources