how does R choose eigenvectors? - r

When given a matrix with repeated eigenvalues, but non-defective, how does the R function eigen choose a basis for the eigenspace? Eg if I call eigen on the identity matrix, it gives me the standard basis. How did it choose that basis over any other orthonormal basis?

Still not a full answer, but digging a little deeper: the source code of eigen shows that for real, symmetric matrices it calls .Internal(La_rs(x, only.values))
The La_rs function is found here, and going through the code shows that it calls the LAPACK function dsyevr
The dsyevr function is documented here:
DSYEVR first reduces the matrix A to tridiagonal form T with a call
to DSYTRD. Then, whenever possible, DSYEVR calls DSTEMR to compute
the eigenspectrum using Relatively Robust Representations. DSTEMR
computes eigenvalues by the dqds algorithm, while orthogonal
eigenvectors are computed from various "good" L D L^T representations
(also known as Relatively Robust Representations).
The comments provide this link that gives more expository detail:
The next task is to compute an eigenvector for $\lambda - s$. For each $\hat{\lambda}$ the algorithm computes, with care, an optimal twisted factorization
...
obtained by implementing triangular factorization both from top down and bottom up and joining them at a well chosen index r ...
[emphasis added]. The emphasized words suggest that there are some devils in the details; if you want to go further down the rabbit hole, it looks like the internal dlarrv function is where the eigenvectors actually get calculated ...
For more details, see DSTEMR's documentation and:
Inderjit S. Dhillon and Beresford N. Parlett: "Multiple representations
to compute orthogonal eigenvectors of symmetric tridiagonal matrices,"
Linear Algebra and its Applications, 387(1), pp. 1-28, August 2004.
Inderjit Dhillon and Beresford Parlett: "Orthogonal Eigenvectors and
Relative Gaps," SIAM Journal on Matrix Analysis and Applications, Vol. 25, 2004. Also LAPACK Working Note 154.
Inderjit Dhillon: "A new O(n^2) algorithm for the symmetric
tridiagonal eigenvalue/eigenvector problem",
Computer Science Division Technical Report No. UCB/CSD-97-971,
UC Berkeley, May 1997.

It probably uses some algorithm written in FORTRAN a long time ago.
I suspect there is a procedure which is performed on the matrix to adjust it into a form from which eigenvalues and eigenvectors can be easily determined. I also suspect that this procedure won't need to do anything to an identity matrix to get it into the required form and so the eigenvalues and eigenvectors are just read off immediately.
In the general case of degenerate eigenvalues the answers you get will depend on the details of this algorithm. I doubt there is any choice being made - it's just whatever it spits out first.

Related

Homogeneous eigenvalue sampling of a sparse unitary matrix

I work with Julia, but I think the question is more general. Suppose that one wants to find the spectrum of a very large (sparse) unitary matrix U numerically. As is reported in many entries, diagonalizing by brute force using eigs ends without eigenvalue convergence.
The trick would be then to work with simpler expressions, i.e. with
U_Re = real(U + U')*0.5
U_Im = real((U - U')*-0.5im)
My question is, is there a way to obtain a uniform sampling in finding the eigenvalues? That is, I would like to obtain, say 10e3 eigenvalues for U_Re and U_Im in the interval [-1,1].
I am not entirely sure how uniform sampling of the eigenvalues would work, but I think you are looking for ARPACK. ARPACK would use matrix-vector products to find your eigenvalues, so I am not entirely sure if the Real/Im decomposition is required in this case (hard to say without knowing a lot about the U).
Also, you might want to look at FEAST algorithm, which would benefit a lot from the given search contour.
I am not aware of the existing linking of Julia to those libraries, but I don't think it is a problem since Julia can call C functions.
Here, I gave some brief ideas, and Computational Science might be a better place to find the right crowd. However, a lot more details about U, its sparsity, size, and what does "uniform sampling of eigenvalues in the interval" means would be required.

LAPACK's `dtrcon` underlying algorithm

I am currently trying to reconstruct some of the function of R's kappa condition number estimation function, which estimates the condition number of a matrix X by:
Working out the QR decomposition of X.
Calling to LAPACK's dtrcon or LINPACK's dtrco (depending on what the underlying dependencies on the users system are), and calculating the condition number of R, the upper triangular matrix which should have the same condition number as X (see here).
I have been trying to understand what the LAPACK and LINPACK algorithms do as it may be extremely useful for my own coding.
I have managed to find the algorithm that LINPACK uses, which was described here, but have had no luck finding the origin of LAPACK's algorithm. The comments in R's kappa function suggest that these are using different algorithms (see here) but I am unsure...
Long story short, my question is:
Does anyone know if LAPACK's dtrcon and LINPACK's dtrco are using the same algorithm and if not, what algorithm is LAPACK's dtrcon using?
Thank you in advance!

8 point algorithm for estimating Fundamental Matrix

I'm watching a lecture about estimating the fundamental matrix for use in stereo vision using the 8 point algorithm. I understand that once we recover the fundamental matrix between two cameras we can compute the epipolar line on one camera given a point on the other. To my understanding this epipolar line (after it's been rectified) makes it easy to find feature correspondences, because we are simply matching features along a 1D line.
The confusion comes from the fact that 8-point algorithm itself requires at least 8 feature correspondences to estimate the Fundamental Matrix.
So, we are finding point correspondences to recover a matrix that is used to find point correspondences?
This seems like a chicken-egg paradox so I guess I'm misunderstanding something.
The fundamental matrix can be precomputed. This leads to two advantages:
You can use a nice environment in which features can be matched easily (like using a chessboard) to compute the fundamental matrix.
You can use more computationally expensive operations like a sequence of SIFT, FLANN and RANSAC across the entire image since you only need to do that once.
After getting the fundamental matrix, you can find correspondences in a noisy environment more efficiently than using the same method when you compute the fundamental matrix.

Warm starting symmetric eigenvalue computation?

Do any standard (LAPACK / ARPACK / etc) implementations of the symmetric eigenvalue problem allow "warm starting"? That is, can they be accelerated if I already have a pretty good guess for the eigenvalues and eigenvectors of my matrix.
With Rayleigh quotient iteration or power iteration, this should be pretty obvious, but I don't see how to do this with standard eigensolver software. I'd prefer not to write my own eigensolver.
What you need is an iterative eigenvalue solve algorithm.
LAPACK uses a direct eigensolver and having an estimation of eigenvectors is of no use. There is a QR iterative refinement in its routines. However It requires Hessenberg matrices. I do not think you could use these routines.
You could use ARPACK library, specify the starting vector a set info argument equal to one.
Also I suggest to reconsider writing your own QR solver. It is very simple.
A basic QR implementation using lapack could be:
Initialize Q, A
repeat
QR = A (dgeqrf)
A = RQ (dormqr)
until convergence (dnrm2)

In what situation would a taylor series for a polynomial be necessary?

I'm having a hard time understanding why it would be useful to use the Taylor series for a function in order to gain an approximation of a function, instead of just using the function itself when programming. If I can tell my computer to compute e^(.1) and it will give me an exact value, why would I take an approximation instead?
Taylor series are generally not used to approximate functions. Usually, some form of minimax polynomial is used.
Taylor series converge slowly (it takes many terms to get the accuracy desired) and are inefficient (they are more accurate near the point around which they are centered and less accurate away from it). The largest use of Taylor series is likely in mathematics classes and papers, where they are useful for examining the properties of functions and for learning about calculus.
To approximate functions, minimax polynomials are often used. A minimax polynomial has the minimum possible maximum error for a particular situation (interval over which a function is to be approximated, degree available for the polynomial). There is usually no analytical solution to finding a minimax polynomial. They are found numerically, using the Remez algorithm. Minimax polynomials can be tailored to suit particular needs, such as minimizing relative error or absolute error, approximating a function over a particular interval, and so on. Minimax polynomials need fewer terms than Taylor series to get acceptable results, and they “spread” the error over the interval instead of being better in the center and worse at the ends.
When you call the exp function to compute ex, you are likely using a minimax polynomial, because somebody has done the work for you and constructed a library routine that evaluates the polynomial. For the most part, the only arithmetic computer processors can do is addition, subtraction, multiplication, and division. So other functions have to be constructed from those operations. The first three give you polynomials, and polynomials are sufficient to approximate many functions, such as sine, cosine, logarithm, and exponentiation (with some additional operations of moving things into and out of the exponent field of floating-point values). Division adds rational functions, which is useful for functions like arctangent.
For two reasons. First and foremost - most processors do not have hardware implementations of complex operations like exponentials, logarithms, etc... In such cases the programming language may provide a library function for computing those - in other words, someone used a taylor series or other approximation for you.
Second, you may have a function that not even the language supports.
I recently wanted to use lookup tables with interpolation to get an angle and then compute the sin() and cos() of that angle. Trouble is that it's a DSP with no floating point and no trigonometric functions so those two functions are really slow (software implementation). Instead I put sin(x) in the table instead of x and then used the taylor series for y=sqrt(1-x*x) to compute the cos(x) from that. This taylor series is accurate over the range I needed with only 5 terms (denominators are all powers of two!) and can be implemented in fixed point using plain C and generates code that is faster than any other approach I could think of.

Resources