I'm implementing metric learning algorithm, I want to reduce the dimension of the data. I am using Java and libraries (Jama) to implement, and PCA to reduce the dimension.
When I used the eig from Jama library to get eigenvalues, it takes a lots of time even for a matrix of size 300 by 20. I need to get get java implementation of Eigenvalue and eigenvector. For your information, i tried also other libraries like Jblas which has PCA, but the performance is really poor in eigenvalue and eigenvector.
Try the Apache math library. Search for the class EigenDecomposition in package org.apache.commons.math3.linear. By the way I think you can only find eigenvalues and eigenvectors of square matrices.
Related
When given a matrix with repeated eigenvalues, but non-defective, how does the R function eigen choose a basis for the eigenspace? Eg if I call eigen on the identity matrix, it gives me the standard basis. How did it choose that basis over any other orthonormal basis?
Still not a full answer, but digging a little deeper: the source code of eigen shows that for real, symmetric matrices it calls .Internal(La_rs(x, only.values))
The La_rs function is found here, and going through the code shows that it calls the LAPACK function dsyevr
The dsyevr function is documented here:
DSYEVR first reduces the matrix A to tridiagonal form T with a call
to DSYTRD. Then, whenever possible, DSYEVR calls DSTEMR to compute
the eigenspectrum using Relatively Robust Representations. DSTEMR
computes eigenvalues by the dqds algorithm, while orthogonal
eigenvectors are computed from various "good" L D L^T representations
(also known as Relatively Robust Representations).
The comments provide this link that gives more expository detail:
The next task is to compute an eigenvector for $\lambda - s$. For each $\hat{\lambda}$ the algorithm computes, with care, an optimal twisted factorization
...
obtained by implementing triangular factorization both from top down and bottom up and joining them at a well chosen index r ...
[emphasis added]. The emphasized words suggest that there are some devils in the details; if you want to go further down the rabbit hole, it looks like the internal dlarrv function is where the eigenvectors actually get calculated ...
For more details, see DSTEMR's documentation and:
Inderjit S. Dhillon and Beresford N. Parlett: "Multiple representations
to compute orthogonal eigenvectors of symmetric tridiagonal matrices,"
Linear Algebra and its Applications, 387(1), pp. 1-28, August 2004.
Inderjit Dhillon and Beresford Parlett: "Orthogonal Eigenvectors and
Relative Gaps," SIAM Journal on Matrix Analysis and Applications, Vol. 25, 2004. Also LAPACK Working Note 154.
Inderjit Dhillon: "A new O(n^2) algorithm for the symmetric
tridiagonal eigenvalue/eigenvector problem",
Computer Science Division Technical Report No. UCB/CSD-97-971,
UC Berkeley, May 1997.
It probably uses some algorithm written in FORTRAN a long time ago.
I suspect there is a procedure which is performed on the matrix to adjust it into a form from which eigenvalues and eigenvectors can be easily determined. I also suspect that this procedure won't need to do anything to an identity matrix to get it into the required form and so the eigenvalues and eigenvectors are just read off immediately.
In the general case of degenerate eigenvalues the answers you get will depend on the details of this algorithm. I doubt there is any choice being made - it's just whatever it spits out first.
I work with Julia, but I think the question is more general. Suppose that one wants to find the spectrum of a very large (sparse) unitary matrix U numerically. As is reported in many entries, diagonalizing by brute force using eigs ends without eigenvalue convergence.
The trick would be then to work with simpler expressions, i.e. with
U_Re = real(U + U')*0.5
U_Im = real((U - U')*-0.5im)
My question is, is there a way to obtain a uniform sampling in finding the eigenvalues? That is, I would like to obtain, say 10e3 eigenvalues for U_Re and U_Im in the interval [-1,1].
I am not entirely sure how uniform sampling of the eigenvalues would work, but I think you are looking for ARPACK. ARPACK would use matrix-vector products to find your eigenvalues, so I am not entirely sure if the Real/Im decomposition is required in this case (hard to say without knowing a lot about the U).
Also, you might want to look at FEAST algorithm, which would benefit a lot from the given search contour.
I am not aware of the existing linking of Julia to those libraries, but I don't think it is a problem since Julia can call C functions.
Here, I gave some brief ideas, and Computational Science might be a better place to find the right crowd. However, a lot more details about U, its sparsity, size, and what does "uniform sampling of eigenvalues in the interval" means would be required.
Do any standard (LAPACK / ARPACK / etc) implementations of the symmetric eigenvalue problem allow "warm starting"? That is, can they be accelerated if I already have a pretty good guess for the eigenvalues and eigenvectors of my matrix.
With Rayleigh quotient iteration or power iteration, this should be pretty obvious, but I don't see how to do this with standard eigensolver software. I'd prefer not to write my own eigensolver.
What you need is an iterative eigenvalue solve algorithm.
LAPACK uses a direct eigensolver and having an estimation of eigenvectors is of no use. There is a QR iterative refinement in its routines. However It requires Hessenberg matrices. I do not think you could use these routines.
You could use ARPACK library, specify the starting vector a set info argument equal to one.
Also I suggest to reconsider writing your own QR solver. It is very simple.
A basic QR implementation using lapack could be:
Initialize Q, A
repeat
QR = A (dgeqrf)
A = RQ (dormqr)
until convergence (dnrm2)
I am using OpenCL to calculate the eigenvectors of a matrix. AMD has an example of eigenvalue calculation so I decided to use inverse iteration to get the eigenvectors.
I was following the algorithm described here and I noticed that in order to solve step 4 I need to solve a system of linear equations (or calculate the inverse of a matrix).
What is the best way to do this on a GPU using OpenCL? Are there any examples/references that I should look into?
EDIT: I'm sorry, I should have mentioned that my matrix is symmetric tridiagonal. From what I have been reading this could be important and maybe simplifies the whole process a lot
The fact that the matrix is tridiagonal is VERY important - that reduces the complexity of the problem from O(N^3) to O(N). You can probably get some speedup from the fact that it's symmetric too, but that won't be as dramatic.
The method for solving a tridiagonal system is here: http://en.wikipedia.org/wiki/Tridiagonal_matrix_algorithm.
Also note that you don't need to store all N^2 elements of the matrix, since almost all of them will be zeroes. You just need one vector of length N (for the diagonal) and two of length N-1 for the sub- and superdiagonals. And since your matrix is symmetric, the sub- and superdiagonals are the same.
Hope that's helpful...
I suggest using LU decomposition.
Here's example.
It's written in CUDA, but I think, it's not so hard to rewrite it in OpenCL.
How expensive is it to compute the eigenvalues of a matrix?
What is the complexity of the best algorithms?
How long might it take in practice if I have a 1000 x 1000 matrix? I assume it helps if the matrix is sparse?
Are there any cases where the eigenvalue computation would not terminate?
In R, I can compute the eigenvalues as in the following toy example:
m<-matrix( c(13,2, 5,4), ncol=2, nrow=2 )
eigen(m, only.values=1)
$values
[1] 14 3
Does anyone know what algorithm it uses?
Are there any other (open-source) packages that compute the eigenvalue?
Most of the algorithms for eigen value computations scale to big-Oh(n^3), where n is the row/col dimension of the (symmetric and square) matrix.
For knowing the time complexity of the best algorithm till date you would have to refer to the latest research papers in Scientific Computing/Numerical Methods.
But even if you assume the worse case, you would still need at least 1000^3 operations for a 1000x1000 matrix.
R uses the LAPACK routine's (DSYEVR, DGEEV, ZHEEV and ZGEEV) implementation by default. However you could specify the EISPACK=TRUE as a parameter to use a EISPACK's RS, RG, CH and CG routines.
The most popular and good open source packages for eigenvalue computation are LAPACK and EISPACK.
With big matrices you usually don't want all the eigenvalues. You just want the top few to do (say) a dimension reduction.
The canonical algorithm is the Arnoldi-Lanczos iterative algorithm implemented in ARPACK:
www.caam.rice.edu/software/ARPACK/
There is a matlab interface in eigs:
http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/ref/eigs.html
eigs(A,k) and eigs(A,B,k) return the k largest magnitude eigenvalues.
And there is now an R interface as well:
http://igraph.sourceforge.net/doc-0.5/R/arpack.html
I assume it helps if the matrix is
sparse?
Yes, there are algorithms, that perform well on sparse matrices.
See for example: http://www.cise.ufl.edu/research/sparse/
How long might it take in practice if
I have a 1000x1000 matrix?
MATLAB (based on LAPACK) computes on a dual-core 1.83 GHz machine all eigenvalues of a 1000x1000 random in roughly 5 seconds. When the matrix is symmetric, the computation can be done significantly faster and requires only about 1 second.
I would take a look at Eigenvalue algorithms, which link to a number of different methods. They'll all have different characteristics, and hopefully one will be suitable for your purposes.
You can use the GuessCompx package from CRAN to estimate the empirical complexity of your eigenvalues computation and predict the full running time (although it's still small in your example). You need a little helper function because the fitting process only subsets the rows, so you must make the matrix square:
library(GuessCompx)
m = matrix(rnorm(1e6), ncol=1000, nrow=1000)
# custom function to subset the increasing-size matrix to a square one:
eigen. = function(m) eigen(as.matrix(m[, 1:nrow(m)]))
CompEst(m, eigen.)
#### $`TIME COMPLEXITY RESULTS`
#### $`TIME COMPLEXITY RESULTS`$best.model
#### [1] "CUBIC"
#### $`TIME COMPLEXITY RESULTS`$computation.time.on.full.dataset
#### [1] "5.23S"
#### $`TIME COMPLEXITY RESULTS`$p.value.model.significance
#### [1] 1.784406e-34
You get a cubic complexity for time, and a Nlog(N) complexity for memory usage of the R base eigen() function. It takes 5.2 secs and 37Mb to run the whole computation.
Apache Mahout is an open-source framework built on map-reduce (i.e. it works for really really big matrices). Note that for a lot of matrix stuff the question isn't "whats the big-o runtime" but rather "how parallelizable is it?" Mahout says they use Lanczos, which can essentially be run in parallel on as many processors as you care to give it.
It uses the QR algo. See Wilkinson, J. H. (1965) The Algebraic Eigenvalue Problem. Clarendon Press, Oxford. It does not exploit sparsity.