How can you find the condition number in Eigen? - math

In Matlab there are cond and rcond, and in LAPACK too. Is there any routine in Eigen to find the condition number of a matrix?
I have a Cholesky decomposition of a matrix and I want to check if it is close to singularity, but cannot find a similar function in the docs.
UPDATE:
I think I can use something like this algorithm, which makes use of the triangular factorization. The method by Ilya is useful for more accurate answers, so I will mark it as correct.

Probably easiest way to compute the condition number is using the expression:
cond(A) = max(sigma) / min(sigma)
where sigma is an array of singular values, the result of SVD. Eigen author suggests this code:
JacobiSVD<MatrixXd> svd(A);
double cond = svd.singularValues()(0)
/ svd.singularValues()(svd.singularValues().size()-1);
Other ways are (less efficient)
cond(A) = max(lambda) / min(lambda)
cond(A) = norm2(A) * norm2(A^-1)
where lambda is an array of eigenvalues.
It looks like Cholesky decomposition does not directly help here, but I cant tell for sure at the moment.

You could use the Gershgorin circle theorem to get a rough estimate.
But as Ilya Popov has already pointed out, calculating the eigenvalues/singular values is more reliable. However, it doesn't make sense to calculate all eigenvalues, which gets very expensive. You only need the largest and the smallest eigenvalues, for that you can use the Power method for the largest and Inverse Iteration for the smallest eigenvalue.
Or you could use a library that can do this already, e.g. Spectra.

You can use norms. In my robotics experience, this is computationally faster than singular values:
pseudoInverse(matrix).norm() * matrix.norm()
I found this to be 2.6X faster than singular values for 6x6 matrices. It's also recommended in this book:
B. Siciliano, and O. Khatib, Springer Handbook of Robotics. Berlin: Springer Science and Business Media, 2008, p. 236.

Related

Which sparse linear solver is faster? SparseLU or BiCGSTAB?

I tested Eigen's SparseLU and BicGSTAB method on some sparse matrix, whose dense counterparts' size ranges from 3000*3000 to 16000*16000. All the cases shows that SparseLU is around 13% faster than BicGSTAB method.
I didn't feed the BiCGSTAB a RowMajor sparse matrix, or give it any pre-conditioner. That might be the reason of slow.
So I am wondering, if I do both methods well, which one should be faster?
How about if the matrix size goes up to millions*millions?
Thanks a lot!
You already mentioned the main reason of performance difference.
The iterative methods are getting much faster when you choose the "right" preconditioner.
An example list of preconditioners you might refer to is:
Jacobi
SOR
ILU
Multigrid
Each preconditioner has some parameters the should be tuned also.
Choice of linear solver has a lot to do with the distribution of eigenvalues/eigenvectors of the matrix. If you have a symmetric positive definite matrix then conjugate-gradient is a good option. Number of iterations depend on the condition number (max eigen value/min eigen value). For a matrix derived from an elliptic operator, condition number increases with the size of the matrix.
Check out this article by Jonathan Shewchuk for a great explanation on CG. (https://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf).
For other matrix types, you can use GMRES etc. based on the eigen properties. Check out this paper http://www.sam.math.ethz.ch/~mhg/pub/biksm.pdf
Hope this helps.

How to numerically compute nonlinear polynomials efficiently and accurately?

(I'm not sure whether I should post this problem on this site or on the math site. Please feel free to migrate this post if necessary.)
My problem at hand is that given a value of k I'd like to numerically compute a rational function of nonlinear polynomials in k which looks like the following: (sorry I don't know how to typeset equations here...)
where {a_0, ..., a_N; b_0, ..., b_N} are complex constants, {u_0, ..., u_N, v_0, ..., v_N} are real constants and i is the imaginary number. I learned from Numerical Recipes that there are whole bunch of ways to compute polynomials quickly, in the meanwhile keeping the rounding error small enough, if all coefficients were constant. But I do not think those ideas are useful in my case since the exponential prefactors also depend on k.
Currently I calculate it in a brute force way in C with complex.h (this is just a pseudo code):
double complex function(double k)
{
return (a_0+a_1*cexp(I*u_1*k)*k+a_2*cexp(I*u_2*k)*k*k+...)/(b_0+b_1*cexp(I*v_1*k)*k+v_2*cexp(I*v_2*k)*k*k+...);
}
However when the number of calls of function increases (because this is just a part of my real calculation), it is very slow and inaccurate (only 6 valid digits). I appreciate any comments and/or suggestions.
I trust that this isn't a homework assignment!
Normally the trick is to use a loop add the next coefficient to the running sum, and multiply by k. However, in your case, I think the "e" term in the coefficient is going to overwhelm any savings by factoring out k. You can still do it, but the savings will probably be small.
Is u_i a constant? Depending on how many times you need to run this formula, maybe you could premultiply u_i * k (unless k changes each run). It's been so many decades since I took a Numerical Analysis course that I have only vague recollections of the tricks of the trade. Let's see... is e^(i*u_i*k) the same as (e^(i*u_i))^k? I don't remember the rules on imaginary numbers, or whether you'll save anything since you've got a real^real (assuming k is real) anyway (internally done using e^power).
If you're getting only 6 digits, that suggests that your math, and maybe your library, is working in single precision (32 bit) reals. Check your library and check your declarations that you are using at least double precision (64 bit) reals everywhere.

Sample uniformly at random from an n-dimensional unit simplex

Sampling uniformly at random from an n-dimensional unit simplex is the fancy way to say that you want n random numbers such that
they are all non-negative,
they sum to one, and
every possible vector of n non-negative numbers that sum to one are equally likely.
In the n=2 case you want to sample uniformly from the segment of the line x+y=1 (ie, y=1-x) that is in the positive quadrant.
In the n=3 case you're sampling from the triangle-shaped part of the plane x+y+z=1 that is in the positive octant of R3:
(Image from http://en.wikipedia.org/wiki/Simplex.)
Note that picking n uniform random numbers and then normalizing them so they sum to one does not work. You end up with a bias towards less extreme numbers.
Similarly, picking n-1 uniform random numbers and then taking the nth to be one minus the sum of them also introduces bias.
Wikipedia gives two algorithms to do this correctly: http://en.wikipedia.org/wiki/Simplex#Random_sampling
(Though the second one currently claims to only be correct in practice, not in theory. I'm hoping to clean that up or clarify it when I understand this better. I initially stuck in a "WARNING: such-and-such paper claims the following is wrong" on that Wikipedia page and someone else turned it into the "works only in practice" caveat.)
Finally, the question:
What do you consider the best implementation of simplex sampling in Mathematica (preferably with empirical confirmation that it's correct)?
Related questions
Generating a probability distribution
java random percentages
This code can work:
samples[n_] := Differences[Join[{0}, Sort[RandomReal[Range[0, 1], n - 1]], {1}]]
Basically you just choose n - 1 places on the interval [0,1] to split it up then take the size of each of the pieces using Differences.
A quick run of Timing on this shows that it's a little faster than Janus's first answer.
After a little digging around, I found this page which gives a nice implementation of the Dirichlet Distribution. From there it seems like it would be pretty simple to follow Wikipedia's method 1. This seems like the best way to do it.
As a test:
In[14]:= RandomReal[DirichletDistribution[{1,1}],WorkingPrecision->25]
Out[14]= {0.8428995243540368880268079,0.1571004756459631119731921}
In[15]:= Total[%]
Out[15]= 1.000000000000000000000000
A plot of 100 samples:
alt text http://www.public.iastate.edu/~zdavkeos/simplex-sample.png
I'm with zdav: the Dirichlet distribution seems to be the easiest way ahead, and the algorithm for sampling the Dirichlet distribution which zdav refers to is also presented on the Wikipedia page on the Dirichlet distribution.
Implementationwise, it is a bit of an overhead to do the full Dirichlet distribution first, as all you really need is n random Gamma[1,1] samples. Compare below
Simple implementation
SimplexSample[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[GammaDistribution[1,1],n,opts]
Full Dirichlet implementation
DirichletDistribution/:Random`DistributionVector[
DirichletDistribution[alpha_?(VectorQ[#,Positive]&)],n_Integer,prec_?Positive]:=
Block[{gammas}, gammas =
Map[RandomReal[GammaDistribution[#,1],n,WorkingPrecision->prec]&,alpha];
Transpose[gammas]/Total[gammas]]
SimplexSample2[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[DirichletDistribution[ConstantArray[1,{n}]],opts]
Timing
Timing[Table[SimplexSample[10,WorkingPrecision-> 20],{10000}];]
Timing[Table[SimplexSample2[10,WorkingPrecision-> 20],{10000}];]
Out[159]= {1.30249,Null}
Out[160]= {3.52216,Null}
So the full Dirichlet is a factor of 3 slower. If you need m>1 samplepoints at a time, you could probably win further by doing (#/Total[#]&)/#RandomReal[GammaDistribution[1,1],{m,n}].
Here's a nice concise implementation of the second algorithm from Wikipedia:
SimplexSample[n_] := Rest## - Most## &[Sort#Join[{0,1}, RandomReal[{0,1}, n-1]]]
That's adapted from here: http://www.mofeel.net/1164-comp-soft-sys-math-mathematica/14968.aspx
(Originally it had Union instead of Sort#Join -- the latter is slightly faster.)
(See comments for some evidence that this is correct!)
I have created an algorithm for uniform random generation over a simplex. You can find the details in the paper in the following link:
http://www.tandfonline.com/doi/abs/10.1080/03610918.2010.551012#.U5q7inJdVNY
Briefly speaking, you can use following recursion formulas to find the random points over the n-dimensional simplex:
x1=1-R11/n-1
xk=(1-Σi=1kxi)(1-Rk1/n-k), k=2, ..., n-1
xn=1-Σi=1n-1xi
Where R_i's are random number between 0 and 1.
Now I am trying to make an algorithm to generate random uniform samples from constrained simplex.that is intersection between a simplex and a convex body.
Old question, and I'm late to the party, but this method is much faster than the accepted answer if implemented efficiently.
In Mathematica code:
#/Total[#,{2}]&#Log#RandomReal[{0,1},{n,d}]
In plain English, you generate n rows * d columns of randoms uniformly distributed between 0 and 1. Then take the Log of everything. Then normalize each row, dividing each element in the row by the row total. Now you have n samples uniformly distributed over the (d-1) dimensional simplex.
If found this method here: https://mathematica.stackexchange.com/questions/33652/uniformly-distributed-n-dimensional-probability-vectors-over-a-simplex
I'll admit, I'm not sure why it works, but it passes every statistical test I can think of. If anyone has a proof of why this method works, I'd love to see it!

Fastest numerical solution of a real cubic polynomial?

R question: Looking for the fastest way to NUMERICALLY solve a bunch of arbitrary cubics known to have real coeffs and three real roots. The polyroot function in R is reported to use Jenkins-Traub's algorithm 419 for complex polynomials, but for real polynomials the authors refer to their earlier work. What are the faster options for a real cubic, or more generally for a real polynomial?
The numerical solution for doing this many times in a reliable, stable manner, involve: (1) Form the companion matrix, (2) find the eigenvalues of the companion matrix.
You may think this is a harder problem to solve than the original one, but this is how the solution is implemented in most production code (say, Matlab).
For the polynomial:
p(t) = c0 + c1 * t + c2 * t^2 + t^3
the companion matrix is:
[[0 0 -c0],[1 0 -c1],[0 1 -c2]]
Find the eigenvalues of such matrix; they correspond to the roots of the original polynomial.
For doing this very fast, download the singular value subroutines from LAPACK, compile them, and link them to your code. Do this in parallel if you have too many (say, about a million) sets of coefficients.
Notice that the coefficient of t^3 is one, if this is not the case in your polynomials, you will have to divide the whole thing by the coefficient and then proceed.
Good luck.
Edit: Numpy and octave also depend on this methodology for computing the roots of polynomials. See, for instance, this link.
The fastest known way (that I'm aware of) to find the real solutions a system of arbitrary polynomials in n variables is polyhedral homotopy. A detailed explanation is probably beyond a StackOverflow answer, but essentially it's a path algorithm that exploits the structure of each equation using toric geometries. Google will give you a number of papers.
Perhaps this question is better suited for mathoverflow?
Fleshing out Arietta's answer above:
> a <- c(1,3,-4)
> m <- matrix(c(0,0,-a[1],1,0,-a[2],0,1,-a[3]), byrow=T, nrow=3)
> roots <- eigen(m, symm=F, only.values=T)$values
Whether this is faster or slower than using the cubic solver in the GSL package (as suggested by knguyen above) is a matter of benchmarking it on your system.
Do you need all 3 roots or just one? If just one, I would think Newton's Method would work ok. If all 3 then it might be problematic in circumstances where two are close together.
1) Solve for the derivative polynomial P' to locate your three roots. See there to know how to do it properly. Call those roots a and b (with a < b)
2) For the middle root, use a few steps of bisection between a and b, and when you're close enough, finish with Newton's method.
3) For the min and max root, "hunt" the solution. For the max root:
Start with x0 = b, x1 = b + (b - a) * lambda, where lambda is a moderate number (say 1.6)
do x_n = b + (x_{n - 1} - a) * lambda until P(x_n) and P(b) have different signs
Perform bisection + newton between x_{n - 1} and x_n
The common methods are available: Newton's Method, Bisection Method, Secant, Fixed point iteration, etc. Google any one of them.
If you have a non-linear system on the other hand (e.g. a system on N polynomial eqn's in N unknowns), a method such as high-order Newton may be used.
Have you tried looking into the GSL package http://cran.r-project.org/web/packages/gsl/index.html?

How to check if m n-sized vectors are linearly independent?

Disclaimer
This is not strictly a programming question, but most programmers soon or later have to deal with math (especially algebra), so I think that the answer could turn out to be useful to someone else in the future.
Now the problem
I'm trying to check if m vectors of dimension n are linearly independent. If m == n you can just build a matrix using the vectors and check if the determinant is != 0. But what if m < n?
Any hints?
See also this video lecture.
Construct a matrix of the vectors (one row per vector), and perform a Gaussian elimination on this matrix. If any of the matrix rows cancels out, they are not linearly independent.
The trivial case is when m > n, in this case, they cannot be linearly independent.
Construct a matrix M whose rows are the vectors and determine the rank of M. If the rank of M is less than m (the number of vectors) then there is a linear dependence. In the algorithm to determine the rank of M you can stop the procedure as soon as you obtain one row of zeros, but running the algorithm to completion has the added bonanza of providing the dimension of the spanning set of the vectors. Oh, and the algorithm to determine the rank of M is merely Gaussian elimination.
Take care for numerical instability. See the warning at the beginning of chapter two in Numerical Recipes.
If m<n, you will have to do some operation on them (there are multiple possibilities: Gaussian elimination, orthogonalization, etc., almost any transformation which can be used for solving equations will do) and check the result (eg. Gaussian elimination => zero row or column, orthogonalization => zero vector, SVD => zero singular number)
However, note that this question is a bad question for a programmer to ask, and this problem is a bad problem for a program to solve. That's because every linearly dependent set of n<m vectors has a different set of linearly independent vectors nearby (eg. the problem is numerically unstable)
I have been working on this problem these days.
Previously, I have found some algorithms regarding Gaussian or Gaussian-Jordan elimination, but most of those algorithms only apply to square matrix, not general matrix.
To apply for general matrix, one of the best answers might be this:
http://rosettacode.org/wiki/Reduced_row_echelon_form#MATLAB
You can find both pseudo-code and source code in various languages.
As for me, I transformed the Python source code to C++, causes the C++ code provided in the above link is somehow complex and inappropriate to implement in my simulation.
Hope this will help you, and good luck ^^
If computing power is not a problem, probably the best way is to find singular values of the matrix. Basically you need to find eigenvalues of M'*M and look at the ratio of the largest to the smallest. If the ratio is not very big, the vectors are independent.
Another way to check that m row vectors are linearly independent, when put in a matrix M of size mxn, is to compute
det(M * M^T)
i.e. the determinant of a mxm square matrix. It will be zero if and only if M has some dependent rows. However Gaussian elimination should be in general faster.
Sorry man, my mistake...
The source code provided in the above link turns out to be incorrect, at least the python code I have tested and the C++ code I have transformed does not generates the right answer all the time. (while for the exmample in the above link, the result is correct :) -- )
To test the python code, simply replace the mtx with
[30,10,20,0],[60,20,40,0]
and the returned result would be like:
[1,0,0,0],[0,1,2,0]
Nevertheless, I have got a way out of this. It's just this time I transformed the matalb source code of rref function to C++. You can run matlab and use the type rref command to get the source code of rref.
Just notice that if you are working with some really large value or really small value, make sure use the long double datatype in c++. Otherwise, the result will be truncated and inconsistent with the matlab result.
I have been conducting large simulations in ns2, and all the observed results are sound.
hope this will help you and any other who have encontered the problem...
A very simple way, that is not the most computationally efficient, is to simply remove random rows until m=n and then apply the determinant trick.
m < n: remove rows (make the vectors shorter) until the matrix is square, and then
m = n: check if the determinant is 0 (as you said)
m < n (the number of vectors is greater than their length): they are linearly dependent (always).
The reason, in short, is that any solution to the system of m x n equations is also a solution to the n x n system of equations (you're trying to solve Av=0). For a better explanation, see Wikipedia, which explains it better than I can.

Resources