How to check if m n-sized vectors are linearly independent? - math

Disclaimer
This is not strictly a programming question, but most programmers soon or later have to deal with math (especially algebra), so I think that the answer could turn out to be useful to someone else in the future.
Now the problem
I'm trying to check if m vectors of dimension n are linearly independent. If m == n you can just build a matrix using the vectors and check if the determinant is != 0. But what if m < n?
Any hints?
See also this video lecture.

Construct a matrix of the vectors (one row per vector), and perform a Gaussian elimination on this matrix. If any of the matrix rows cancels out, they are not linearly independent.
The trivial case is when m > n, in this case, they cannot be linearly independent.

Construct a matrix M whose rows are the vectors and determine the rank of M. If the rank of M is less than m (the number of vectors) then there is a linear dependence. In the algorithm to determine the rank of M you can stop the procedure as soon as you obtain one row of zeros, but running the algorithm to completion has the added bonanza of providing the dimension of the spanning set of the vectors. Oh, and the algorithm to determine the rank of M is merely Gaussian elimination.
Take care for numerical instability. See the warning at the beginning of chapter two in Numerical Recipes.

If m<n, you will have to do some operation on them (there are multiple possibilities: Gaussian elimination, orthogonalization, etc., almost any transformation which can be used for solving equations will do) and check the result (eg. Gaussian elimination => zero row or column, orthogonalization => zero vector, SVD => zero singular number)
However, note that this question is a bad question for a programmer to ask, and this problem is a bad problem for a program to solve. That's because every linearly dependent set of n<m vectors has a different set of linearly independent vectors nearby (eg. the problem is numerically unstable)

I have been working on this problem these days.
Previously, I have found some algorithms regarding Gaussian or Gaussian-Jordan elimination, but most of those algorithms only apply to square matrix, not general matrix.
To apply for general matrix, one of the best answers might be this:
http://rosettacode.org/wiki/Reduced_row_echelon_form#MATLAB
You can find both pseudo-code and source code in various languages.
As for me, I transformed the Python source code to C++, causes the C++ code provided in the above link is somehow complex and inappropriate to implement in my simulation.
Hope this will help you, and good luck ^^

If computing power is not a problem, probably the best way is to find singular values of the matrix. Basically you need to find eigenvalues of M'*M and look at the ratio of the largest to the smallest. If the ratio is not very big, the vectors are independent.

Another way to check that m row vectors are linearly independent, when put in a matrix M of size mxn, is to compute
det(M * M^T)
i.e. the determinant of a mxm square matrix. It will be zero if and only if M has some dependent rows. However Gaussian elimination should be in general faster.

Sorry man, my mistake...
The source code provided in the above link turns out to be incorrect, at least the python code I have tested and the C++ code I have transformed does not generates the right answer all the time. (while for the exmample in the above link, the result is correct :) -- )
To test the python code, simply replace the mtx with
[30,10,20,0],[60,20,40,0]
and the returned result would be like:
[1,0,0,0],[0,1,2,0]
Nevertheless, I have got a way out of this. It's just this time I transformed the matalb source code of rref function to C++. You can run matlab and use the type rref command to get the source code of rref.
Just notice that if you are working with some really large value or really small value, make sure use the long double datatype in c++. Otherwise, the result will be truncated and inconsistent with the matlab result.
I have been conducting large simulations in ns2, and all the observed results are sound.
hope this will help you and any other who have encontered the problem...

A very simple way, that is not the most computationally efficient, is to simply remove random rows until m=n and then apply the determinant trick.
m < n: remove rows (make the vectors shorter) until the matrix is square, and then
m = n: check if the determinant is 0 (as you said)
m < n (the number of vectors is greater than their length): they are linearly dependent (always).
The reason, in short, is that any solution to the system of m x n equations is also a solution to the n x n system of equations (you're trying to solve Av=0). For a better explanation, see Wikipedia, which explains it better than I can.

Related

Which sparse linear solver is faster? SparseLU or BiCGSTAB?

I tested Eigen's SparseLU and BicGSTAB method on some sparse matrix, whose dense counterparts' size ranges from 3000*3000 to 16000*16000. All the cases shows that SparseLU is around 13% faster than BicGSTAB method.
I didn't feed the BiCGSTAB a RowMajor sparse matrix, or give it any pre-conditioner. That might be the reason of slow.
So I am wondering, if I do both methods well, which one should be faster?
How about if the matrix size goes up to millions*millions?
Thanks a lot!
You already mentioned the main reason of performance difference.
The iterative methods are getting much faster when you choose the "right" preconditioner.
An example list of preconditioners you might refer to is:
Jacobi
SOR
ILU
Multigrid
Each preconditioner has some parameters the should be tuned also.
Choice of linear solver has a lot to do with the distribution of eigenvalues/eigenvectors of the matrix. If you have a symmetric positive definite matrix then conjugate-gradient is a good option. Number of iterations depend on the condition number (max eigen value/min eigen value). For a matrix derived from an elliptic operator, condition number increases with the size of the matrix.
Check out this article by Jonathan Shewchuk for a great explanation on CG. (https://www.cs.cmu.edu/~quake-papers/painless-conjugate-gradient.pdf).
For other matrix types, you can use GMRES etc. based on the eigen properties. Check out this paper http://www.sam.math.ethz.ch/~mhg/pub/biksm.pdf
Hope this helps.

R how to generate random yet easily invertible matrices

I have a difficult R computation to do, and I have an option of 2 computers, called V and L, to run the code. V is supposed to be faster than L, but I did not experience this. So I decided to test it out.
As a simple test, I decided to ask them invert a 3000*3000 matrice 500 times, and record the time.
set.seed(123)
I=500
n=3000
time=matrix(NA,ncol=3,nrow=I)
for(i in 1:I){
t0<-proc.time()
x<-solve(matrix(runif(n^2),n))
mt1<-proc.time()
time[i,]<-(mt1-t0)[1:3]
}
The problem is that during a particular iteration, it got stuck. I don't know why but I suspect it is because the matrix generated was near singular. So I would like to improve the code. I can think of 3 ways:
make sure the matrix generated is easily invertible. But how do i enforce this??? Of course, any solution needs to be computationally inexpensive, otherwise the exercise becomes meaningless.
ask R to skip that iteration if solve takes too long? But again, how do I do that?
assign them a different computation task instead, any recommendation?
A random matrix is invertible with probability 1, meaning that, in practice, the probability of generating a singular (i.e. non-invertible) matrix is infinitesimally small.
Moreover, from the point of view of the algorithm that R uses to invert matrices, there is no such thing as an "easily invertible" matrix. Either the algorithm succeeds, or it determines that a matrix is singular and fails. But there is no scenario under which it tries "really hard" and takes a long time to invert a matrix. It's a deterministic algorithm which either runs into a 0 (or a value smaller than some given epsilon), in which case if fails, or else it doesn't.
On which iteration do you get stuck? Are you sure you are getting stuck on the inversion of the matrix, and it's not something like garbage collection that is taking a long time?
I can't reproduce the problem you describe. Starting with random seed 123, I can invert 500 random 3000x3000 matrices in a row, using your code, without any significant timing discrepancies. Can you find a random seed that generates a "hard to invert matrix" directly?

How to numerically compute nonlinear polynomials efficiently and accurately?

(I'm not sure whether I should post this problem on this site or on the math site. Please feel free to migrate this post if necessary.)
My problem at hand is that given a value of k I'd like to numerically compute a rational function of nonlinear polynomials in k which looks like the following: (sorry I don't know how to typeset equations here...)
where {a_0, ..., a_N; b_0, ..., b_N} are complex constants, {u_0, ..., u_N, v_0, ..., v_N} are real constants and i is the imaginary number. I learned from Numerical Recipes that there are whole bunch of ways to compute polynomials quickly, in the meanwhile keeping the rounding error small enough, if all coefficients were constant. But I do not think those ideas are useful in my case since the exponential prefactors also depend on k.
Currently I calculate it in a brute force way in C with complex.h (this is just a pseudo code):
double complex function(double k)
{
return (a_0+a_1*cexp(I*u_1*k)*k+a_2*cexp(I*u_2*k)*k*k+...)/(b_0+b_1*cexp(I*v_1*k)*k+v_2*cexp(I*v_2*k)*k*k+...);
}
However when the number of calls of function increases (because this is just a part of my real calculation), it is very slow and inaccurate (only 6 valid digits). I appreciate any comments and/or suggestions.
I trust that this isn't a homework assignment!
Normally the trick is to use a loop add the next coefficient to the running sum, and multiply by k. However, in your case, I think the "e" term in the coefficient is going to overwhelm any savings by factoring out k. You can still do it, but the savings will probably be small.
Is u_i a constant? Depending on how many times you need to run this formula, maybe you could premultiply u_i * k (unless k changes each run). It's been so many decades since I took a Numerical Analysis course that I have only vague recollections of the tricks of the trade. Let's see... is e^(i*u_i*k) the same as (e^(i*u_i))^k? I don't remember the rules on imaginary numbers, or whether you'll save anything since you've got a real^real (assuming k is real) anyway (internally done using e^power).
If you're getting only 6 digits, that suggests that your math, and maybe your library, is working in single precision (32 bit) reals. Check your library and check your declarations that you are using at least double precision (64 bit) reals everywhere.

Efficient Calculation of an N-Dimensional Cross Product?

As per the title, is the best way to calculate the n-dimensional cross product just using the determinant definition and using the LU Decomposition method of doing as such or could you guys suggest a better one?
Thanks
Edit: for clarity I mean http://en.wikipedia.org/wiki/Cross_product and not the Cartesian Product
Edit: It also seems that using the Leibniz Formula might help - though I don't know how that compares to LU Decomp. at the moment.
From your comment, it seems like you are looking for an operation which takes n −1 vectors as input and computes a single vector as its result, which will be orthogonal to all the input vectors and perhaps have a well-defined length as well.
With defined length
You can characterize the 3-dimensional cross product v =a ×b using the identity v ∙w =det(a,b,w). In other words, taking the cross product of the input vectors and then computing the dot product with any other vector w is the same as plugging the input vectors and that other vector into a matrix and computing its determinant.
This definition can be generalized to arbitrary dimensions. Due to the way a determinant can be computed using Laplace expansion along the last column, the resulting coordinates of that cross product will be the values of all (n −1)×(n −1) sub-determinants you can form from the input vectors, with alternating signs. So yes, Leibniz might be useful in theory, although it is hardly suitable for real-world computations. In practice, you'll soon have to figure out ways to avoid repeating computationswhile computing these n determinants. But wait for the last section of this answer…
Just the direction
Most applications however can do with a weaker requirement. They don't care about the length of the resulting vector, but only about its direction. In that case, what you are asking for is the kernel of the (n −1)×n matrix you can form by taking the input vectors as rows. Any element of that kernel will be orthogonal to the input vectors, and since computing kernels is a common task, you can build on a lot of existing implementations, e.g. Lapack. Details might depend on the language you are using.
Combining these
You can even combine the two approaches above: compute one element of the kernel, and for a non-zero entry of that vector, also compute the corresponding (n −1)×(n −1) determinant which would give you that single coordinate using the first approach. You can then simply scale the vector so that the selected coordinate reaches the computed value, and all the other coordinates will match that one.

Sample uniformly at random from an n-dimensional unit simplex

Sampling uniformly at random from an n-dimensional unit simplex is the fancy way to say that you want n random numbers such that
they are all non-negative,
they sum to one, and
every possible vector of n non-negative numbers that sum to one are equally likely.
In the n=2 case you want to sample uniformly from the segment of the line x+y=1 (ie, y=1-x) that is in the positive quadrant.
In the n=3 case you're sampling from the triangle-shaped part of the plane x+y+z=1 that is in the positive octant of R3:
(Image from http://en.wikipedia.org/wiki/Simplex.)
Note that picking n uniform random numbers and then normalizing them so they sum to one does not work. You end up with a bias towards less extreme numbers.
Similarly, picking n-1 uniform random numbers and then taking the nth to be one minus the sum of them also introduces bias.
Wikipedia gives two algorithms to do this correctly: http://en.wikipedia.org/wiki/Simplex#Random_sampling
(Though the second one currently claims to only be correct in practice, not in theory. I'm hoping to clean that up or clarify it when I understand this better. I initially stuck in a "WARNING: such-and-such paper claims the following is wrong" on that Wikipedia page and someone else turned it into the "works only in practice" caveat.)
Finally, the question:
What do you consider the best implementation of simplex sampling in Mathematica (preferably with empirical confirmation that it's correct)?
Related questions
Generating a probability distribution
java random percentages
This code can work:
samples[n_] := Differences[Join[{0}, Sort[RandomReal[Range[0, 1], n - 1]], {1}]]
Basically you just choose n - 1 places on the interval [0,1] to split it up then take the size of each of the pieces using Differences.
A quick run of Timing on this shows that it's a little faster than Janus's first answer.
After a little digging around, I found this page which gives a nice implementation of the Dirichlet Distribution. From there it seems like it would be pretty simple to follow Wikipedia's method 1. This seems like the best way to do it.
As a test:
In[14]:= RandomReal[DirichletDistribution[{1,1}],WorkingPrecision->25]
Out[14]= {0.8428995243540368880268079,0.1571004756459631119731921}
In[15]:= Total[%]
Out[15]= 1.000000000000000000000000
A plot of 100 samples:
alt text http://www.public.iastate.edu/~zdavkeos/simplex-sample.png
I'm with zdav: the Dirichlet distribution seems to be the easiest way ahead, and the algorithm for sampling the Dirichlet distribution which zdav refers to is also presented on the Wikipedia page on the Dirichlet distribution.
Implementationwise, it is a bit of an overhead to do the full Dirichlet distribution first, as all you really need is n random Gamma[1,1] samples. Compare below
Simple implementation
SimplexSample[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[GammaDistribution[1,1],n,opts]
Full Dirichlet implementation
DirichletDistribution/:Random`DistributionVector[
DirichletDistribution[alpha_?(VectorQ[#,Positive]&)],n_Integer,prec_?Positive]:=
Block[{gammas}, gammas =
Map[RandomReal[GammaDistribution[#,1],n,WorkingPrecision->prec]&,alpha];
Transpose[gammas]/Total[gammas]]
SimplexSample2[n_, opts:OptionsPattern[RandomReal]] :=
(#/Total[#])& # RandomReal[DirichletDistribution[ConstantArray[1,{n}]],opts]
Timing
Timing[Table[SimplexSample[10,WorkingPrecision-> 20],{10000}];]
Timing[Table[SimplexSample2[10,WorkingPrecision-> 20],{10000}];]
Out[159]= {1.30249,Null}
Out[160]= {3.52216,Null}
So the full Dirichlet is a factor of 3 slower. If you need m>1 samplepoints at a time, you could probably win further by doing (#/Total[#]&)/#RandomReal[GammaDistribution[1,1],{m,n}].
Here's a nice concise implementation of the second algorithm from Wikipedia:
SimplexSample[n_] := Rest## - Most## &[Sort#Join[{0,1}, RandomReal[{0,1}, n-1]]]
That's adapted from here: http://www.mofeel.net/1164-comp-soft-sys-math-mathematica/14968.aspx
(Originally it had Union instead of Sort#Join -- the latter is slightly faster.)
(See comments for some evidence that this is correct!)
I have created an algorithm for uniform random generation over a simplex. You can find the details in the paper in the following link:
http://www.tandfonline.com/doi/abs/10.1080/03610918.2010.551012#.U5q7inJdVNY
Briefly speaking, you can use following recursion formulas to find the random points over the n-dimensional simplex:
x1=1-R11/n-1
xk=(1-Σi=1kxi)(1-Rk1/n-k), k=2, ..., n-1
xn=1-Σi=1n-1xi
Where R_i's are random number between 0 and 1.
Now I am trying to make an algorithm to generate random uniform samples from constrained simplex.that is intersection between a simplex and a convex body.
Old question, and I'm late to the party, but this method is much faster than the accepted answer if implemented efficiently.
In Mathematica code:
#/Total[#,{2}]&#Log#RandomReal[{0,1},{n,d}]
In plain English, you generate n rows * d columns of randoms uniformly distributed between 0 and 1. Then take the Log of everything. Then normalize each row, dividing each element in the row by the row total. Now you have n samples uniformly distributed over the (d-1) dimensional simplex.
If found this method here: https://mathematica.stackexchange.com/questions/33652/uniformly-distributed-n-dimensional-probability-vectors-over-a-simplex
I'll admit, I'm not sure why it works, but it passes every statistical test I can think of. If anyone has a proof of why this method works, I'd love to see it!

Resources