I see some similar questions:
Generate permutation matrix from permutation vector
https://math.stackexchange.com/questions/345166/what-is-the-name-for-a-non-square-permutation-matrix
Given elements:
elems = [1,2,3,4] # dimensions 1x4
If I have a vector:
M = [4,2,3,1] # dimensions 1x4
I know there is some permutation matrix p that I can multiply elems * p = M, which in this case would be:
p =
[
0 0 0 1
0 1 0 0
0 0 1 0
1 0 0 0
] # dimensions 4x4
# eg:
# elems * P = M
1x4 4x4 = 1x4
Now, for my question, I am interested in what it would look like in the case when M is a non-vector, non-square matrix, like:
M' = [
4 2 3 1
4 3 2 1
1 2 3 4
] # dimensions 3x4
For the same
elems' = [
1 2 3 4
1 2 3 4
1 2 3 4
] # where this is now tripled to be conformant dimensions
# dimensions 3x4
#
# meaning P is still 4x4
You can see M_prime and elems_prime in this case are still just permutations, but now multivariate, rather than just a single vector as originally.
I know I am not able to just do the following kind of thing, because the matrix is not square, and thus not invertible:
elems' * P = M'
P = elems'^-1 * M'
# eg:
# elems' * P = M'
3x4 4x4 = 3x4
When I try, in R at least, I see:
> P <- ginv(elems_prime) %*% M_prime
[,1] [,2] [,3] [,4]
[1,] 0.1 0.07777778 0.08888889 0.06666667
[2,] 0.2 0.15555556 0.17777778 0.13333333
[3,] 0.3 0.23333333 0.26666667 0.20000000
[4,] 0.4 0.31111111 0.35555556 0.26666667
Does this give me back M'?
> elems_prime %*% P
[,1] [,2] [,3] [,4]
[1,] 3 2.333333 2.666667 2
[2,] 3 2.333333 2.666667 2
[3,] 3 2.333333 2.666667 2
!= M' # No, does not.
So this is not right.
My questions are:
What is the right P that would correctly permute the elems' matrix
into the M' matrix?
What is the name of the algorithm to find it?
(implementation in R, Haskell, or pseudocode is great)
Is there a way to restrict the values of P to be integers, preferably 0 or 1?
for R reproducibility
> dput(elems_prime)
structure(c(1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4), .Dim = 3:4)
> dput(M_prime)
structure(c(4, 4, 1, 2, 3, 2, 3, 2, 3, 1, 1, 4), .Dim = 3:4)
Notice that column space of M' is of higher order than the column space of elem'. This implies that there does not exist a linear mapping from elem' to M' because a linear mapping cannot increase the row or column space of a matrix (useful to think about this as a transformation of basis).
It follows that the any M' generated by elem' * P can have rank of at most 1, leaving only the conventional permutation matrices as candidates for P'
It is an entirely different question if we look at going from M' back to elem, and this asymmetry is also noteworthy.
When M is not a vector, this is not possible.
Here is why. In general if we multiple a nxm matrix times a mxp matrix we get a nxp matrix. Here elems is a vector that is a 1x4 matrix, so elems * P has to be a 1x? matrix of some sort. By making P longer, you can make M longer, but you'd have to change elems to make M taller.
Incidentally in linear algebra it is standard to flip vectors to be columns and put the matrices on their left. The reason for that is that the matrix represents a linear function, and that puts the matrix in the same place where the linear function goes. So it is very nice when going from functional notation to matrix notation. Also if you've got to write a square matrix anyways, it takes less room on the page to write a vertical vector on the right rather than a horizontal one on the left...
Related
When applying Matrix::qr() on the sparse matrix in R, the output is quite different from that of base::qr. There are V, beta, p, R, q but not rank and pivot. Below is a small example code. I want to detect linear dependent columns of the A sparse matrix, which requires the pivot and rank. How should I get these information?
library(Matrix)
A <- matrix(c(0, -2, 1, 0,
0, -4, 2, 0,
1, -2, 1, 2,
1, -2, 1, 2,
1, -2, 1, 2), nrow = 5, byrow = T)
A.s <- as(A, "dgCMatrix")
qrA.M <- Matrix::qr(A.s)
qrA.R <- base::qr(A)
There is another related but not answered question, Get base::qr pivoting in Matrix::qr method
I would reconstruct your example matrix A a little bit:
A <- A[, c(1,4,3,2)]
# [,1] [,2] [,3] [,4]
#[1,] 0 0 1 -2
#[2,] 0 0 2 -4
#[3,] 1 2 1 -2
#[4,] 1 2 1 -2
#[5,] 1 2 1 -2
You did not mention in your question why rank and pivot returned by a dense QR factorization are useful. But I think this is what you are looking for:
dQR <- base::qr(A)
with(dQR, pivot[1:rank])
#[1] 1 3
So columns 1 and 3 of A gives a basis for A's column space.
I don't really understand the logic of a sparse QR factorization. The 2nd column of A is perfectly linearly dependent on the 1st column, so I expect column pivoting to take place during the factorization. But very much to my surprise, it doesn't!
library(Matrix)
sA <- Matrix(A, sparse = TRUE)
sQR <- Matrix::qr(sA)
sQR#q + 1L
#[1] 1 2 3 4
No column pivoting is done! As a result, there isn't an obvious way to determine the rank of A.
At this moment, I could only think of performing a dense QR factorization on the R factor to get what you are looking for.
R <- as.matrix(Matrix::qrR(sQR))
QRR <- base::qr(R)
with(QRR, pivot[1:rank])
#[1] 1 3
Why does this work? Well, the Q factor has orthogonal hence linearly independent columns, thus columns of R inherit linear dependence or independence of A. For a matrix with much more rows than columns, the computational costs of this 2nd QR factorization is negligible.
I need to figure out the algorithm behind a sparse QR factorization before coming up with a better idea.
I've been looking at a similar problem and I ended up not relying on Matrix::qr() to calculate rank and to detect linear dependency. Instead I programmed the function GaussIndependent in the package SSBtools.
In the package examples I included an example that demonstrates wrong conclusion from rankMatrix(x, method = "qr"). Input x is a 44*20 dummy matrix.
Starting with your example matrix, A.s:
library(SSBtools)
GaussIndependent(A.s) # List of logical vectors specifying independent rows and columns
# $rows
# [1] TRUE FALSE TRUE FALSE FALSE
#
# $columns
# [1] TRUE TRUE FALSE FALSE
GaussRank(A.s) # the rank
# [1] 2
In this question was found a solution to find a particular solution to a non-square linear system that has infinitely many solutions. This leads to another question:
How to find all the solutions for a non-square linear system with infinitely many solutions, with R? (see below for a possible description of the infinite set of solutions)
Example: the linear system
x+y+z=1
x-y-2z=2
is equivalent to A X = B with:
A=matrix(c(1,1,1,1,-1,-2),2,3,T)
B=matrix(c(1,2),2,1,T)
A
[,1] [,2] [,3]
[1,] 1 1 1
[2,] 1 -1 -2
B
[,1]
[1,] 1
[2,] 2
We can describe the infinite set of solutions with:
x = 3/2 + (1/2) z
y = -1/2 + (-3/2) z
z in R
Thus, R could describe the set of solutions this way:
> solve2(A,B)
$principal
[1] 1 2 # this means that x and y will be described
$free
[1] 3 # this means that the 3rd variable (i.e. z) is free in the set of real numbers
$P
[1] 1.5 -0.5
$Q
[1] 0.5 -1.5
This means that every solution can be created with:
z = 236782 # any value would be ok
solve2(A,B)$P + z * solve2(A,B)$Q # this gives x and y
About the maths: there always exist such a decomposition, when the linear system has infinitely many solutions, this part is ok. The question is: is there something to do this in R?
You can solve equations like thse using the generalized inverse of A.
library(MASS)
ginv(A) %*% B
# 1.2857143
# 0.1428571
#-0.4285714
A %*% ginv(A) %*% B
# 1
# 2
So, with help from #Bhas
gen_soln <- function(vec) {
G <- ginv(A)
W <- diag(3) - G %*% A
(G %*% B + W %*% vec)
}
You can now find many solutions by providing a vector of length 3 to `gen_soln' function. For example,
one_from_inf <- gen_soln(1:3)
one_from_inf
#[1,] 1.35714286
#[2,] -0.07142857
#[3,] -0.2857142
# Test the solution.
A %*% one_from_inf
# [,1]
#[1,] 1
#[2,] 2
# Using random number generator
A %*% gen_soln(rnorm(3))
# [,1]
#[1,] 1
#[2,] 2
The general solution to
A*x = b
is
x = x0 + z
where x0 is any solution and z is in the kernel of A
As pointed out above you can find a particular solution x0 by using the generalised inverse. You can also use the SVD to find a basis for the kernel of A:
A = U*S*V'
where U and V are orthogonal and S diagonal, with, say, the last k entries on the diagonal 0 (and the others non-zero).
If follows that the last k columns of V form a basis for the kernel of A, and if we call these z1,..zk then the solutions of the original equation
are
x = x0 + c1*z1 + .. ck*zk
for any real c1..ck
Say I have a matrix:
A <- matrix(c(2,4,3,1,5,7), nrow=3, ncol=2)
colnames(A) <- c ("x", "y")
A
x y
[1,] 2 1
[2,] 4 5
[3,] 3 7
Is there a way to access each row of the matrix using a for loop?
What I'm trying to do is total the euclidean distance between each successive point (x,y). So in this example, I would find the total distance between:
(2,1) and (4,5)
(4,5) and (3,7)
So first I would find the distance between each of the two points, ie:
(2,1) and (4,5) => (|4-2|,|5-1|) => (2,4)
(4,5) and (3,7) => (|3-4|,|7-5|) => (1,2)
Then I would turn it into euclidean distance:
(2,4) => sqrt(2^2 + 4^2) => 4.47
(1,2) => sqrt(1^2 + 2^2) => 2.24
And total the distance
4.47 + 2.24 = 6.71
I'm quite confident that if I can access each row of the matrix as a vector, I can easily code this. However, I would love to hear any better ways of doing this.
I was also looking into turning the matrix into a list of lists (ie a list of (x,y) points, where each point is a list of the x and y value), or a list of points (x,y).
I'm not very experienced in programming and I've just started using R, so sorry if I'm not making sense.
You can try the following
for (i in 1:nrow(A))
{
row = A[i,]
% Do something with the row
}
However, I would love to hear any better ways of doing this.
R has built in functions for distance calculations such as dist, e.g.:
out <- as.matrix(dist(A))
# 1 2 3
#1 0.000000 4.472136 6.082763
#2 4.472136 0.000000 2.236068
#3 6.082763 2.236068 0.000000
You can extract the off-diagonal, which is the values you want, using:
row(out) - col(out)==1
# [,1] [,2] [,3]
#[1,] FALSE FALSE FALSE
#[2,] TRUE FALSE FALSE
#[3,] FALSE TRUE FALSE
Thus:
out[row(out) - col(out)==1]
#[1] 4.472136 2.236068
sum(out[row(out) - col(out)==1])
#[1] 6.708204
To run a Canonical correspondence analysis (cca package ade4) I need a positive definite variance matrix. (Which in theory is always the case)
but:
matrix(c(2,59,4,7,10,0,7,0,0,0,475,18714,4070,97,298,0,1,0,17,7,4,1,4,18,36),nrow=5)
> a
[,1] [,2] [,3] [,4] [,5]
[1,] 2 0 475 0 4
[2,] 59 7 18714 1 1
[3,] 4 0 4070 0 4
[4,] 7 0 97 17 18
[5,] 10 0 298 7 36
> eigen(var(a))
$values
[1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01
[5] -1.377693e-09
The last eigen value is -1.377693e-09 which is < 0. But the theorical value is > 0.
I can't run the function if one of the eigen value is < 0
I really don't know how to fix this without changing the code of the function cca()
Thanks for help
You can change the input, just a little bit, to make the matrix positive definite.
If you have the variance matrix, you can truncate the eigenvalues:
correct_variance <- function(V, minimum_eigenvalue = 0) {
V <- ( V + t(V) ) / 2
e <- eigen(V)
e$vectors %*% diag(pmax(minimum_eigenvalue,e$values)) %*% t(e$vectors)
}
v <- correct_variance( var(a) )
eigen(v)$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 1.326768e-08
Using the singular value decomposition, you can do the same thing directly with a.
truncate_singular_values <- function(a, minimum = 0) {
s <- svd(a)
s$u %*% diag( ifelse( s$d > minimum, s$d, minimum ) ) %*% t(s$v)
}
svd(a)$d
# [1] 1.916001e+04 4.435562e+01 1.196984e+01 8.822299e+00 1.035624e-01
eigen(var( truncate_singular_values(a,.2) ))$values
# [1] 6.380066e+07 1.973680e+02 3.551494e+01 1.033452e+01 6.079487e-09
However, this changes your matrix a by up to 0.1, which is a lot
(I suspect it is that high because the matrix a is square: as a result,
one of the eigenvalues of var(a) is exactly 0.)
b <- truncate_singular_values(a,.2)
max( abs(b-a) )
# [1] 0.09410187
We can actually do better simply by adding some noise.
b <- a + 1e-6*runif(length(a),-1,1) # Repeat if needed
eigen(var(b))$values
# [1] 6.380066e+07 1.973658e+02 3.551492e+01 1.033096e+01 2.492604e-09
Here are two approaches:
V <- var(a)
# 1
library(Matrix)
nearPD(V)$mat
# 2 perturb diagonals
eps <- 0.01
V + eps * diag(ncol(V))
R has a qr() function, which performs QR decomposition using either LINPACK or LAPACK (in my experience, the latter is 5% faster). The main object returned is a matrix "qr" that contains in the upper triangular matrix R (i.e. R=qr[upper.tri(qr)]). So far so good. The lower triangular part of qr contains Q "in compact form". One can extract Q from the qr decomposition by using qr.Q(). I would like to find the inverse of qr.Q(). In other word, I do have Q and R, and would like to put them in a "qr" object. R is trivial but Q is not. The goal is to apply to it qr.solve(), which is much faster than solve() on large systems.
Introduction
R uses the LINPACK dqrdc routine, by default, or the LAPACK DGEQP3 routine, when specified, for computing the QR decomposition. Both routines compute the decomposition using Householder reflections. An m x n matrix A is decomposed into an m x n economy-size orthogonal matrix (Q) and an n x n upper triangular matrix (R) as A = QR, where Q can be computed by the product of t Householder reflection matrices, with t being the lesser of m-1 and n: Q = H1H2...Ht.
Each reflection matrix Hi can be represented by a length-(m-i+1) vector. For example, H1 requires a length-m vector for compact storage. All but one entry of this vector is placed in the first column of the lower triangle of the input matrix (the diagonal is used by the R factor). Therefore, each reflection needs one more scalar of storage, and this is provided by an auxiliary vector (called $qraux in the result from R's qr).
The compact representation used is different between the LINPACK and LAPACK routines.
The LINPACK Way
A Householder reflection is computed as Hi = I - viviT/pi, where I is the identity matrix, pi is the corresponding entry in $qraux, and vi is as follows:
vi[1..i-1] = 0,
vi[i] = pi
vi[i+1:m] = A[i+1..m, i] (i.e., a column of the lower triangle of A after calling qr)
LINPACK Example
Let's work through the example from the QR decomposition article at Wikipedia in R.
The matrix being decomposed is
> A <- matrix(c(12, 6, -4, -51, 167, 24, 4, -68, -41), nrow=3)
> A
[,1] [,2] [,3]
[1,] 12 -51 4
[2,] 6 167 -68
[3,] -4 24 -41
We do the decomposition, and the most relevant portions of the result is shown below:
> Aqr = qr(A)
> Aqr
$qr
[,1] [,2] [,3]
[1,] -14.0000000 -21.0000000 14
[2,] 0.4285714 -175.0000000 70
[3,] -0.2857143 0.1107692 -35
[snip...]
$qraux
[1] 1.857143 1.993846 35.000000
[snip...]
This decomposition was done (under the covers) by computing two Householder reflections and multiplying them by A to get R. We will now recreate the reflections from the information in $qr.
> p = Aqr$qraux # for convenience
> v1 <- matrix(c(p[1], Aqr$qr[2:3,1]))
> v1
[,1]
[1,] 1.8571429
[2,] 0.4285714
[3,] -0.2857143
> v2 <- matrix(c(0, p[2], Aqr$qr[3,2]))
> v2
[,1]
[1,] 0.0000000
[2,] 1.9938462
[3,] 0.1107692
> I = diag(3) # identity matrix
> H1 = I - v1 %*% t(v1)/p[1] # I - v1*v1^T/p[1]
> H2 = I - v2 %*% t(v2)/p[2] # I - v2*v2^T/p[2]
> Q = H1 %*% H2
> Q
[,1] [,2] [,3]
[1,] -0.8571429 0.3942857 0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,] 0.2857143 -0.1714286 0.94285714
Now let's verify the Q computed above is correct:
> qr.Q(Aqr)
[,1] [,2] [,3]
[1,] -0.8571429 0.3942857 0.33142857
[2,] -0.4285714 -0.9028571 -0.03428571
[3,] 0.2857143 -0.1714286 0.94285714
Looks good! We can also verify QR is equal to A.
> R = qr.R(Aqr) # extract R from Aqr$qr
> Q %*% R
[,1] [,2] [,3]
[1,] 12 -51 4
[2,] 6 167 -68
[3,] -4 24 -41
The LAPACK Way
A Householder reflection is computed as Hi = I - piviviT, where I is the identity matrix, pi is the corresponding entry in $qraux, and vi is as follows:
vi[1..i-1] = 0,
vi[i] = 1
vi[i+1:m] = A[i+1..m, i] (i.e., a column of the lower triangle of A after calling qr)
There is another twist when using the LAPACK routine in R: column pivoting is used, so the decomposition is solving a different, related problem: AP = QR, where P is a permutation matrix.
LAPACK Example
This section does the same example as before.
> A <- matrix(c(12, 6, -4, -51, 167, 24, 4, -68, -41), nrow=3)
> Bqr = qr(A, LAPACK=TRUE)
> Bqr
$qr
[,1] [,2] [,3]
[1,] 176.2554964 -71.1694118 1.668033
[2,] -0.7348557 35.4388886 -2.180855
[3,] -0.1056080 0.6859203 -13.728129
[snip...]
$qraux
[1] 1.289353 1.360094 0.000000
$pivot
[1] 2 3 1
attr(,"useLAPACK")
[1] TRUE
[snip...]
Notice the $pivot field; we will come back to that. Now we generate Q from the information the Aqr.
> p = Bqr$qraux # for convenience
> v1 = matrix(c(1, Bqr$qr[2:3,1]))
> v1
[,1]
[1,] 1.0000000
[2,] -0.7348557
[3,] -0.1056080
> v2 = matrix(c(0, 1, Bqr$qr[3,2]))
> v2
[,1]
[1,] 0.0000000
[2,] 1.0000000
[3,] 0.6859203
> H1 = I - p[1]*v1 %*% t(v1) # I - p[1]*v1*v1^T
> H2 = I - p[2]*v2 %*% t(v2) # I - p[2]*v2*v2^T
> Q = H1 %*% H2
[,1] [,2] [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,] 0.9474882 -0.01602261 -0.3193891
[3,] 0.1361660 -0.88346868 0.4482655
Once again, the Q computed above agrees with the R-provided Q.
> qr.Q(Bqr)
[,1] [,2] [,3]
[1,] -0.2893527 -0.46821615 -0.8348944
[2,] 0.9474882 -0.01602261 -0.3193891
[3,] 0.1361660 -0.88346868 0.4482655
Finally, let's compute QR.
> R = qr.R(Bqr)
> Q %*% R
[,1] [,2] [,3]
[1,] -51 4 12
[2,] 167 -68 6
[3,] 24 -41 -4
Notice the difference? QR is A with its columns permuted given the order in Bqr$pivot above.
I have researched for this same problem as the OP asks and I don't think it is possible. Basically the OP question is whether having the explicitly computed Q, one can recover the H1 H2 ... Ht. I do not think this is possible without computing the QR from scratch but I would also be very interested to know whether there is such solution.
I have a similar issue as the OP but in a different context, my iterative algorithm needs to mutate the matrix A by adding columns and/or rows. The first time, the QR is computed using DGEQRF and thus, the compact LAPACK format. After the matrix A is mutated e.g. with new rows I can quickly build a new set of reflectors or rotators that will annihilate the non-zero elements of the lowest diagonal of my existing R and build a new R but now I have a set of H1_old H2_old ... Hn_old and H1_new H2_new ... Hn_new (and similarly tau's) which can't be mixed up into a single QR compact representation. The two possibilities I have are, and maybe the OP has the same two possibilities:
Always maintain Q and R explicitly separated whether when computed the first time or after every update at the cost of extra flops but keeping the required memory well bounded.
Stick to the compact LAPACK format but then every time a new update comes in, keep a list of all these mini sets of update reflectors. At the point of solving the system one would do a big Q'*c i.e. H1_u3*H2_u3*...*Hn_u3*H1_u2*H2_u2*...*Hn_u2*H1_u1*H2_u1...*Hn_u1*H1*H2*...*Hn*c where ui is the QR update number and this is potentially a lot of multiplications to do and memory to keep track of but definitely the fastest way.
The long answer from David basically explains what the compact QR format is but not how to get to this compact QR format having the explicit computed Q and R as input.