How to solve a non-square linear system with R : A X = B ?
(in the case the system has no solution or infinitely many solutions)
Example :
A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
B=matrix(c(-17,28,11),3,1,T)
A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
B
[,1]
[1,] -17
[2,] 28
[3,] 11
If the matrix A has more rows than columns, then you should use least squares fit.
If the matrix A has fewer rows than columns, then you should perform singular value decomposition. Each algorithm does the best it can to give you a solution by using assumptions.
Here's a link that shows how to use SVD as a solver:
http://www.ecse.rpi.edu/~qji/CV/svd_review.pdf
Let's apply it to your problem and see if it works:
Your input matrix A and known RHS vector B:
> A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
> B=matrix(c(-17,28,11),3,1,T)
> A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
> B
[,1]
[1,] -17
[2,] 28
[3,] 11
Let's decompose your A matrix:
> asvd = svd(A)
> asvd
$d
[1] 8.007081e+00 4.459446e+00 4.022656e-16
$u
[,1] [,2] [,3]
[1,] -0.1295469 -0.8061540 0.5773503
[2,] 0.7629233 0.2908861 0.5773503
[3,] 0.6333764 -0.5152679 -0.5773503
$v
[,1] [,2] [,3]
[1,] 0.87191556 -0.2515803 -0.1764323
[2,] -0.46022634 -0.1453716 -0.4694190
[3,] 0.04853711 0.5423235 0.6394484
[4,] -0.15999723 -0.7883272 0.5827720
> adiag = diag(1/asvd$d)
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0.00000e+00
[2,] 0.0000000 0.2242431 0.00000e+00
[3,] 0.0000000 0.0000000 2.48592e+15
Here's the key: the third eigenvalue in d is very small; conversely, the diagonal element in adiag is very large. Before solving, set it equal to zero:
> adiag[3,3] = 0
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0
[2,] 0.0000000 0.2242431 0
[3,] 0.0000000 0.0000000 0
Now let's compute the solution (see slide 16 in the link I gave you above):
> solution = asvd$v %*% adiag %*% t(asvd$u) %*% B
> solution
[,1]
[1,] 2.411765
[2,] -2.282353
[3,] 2.152941
[4,] -3.470588
Now that we have a solution, let's substitute it back to see if it gives us the same B:
> check = A %*% solution
> check
[,1]
[1,] -17
[2,] 28
[3,] 11
That's the B side you started with, so I think we're good.
Here's another nice SVD discussion from AMS:
http://www.ams.org/samplings/feature-column/fcarc-svd
Aim is to solve Ax = b
where A is p by q, x is q by 1 and b is p by 1 for x given A and b.
Approach 1: Generalized Inverse: Moore-Penrose
https://en.wikipedia.org/wiki/Generalized_inverse
Multiplying both sides of the equation, we get
A'Ax = A' b
where A' is the transpose of A. Note that A'A is q by q matrix now. One way to solve this now multiply both sides of the equation by the inverse of A'A. Which gives,
x = (A'A)^{-1} A' b
This is the theory behind generalized inverse. Here G = (A'A)^{-1} A' is pseudo-inverse of A.
library(MASS)
ginv(A) %*% B
# [,1]
#[1,] 2.411765
#[2,] -2.282353
#[3,] 2.152941
#[4,] -3.470588
Approach 2: Generalized Inverse using SVD.
#duffymo used SVD to obtain a pseudoinverse of A.
However, last elements of svd(A)$d may not be quite as small as in this example. So, probably one shouldn't use that method as is. Here's an example where none of the last elements of A is close to zero.
A <- as.matrix(iris[11:13, -5])
A
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 11 5.4 3.7 1.5 0.2
# 12 4.8 3.4 1.6 0.2
# 13 4.8 3.0 1.4 0.1
svd(A)$d
# [1] 10.7820526 0.2630862 0.1677126
One option would be to look as the singular values in cor(A)
svd(cor(A))$d
# [1] 2.904194e+00 1.095806e+00 1.876146e-16 1.155796e-17
Now, it is clear there is only two large singular values are present. So, one now can apply svd on A to get pseudo-inverse as below.
svda <- svd(A)
G = svda$v[, 1:2] %*% diag(1/svda$d[1:2]) %*% t(svda$u[, 1:2])
# to get x
G %*% B
Related
I am trying to multiply to matrices in R:
I know this multiplication can be done, but I am getting an error. Any idea why?
> d1
[,1]
[1,] -3
[2,] 0
[3,] 3
> t1
[,1] [,2] [,3]
[1,] 2 2 2
> t1 * d1
Error in t1 * d1 : non-conformable arrays
A few details more starting from #ThomasIsCoding comment:
d1<-as.matrix(c(-3,0,3))
t1<-t(as.matrix(c(2,2,2)))
d1 %*% t1
[,1] [,2] [,3]
[1,] -6 -6 -6
[2,] 0 0 0
[3,] 6 6 6
From the official CRAN documentation about matrix multiplication
A * B is the matrix of element by element products and
A %*% B is the matrix product.
If x is a vector, then
x %% A %% x is a quadratic form.
Link to the documenation HERE.
trying to get the correlation coeffizient between the columns of each row of the matrix. I am reall new to R and it is a real beginner thing here. One of the first tasks I have to do for class.
Matrix:
A2
[,1] [,2]
[1,] 4 -2
[2,] 8 -3
[3,] 6 1
[4,] 2 2
[5,] -1 1
I tried to use cor(A) since I read it will automatically calculate the correlation coeffizient for columns of each row, but it gives me the following result:
cor(A2)
[,1] [,2]
[1,] 1.0000000 -0.6338878
[2,] -0.6338878 1.0000000
when using cor(t(A2))
cor(t(A2))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 NA -1
[2,] 1 1 1 NA -1
[3,] 1 1 1 NA -1
[4,] NA NA NA 1 NA
[5,] -1 -1 -1 NA 1
But I expected it to have 5 rows, one column with the result in it.
There are several ways to use the cor() function. If you want to calculate the correlation between two columns in a matrix, then you can provide two arguments like this:
> cor(A2[,1], A2[,2])
[1] -0.6338878
If you input a single matrix as an argument, then it will return a correlation matrix.
> cor(A2)
[,1] [,2]
[1,] 1.0000000 -0.6338878
[2,] -0.6338878 1.0000000
In this case, position [1,1] is the correlation between the A2[,1] and A2[,1] (which is exactly 1). In the position [1,2], you can find the correlation between A2[,1] and A2[,2]. The correlation matrix is symmetric, and the diagnonal is always 1, because the correlation of a vector with itself is 1.
I've just started learning r and have had trouble finding an (understandable) explanation of what the prop.table() function does. I found the following explanation and example:
prop.table: Express Table Entries as Fraction of Marginal Table
Examples
m <- matrix(1:4, 2)
m
prop.table(m, 1)
But, as a beginner, I do not understand what this explanation means. I've also attempted to discern its functionality from the result of the above example, but I haven't been able to make sense of it.
With reference to the example above, what does the prop.table() function do? Furthermore, what is a "marginal table"?
The values in each cell divided by the sum of the 4 cells:
prop.table(m)
The value of each cell divided by the sum of the row cells:
prop.table(m, 1)
The value of each cell divided by the sum of the column cells:
prop.table(m, 2)
I think this can help
include all those things like prop.table(m), prop.table(m, 1), prop.table(m, 2)
m <- matrix(1:4, 2)
> m
[,1] [,2]
[1,] 1 3
[2,] 2 4
> prop.table(m) #sum=1+2+3+4=10, 1/10=0.1, 2/10=0.2, 3/10=0.3,4/10=0.4
[,1] [,2]
[1,] 0.1 0.3
[2,] 0.2 0.4
> prop.table(m,1)
[,1] [,2]
[1,] 0.2500000 0.7500000 #row1: sum=1+3=4, m(0,0)=1/4=0.25, m(0,1)=3/4=0.75
[2,] 0.3333333 0.6666667 #row2: sum=2+4=6, m(1,0)=2/6=0.33, m(1,1)=4/6=0.66
> prop.table(m,2)
[,1] [,2]
[1,] 0.3333333 0.4285714 #col1: sum=1+2=3, m(0,0)=1/3=0.33, m(1,0)=2/3=0.4285
[2,] 0.6666667 0.5714286 #col2: sum=3+4=7, m(0,1)=3/7=0.66, m(1,1)=4/7=0.57
>
when m is the 2D matrix: (m,1) refers to a fraction of row marginal table (sum over each row), (m,2) refers to a fraction of column marginal table (sum over each column). In short, just a "% of total sum of row of column", if you dont want to care about the term marginal.
Example:
m with extra row and column margin
[,1] [,2] ***
[1,] 1 4 5
[2,] 2 5 7
[3,] 3 6 9
*** 6 15
> prop.table(m,1)
` [,1] [,2]
[1,] 0.2000000 0.8000000
[2,] 0.2857143 0.7142857
[3,] 0.3333333 0.6666667
> prop.table(m,2)
[,1] [,2]
[1,] 0.1666667 0.2666667
[2,] 0.3333333 0.3333333
[3,] 0.5000000 0.4000000
I've been trying to reproduce a cholesky-like covariance decomposition in R - like it is done in Matlab using cholcov(). Example taken from https://uk.mathworks.com/help/stats/cholcov.html.
Result of the original cholcov() function as of their example:
T =
-0.2113 0.7887 -0.5774 0
0.7887 -0.2113 -0.5774 0
1.1547 1.1547 1.1547 1.7321
I am trying to replicate this T in R. I tried:
C1 <- cbind(c(2,1,1,2), c(1,2,1,2), c(1,1,2,2), c(2,2,2,3))
T1 <- chol(C1)
C2 <- t(T1) %*% T1
My result:
[,1] [,2] [,3] [,4]
[1,] 1.414214 0.7071068 0.7071068 1.414214e+00
[2,] 0.000000 1.2247449 0.4082483 8.164966e-01
[3,] 0.000000 0.0000000 1.1547005 5.773503e-01
[4,] 0.000000 0.0000000 0.0000000 1.290478e-08
C2 recovers C1, but T1 is quite different from MATLAB's solution. I then thought maybe it would be a Cholesky composition of the covariance matrix:
T1 <- chol(cov(C1))
but I get
[,1] [,2] [,3] [,4]
[1,] 0.5773503 0.0000000 0.0000000 2.886751e-01
[2,] 0.0000000 0.5773503 0.0000000 2.886751e-01
[3,] 0.0000000 0.0000000 0.5773503 2.886751e-01
[4,] 0.0000000 0.0000000 0.0000000 3.725290e-09
which is not right either.
Could anyone give me a hint how cholcov() in Matlab is calculated so that I could replicate it in R?
You are essentially abusing R function chol in this case. The cholcov function from MATLAB is a composite function.
If the covariance is positive, it does Cholesky factorization, returning a full-rank upper triangular Cholesky factor;
If the covariance is positive-semidefinite, it does Eigen decomposition, returning a rectangular matrix.
On the other hand, chol from R only does Choleksy factorization. The example you give, C1, falls into the second case. So, we should resort to eigen function in R.
E <- eigen(C1, symmetric = TRUE)
#$values
#[1] 7.000000e+00 1.000000e+00 1.000000e+00 2.975357e-17
#
#$vectors
# [,1] [,2] [,3] [,4]
#[1,] -0.4364358 0.000000e+00 8.164966e-01 -0.3779645
#[2,] -0.4364358 -7.071068e-01 -4.082483e-01 -0.3779645
#[3,] -0.4364358 7.071068e-01 -4.082483e-01 -0.3779645
#[4,] -0.6546537 8.967707e-16 -2.410452e-16 0.7559289
V <- E$vectors
D <- sqrt(E$values) ## root eigen values
Since numerical rank is 3, we drop the last eigen value and eigen vector:
V1 <- V[, 1:3]
D1 <- D[1:3]
Thus the factor you want is:
R <- D1 * t(V1) ## diag(D1) %*% t(V1)
# [,1] [,2] [,3] [,4]
#[1,] -1.1547005 -1.1547005 -1.1547005 -1.732051e+00
#[2,] 0.0000000 -0.7071068 0.7071068 8.967707e-16
#[3,] 0.8164966 -0.4082483 -0.4082483 -2.410452e-16
We can verify that:
crossprod(R) ## t(R) %*% R
# [,1] [,2] [,3] [,4]
#[1,] 2 1 1 2
#[2,] 1 2 1 2
#[3,] 1 1 2 2
#[4,] 2 2 2 3
The R factor above is not as same as the one returned by cholcov due to different algorithms used for Eigen factorization. R uses LAPACK routine DSYVER in which some pivoting is done so that eigen values are non-increasing. MATLAB's cholcov is not open-source, so I'm not sure what algorithm it uses. But it is easy to demonstrate that it does not arrange eigen values in non-increasing order.
Consider the factor T returned by cholcov:
T <- structure(c(-0.2113, 0.7887, 1.1547, 0.7887, -0.2113, 1.1547,
-0.5774, -0.5774, 1.1547, 0, 0, 1.7321), .Dim = 3:4)
We can get eigen values by
rowSums(T ^ 2)
# [1] 1.000086 1.000086 7.000167
There are some round-off error because T is not precise, but we can see clearly that eigen values are 1, 1, 7. On the other hand, we have 7, 1, 1 from R (recall D1).
I'm pretty new to the R language and trying to find out how you can calculate the inverse of a matrix that isn't square. (non-square? irregular? I'm unsure of the correct terminology).
From my book and a quick google search, (see source), I've found you can use solve(a) to find the inverse of a matrix if a is square.
The matrix I have created is, and from what I understand, not square:
> matX<-matrix(c(rep(1, 8),2,3,4,0,6,4,3,7,-2,-4,3,5,7,8,9,11), nrow=8, ncol=3);
> matX
[,1] [,2] [,3]
[1,] 1 2 -2
[2,] 1 3 -4
[3,] 1 4 3
[4,] 1 0 5
[5,] 1 6 7
[6,] 1 4 8
[7,] 1 3 9
[8,] 1 7 11
>
Is there a function for solving a matrix of this size or will I have to do something to each element? as the solve() function gives this error:
Error in solve.default(matX) : 'a' (8 x 3) must be square
The calculation I'm trying to achieve from the above matrix is: (matX'matX)^-1
Thanks in advance.
ginv ginv in the MASS package will give the generalized inverse of a matrix. Premultiplying the original matrix by it will give the identity:
library(MASS)
inv <- ginv(matX)
# test it out
inv %*% matX
## [,1] [,2] [,3]
## [1,] 1.000000e+00 6.661338e-16 4.440892e-15
## [2,] -8.326673e-17 1.000000e+00 -1.110223e-15
## [3,] 6.938894e-17 8.326673e-17 1.000000e+00
As suggested in the comments this can be displayed in a nicer way using zapsmall:
zapsmall(inv %*% matX)
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1
The inverse of matX'matX is now:
tcrossprod(inv)
## [,1] [,2] [,3]
## [1,] 0.513763935 -0.104219636 -0.002371406
## [2,] -0.104219636 0.038700372 -0.007798748
## [3,] -0.002371406 -0.007798748 0.006625269
solve however, if you aim is to calculate the inverse of matX'matX you don't need it in the first place. This does not involve MASS:
solve(crossprod(matX))
## [,1] [,2] [,3]
## [1,] 0.513763935 -0.104219636 -0.002371406
## [2,] -0.104219636 0.038700372 -0.007798748
## [3,] -0.002371406 -0.007798748 0.006625269
svd The svd could also be used and similarly does not require MASS:
with(svd(matX), v %*% diag(1/d^2) %*% t(v))
## [,1] [,2] [,3]
## [1,] 0.513763935 -0.104219636 -0.002371406
## [2,] -0.104219636 0.038700372 -0.007798748
## [3,] -0.002371406 -0.007798748 0.006625269
ADDED some additional info.
You can do what's called a "Moore–Penrose pseudoinverse". Here's a function exp.matthat will do this for you. There is also an example outlining it's use here.
exp.mat():
#The exp.mat function performs can calculate the pseudoinverse of a matrix (EXP=-1)
#and other exponents of matrices, such as square roots (EXP=0.5) or square root of
#its inverse (EXP=-0.5).
#The function arguments are a matrix (MAT), an exponent (EXP), and a tolerance
#level for non-zero singular values.
exp.mat<-function(MAT, EXP, tol=NULL){
MAT <- as.matrix(MAT)
matdim <- dim(MAT)
if(is.null(tol)){
tol=min(1e-7, .Machine$double.eps*max(matdim)*max(MAT))
}
if(matdim[1]>=matdim[2]){
svd1 <- svd(MAT)
keep <- which(svd1$d > tol)
res <- t(svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep]))
}
if(matdim[1]<matdim[2]){
svd1 <- svd(t(MAT))
keep <- which(svd1$d > tol)
res <- svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep])
}
return(res)
}
example of use:
source("exp.mat.R")
X <- matrix(c(rep(1, 8),2,3,4,0,6,4,3,7,-2,-4,3,5,7,8,9,11), nrow=8, ncol=3)
iX <- exp.mat(X, -1)
zapsmall(iX %*% X) # results in identity matrix
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1