I have a variance covariance matrix S:
> S
[,1] [,2]
[1,] 4 -3
[2,] -3 9
I am trying to find an inverse of it.
The code I have is:
>invS <- (1/((S[1,1]*S[2,2])-(S[1,2]*S[2,1])))*S
[,1] [,2]
[1,] 0.1481481 -0.1111111
[2,] -0.1111111 0.3333333
However, if I use solve(), I get this:
>invSalt <- solve(S)
[,1] [,2]
[1,] 0.3333333 0.1111111
[2,] 0.1111111 0.1481481
Why is invS incorrect? What should I change to correct it?
You correctly found the determinant in the denominator, but the rest is wrong.
Off-diagonal elements should be with the opposite sign, while the diagonal elements should be switched. Both of those things are clearly visible when comparing the two matrices.
That's not the most convenient thing to do by hand, so solve is really better. If you insist on doing it manually, then you could use
matrix(rev(S), 2, 2) / (prod(diag(S)) - S[1, 2] * S[2, 1]) * (2 * diag(1, 2) - 1)
# [,1] [,2]
# [1,] 0.3333333 0.1111111
# [2,] 0.1111111 0.1481481
The correct formula is
(1/((S[1,1]*S[2,2])-(S[1,2]*S[2,1])))* matrix(c(S[2,2], -S[2,1], -S[1,2], S[1,1]),2)
Related
It is normal when using standard matrix function and get the inversion of negative matrix by solve()
B <- matrix(c(2,1,1,2),nrow=2,ncol=2)
solve(-B)
[,1] [,2]
[1,] -0.6666667 0.3333333
[2,] 0.3333333 -0.6666667
However it produce the opposite matrix when defining the matrix using "Matrix" Package
C <- Matrix(c(2,1,1,2),nrow=2,ncol=2)
solve(-C)
2 x 2 Matrix of class "dsyMatrix"
[,1] [,2]
[1,] 0.6666667 -0.3333333
[2,] -0.3333333 0.6666667
now solve(-C) is the same as solve(C), Why?
In R 4.1.0, it gives the same output
> solve(-C)
2 x 2 Matrix of class "dsyMatrix"
[,1] [,2]
[1,] -0.6666667 0.3333333
[2,] 0.3333333 -0.6666667
> solve(-B)
[,1] [,2]
[1,] -0.6666667 0.3333333
[2,] 0.3333333 -0.6666667
There are multiple methods for solve
showMethods(solve)
Thanks all guys for helpful tips. I updated R from 4.0.3 to 4.1.0 and such issue has disappeared. It seems happens with the 4.0.3 version that Matrix::solve(a+bM) results inv(M) where M is defined by Matrix(data,now,ncol), in contrast to the standard small letters base::matrix(data,nrow,ncol).
I've just started learning r and have had trouble finding an (understandable) explanation of what the prop.table() function does. I found the following explanation and example:
prop.table: Express Table Entries as Fraction of Marginal Table
Examples
m <- matrix(1:4, 2)
m
prop.table(m, 1)
But, as a beginner, I do not understand what this explanation means. I've also attempted to discern its functionality from the result of the above example, but I haven't been able to make sense of it.
With reference to the example above, what does the prop.table() function do? Furthermore, what is a "marginal table"?
The values in each cell divided by the sum of the 4 cells:
prop.table(m)
The value of each cell divided by the sum of the row cells:
prop.table(m, 1)
The value of each cell divided by the sum of the column cells:
prop.table(m, 2)
I think this can help
include all those things like prop.table(m), prop.table(m, 1), prop.table(m, 2)
m <- matrix(1:4, 2)
> m
[,1] [,2]
[1,] 1 3
[2,] 2 4
> prop.table(m) #sum=1+2+3+4=10, 1/10=0.1, 2/10=0.2, 3/10=0.3,4/10=0.4
[,1] [,2]
[1,] 0.1 0.3
[2,] 0.2 0.4
> prop.table(m,1)
[,1] [,2]
[1,] 0.2500000 0.7500000 #row1: sum=1+3=4, m(0,0)=1/4=0.25, m(0,1)=3/4=0.75
[2,] 0.3333333 0.6666667 #row2: sum=2+4=6, m(1,0)=2/6=0.33, m(1,1)=4/6=0.66
> prop.table(m,2)
[,1] [,2]
[1,] 0.3333333 0.4285714 #col1: sum=1+2=3, m(0,0)=1/3=0.33, m(1,0)=2/3=0.4285
[2,] 0.6666667 0.5714286 #col2: sum=3+4=7, m(0,1)=3/7=0.66, m(1,1)=4/7=0.57
>
when m is the 2D matrix: (m,1) refers to a fraction of row marginal table (sum over each row), (m,2) refers to a fraction of column marginal table (sum over each column). In short, just a "% of total sum of row of column", if you dont want to care about the term marginal.
Example:
m with extra row and column margin
[,1] [,2] ***
[1,] 1 4 5
[2,] 2 5 7
[3,] 3 6 9
*** 6 15
> prop.table(m,1)
` [,1] [,2]
[1,] 0.2000000 0.8000000
[2,] 0.2857143 0.7142857
[3,] 0.3333333 0.6666667
> prop.table(m,2)
[,1] [,2]
[1,] 0.1666667 0.2666667
[2,] 0.3333333 0.3333333
[3,] 0.5000000 0.4000000
I am running into a scaling problem with several matrices I am working with. Here is an example of my matrix:
my_matrix = matrix(data = c(1,2,3,4,5,6,7,8,25), nrow = 3)
My current method of scaling between 0 to 1 is using the equation (value - min) / (max - min). Using this on the entire matrix gives the following:
mn = min(my_matrix); mx = max(my_matrix);
(my_matrix - mn) / (mx - mn)
[,1] [,2] [,3]
[1,] 0.00000000 0.1250000 0.2500000
[2,] 0.04166667 0.1666667 0.2916667
[3,] 0.08333333 0.2083333 1.0000000
I understand the calculation, as well as why I am receiving this matrix. However, I would much prefer to scale 0 to 1 based on percentiles, and receive this matrix instead:
[,1] [,2] [,3]
[1,] 0.11111111 0.4444444 0.7777778
[2,] 0.22222222 0.5555556 0.8888889
[3,] 0.33333333 0.6666667 1.0000000
Does anybody know an easy way to do this? Thanks!
I've been trying to reproduce a cholesky-like covariance decomposition in R - like it is done in Matlab using cholcov(). Example taken from https://uk.mathworks.com/help/stats/cholcov.html.
Result of the original cholcov() function as of their example:
T =
-0.2113 0.7887 -0.5774 0
0.7887 -0.2113 -0.5774 0
1.1547 1.1547 1.1547 1.7321
I am trying to replicate this T in R. I tried:
C1 <- cbind(c(2,1,1,2), c(1,2,1,2), c(1,1,2,2), c(2,2,2,3))
T1 <- chol(C1)
C2 <- t(T1) %*% T1
My result:
[,1] [,2] [,3] [,4]
[1,] 1.414214 0.7071068 0.7071068 1.414214e+00
[2,] 0.000000 1.2247449 0.4082483 8.164966e-01
[3,] 0.000000 0.0000000 1.1547005 5.773503e-01
[4,] 0.000000 0.0000000 0.0000000 1.290478e-08
C2 recovers C1, but T1 is quite different from MATLAB's solution. I then thought maybe it would be a Cholesky composition of the covariance matrix:
T1 <- chol(cov(C1))
but I get
[,1] [,2] [,3] [,4]
[1,] 0.5773503 0.0000000 0.0000000 2.886751e-01
[2,] 0.0000000 0.5773503 0.0000000 2.886751e-01
[3,] 0.0000000 0.0000000 0.5773503 2.886751e-01
[4,] 0.0000000 0.0000000 0.0000000 3.725290e-09
which is not right either.
Could anyone give me a hint how cholcov() in Matlab is calculated so that I could replicate it in R?
You are essentially abusing R function chol in this case. The cholcov function from MATLAB is a composite function.
If the covariance is positive, it does Cholesky factorization, returning a full-rank upper triangular Cholesky factor;
If the covariance is positive-semidefinite, it does Eigen decomposition, returning a rectangular matrix.
On the other hand, chol from R only does Choleksy factorization. The example you give, C1, falls into the second case. So, we should resort to eigen function in R.
E <- eigen(C1, symmetric = TRUE)
#$values
#[1] 7.000000e+00 1.000000e+00 1.000000e+00 2.975357e-17
#
#$vectors
# [,1] [,2] [,3] [,4]
#[1,] -0.4364358 0.000000e+00 8.164966e-01 -0.3779645
#[2,] -0.4364358 -7.071068e-01 -4.082483e-01 -0.3779645
#[3,] -0.4364358 7.071068e-01 -4.082483e-01 -0.3779645
#[4,] -0.6546537 8.967707e-16 -2.410452e-16 0.7559289
V <- E$vectors
D <- sqrt(E$values) ## root eigen values
Since numerical rank is 3, we drop the last eigen value and eigen vector:
V1 <- V[, 1:3]
D1 <- D[1:3]
Thus the factor you want is:
R <- D1 * t(V1) ## diag(D1) %*% t(V1)
# [,1] [,2] [,3] [,4]
#[1,] -1.1547005 -1.1547005 -1.1547005 -1.732051e+00
#[2,] 0.0000000 -0.7071068 0.7071068 8.967707e-16
#[3,] 0.8164966 -0.4082483 -0.4082483 -2.410452e-16
We can verify that:
crossprod(R) ## t(R) %*% R
# [,1] [,2] [,3] [,4]
#[1,] 2 1 1 2
#[2,] 1 2 1 2
#[3,] 1 1 2 2
#[4,] 2 2 2 3
The R factor above is not as same as the one returned by cholcov due to different algorithms used for Eigen factorization. R uses LAPACK routine DSYVER in which some pivoting is done so that eigen values are non-increasing. MATLAB's cholcov is not open-source, so I'm not sure what algorithm it uses. But it is easy to demonstrate that it does not arrange eigen values in non-increasing order.
Consider the factor T returned by cholcov:
T <- structure(c(-0.2113, 0.7887, 1.1547, 0.7887, -0.2113, 1.1547,
-0.5774, -0.5774, 1.1547, 0, 0, 1.7321), .Dim = 3:4)
We can get eigen values by
rowSums(T ^ 2)
# [1] 1.000086 1.000086 7.000167
There are some round-off error because T is not precise, but we can see clearly that eigen values are 1, 1, 7. On the other hand, we have 7, 1, 1 from R (recall D1).
How to solve a non-square linear system with R : A X = B ?
(in the case the system has no solution or infinitely many solutions)
Example :
A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
B=matrix(c(-17,28,11),3,1,T)
A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
B
[,1]
[1,] -17
[2,] 28
[3,] 11
If the matrix A has more rows than columns, then you should use least squares fit.
If the matrix A has fewer rows than columns, then you should perform singular value decomposition. Each algorithm does the best it can to give you a solution by using assumptions.
Here's a link that shows how to use SVD as a solver:
http://www.ecse.rpi.edu/~qji/CV/svd_review.pdf
Let's apply it to your problem and see if it works:
Your input matrix A and known RHS vector B:
> A=matrix(c(0,1,-2,3,5,-3,1,-2,5,-2,-1,1),3,4,T)
> B=matrix(c(-17,28,11),3,1,T)
> A
[,1] [,2] [,3] [,4]
[1,] 0 1 -2 3
[2,] 5 -3 1 -2
[3,] 5 -2 -1 1
> B
[,1]
[1,] -17
[2,] 28
[3,] 11
Let's decompose your A matrix:
> asvd = svd(A)
> asvd
$d
[1] 8.007081e+00 4.459446e+00 4.022656e-16
$u
[,1] [,2] [,3]
[1,] -0.1295469 -0.8061540 0.5773503
[2,] 0.7629233 0.2908861 0.5773503
[3,] 0.6333764 -0.5152679 -0.5773503
$v
[,1] [,2] [,3]
[1,] 0.87191556 -0.2515803 -0.1764323
[2,] -0.46022634 -0.1453716 -0.4694190
[3,] 0.04853711 0.5423235 0.6394484
[4,] -0.15999723 -0.7883272 0.5827720
> adiag = diag(1/asvd$d)
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0.00000e+00
[2,] 0.0000000 0.2242431 0.00000e+00
[3,] 0.0000000 0.0000000 2.48592e+15
Here's the key: the third eigenvalue in d is very small; conversely, the diagonal element in adiag is very large. Before solving, set it equal to zero:
> adiag[3,3] = 0
> adiag
[,1] [,2] [,3]
[1,] 0.1248895 0.0000000 0
[2,] 0.0000000 0.2242431 0
[3,] 0.0000000 0.0000000 0
Now let's compute the solution (see slide 16 in the link I gave you above):
> solution = asvd$v %*% adiag %*% t(asvd$u) %*% B
> solution
[,1]
[1,] 2.411765
[2,] -2.282353
[3,] 2.152941
[4,] -3.470588
Now that we have a solution, let's substitute it back to see if it gives us the same B:
> check = A %*% solution
> check
[,1]
[1,] -17
[2,] 28
[3,] 11
That's the B side you started with, so I think we're good.
Here's another nice SVD discussion from AMS:
http://www.ams.org/samplings/feature-column/fcarc-svd
Aim is to solve Ax = b
where A is p by q, x is q by 1 and b is p by 1 for x given A and b.
Approach 1: Generalized Inverse: Moore-Penrose
https://en.wikipedia.org/wiki/Generalized_inverse
Multiplying both sides of the equation, we get
A'Ax = A' b
where A' is the transpose of A. Note that A'A is q by q matrix now. One way to solve this now multiply both sides of the equation by the inverse of A'A. Which gives,
x = (A'A)^{-1} A' b
This is the theory behind generalized inverse. Here G = (A'A)^{-1} A' is pseudo-inverse of A.
library(MASS)
ginv(A) %*% B
# [,1]
#[1,] 2.411765
#[2,] -2.282353
#[3,] 2.152941
#[4,] -3.470588
Approach 2: Generalized Inverse using SVD.
#duffymo used SVD to obtain a pseudoinverse of A.
However, last elements of svd(A)$d may not be quite as small as in this example. So, probably one shouldn't use that method as is. Here's an example where none of the last elements of A is close to zero.
A <- as.matrix(iris[11:13, -5])
A
# Sepal.Length Sepal.Width Petal.Length Petal.Width
# 11 5.4 3.7 1.5 0.2
# 12 4.8 3.4 1.6 0.2
# 13 4.8 3.0 1.4 0.1
svd(A)$d
# [1] 10.7820526 0.2630862 0.1677126
One option would be to look as the singular values in cor(A)
svd(cor(A))$d
# [1] 2.904194e+00 1.095806e+00 1.876146e-16 1.155796e-17
Now, it is clear there is only two large singular values are present. So, one now can apply svd on A to get pseudo-inverse as below.
svda <- svd(A)
G = svda$v[, 1:2] %*% diag(1/svda$d[1:2]) %*% t(svda$u[, 1:2])
# to get x
G %*% B