covariance matrix - r

I was wondering if any one could explain to me how the geoR package calculates the covariance function? I mean how you would do it by hand?
library(geoR)
#suppose I have the following coordinates
X = c(60,30,20,40)
Y = c(50,20,50,50)
my_coordinates = cbind(X,Y)
print(my_coordinates)
#computing covariance
my_cov= varcov.spatial(my_coordinates,cov.model="exp", cov.pars=c(0.2,25))
print(my_cov)
And you get:
[,1] [,2] [,3] [,4]
[1,] 0.20000000 0.03664442 0.04037930 0.08986579
[2,] 0.03664442 0.20000000 0.05645288 0.05645288
[3,] 0.04037930 0.05645288 0.20000000 0.08986579
[4,] 0.08986579 0.05645288 0.08986579 0.20000000
However, one might want to do it in Matlab as well.

The best way to find out how a package or function does something is to look at the source code. This is one of the awesome things about open source projects, you can do this.
try typing varcov.spatial or searching through the unpacked package tar ball for the function definition
To calculate the covariance (which is dependant on the distance between points), you need to calculate
the distance between your points (you really only need the lower triangle, as it will be symmetric
The value of the covariance function at each distance
form the full symmetric variance covariance matrix from these calculated covariances.
The covariance functions are defined in ?cov.spatial. You can call cov.spatial to calculate these in R (exactly what geoR::varcov.spatial does)

Related

R principal component analysis interpreting the princomp() and eigen() functions for a non-square matrix

I'm trying to learn about and implement principal component analysis and study in particular how it relates to eigenvectors and eigenvalues and other things from linear algebra. Cross Validated has been helpful but I do have questions I haven't seen an answer for so far.
I've read online that eigenvalues and eigenvalues are for square matrices and singular value decomposition is like an extension of that for non-square matrices. Here is what I find on Google when I search the question:
Note. Eigenvalues and eigenvectors are only for square matrices. Eigenvectors are by definition nonzero. Eigenvalues may be equal to zero.
But if I take, for example, a selection from the mtcars dataset, by choosing only the first six rows but keeping all the observations, and then ask about the dimensions of this new dataset, I see that I have a m x n matrix that is a 32 x 6 matrix.
mtcars_selection <- mtcars %>%
dplyr::select(mpg:wt)
nrow(mtcars_selection) # 32
length(mtcars_selection) # 6
Now turning to principal component analysis, when I run these lines of code:
prcomp_attempt = stats::prcomp(mtcars_selection, scale = FALSE)
summary(prcomp_attempt)
I get the following as part of the output.
PC1 PC2 PC3 PC4 PC5 PC6
Standard deviation 136.5265 38.11828 3.04062 0.67678 0.36761 0.3076
Proportion of Variance 0.9272 0.07228 0.00046 0.00002 0.00001 0.0000
Cumulative Proportion 0.9272 0.99951 0.99997 0.99999 1.00000 1.0000
Similarly, when I change prcomp() to princomp() I get a similar output.
princomp_attempt = stats::princomp(mtcars_selection, scale = FALSE)
summary(princomp_attempt)
Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6
Standard deviation 134.3763479 37.51795949 2.9927313123 0.66612093849 0.361823645756 0.302784248968
Proportion of Variance 0.9272258 0.07228002 0.0004599126 0.00002278484 0.000006722546 0.000004707674
Cumulative Proportion 0.9272258 0.99950587 0.9999657849 0.99998856978 0.999995292326 1.000000000000
From ?prcomp() I see that the computation is done using singular value decomposition.
The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy.
And from ?princomp() I see:
The calculation is done using eigen on the correlation or covariance matrix, as determined by cor. This is done for compatibility with the S-PLUS result.
But doesn't this all mean that one of the code chunks above should work and one of them should not work? In particular, how did princomp() work if the matrix that went into the princomp() function is a non-square matrix?
Now when I take a look at the eigen() function on the covariance matrix, which is non-square, I get an output that looks like it only printed the first six rows.
eigen(cov(mtcars_selection))
I see in this particular output
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 0.038119604 -0.009201962 0.99282561 -0.057597550 0.094821246 0.021236444
[2,] -0.012035481 0.003373585 -0.06561936 -0.965667568 0.050583003 0.245897079
[3,] -0.899622021 -0.435427435 0.03153733 0.006753430 -0.006294596 -0.001825989
[4,] -0.434782990 0.900148911 0.02503332 0.006406853 0.004534789 -0.002722171
[5,] 0.002660336 0.003898813 0.03993024 0.187172744 -0.494914521 0.847590278
[6,] -0.006239859 -0.004861028 -0.08231475 0.170435844 0.862235306 0.469748461
In the eigen() and the princomp() functions, is the the data being conformed to a square matrix by slicing off the rows that are greater in number than the columns, so m = n?

compute gram matrix in R

I need some help to compute a nXn gram matrix K for a given kernel. Here is my R code that generates simulated data. I could be any positive definite matrix. Taking diagonal matrix for simplicity.
set.seed(3)
n=20
x=runif(n)
y=rnorm(n)
df<-cbind(x,y)
I=diag(2)
kernel<-function(x,y) {
t(x)%*%I%*%y
}
# for example
#K[1,1]
t(df[1,])%*%I%*%df[1,]
[,1]
[1,] 0.5829376
#K[1,2]
t(df[1,])%*%I%*%df[2,]
[,1]
[1,] 0.978207
Example in the case of a linear regression model for a database call data with a response Y and predictors X1 and X2:
#The regression model
model=lm(Y~X1+X2, data)
#Estimating residuals
r=model$res
#Estimating hat values
h=hatvalues(model)
#Computing Gramm matrix
d=r/(1-h)
#Estimating Gramm determinant (which summarizes the information in the Gramm matrix)
press=t(d)%*%d
round(press,2)
I hope, despite the delay, this may be useful for someone. Best regards.

Sign of eigenvectors change depending on specification of the symmetric argument for symmetric matrices

The signs of the eigenvectors in the eigen function change depending on the specification of the symmetric argument. Consider the following example:
set.seed(1234)
data <- matrix(rnorm(200),nrow=100)
cov.matrix <- cov(data)
vectors.1 <- eigen(cov.matrix,symmetric=TRUE)$vectors
vectors.2 <- eigen(cov.matrix,symmetric=FALSE)$vectors
#The second and third eigenvectors have opposite sign
all(vectors.1 == vectors.2)
FALSE
This also has implications for principal component analysis as the princomp function appears to calculate the eigenvectors for the covariance matrix using the eigen function with symmetric set to TRUE.
pca <- princomp(data)
#princomp uses vectors.1
pca$loadings
Loadings:
Comp.1 Comp.2
[1,] -0.366 -0.931
[2,] 0.931 -0.366
Comp.1 Comp.2
SS loadings 1.0 1.0
Proportion Var 0.5 0.5
Cumulative Var 0.5 1.0
vectors.1
[,1] [,2]
[1,] -0.3659208 -0.9306460
[2,] 0.9306460 -0.3659208
Can someone please explain the source or reasoning behind the discrepancy?
Eigenvectors remain eigenvectors after multiplication by a scalar (including -1).
The proof is simple:
If v is an eigenvector of matrix A with matching eigenvalue c, then by definition Av=cv.
Then, A(-v) = -(Av) = -(cv) = c(-v). So -v is also an eigenvector with the same eigenvalue.
The bottom line is that this does not matter and does not change anything.
If you want to change the sign of eigenvector elements, then simply ensure $\mathbf{1}^T\mathbf{e}>1$. In other words, sum all the elements in each eigenvector, and ensure the sum is greater than one. If not, change the sign of each element to the opposite sign. This is the trick to get the sign of eigenvector elements, principal components, and loadings in PCA to come out the same as most statistical software.
Linear algebra libraries like LAPACK contain multiple subroutines for carrying out operations like eigendecompositions. The particular subroutine used in any given case may depend on the type of matrix being decomposed, and the pieces of that decomposition needed by the user.
As you can see in this snippet from eigen's code, it dispatches different LAPACK subroutines depending on whether symmetric=TRUE or symmetric=FALSE (and also, on whether the matrix is real or complex).
if (symmetric) {
z <- if (!complex.x)
.Internal(La_rs(x, only.values))
else .Internal(La_rs_cmplx(x, only.values))
ord <- rev(seq_along(z$values))
}
else {
z <- if (!complex.x)
.Internal(La_rg(x, only.values))
else .Internal(La_rg_cmplx(x, only.values))
ord <- sort.list(Mod(z$values), decreasing = TRUE)
}
Based on pointers in ?eigen, La_rs() (used when symmetric=TRUE) appears to refer to dsyevr while La_rg() refers to dgeev.
To learn exactly why those two algorithms switch some of the signs of the eigenvectors of the matrix you've handed to eigen(), you'd have to dig into the FORTRAN code used to implement them. (Since, as others have noted, the sign is irrelevant, I'm guessing you won't want to dig quite that deep ;).

normalizing matrices in R

How do I normalize/scale matrices in R by column. For example, when I compute eigenvectors of a matrix, R returns:
> eigen(matrix(c(2,-2,-2,5),2,2))$vectors
[,1] [,2]
[1,] -0.4472136 -0.8944272
[2,] 0.8944272 -0.4472136
// should be normalized to
[,1] [,2]
[1,] -1 -2
[2,] 2 -1
The function "scale" subtracts the means and divided by standard deviation by column which does not help in this case. How do I achieve this?
This produces the matrix you say you want:
> a <- eigen(matrix(c(2,-2,-2,5),2,2))$vectors
> a / min(abs(a))
[,1] [,2]
[1,] -1 -2
[2,] 2 -1
But I'm not sure I understand exactly what you want, so this may not do the right thing in general.
Wolfram Alpha gives the following result:
http://www.wolframalpha.com/input/?i=eigenvalues{{2,-2},{-2,5}}
Input:
alt text http://www4a.wolframalpha.com/Calculate/MSP/MSP2019c09551ice7322c0000597gh9iecce8ce5a?MSPStoreType=image/gif&s=58&w=162&h=36
Eigenvalues:
alt text http://www4a.wolframalpha.com/Calculate/MSP/MSP2319c09551ice7322c00000d87ab28c27g8i27?MSPStoreType=image/gif&s=58&w=500&h=52
Eigenvectors:
alt text http://www4a.wolframalpha.com/Calculate/MSP/MSP2619c09551ice7322c00001c9hcg6e2bgiefgf?MSPStoreType=image/gif&s=58&w=500&h=64
I'm not sure what you're talking about with means and standard deviations. A good iterative method like QR should get you the eigenvalues and eigenvectors you need. Check out Jacobi or Householder.
You normalize any vector by dividing every component by the square root of the sum of squares of its components. A unit vector will have magnitude equal to one.
In your case this is true: the vectors being presented by R have been normalized. If you normalize the two Wolfram eigenvectors, you'll see that both have a magnitude equal to the square root of 5. Divide each column vector by this value and you'll get the ones given to you by R. Both are correct.

(all the) directions perpendicular to hyperplane through p data points

I have a simple question:
given p points (non-collinear) in R^p i find the hyperplane passing by these points (to help clarify i type everything in R):
p<-2
x<-matrix(rnorm(p^2),p,p)
b<-solve(crossprod(cbind(1,x[,-2])))%*%crossprod(cbind(1,x[,-2]),x[,2])
then, given a p+1^th points not collinear with first p points, i find the direction perpendicular to b:
x2<-matrix(rnorm(p),p,1)
b2<-solve(c(-b[-1],1)%*%t(c(-b[-1],1))+x2%*%t(x2))%*%x2
That is, b2 defines a p dimensional hyperplane perpendicular to b and passing by x2.
Now, my questions are:
The formula comes from my interpretation of this wikipedia entry ("solve(A)" is the R command for A^-1). Why this doesn't work for p>2 ? What am i doing wrong ?
PS: I have seen this post (on stakeoverflow edit:sorry cannot post more than one link) but somehow it doesn't help me.
Thanks in advance,
i have a problem implementation/understanding of Liu's solution when p>2:
shouldn't the dot product between the qr decomposition of the sweeped matrix and the direction of the hyperplane be 0 ? (i.e. if the qr vectors are perpendicular to the hyperplane)
i.e, when p=2 this
c(-b[2:p],1)%*%c(a1)
gives 0. When p>2 it does not.
Here is my attempt to implement Victor Liu's solution.
a) given p linearly independent observations in R^p:
p<-2;x<-matrix(rnorm(p^2),p,p);x
[,1] [,2]
[1,] -0.4634923 -0.2978151
[2,] 1.0284040 -0.3165424
b) stake them in a matrix and subtract the first row:
a0<-sweep(x,2,x[1,],FUN="-");a0
[,1] [,2]
[1,] 0.000000 0.00000000
[2,] 1.491896 -0.01872726
c) perform a QR decomposition of the matrix a0. The vector in the nullspace is the direction im looking for:
qr(a0)
[,1] [,2]
[1,] -1.491896 0.01872726
[2,] 1.000000 0.00000000
Indeed; this direction is the same as the one given by application of the formula from wikipedia (using x2=(0.4965321,0.6373157)):
[,1]
[1,] 2.04694853
[2,] -0.02569464
...with the advantage that it works in higher dimensions.
I have one last question: what is the meaning of the other p-1 (i.e. (1,0) here) QR vector when p>2 ?
-thanks in advance,
A p-1 dimensional hyperplane is defined by a normal vector and a point that the plane passes through:
n.(x-x0) = 0
where n is the normal vector of length p, x0 is a point through which the hyperplane passes, . is a dot product, and the equation must be satisfied for any point x on the plane. We can also write this as
n.x = p
where p = n.x0 is just a number. This is a more compact representation of a hyperplane, which is parameterized by (n,p). To find your hyperplane, suppose your points are x1, ..., xp.
Form a matrix A with p-1 rows and p columns as follows. The rows of p are xi-x1, laid out as rows vectors, for all i>1 (there are only p-1 of them). If your p points are not "collinear" as you say (they need to be affinely independent), then matrix A will have rank p-1, and a nullspace dimension of 1. The one vector in the nullspace is the normal vector of the hyperplane. Once you find it (call it n), then p = n.x1. In order to find the nullspace of a matrix, you can use a QR decomposition (see here for details).

Resources