Population matrix is given as below:
May I know if I want only first two Principal Components, PC1 and PC2, can I make the matrix as 2x2 matrix below to perform calculation of eigen vectors?
Related
The core of Principal Componenet Analysis (PCA) lies at calculating eigenvalues and eigenvectors from the variance-covariance matrix corresponding to some dataset (for example, a matrix of multivariate data coming from a set of individuals). Text-book knowledge I have is that:
a) by multipliying such eigenvectors with the original data matrix one can calculate "scores" (as many as orignal set of variables) which are independent from each other
b) the eigenvalues summarize the amount of variance of each score.
These two properties make this process a very effective data transformation technique to simplify the analysis of multivariate data.
My question is why is that so? why is calculating eigenvalues and eigenvectors from a covariance-variance matrix results in such unique properties of the scores?
1.This is a question from paper "Fast Generalized Eigenvector Tracking Based on the Power Method".
2.The author wrote "We generate two zero-mean Gaussian random vectors ,which have correlation matrices A and B whose eigenvalues are exponentially distributed".
3.But how to generate a zero-mean Gaussian random vector ,which have correlation matrices whose eigenvalues are exponentially distributed ,this confused me almost a week.
4.It seems that we could only use randn in MATLAB to generate random vector,
so the problem is how to make sure correlation matrices whose eigenvalues exponentially distributed at the same time?
Let S be a positive definite matrix. Therefore S has a Cholesky decomposition L.L' = S where L is a lower-triangular matrix and ' denotes the matrix transpose and . denotes matrix multiplication. Let x be drawn from a Gaussian distribution with mean zero and covariance equal to the identity matrix. Then y = L.x has a Gaussian distribution with mean zero and covariance S.
So if you can find suitable covariance matrices A and B, you can use their Cholesky decompositions to generate samples. Now about constructing a matrix which has eigenvalues following a given distribution. My advice is to start with a list of samples from an exponential distribution; these will be your eigenvalues. Let E = a matrix with the exponential samples on the diagonal and zeros otherwise. Let U be any unitary matrix (i.e. columns are orthogonal and norm of each column is 1). Then U.E.U' is a positive definite matrix with the specified eigenvalues.
U can be any unitary matrix. In particular U can be the identity matrix. That might make everything else simpler; you'll have to verify whether U = identity is workable for the problem you're working on.
Given an input matrix and a correlation Rho, I want to generate a random matrix that is correlated to the input matrix with a correlation value of Rho.
I can create random matrices through rnorm, but I'm not sure how to force this new matrix to be correlated to the original input matrix.
I looked through some other posts such as this but couldn't find what I was looking for. For example, this post Generating random correlation matrix with given average correlation looks to calculate a random matrix, but correlated to itself, not an input matrix.
I have a really big similarity matrix having 444 columns. I want to plot a heatmap or corrplot to compare different similarity matrices, but I can't use all the columns. I want to take a random sample of columns and then plot a heatmap, but I don't want to compute similarities again for this columns as it takes a lot of time for some similarity functions that I have. Any ideas how I could take a random sample of columns from similarity matrix (it has the same structure as correlation matrix) to plot a heatmap for them?
I've got a correlation matrix (say 3x3) and I'd like to extract the pairwise correlations and put them into a vector. That is, I'd like to go from the correlation matrix to:
corVec = c(rho_12, rho_13, rho_23)
I'd like to be able to do this for correlation matrices of any dimension.
The reason I'm doing this is because I'd like to construct a multivariate (elliptical) copula using the copula package with a random correlation matrix.
Thanks!
If the correlation matrix is rho then you can extract the pairwise correlations with:
rho[upper.tri(rho)]
Suppose you have a data.frame df1 with 3 columns.
rho=cor(df1) would make a 3x3 matrix.
To make a pairwise correlation "list" (data.frame):
require(reshape2)
rho[!upper.tri(rho)]=NA
rho=na.omit(melt(rho,value.name = 'cor'))
rho=rho[order(-rho$cor),]