Replace 0 value in covariance matrix (pmvnorm) - r

so I'm using pmvnorm and a cycle for, as the elements in the covariance matrix can change according to the value of some parameters:
y<-c(0,0,0,0,0,0,0,0,0,0)
....
library(mvtnorm)
mu=c(18,12.72,(18*(c-d)+12.72*f))
covariance=matrix(c(5.7,0,5.7*(c-d),0,30.38,30.38*f^2,5.7*(c-d),30.38*f,(5.7*(c-d)^2+30.38*f^2)),3)
H=c(15,-Inf,-Inf)
L=c(Inf,15,g)
for(i in 1:10)
y[i]=pmvnorm(mean=mu,sigma=covariance,lower=H,upper=L)
where c,d,f etc were already defined.
It works but,in some cases I have the third r.v that has 0 variance and it appears an error. Is it possible to replace in the covariance matrix 0 value with very small value (as 1e-06?)
Thank you

If you just want to replace 0s with a very small value (as 1e-06)
covariance = matrix(c(0,2,3,0), ncol = 2)
covariance[covariance == 0] <- 1e-06
covariance
If this doesnt help Pascal is right, some details about your parameter could be helpful to look in to the positive definiteness problem of your covariance matrices.

Related

Computing eigenvectors given shrinkage eigenvalues

I used the function linshrink of the nlshrink package to have a shrinkage estimation of the eigenvalues of a symmetric matrix M. Unfortunately the function does not return the eigenvectors, which I also need. How can I manually compute them? I thought about applying the definition and use (M − λI)x = 0 for every eigenvalue λ, but I'm not sure how to properly do it, since computing the matrix A = M − λI and using it as an input in solve(A,b) with b=rep(0,nrow(M)) obviously returns a vector of zero. Can anybody help me? Here are a few lines to provide a working example:
library(nlshrink)
M <- matrix(1:16,4)
M[lower.tri(M)] = t(M)[lower.tri(M)]
M <- M/16.1
shrinkval <- linshrink(M) #eigenvalues

Invert singular matrices in R

I am trying to grasp the basic concept of invertible and non-invertible matrices.
I created a random non-singular square matrix
S <- matrix(rnorm(100, 0, 1), ncol = 10, nrow = 10)
I know that this matrix is positive definite (thus invertible) because when I decompose the matrix S into its eigenvalues, their product is positive.
eig_S <- eigen(S)
eig_S$values
[1] 3.0883683+0.000000i -2.0577317+1.558181i -2.0577317-1.558181i 1.6884120+1.353997i 1.6884120-1.353997i
[6] -2.1295086+0.000000i 0.1805059+1.942696i 0.1805059-1.942696i -0.8874465+0.000000i 0.8528495+0.000000i
solve(S)
According to this paper, we can compute the inverse of a non-singular matrix by its SVD too.
Where
(where U and V are eigenvectors and D eigenvalues, please do correct me if I am wrong).
The inverse then is, .
Indeed, I can run the formula in R:
s <- svd(S)
s$v%*%solve(diag(s$d))%*%t(s$u)
Which produces exactly the same result as solve(S).
My first question is:
1) Are s$d indeed represent the eigenvalues of S? Because s$d and eig_S$values are quite different.
Now the second part,
If I create a singular matrix
I <- matrix(rnorm(100, 0, 1), ncol = 5, nrow = 20)
I <- I%*%t(I)
eig_I <- eigen(I)
eig_I$values
[1] 3.750029e+01 2.489995e+01 1.554184e+01 1.120580e+01 8.674039e+00 3.082593e-15 5.529794e-16 3.227684e-16
[9] 2.834454e-16 5.876634e-17 -1.139421e-18 -2.304783e-17 -6.636508e-17 -7.309336e-17 -1.744084e-16 -2.561197e-16
[17] -3.075499e-16 -4.150320e-16 -7.164553e-16 -3.727682e-15
The solve function will produce an error
solve(I)
system is computationally singular: reciprocal condition number =
1.61045e-19
So, again according to the same paper we can use the SVD
i <- svd(I)
solve(i$u %*% diag(i$d) %*% t(i$v))
which produces the same error.
Then I tried to use the Cholesky decomposition for matrix inversion
Conj(t(I))%*%solve(I%*%Conj(t(I)))
and again I get the same error.
Could someone please explain where am I using the equations wrong?
I know that for matrix I%*%Conj(t(I)), the determinant of the eigenvalue matrix is positive but the matrix is not a full rank due to the initial multiplication that I did.
j <- eigen(I%*%Conj(t(I)))
det(diag(j$values))
[1] 3.17708e-196
qr(I %*% Conj(t(I)))$rank
[1] 5
UPDATE 1: Following the comments bellow, and after going through the paper/Wikipedia page again. I used these two codes, which they produce some results but I am not sure about their validity. The first example seems more believable. The SVD solution
i$v%*%diag(1/i$d)%*%t(i$u)
and the Cholesky
Conj(t(I))%*%(I%*%Conj(t(I)))^(-1)
I am not sure if I interpreted the two sources correctly though.

Inverse of matrix and numerical integration in R

in R I try to
1) get a general form of an inverse of a matrix (I mean a matrix with parameters instead of specific numbers),
2) then use this to compute an integral.
I mean, I've got a P matrix with a parameter theta, I need to add and subtract something, then take an inverse of this and multiply it by a vector so that I am given a vector pil. From the vector pil I take term by term and multiply it by a function with again the parameter theta and the result must be integrated from 0 to infinity.
I tried this, but it didn't work because I know the result should be pst=
(0.3021034 0.0645126 0.6333840)
c<-0.1
g<-0.15
integrand1 <- function(theta) {
pil1 <- function(theta) {
P<-matrix(c(
1-exp(-theta), 1-exp(-theta),1-exp(-theta),exp(-theta),0,0,0,exp(-theta),exp(-theta)
),3,3);
pil<-(rep(1,3))%*%solve(diag(1,3)-P+matrix(1,3,3));
return(pil[[1]])
}
q<-pil1(theta)*(c^g/gamma(g)*theta^(g-1)*exp(-c*theta))
return(q)}
(pst1<-integrate(integrand1, lower = 0, upper = Inf)$value)
#0.4144018
This was just for the first term of the vector pst, because when I didn't know how to a for cycle for this.
Please, do you have any idea why it won't work and how to make it work?
Functions used in integrate should be vectorized as stated in the help.
At the end of your code add this
integrand2 <- Vectorize(integrand1)
integrate(integrand2, lower = 0, upper = Inf)$value
#[1] 0.3021034
The result is the first element of your expected result.
You will have to present more information about the input to get your expected vector.

How to create a matrix with probability distribution in R

I want to create a matrix in R with element [-1,0,1] with probability [1/6, 2/3, 1/6] respectively. The probability may change during runtime. for static probability I have got the output but the problem is dynamic change in the probability.
for example, If I create a matrix for the above probability with [sqrt(3),0,-sqrt(3)], the required output is.
Note: The Probability should not be static as mentioned. It may vary during runtime.
Kindly help to solve this.
Supposing you want a 2x3 matrix:
matrix(sample(c(-1,0,1), size=6, replace=TRUE, prob=c(1/6,2/3,1/6)), nrow=2)
So you sample from the values you want, with probabilities defined in prob. This is just a vector, but you can make it into a matrix of the desired shape using matrix afterwards. Replace the probabilities by a variable instead of values to not make it static.
If the numbers should be distributed according to a certain scheme rather than randomly drawn according to a probability, replicate the vector elements and shuffle them:
matrix(sample(rep(c(-1,0,1), times=c(1,4,1))), nrow=2)
You can try this to generate a mxn matrix:
sample.dynamic.matrix <- function(pop.symbols, probs, m, n) {
samples <- sample(pop.symbols, m*n, prob = probs, replace=TRUE)
return(matrix(samples, nrow=m))
}
set.seed(123)
sample.dynamic.matrix(-1:1, c(1/6,2/3,1/6), 2, 3)
# [,1] [,2] [,3]
#[1,] 0 0 -1
#[2,] 1 -1 0

Impossible to create correlated variables from this correlation matrix?

I would like to generate correlated variables specified by a correlation matrix.
First I generate the correlation matrix:
require(psych)
require(Matrix)
cor.table <- matrix( sample( c(0.9,-0.9) , 2500 , prob = c( 0.8 , 0.2 ) , repl = TRUE ) , 50 , 50 )
k=1
while (k<=length(cor.table[1,])){
cor.table[1,k]<-0.55
k=k+1
}
k=1
while (k<=length(cor.table[,1])){
cor.table[k,1]<-0.55
k=k+1
}
ind<-lower.tri(cor.table)
cor.table[ind]<-t(cor.table)[ind]
diag(cor.table) <- 1
This correlation matrix is not consistent, therefore, eigenvalue decomposition is impossible.
TO make it consistent I use nearPD:
c<-nearPD(cor.table)
Once this is done I generate the correlated variables:
fit<-principal(c, nfactors=50,rotate="none")
fit$loadings
loadings<-matrix(fit$loadings[1:50, 1:50],nrow=50,ncol=50,byrow=F)
loadings
cases <- t(replicate(50, rnorm(10)) )
multivar <- loadings %*% cases
T_multivar <- t(multivar)
var<-as.data.frame(T_multivar)
cor(var)
However the resulting correlations are far from anything that I specified initially.
Is it not possible to create such correlations or am I doing something wrong?
UPDATE from Greg Snow's comment it became clear that the problem is that my initial correlation matrix is unreasonable.
The question then is how can I make the matrix reasonable. The goal is:
each of the 49 variables should correlate >.5 with the first variable.
~40 of the variables should have a high >.8 correlation with each other
the remaining ~9 variables should have a low or negative correlation with each other.
Is this whole requirement impossible ?
Try using the mvrnorm function from the MASS package rather than trying to construct the variables yourself.
**Edit
Here is a matrix that is positive definite (so it works as a correlation matrix) and comes close to your criteria, you can tweak the values from there (all the Eigen values need to be positive, so you can see how changing a number affects things):
cor.mat <- matrix(0.2,nrow=50, ncol=50)
cor.mat[1,] <- cor.mat[,1] <- 0.55
cor.mat[2:41,2:41] <- 0.9
cor.mat[42:50, 42:50] <- 0.25
diag(cor.mat) <- 1
eigen(cor.mat)$values
Some numerical experimentation based on your specifications above suggests that the generated matrix will never (what never? well, hardly ever ...) be positive definite, but it also doesn't look far from PD with these values (making lcor below negative will almost certainly make things worse ...)
rmat <- function(n=49,nhcor=40,hcor=0.8,lcor=0) {
m <- matrix(lcor,n,n) ## fill matrix with 'lcor'
## select high-cor variables
hcorpos <- sample(n,size=nhcor,replace=FALSE)
## make all of these highly correlated
m[hcorpos,hcorpos] <- hcor
## compute min real part of eigenvalues
min(Re(eigen(m,only.values=TRUE)$values))
}
set.seed(101)
r <- replicate(1000,rmat())
## NEVER pos definite
max(r)
## [1] -1.069413e-15
par(las=1,bty="l")
png("eighist.png")
hist(log10(abs(r)),breaks=50,col="gray",main="")
dev.off()

Resources