log covariance to arithmetic covariance matrix function? - r

Is there a function that can convert a covariance matrix built using log-returns into a covariance matrix based on simple arithmetic returns?
Motivation: We'd like to use a mean-variance utility function where expected returns and variance is specified in arithmetic terms. However, estimating returns and covariances is often performed with log-returns because of the additivity property of log returns, and we assume asset prices follow a lognormal stochastic process.
Meucci describes a process to generate a arithmetic-returns based covariance matrix for a generic/arbitrary distribution of lognormal returns on Appendix page 5.

Here's my translation of the formulae:
linreturn <- function(mu,Sigma) {
m <- exp(mu+diag(Sigma)/2)-1
x1 <- outer(mu,mu,"+")
x2 <- outer(diag(Sigma),diag(Sigma),"+")/2
S <- exp(x1+x2)*(exp(Sigma)-1)
list(mean=m,vcov=S)
}
edit: fixed -1 issue based on comments.
Try an example:
m1 <- c(1,2)
S1 <- matrix(c(1,0.2,0.2,1),nrow=2)
Generate multivariate log-normal returns:
set.seed(1001)
r1 <- exp(MASS::mvrnorm(200000,mu=m1,Sigma=S1))-1
colMeans(r1)
## [1] 3.485976 11.214211
var(r1)
## [,1] [,2]
## [1,] 34.4021 12.4062
## [2,] 12.4062 263.7382
Compare with expected results from formulae:
linreturn(m1,S1)
## $mean
## [1] 3.481689 11.182494
## $vcov
## [,1] [,2]
## [1,] 34.51261 12.08818
## [2,] 12.08818 255.01563

Related

Simulating data from multivariate distribution in R based on Winbugs/JAGS script

I am trying to simulate data, based on part of a JAGS/Winbugs script. The script comes from Eaves & Erkanli (2003, see, http://psych.colorado.edu/~carey/pdffiles/mcmc_eaves.pdf, page 295-296).
The (part of) the script I want to base my simulations on is as follows (different variable names than in the original paper):
for(fam in 1 : nmz ){
a2mz[fam, 1:N] ~ dmnorm(mu[1:N], tau.a[1:N, 1:N])
a1mz[fam, 1:N] ~ dmnorm(a2mz[fam, 1:N], tau.a[1:N, 1:N])
}
#Prior
tau.a[1:N, 1:N] ~ dwish(omega.g[,], N)
I want to simulate data in R for the parameters a2mz and a1mz as given in the script above.
So basically, I want to simualte data from -N- (e.g. = 3) multivariate distributions with -fam- (e.g. 10) persons with sigma tau.a.
To make this more illustrative: The purpose is to simulate genetic effects for -fam- (e.g. 10) families. The genetic effect is the same for each family (e.g. monozygotic twins), with a variance of tau.a (e.g. 0.5). Of these genetic effects, 3 'versions' (3 multivariate distributions) have to be simulated.
What I tried in R to simulate the data as given in the JAGS/Winbugs script is as follows:
library(MASS)
nmz = 10 #number of families, here e.g. 10
var_a = 0.5 #tau.g in the script
a2_mz <- mvrnorm(3, mu = rep(0, nmz), Sigma = diag(nmz)*var_a)
This simulates data for the a2mz parameter as referred to in the JAGS/Winbugs script above:
> print(t(a2_mz))
[,1] [,2] [,3]
[1,] -1.1563683 -0.4478091 -0.15037563
[2,] 0.5673873 -0.7052487 0.44377336
[3,] 0.2560446 0.9901964 -0.65463341
[4,] -0.8366952 0.4924839 -0.56891991
[5,] 0.7343780 0.5429955 0.87529201
[6,] 0.5592868 -0.3899988 -0.33709105
[7,] -1.8233663 -0.7149141 -0.18153049
[8,] -0.8213804 -1.4397075 -0.09159725
[9,] -0.7002797 -0.3996970 -0.29142215
[10,] 1.1084067 0.3884869 -0.46207940
However, when I then try to use these data to simulate data for the a1mz (third line of the JAGS/Winbugs) script, then something goes wrong and I am not sure what:
a1_mz <- mvrnorm(3, mu = t(a2_mz), Sigma = c(diag(nmz)*var_a, diag(nmz)*var_a, diag(nmz)*var_a))
This results in the error:
Error in eigen(Sigma, symmetric = TRUE, EISPACK = EISPACK) :
non-square matrix in 'eigen'
Can anyone give me any hints or tips on what I am doing wrong?
Many thanks,
Best regards,
inga
mvrnorm() takes a mean-vector and a variance matrix as input, and that's not what you're feeding it. I'm not sure I understand your question, but if you want to simulate 3 samples from 3 different multivariate normal distributions with same variance and different mean. Then just use:
a1_mz<-array(dim=c(dim(a2_mz),3))
for(i in 1:3) a1_mz[,,i]<-mvrnorm(3,t(a2_mz)[,i],diag(nmz)*var_a)

R eigenvalues/eigenvectors

I have this correlation matrix
A
[,1] 1.00000 0.00975 0.97245 0.43887 0.02241
[,2] 0.00975 1.00000 0.15428 0.69141 0.86307
[,3] 0.97245 0.15428 1.00000 0.51472 0.12193
[,4] 0.43887 0.69141 0.51472 1.00000 0.77765
[,5] 0.02241 0.86307 0.12193 0.77765 1.00000
And I need to get the eigenvalues, eigenvectors and loadings in R.
When I use the princomp(A,cor=TRUE) function I get the variances(Eigenvalues)
but when I use the eigen(A) function I get the Eigenvalues and Eigenvectors, but the Eigenvalues in this case are different than when I use the Princomp-function..
Which function is the right one to get the Eigenvalues?
I believe you are referring to a PCA analysis when you talk of eigenvalues, eigenvectors and loadings. prcomp is essentially doing the following (when cor=TRUE):
###Step1
#correlation matrix
Acs <- scale(A, center=TRUE, scale=TRUE)
COR <- (t(Acs) %*% Acs) / (nrow(Acs)-1)
COR ; cor(Acs) # equal
###STEP 2
# Decompose matrix using eigen() to derive PC loadings
E <- eigen(COR)
E$vectors # loadings
E$values # eigen values
###Step 3
# Project data on loadings to derive new coordinates (principal components)
B <- Acs %*% E$vectors
eigen(M) gives you the correct eigen values and vectors of M.
princomp() is to be handed the data matrix - you are mistakenly feeding it the correlation matrix!
princomp(A,) will treat A as the data and then come up with a correlation matrix and its eigen vectors and values. So the eigen values of A (in case A holds the data as supposed) are not just irrelevant they are of course different from what princomp() comes up with at the end.
For an illustration of performing a PCA in R see here: http://www.joyofdata.de/blog/illustration-of-principal-component-analysis-pca/

Numerical derivatives of an arbitrarily defined function

I would like to find numerical derivatives of a bivariate function.
The function is defined by myself
I need first derivatives with respect to each argument and cross second derivative
Is there a package or built-in function to do this?
Install and load the numDeriv package.
library(numDeriv)
f <- function(x) {
a <- x[1]; b <- x[2]; c <- x[3]
sin(a^2*(abs(cos(b))^c))
}
grad(f,x=1:3)
## [1] 0.14376097 0.47118519 -0.06301885
hessian(f,x=1:3)
## [,1] [,2] [,3]
## [1,] 0.1422651 0.9374675 -0.12538196
## [2,] 0.9374675 1.8274058 -0.25388515
## [3,] -0.1253820 -0.2538852 0.05496226
(My example is trivariate rather than bivariate, but it will obviously work for a bivariate function as well.) See the help pages for more information on how the gradient and especially Hessian computations are done.

How to compute the power of a matrix in R [duplicate]

This question already has answers here:
A^k for matrix multiplication in R?
(6 answers)
Closed 9 years ago.
I'm trying to compute the -0.5 power of the following matrix:
S <- matrix(c(0.088150041, 0.001017491 , 0.001017491, 0.084634294),nrow=2)
In Matlab, the result is (S^(-0.5)):
S^(-0.5)
ans =
3.3683 -0.0200
-0.0200 3.4376
> library(expm)
> solve(sqrtm(S))
[,1] [,2]
[1,] 3.36830328 -0.02004191
[2,] -0.02004191 3.43755429
After some time, the following solution came up:
"%^%" <- function(S, power)
with(eigen(S), vectors %*% (values^power * t(vectors)))
S%^%(-0.5)
The result gives the expected answer:
[,1] [,2]
[1,] 3.36830328 -0.02004191
[2,] -0.02004191 3.43755430
The square root of a matrix is not necessarily unique (most real numbers have at least 2 square roots, so it is not just matricies). There are multiple algorithms for generating a square root of a matrix. Others have shown the approach using expm and eigenvalues, but the Cholesky decomposition is another possibility (see the chol function).
To extend this answer beyond square roots, the following function exp.mat() generalizes the "Moore–Penrose pseudoinverse" of a matrix and allows for one to calculate the exponentiation of a matrix via a Singular Value Decomposition (SVD) (even works for non square matrices, although I don't know when one would need that).
exp.mat() function:
#The exp.mat function performs can calculate the pseudoinverse of a matrix (EXP=-1)
#and other exponents of matrices, such as square roots (EXP=0.5) or square root of
#its inverse (EXP=-0.5).
#The function arguments are a matrix (MAT), an exponent (EXP), and a tolerance
#level for non-zero singular values.
exp.mat<-function(MAT, EXP, tol=NULL){
MAT <- as.matrix(MAT)
matdim <- dim(MAT)
if(is.null(tol)){
tol=min(1e-7, .Machine$double.eps*max(matdim)*max(MAT))
}
if(matdim[1]>=matdim[2]){
svd1 <- svd(MAT)
keep <- which(svd1$d > tol)
res <- t(svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep]))
}
if(matdim[1]<matdim[2]){
svd1 <- svd(t(MAT))
keep <- which(svd1$d > tol)
res <- svd1$u[,keep]%*%diag(svd1$d[keep]^EXP, nrow=length(keep))%*%t(svd1$v[,keep])
}
return(res)
}
Example
S <- matrix(c(0.088150041, 0.001017491 , 0.001017491, 0.084634294),nrow=2)
exp.mat(S, -0.5)
# [,1] [,2]
#[1,] 3.36830328 -0.02004191
#[2,] -0.02004191 3.43755429
Other examples can be found here.

Using R to honor correlations for LatinHypercube / Monte Carlo trials

I am currently using python and RPY to use the functionality inside R.
How do I use R library to generate Monte carlo samples that honor the correlation between 2 variables..
e.g
if variable A and B have a correlation of 85% (0.85), i need to generate all the monte carlo samples honoring that correlation between A & B.
Would appreciate if anyone can share ideas / snippets
Thanks
The rank correlation method of Iman and Conover seems to be a widely used and general approach to producing correlated monte carlo samples for computer based experiments, sensitivity analysis etc. Unfortunately I have only just come across this and don't have access to the PDF so don't know how the authors actually implement their method, but you could follow this up.
Their method is more general because each variable can come from a different distribution unlike the multivariate normal of #Dirk's answer.
Update: I found an R implementation of the above approach in package mc2d, in particular you want the cornode() function.
Here is an example taken from ?cornode
> require(mc2d)
> x1 <- rnorm(1000)
> x2 <- rnorm(1000)
> x3 <- rnorm(1000)
> mat <- cbind(x1, x2, x3)
> ## Target
> (corr <- matrix(c(1, 0.5, 0.2, 0.5, 1, 0.2, 0.2, 0.2, 1), ncol=3))
[,1] [,2] [,3]
[1,] 1.0 0.5 0.2
[2,] 0.5 1.0 0.2
[3,] 0.2 0.2 1.0
> ## Before
> cor(mat, method="spearman")
x1 x2 x3
x1 1.00000000 0.01218894 -0.02203357
x2 0.01218894 1.00000000 0.02298695
x3 -0.02203357 0.02298695 1.00000000
> matc <- cornode(mat, target=corr, result=TRUE)
Spearman Rank Correlation Post Function
x1 x2 x3
x1 1.0000000 0.4515535 0.1739153
x2 0.4515535 1.0000000 0.1646381
x3 0.1739153 0.1646381 1.0000000
The rank correlations in matc are now very close to the target correlations of corr.
The idea with this is that you draw the samples separately from the distribution for each variable, and then use the Iman & Connover approach to make the samples (as close) to the target correlations as possible.
That is a FAQ. Here is one answer using a recommended package:
R> library(MASS)
R> example(mvrnorm)
mvrnrmR> Sigma <- matrix(c(10,3,3,2),2,2)
mvrnrmR> Sigma
[,1] [,2]
[1,] 10 3
[2,] 3 2
mvrnrmR> var(mvrnorm(n=1000, rep(0, 2), Sigma))
[,1] [,2]
[1,] 8.82287 2.63987
[2,] 2.63987 1.93637
mvrnrmR> var(mvrnorm(n=1000, rep(0, 2), Sigma, empirical = TRUE))
[,1] [,2]
[1,] 10 3
[2,] 3 2
R>
Switching between correlation and covariance is straightforward (hint: outer product of vector of standard deviations).
This question was not tagged as python, but based on your comment it looks like you might be looking for a Python solution as well. The most basic Python implementation of Iman Convover, that I can concoct looks like the following in Python (actually numpy):
def makeCorrelated( y, corMatrix ):
c = multivariate_normal(zeros(size( y, 0 ) ) , corMatrix, size( y, 1 ) )
key = argsort( argsort(c, axis=0), axis=0 ).T
out = map(take, map(sort, y), key)
out = array(out)
return out
where y is an array of samples from the marginal distributions and corMatrix is a positive semi definite, symmetric correlation matrix. Given that this function uses multivariate_normal() for the c matrix, you can tell this uses an implied Gaussian Copula. To use different copula structures you'll need to use different drivers for the c matrix.

Resources