I found out that boot function of the boot package is not working with complex numbers. I am trying to bootstrap a data by taking the eigenvalue of the bivariate matrix. The problem with the eigenvalue is that, it often returns complex numbers, and by that it (boot) gives error. Is there a way to avoid complex numbers?
Here is my codes,
Data <- read.table('http://ubuntuone.com/6n1igcHXq4EnOm4x2zeqFb', header = FALSE)
Mat <- cbind(Data[["V1"]],Data[["V2"]])
Data.ts <- as.ts(Mat)
Below are some functions needed,
library(mvtnorm)
var1.sim <- function(T, n.start=100, phi1=matrix(c(0.7,0.2,0.2,0.7),nr=2),
err.mu=c(0,0), err.sigma2=matrix(c(1,0.5,0.5,1), nr=2),
errors=NULL) {
e <- rmvnorm(n.start + T, err.mu, err.sigma2) # (n.start+T) x 2 matrix
y <- matrix(0, nrow=n.start+T, ncol=2)
if (!is.null(errors) && is.matrix(errors) && ncol(errors) == 2) {
rows <- nrow(errors)
if (rows < n.start + T) {
# replace last nrow(errors) errors
e[seq.int(n.start+T-rows+1,n.start+T),] <- errors
} else {
e <- errors[seq.int(n.start+T+1, rows)]
}
}
for (t in seq.int(2, n.start + T)) {
y[t,] <- phi1 %*% y[t-1,] + e[t,]
}
return(ts(y[seq.int(n.start+1,n.start+T),]))
}
########
coef.var1 <- function(var.fit) {
k <- coef(var.fit)
rbind(k[[1]][,"Estimate"], k[[2]][,"Estimate"])
}
And here is the main method,
library(vars)
library(boot)
y.var <- VAR(Data.ts, p=1, type="none")
y.resid <- resid(y.var)
rm(y.boot)
y.boot <- boot(y.resid, R=100, statistic=function(x,i) {
resid.boot <- x[i,]
y.boot1 <- var1.sim(T=nrow(x), errors=resid.boot)
min(eigen(coef.var1(VAR(y.boot1, p=1, type="none")))$values)
}, stype="i")
y.boot$t
y.ci <- boot.ci(y.boot, type="norm", conf=0.95)$normal[2:3]
list(t=y.boot$t,ci=y.ci)
The problem occurs in y.boot object, particularly this line
min(eigen(coef.var1(VAR(y.boot1, p=1, type="none")))$values)
When the obtain minimum eigenvalue is complex, then boot will return this error
Error in min(eigen(coef.var1(VAR(y.boot1, p = 1, type = "none")))$values) :
invalid 'type' (complex) of argument
Otherwise, there is no problem. Now, it would be safe if this 100 bootstraps is performed once, but I am going to loop this actually about 100 times too. So, there is a big chance that complex values will occur in these loops. Hence, we will obtain the above error again.
Is there a way to avoid these complex values?
Related
Environments and the like have always confused me incredibly in R. I guess therefore this is more of a reference request, since I've been surfing the site for the last hour in search of an answer to no avail.
I have a simple R function called target defined as follows
target <- function(x,scale,shape){
s <- scale
b <- shape
value <- 0.5*(sin(s*x)^b + x + 1)
return(value)
}
I then define the function AR
AR <- function(n,f,...){
variates <- NULL
for(i in 1:n){
z <- runif(1)
u <- runif(1)
if(u < f(z, scale, shape)/c){
variates[i] <- z
}else{next}
}
variates <- variates[!is.na(variates)]
return(variates)
}
in which the function target is being evaluated. Unfortunately, the call returns the following error
sample <- AR(n = 10000, f = target, shape = 8, scale = 5)
Error in fun(z, scale, shape) : object 'shape' not found
I know this has to do with the function AR not knowing where to look for the objects shape and scale, but I thought that was exactly the job of the ellipsis: allowing me to sort of put argument definition "on hold" until one actually calls the function. Where am I wrong and could anyone give me a lead as to where to look for insight on this specific problem?
You are very close, you just need to make use of your ellipses...
NB: c was not defined in AR so I added it and gave it a value.
NB2: I would refrain from using c and sample in your function as these themselves are functions and could cause some confusion downt he road.
AR <- function(n, f, c, ...){
variates <- NULL
for(i in 1:n){
z <- runif(1)
u <- runif(1)
if(u < f(z, ...)/c){ ##instead of using shape and scale use the ellipses and R will insert any parameters here which were not defined in the function
variates[i] <- z
}else{next}
}
variates <- variates[!is.na(variates)]
return(variates)
}
sample <- AR(n = 10000, f = target, shape = 8, scale = 5, c = 100)
Introduction to the problem
I am trying to write down a code in R so to obtain the weights of an Equally-Weighted Contribution (ERC) Portfolio. As some of you may know, the portfolio construction was presented by Maillard, Roncalli and Teiletche.
Skipping technicalities, in order to find the optimal weights of an ERC portfolio one needs to solve the following Sequential Quadratic Programming problem:
with:
Suppose we are analysing N assets. In the above formulas, we have that x is a (N x 1) vector of portfolio weights and Σ is the (N x N) variance-covariance matrix of asset returns.
What I have done so far
Using the function slsqp of the package nloptr which solves SQP problems, I would like to solve the above minimisation problem. Here is my code. Firstly, the objective function to be minimised:
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
}
Secondly, the starting point (we start by an equally-weighted portfolio):
x0 <- matrix(1/N, nrow = N, ncol = 1)
Then, the equality constraint (weights must sum to one, that is: sum of the weights minus one equal zero):
heqERC <- function (x) {
h <- numeric(1)
h[1] <- (t(matrix(1, nrow = N, ncol = 1)) %*% x) - 1
return(h)
}
Finally, the lower and upper bounds constraints (weights cannot exceed one and cannot be lower than zero):
lowerERC <- matrix(0, nrow = N, ncol = 1)
upperERC <- matrix(1, nrow = N, ncol = 1)
So that the function which should output optimal weights is:
slsqp(x0 = x0, fn = ObjFuncERC, Sigma = Sigma, lower = lowerERC, upper = upperERC, heq = heqERC)
Unfortunately, I do not know how to share with you my variance-covariance matrix (which takes name Sigma and is a (29 x 29) matrix, so that N = 29) so to reproduce my result, still you can simulate one.
The output error
Running the above code yields the following error:
Error in nl.grad(x, fn) :
Function 'f' must be a univariate function of 2 variables.
I have no idea what to do guys. Probably, I have misunderstood how things must be written down in order for the function slsqp to understand what to do. Can someone help me understand how to fix the problem and get the result I want?
UPDATE ONE: as pointed out by #jogo in the comments, I have updated the code, but it still produces an error. The code and the error above are now updated.
UPDATE 2: as requested by #jaySf, here is the full code that allows you to reproduce my error.
## ERC Portfolio Test
# Preliminary Operations
rm(list=ls())
require(quantmod)
require(nloptr)
# Load Stock Data in R through Yahoo! Finance
stockData <- new.env()
start <- as.Date('2014-12-31')
end <- as.Date('2017-12-31')
tickers <-c('AAPL','AXP','BA','CAT','CSCO','CVX','DIS','GE','GS','HD','IBM','INTC','JNJ','JPM','KO','MCD','MMM','MRK','MSFT','NKE','PFE','PG','TRV','UNH','UTX','V','VZ','WMT','XOM')
getSymbols.yahoo(tickers, env = stockData, from = start, to = end, periodicity = 'monthly')
# Create a matrix containing the price of all assets
prices <- do.call(cbind,eapply(stockData, Op))
prices <- prices[-1, order(colnames(prices))]
colnames(prices) <- tickers
# Compute Returns
returns <- diff(prices)/lag(prices)[-1,]
# Compute variance-covariance matrix
Sigma <- var(returns)
N <- 29
# Set up the minimization problem
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
}
x0 <- matrix(1/N, nrow = N, ncol = 1)
heqERC <- function (x) {
h <- numeric(1)
h[1] <- t(matrix(1, nrow = N, ncol = 1)) %*% x - 1
}
lowerERC <- matrix(0, nrow = N, ncol = 1)
upperERC <- matrix(1, nrow = N, ncol = 1)
slsqp(x0 = x0, fn = ObjFuncERC, Sigma = Sigma, lower = lowerERC, upper = upperERC, heq = heqERC)
I spotted several mistakes in your code. For instance, ObjFuncERC is not returning any value. You should use the following instead:
# Set up the minimization problem
ObjFuncERC <- function (x, Sigma) {
sum <- 0
R <- Sigma %*% x
for (i in 1:N) {
for (j in 1:N) {
sum <- sum + (x[i]*R[i] - x[j]*R[j])^2
}
}
sum
}
heqERC doesn't return anything too, I also changed your function a bit
heqERC <- function (x) {
sum(x) - 1
}
I made those changes and tried slsqp without lower and upper and it worked. Still, another thing to consider is that you set lowerERC and upperERC as matrices. Use the following instead:
lowerERC <- rep(0,N)
upperERC <- rep(1,N)
Hope this helps.
I have a dataset of climate data in a data.frame (columns are measuring stations, and rows indicate time of measurement), and I'm trying to find the proper lambda values in a Yeo-Johnson transform to limit skewness impact on a principal component analysis.
Obviously, the first step is to get log likelihoods to find the best lambda : I use the following, where i is the index of a column :
getYeoJohsnonLambda <- function(myClimateData,cols,lambda_min, lambda_max,eps)
...
lambda <- seq(lambda_min,lambda_max,eps)
for(i in cols)
{
formula <- as.formula(paste("myClimateData$",colnames(myClimateData)[i],"~1"))
currentModel <- lm(formula,myClimateData)
print(currentModel)
myboxCox <- boxCox(currentModel, lambda = lambda ,family="yjPower", plotit = FALSE)
...
}
When I am trying to call it for a climateData time series which could be, for example :
`climateData <-data.frame(c(8.2,6.83,5.46,4.1,3.73,3.36,3,3,3,3,3.7),c(0,0.66,1.33,2,2,2,2,2,2,2,1.6))`
I get this error : Error in is.data.frame(data) : object 'myClimateData' not found
This is weird, as lm seems to find it and return a correct fit, and myClimateData should be found as it is one of the arguments of the function, right ?
Sadly, it seems that the problem comes from the function boxCox rather than your getYeoJohsnonLambda function. As BrodieG pointed out in a related question, this function uses parent.frame as an argument to eval which is considered as bad practice in the doc.
One way to solve this is to build the models before the call, as suggested in Adam Quek's answer:
library(car)
climateData <- data.frame(c(8.2,6.83,5.46,4.1,3.73,3.36,3,3,3,3,3.7),c(0,0.66,1.33,2,2,2,2,2,2,2,1.6))
names(climateData) <- c("a","b")
modelList <- list()
for(k in 1:ncol(climateData)) {
modelList[[k]] <- lm(as.formula(paste0(names(climateData)[k],"~1")),data=climateData)
}
getYeoJohnsonLambda <- function(myClimateData, cols, lambda_min, lambda_max, eps)
{
#Recommended values for lambda_min = -0.5 and lambda_max = 2.0, eps = 0.1
myboxCox <- list()
lmd <- seq(lambda_min,lambda_max,eps)
for(i in cols)
{
cat("Creating model for column # ",i,"\n")
currentModel <- modelList[[i]]
myboxCox[[i]] <- boxCox(currentModel, lambda = lmd ,family="yjPower", plotit = FALSE)
}
return(myboxCox)
}
test <- getYeoJohnsonLambda(climateData,c(1,2) ,-0.5,2,0.1)
Other solution (arguably cleaner): use yeo.johnson in VGAM
library(VGAM)
getYeoJohnsonLambda_VGAM <- function(myClimateData, cols, lambda_min, lambda_max, eps)
{
#Recommended values for lambda_min = -0.5 and lambda_max = 2.0, eps = 0.1
myboxCox <- list()
lmd <- seq(lambda_min,lambda_max,eps)
return(apply(climateData,2,yeo.johnson,lambda=lmd))
}
test2 <- getYeoJohnsonLambda_VGAM(climateData,c(1,2) ,-0.5,2,0.1)
Here's a solution without troubleshooting the function getYeoJohsnonLambda:
iris.dat <- iris[-5]
vars <- names(iris.dat)
lmd <- seq(.1, 1, .1) #lambda_min, lambda_max, eps
all.form <- lapply(vars, function(x) as.formula(paste0(x, "~ 1")))
all.lm <- lapply(all.form, lm, data=iris.dat)
library(MASS)
all.bcox <- lapply(all.form, boxcox, data=iris.dat,
lambda=lmd, family="yjPower", plotit=FALSE)
I have set up a Metropolis-Hastings algorithm, and now I am trying to run the algorithm using parallel computing. I have set up a single-chain function
library(parallel)
library(foreach)
library(mvtnorm)
library(doParallel)
n<-100
mX <- 1:n
vY <- rnorm(n)
chains <- 4
iter <- n
p <- 2
#Loglikelihood
post <- function(y, theta) dmvnorm(t(y), rep(0,length(y)), theta[1]*exp(- abs(matrix(rep(mX,n),n) - matrix(rep(mX,each=n),n))/theta[2]),log=TRUE)
geninits <- function() list(theta = runif(p, 0, 1))
dist <- 0.01
jump <- function(x, dist) exp(log(x) + rmvnorm(1,rep(0,p),diag(rep(dist,p))))
MCsingle <- function(){ # This is part of a larger function, so no input are needed
inits <- geninits()
theta.post <- matrix(NA,nrow=p,ncol=iter)
for (i in 1:p) theta.post[i,1] <- inits$theta[i]
for (t in 2:iter){
theta_star <- c(jump(theta.post[, t-1],dist))
pstar <- post(vY, theta = theta_star) # post is the loglikelihood using dmvnorm.
pprev <- post(vY, theta = theta.post[,t-1])
r <- min(exp(pstar - pprev) , 1)
accept <- rbinom(1, 1, prob = r)
if (accept == 1){
theta.post[, t] <- theta_star
} else {
theta.post[, t] <- theta.post[, t-1]
}
}
return(theta.post)
}
, which returns an p x iter matrix, with p parameters and iter iterations.
cl<-makeCluster(4)
registerDoParallel(cl)
posterior <- foreach(c = 1:chains) %dopar% {
MCsingle() }
UPDATE: When I tried to simplify the problem the code suddenly seemed to work. Even though I purposely tried to make errors, the code ran perfectly and the results were as wanted. So for others with similar problems unfortunately I cannot give an answer.
A follow-up question:
My initial purpose was to built up an entire function, such that
MCmulti <- function(mX,vY,iter,chains){
posterior <- foreach(c = 1:chains) %dopar% {
MCsingle() }
return(posterior)
}
but the foreach-loop does not seem to read all the required functions like:
Error in FUN() : task 1 failed - "could not find function "geninits""
Can anybody answer how to implement custom functions inside a foreach loop? Am I to input it as MCmulti <- function(FUN,...) FUN() and call MCmulti(MCsingle,...) ?
I am trying to run a VaR model on my time series for different alpha values and always come up with the error Error in[<-(tmp, i + seq[1], j, value = 0.09265792) : subscript out of bounds when i try to get my results in a matrix form with different p values. If i solely run the model with one for and a fixed value for p i receive correct results. Why do i get this error, i have not found a solution on this topic in the many threads read! I simulated my time series in order to be reproducible! Thanks
#HS VaR function
VaRhistorical <- function(returns, prob=.05) {
ans <- -quantile(returns, prob)
signif(ans, digits=7)
}
#Parameter specification
ret <- runif(5000,-0.1,0.1)
p <- c(0.05, 0.025)#, 0.01, 0.005, 0.001)
vseq <- -499:0 #VaR estimated from past 500 ret values change to -1999 for window of 2000 observations
#HS VaR Estimation with specified parameters
estperiod <- length(vseq)
VaRhs <- matrix(nrow=length(estperiod:(length(ret)-1)),ncol=length(p),byrow=T)
for (i in estperiod:(length(ret)-1)) {
seq <- vseq + i
for (j in p) {
VaRhs[i+vseq[1],j] <- VaRhistorical(ret[seq],prob=j)
}
}
act <- ret[(length(vseq)+1):length(ret)]
violationratio <- ifelse(act>=(-VaRhs),0,1)
sum(violationratio)
I changed the code as follows:
I removed seq, and vseq as they were not needed. I redefined estperiod to just be the length of the window. The innner loop (the one with j) was removed and replaced with a sapply that calculates the values for all of your p's at once and puts then in the proper row. You should check the math to make sure I have the right part of ret going into the VaRhistorical function for each prediction.
#HS VaR function
VaRhistorical <- function(returns, prob=.05) {
ans <- -quantile(returns, prob)
signif(ans, digits=7)
}
#Parameter specification
ret <- runif(5000,-0.1,0.1)
p <- c(0.05, 0.025)#, 0.01, 0.005, 0.001)
estperiod<-500 #now its the length of the window
VaRhs <- matrix(nrow=length(ret)-estperiod, ncol=length(p))
for (i in 1:nrow(VaRhs) ) {
VaRhs[i,]<- sapply(X=p, FUN=VaRhistorical,returns=ret[i:(estperiod+i-1)] )
}
act <- ret[(estperiod+1):length(ret)]
violationratio <- ifelse(act>=(-VaRhs),0,1)
sum(violationratio)