DEoptim does not return optimal parameters - r

I am trying to use DEoptim to optimize the parameters of the Heston pricing model (NMOF package). My goal is to minimize the difference between the real option price and the heston price. However, when running my code, DEoptim does not save the best result but always displays the value that is obtained by using the initial parameters, not the optimized ones. Unfortunately, I'm totally new to R (and any kind of programming) and thus I cannot seem to fix the problem.
My data, for one exemplary subset of an option looks like this.
#Load data
#Real option price
C0116_P=as.vector(c(1328.700000, 1316.050000, 1333.050000, 1337.900000, 1344.800000))
#Strike price
C0116_K=as.vector(c(500, 500, 500, 500, 500))
#Time to maturity in years
C0116_T_t=as.vector(c(1.660274, 1.657534, 1.654795, 1.652055, 1.649315))
#Interest rate percentage
C0116_r=as.vector(c(0.080000, 0.080000, 0.090000, 0.090000, 0.090000))
#Dividend yield percentage
C0116_DY=as.vector(c(2.070000, 2.090000, 2.070000, 2.070000,2.060000))
#Price underlying
C0116_SP_500_P=as.vector(c(1885.08, 1872.83, 1888.03, 1892.49, 1900.53))
In the next step, I want to define the function I want to minimize (difference between real and heston price) and set some initial parameters. To optimize, I am running a loop which unfortunately at the end only returns the difference between the real option price and the heston price using the initial parameters as a best value and not the actual parameters that minimize the difference.
#Load packages
require(NMOF)
require(DEoptim)
#Initial parameters
v0=0.2
vT=0.2
rho=0.2
k=0.2
sigma=0.2
#Define function
error_heston<-function(x)
{error<-P-callHestoncf(S, X, tau, r, q, v0, vT, rho, k, sigma)
return(error)}
#Run optimization
outDEoptim<-matrix()
for (i in 1:5)
{
#I only want the parameters v0, vT, rho, k and sigma to change. That is why I kept the others constant
lower<-c(C0116_P[i],C0116_SP_500_P[i],C0116_K[i],C0116_T_t[i],C0116_r[i]/100,C0116_DY[i]/100,0.0001,0.0001,-1,0.0001,0.0001)
upper<-c(C0116_P[i],C0116_SP_500_P[i],C0116_K[i],C0116_T_t[i],C0116_r[i]/100,C0116_DY[i]/100,10,10,1,10,10)
outDEoptim<-(DEoptim(error_heston, lower, upper, DEoptim.control(VTR=0,itermax=100)))
print(outDEoptim$opti$bestval)
i=i+1
}
Any help is much appreciated!

One of the first problems is that your objective function only has one argument (the parameters to optimize), so all the others objects used inside the function must be looked up. It's better practice to pass them explicitly.
Plus, many of the necessary values aren't defined in your example (e.g. S, X, etc). All the parameters you want to optimize will be passed to your objective function via the first argument. It can help clarify things if you explicitly assign each element inside your objective function. So a more robust objective function definition is:
# Define objective function
error_heston <- function(x, P, S, K, tau, r, q) {
v0 <- x[1]
vT <- x[2]
rho <- x[3]
k <- x[4]
sigma <- x[5]
error <- abs(P - callHestoncf(S, K, tau, r, q, v0, vT, rho, k, sigma))
return(error)
}
Also note that I took the absolute error. DEoptim is going to minimize the objective function, so it would try to make P - callHestoncf() as negative as possible, when you want it to be close to zero instead.
You specified the box constraints upper and lower even for the parameters that don't vary. It's best to only have DEoptim generate a population for the parameters that do vary, so I removed the non-varying parameters from the box constraints. I also defined them outside the for loop.
# Only need to set bounds for varying parameters
lower <- c(1e-4, 1e-4, -1, 1e-4, 1e-4)
upper <- c( 10, 10, 1, 10, 10)
Now to the actual DEoptim call. Here is where you will pass the values for all the non-varying parameters. You set them as named arguments to the DEoptim call, as I've done below.
i <- 1
outDEoptim <- DEoptim(error_heston, lower, upper,
DEoptim.control(VTR=0, itermax=100), P = C0116_P[i], S = C0116_SP_500_P[i],
K = C0116_K[i], tau = C0116_T_t[i], r = C0116_r[i], q = C0116_DY[i])
I only ran one iteration of the for loop, because the callHestoncf() function frequently throws an error because the numerical integration routine fails. This stops the optimization. You should look into the cause of that, and ask a new question if you have trouble.
I also noticed you specified one of the non-varying inputs incorrectly. Your dividend yield percentages are 100 times too large. Your non-varying inputs should be:
# Real option price
C0116_P <- c(1328.70, 1316.05, 1333.05, 1337.90, 1344.80)
# Strike price
C0116_K <- c(500, 500, 500, 500, 500)
# Time to maturity in years
C0116_T_t <- c(1.660274, 1.657534, 1.654795, 1.652055, 1.649315)
# Interest rate percentage
C0116_r <- c(0.08, 0.08, 0.09, 0.09, 0.09)
# Dividend yield percentage
C0116_DY <- c(2.07, 2.09, 2.07, 2.07, 2.06) / 100
# Price underlying
C0116_SP_500_P <- c(1885.08, 1872.83, 1888.03, 1892.49, 1900.53)
As an aside, you should take a little time to format your code better. It makes it more readable, which should help you avoid typo-like errors.

Related

Fitting a physical model to a specific data using nls: over-parameterization or unidentifiable parameters?

I have somewhat a complex physical model with five unknown parameters to fit, but no success so far.
I used nls2 first to get some estimates for the start values, but then nls, nlxb, and nlsLM all threw the famous "singular gradient error at initial parameter estimates" error.
For the start values for nls2, I extracted them from the literature, so I think that I have good starting values at least for nls2. The parameter estimates extracted from nls2 make quite sense physically as well; however, don't resolve the issue with the singular gradient matrix error.
Since it's a physical model, every coefficient has a physical meaning, and I prefer not to fix any of them.
I should also mention that all five unknown parameters in the model equation are positive and the shape parameter m can go up to 2.
Reading through many posts and trying different solution suggestions, I have come to conclusion that I have either over-parameterization or unidentifiable parameters problem.
My question is that should I stop trying to use nls with this specific model (with this many unknown parameters) or is there any way out?
I am quite new to topic, so any help, mathematically or code-wise, is greatly appreciated.
Here is my MWE:
# Data
x <- c(0, 1000, 2000, 2500, 2750, 3000, 3250, 3500, 3750, 4000, 5000)
y <- c(1.0, 0.99, 0.98, 0.95, 0.795, 0.59, 0.35, 0.295, 0.175, 0.14, 0.095)
# Start values for nls2
bounds <- data.frame(a = c(0.8, 1.5), b = c(1e+5, 1e+7), c = c(0.4, 1.4), n = c(0.1, 2), m = c(0.1, 2))
# Model equation function
mod <- function(x, a, b, c, n, m){
t <- b*85^n*exp(-c/0.0309)
(1 - exp(-(a/(t*x))^m))
}
# # Model equation
# mod <- y ~ (1 - exp(-(a/(b*85^n*exp(-c/0.0309)*x))^m))
# Model fit with nls2
fit2 <- nls2(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = bounds, algorithm = "brute-force")
# Model fit with nls
fit <- nls(y ~ mod(x, a, b, c, n, m), data = data.frame(x, y), start = coef(fit2))
The more I look at this the more confused I get, but I'm going to try again.
Looking again at your expression, we have the expression inside the exponential
-(a/(b*85^n*exp(-c/0.0309)*x))^m
We can rewrite this as
-( [a/(b*85^n*exp(-c/0.0309))] * 1/x )^m
(please check my algebra!)
If this is correct, then that whole bold-faced blob doesn't affect the functional form of x — it all collapses to a single constant in the equation. (In other words, {a,b,c,n} are all jointly unidentifiable.) Lumping that stuff into a single parameter phi :
1 - exp(-(phi/x)^m)
phi is a shape parameter (has the same units as x, should be roughly the same magnitude as a typical value of x): let's try a starting value of 2500 (the mean value of x)
m is a shape parameter; we can't go too badly wrong starting from m==1
Now nls works fine without any extra help:
n1 <- nls(y~1 - exp(-(phi/x)^m), start=list(phi=2500,m=1), data=data.frame(x,y))
and gets phi=2935, m=6.49.
Plot predictions:
plot(x,y, ylim=c(0,1))
xvec <- seq(0, 5000, length=101)
lines(xvec, predict(n1, newdata=data.frame(x=xvec)))
Another way to think about what this curve is doing: we can transform the equation to -log(1-y) = phi^m*(1/x)^m: that is, -log(1-y) should follow a power-law curve with respect to 1/x.
Here's what that looks like:
plot(1/x, -log(1-y))
## curve() uses "x" as the current x-axis variable, i.e.
## read "x" as "1/x" below.
with(as.list(coef(n1)), curve(phi^m*x^m, add=TRUE))
In this format, it appears to fit the central data well but fails for large values of 1/x (the x=0 point is missing here because it goes to infinity).

Integral and numeric optimization (nlminb) R

I am having issues with an optimization problem involving numerical estimation of an integral which contains an unknown variable.
Numerical estimating an integral is simple enough, just use the integrate function in R. I am trying to estimate a rather unpleasant integral which requires optimization since it contains an unknown variable and a constraint. I am using the nlminb function but the result is highly incorrect. The idea is to evaluate the integral to constraint smaller or equal to 1-l, where l is between 0 and 1.
the code is the following:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
and it is the variable p which is unknown.
The objective function to be minimised is the following:
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}
test <- nlminb(0.015, objective = objective, lower = 0, upper = 1)$par*100
Edited to reflect mistakes in the objective function and the integral.
Same issue still remains.
I think my mistake is not specifying which variable to minimise. The optimisation just gives the starting value in nlminb multiplied by 100.
The authors of the paper used dummy variables and showed that a l = 0,5 should give p=0,15%.
Thank you for your time.
Of course, since your objective function does not depend on p. Do:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}

Error in optim(): searching for global minimum for a univariate function

I am trying to optmize a function in R
The function is the Likelihood function of negative binominal when estimating only mu parameter. This should not be a problem since the function clearly has just one point of maximum. But, I am not being able to reach the desirable result.
The function to be optmized is:
EMV <- function(data, par) {
Mi <- par
Phi <- 2
N <- NROW(data)
Resultado <- log(Mi/(Mi + Phi))*sum(data) + N*Phi*log(Phi/(Mi + Phi))
return(Resultado)
}
Data is a vector of negative binomial variables with parameters 2 and 2
data <- rnegbin(10000, mu = 2, theta = 2)
When I plot the function having mu as variable with the following code:
x <- seq(0.1, 100, 0.02)
z <- EMV(data,0.1)
for (aux in x) {z <- rbind(z, EMV(data,aux))}
z <- z[2:NROW(z)]
plot(x,z)
I get the following curve:
And the maximum value of z is close to parameter value --> 2
x[which.max(z)]
But the optimization is not working with BFGS
Error in optim(par = theta, fn = EMV, data = data, method = "BFGS") :
non-finite finite-difference value [1]
And is not going to right value using SANN, for example:
$par
[1] 5.19767e-05
$value
[1] -211981.8
$counts
function gradient
10000 NA
$convergence
[1] 0
$message
NULL
The questions are:
What am I doing wrong?
Is there a way to tell optim that the param should be bigger than 0?
Is there a way to tell optim that I want to maximize the function? (I am afraid the optim is trying to minimize and is going to a very small value where function returns smallest values)
Minimization or Maximization?
Although ?optim says it can do maximization, but that is in a bracket, so minimization is default:
fn: A function to be minimized (or maximized) ...
Thus, if we want to maximize an objective function, we need to multiply an -1 to it, and then minimize it. This is quite a common situation. In statistics we often want to find maximum log likelihood, so to use optim(), we have no choice but to minimize the negative log likelihood.
Which method to use?
If we only do 1D minimization, we should use method "Brent". This method allows us to specify a lower bound and an upper bound of search region. Searching will start from one bound, and search toward the other, until it hit the minimum, or it reach the boundary. Such specification can help you to constrain your parameters. For example, you don't want mu to be smaller than 0, then just set lower = 0.
When we move to 2D or higher dimension, we should resort to "BFGS". In this case, if we want to constrain one of our parameters, say a, to be positive, we need to take log transform log_a = log(a), and reparameterize our objective function using log_a. Now, log_a is free of constraint. The same goes when we want constrain multiple parameters to be positive.
How to change your code?
EMV <- function(data, par) {
Mi <- par
Phi <- 2
N <- NROW(data)
Resultado <- log(Mi/(Mi + Phi))*sum(data) + N*Phi*log(Phi/(Mi + Phi))
return(-1 * Resultado)
}
optim(par = theta, fn = EMV, data = data, method = "Brent", lower = 0, upper = 1E5)
The help file for optim says: "By default optim performs minimization, but it will maximize if control$fnscale is negative." So if you either multiply your function output by -1 or change the control object input, you should get the right answer.

confidence interval around predicted value from complex inverse function

I'm trying to get a 95% confidence interval around some predicted values, but am not capable of achieving this.
Basically, I estimated a growth curve like this:
set.seed(123)
dat=data.frame(size=rnorm(50,10,3),age=rnorm(50,5,2))
S <- function(t,ts,C,K) ((C*K)/(2*pi))*sin(2*pi*(t-ts))
sommers <- function(t,Linf,K,t0,ts,C)
Linf*(1-exp(-K*(t-t0)-S(t,ts,C,K)+S(t0,ts,C,K)))
model <- nls(size~sommers(age,Linf,K,t0,ts,C),data=dat,
start=list(Linf=10,K=4.7,t0=2.2,C=0.9,ts=0.1))
I have independent size measurements, for which I would like to predict the age. Therefore, the inverse of the function, which is not very straightforward, I calculated like this:
model.out=coef(model)
S.out <- function(t)
((model.out[[4]]*model.out[[2]])/(2*pi))*sin(2*pi*(t-model.out[[5]]))
sommers.out <- function(t)
model.out[[1]]*(1-exp(-model.out[[2]]*(t-model.out[[3]])-S.out(t)+S.out(model.out[[3]])))
inverse = function (f, lower = -100, upper = 100) {
function (y) uniroot((function (x) f(x) - y), lower = lower, upper = upper)[1]
}
sommers.inverse = inverse(sommers.out, 0, 25)
x= sommers.inverse(10) #this works with my complete dataset, but not with this fake one
Although this works fine, I need to know the confidence interval (95%) around this estimate (x). For linear models there is for example "predict(... confidence=)". I could also bootstrap the function somehow to get the quantiles associated with the parameters (didn't find how), to then use the extremes of those to calculate the maximum and minimum values predictable. But that doesn't really look like the good way of doing this....
Any help would be greatly appreciated.
EDIT after answer:
So this worked (explained in the book of Ben Bolker, see answer):
vmat = mvrnorm(1000, mu = coef(mfit), Sigma = vcov(mfit))
dist = numeric(1000)
for (i in 1:1000) {dist[i] = sommers_inverse(9.938,vmat[i,])}
quantile(dist, c(0.025, 0.975))
On the rather bad fake data I gave, this works of course rather horrible. But on the real data (which I have a problem recreating), this is ok!
Unless I'm mistaken, you're going to have to use either regular (parametric) bootstrapping or a method called either "population predictive intervals" (e.g., see section 5 of chapter 7 of Bolker 2008), which assumes that the sampling distributions of your parameters are multivariate Normal. However, I think you may have bigger problems, unless I've somehow messed up your model in adapting it ...
Generate data (note that random data may actually bad for testing your model - see below ...)
set.seed(123)
dat <- data.frame(size=rnorm(50,10,3),age=rnorm(50,5,2))
S <- function(t,ts,C,K) ((C*K)/(2*pi))*sin(2*pi*(t-ts))
sommers <- function(t,Linf,K,t0,ts,C)
Linf*(1-exp(-K*(t-t0)-S(t,ts,C,K)+S(t0,ts,C,K)))
Plot the data and the initial curve estimate:
plot(size~age,data=dat,ylim=c(0,16))
agevec <- seq(0,10,length=1001)
lines(agevec,sommers(agevec,Linf=10,K=4.7,t0=2.2,ts=0.1,C=0.9))
I had trouble with nls so I used minpack.lm::nls.lm, which is slightly more robust. (There are other options here, e.g. calculating the derivatives and providing the gradient function, or using AD Model Builder or Template Model Builder, or using the nls2 package.)
For nls.lm we need a function that returns the residuals:
sommers_fn <- function(par,dat) {
with(c(as.list(par),dat),size-sommers(age,Linf,K,t0,ts,C))
}
library(minpack.lm)
mfit <- nls.lm(fn=sommers_fn,
par=list(Linf=10,K=4.7,t0=2.2,C=0.9,ts=0.1),
dat=dat)
coef(mfit)
## Linf K t0 C ts
## 10.6540185 0.3466328 2.1675244 136.7164179 0.3627371
Here's our problem:
plot(size~age,data=dat,ylim=c(0,16))
lines(agevec,sommers(agevec,Linf=10,K=4.7,t0=2.2,ts=0.1,C=0.9))
with(as.list(coef(mfit)), {
lines(agevec,sommers(agevec,Linf,K,t0,ts,C),col=2)
abline(v=t0,lty=2)
abline(h=c(0,Linf),lty=2)
})
With this kind of fit, the results of the inverse function are going to be extremely unstable, as the inverse function is many-to-one, with the number of inverse values depending sensitively on the parameter values ...
sommers_pred <- function(x,pars) {
with(as.list(pars),sommers(x,Linf,K,t0,ts,C))
}
sommers_pred(6,coef(mfit)) ## s(6)=9.93
sommers_inverse <- function (y, pars, lower = -100, upper = 100) {
uniroot(function(x) sommers_pred(x,pars) -y, c(lower, upper))$root
}
sommers_inverse(9.938, coef(mfit)) ## 0.28
If I pick my interval very carefully I can get back the correct answer ...
sommers_inverse(9.938, coef(mfit), 5.5, 6.2)
Maybe your model will be better behaved with more realistic data. I hope so ...

Wrong Hessian from optim in R

I am doing some Extreme Values analysis. I don't want to use the fevd package for a variety of reasons (the first I want to be able to tweak some things that I cannot do otherwise). I wrote my own code. It is mostly very simple, and I thought I had solved everything. But for some parameter combinations, the Hessian coming out of my log-likelihood analysis (based on optim ) will not be correct.
Going over one step at the time. My code - or selected part of it - looks like this:
# routines for non stationary
Log_lik_GEV <- function(dataIN,scaleIN,shapeIN,locationIN){
# simply calculate the negative log likelihood value for a set of X and parameters, for the GPD
#xi, mu, sigma - xi is the shape parameter, mu the location parameter, and sigma is the scale parameter.
# shape = xi
# location = mu
# scale = beta
library(fExtremes)
#dgev Density of the GEV Distribution, dgev(x, xi = 1, mu = 0, sigma = 1)
LLvalues <- dgev(dataIN, xi = shapeIN, mu = locationIN, beta = scaleIN)
NLL <- -sum(log(LLvalues[is.finite(LLvalues)]))
return(NLL)
}
function_MLE <- function(par , dataIN){
scoreLL <- 0
shape_param <- par[1]
scale_param <- par[2]
location_param <- par[3]
scoreLL <- Log_lik_GEV(dataIN, scale_param, shape_param, location_param)
if (abs(shape_param) > 0.3) scoreLL <- scoreLL*10000000
if ((scale_param) <= 0) {
scale_param <- abs(scale_param)
par[2] <- abs(scale_param)
scoreLL <- scoreLL*1000000000
}
sum(scoreLL)
}
kernel_estimation <- function(dati_AM, shape_o, scale_o, location_o) {
paramOUT <- optim(par = c(shape_o, scale_o, location_o), fn = function_MLE, dataIN = dati_AM, control = list(maxit = 3000, reltol = 0.00000001), hessian = TRUE)
# calculation std errors
covmat <- solve(paramOUT$hessian)
stde <- sqrt(diag(covmat))
print(covmat)
print('')
result <- list(shape_gev =paramOUT$par[1], scale_gev = paramOUT$par[2],location_gev =paramOUT$par[3], var_covar = covmat)
return(result)
}
Everything works great, in some cases. If I run my routines and the fevd routines, I get exactly the same results. In some cases (in my specific case when shape=-0.29 so strongly negative/weibull), my routine will give negative variances and funky hessians. It is not always wrong, but some parameter combinations are clearly not giving valid hessian (Note: the parameters are still estimated correctly, meaning are identical to the fevd results, but the covariance matrix is completely off).
I found this post that compared the hessian from two procedures, and indeed optim seems to be flaky. However, if I simply substitute maxLik in my routine, it just doesn't converge at all (even in those cases when the convergence was happening).
paramOUT = maxLik(function_MLE, start =c(shape_o, scale_o, location_o),
dataIN=dati_AM, method ='NR' )
I tried to give different initial values - even the correct ones - but it just doesn't converge.
I am not supplying data because I think that the optim routine is used correctly in my example. Simply, the numerical results are not stable for some parameter combination. My question is:
1) Am I missing something in the way I use maxLik?
2) Are there other optimization routines, besides maxLik, from which I can extract the hessian?
thanks

Resources