R: Convergence problems with numerical integration - r

Not sure if this numerical methods problem should really be here or in crossvalidated, but since I have a nice reproducible example I though I would start here.
I am going to be estimating and fitting a bunch of distributions both to some large data sets and to data sets generated randomly from similar distributions. As part of this process I will be generating estimates for the conditional mean of various value ranges, including truncated and non-truncated values of the right tail.
The function cr_moment below, given a pdf function for dfun and parameters for that function in params calculates the unconditional mean of that distribution. Given the upper, lower, or both bounds, it calculates the conditional mean for the range specified by those bounds, using the singly- or doubly-truncated distribution for those bounds. The function beneath it, cr_gb2, specializes cr_moment to the generalized beta distribution of the second kind. Finally, the parameter values supplied beneath that approximate the unadjusted current-dollar household income distribution from the US Census/BLS Current Population Survey for the year 2000. McDonald & Ransom 2008. (Also, kudos to Mikko Marttila on this list for help with coding this function).
This function gives me a failure to converge error, copied below, for various lower bounds and an upper bound equal to 4.55e8, or higher, but not at 4.54e8. The kth moment of the GB2 exists for k < shape1 * shape3, here about 2.51. This is a nice smooth unimodal function being integrated over a finite interval, and I don’t know why it is failing to converge and don-t know what to do about it. For other parameter values, but not this one, I have also seen convergence problems at the low end for lower bounds ranging from 6 to a couple of hundred.
Error in integrate(f = prob_interval, lower = lb, upper = ub, subdivisions = 100L):
the integral is probably divergent
455 billion will be above the highest observable income level, by one or two orders of magnitude, but given a wider range of parameter values and using hill-climbing algorithms to fit real and simulated data I think I will hit this wall many times. I know very little about numerical methods in a case like this and don’t really know where to start. Help and suggestions greatly appreciated.
cr_moment <- function(lb = -Inf, ub = Inf, dfun, params, v=1, ...){
x_pdf <- function(X){
X^v * do.call(what=dfun, args=c(list(x=X), params))
}
prob_interval <- function(X){
do.call(what=dfun, args=c(list(x=X), params))
}
integral_val <- integrate(f = x_pdf, lower = lb, upper = ub)
integral_prob <- integrate(f = prob_interval, lower = lb, upper = ub)
crm <- interval_val[[1]] / interval_prob[[1]]
out <- list(value = integral_val[[1]], probability = integral_prob[[1]],
cond_moment = crm)
out
}
library(GB2)
cr_gb2 <- function(lb = -Inf, ub = Inf, v = 1, params){
cr_moment(lb, ub, dfun = dgb2, params = get("params"))
}
GB2_params <- list(shape1 = 2.2474, scale = 58441.5, shape2 = 0.6186, shape3 = 1.118)
cr_gb2(lb=1, ub= 4.55e8, params = GB2_params)

Related

Estimating PDF with monotonically declining density at tails

tldr: I am numerically estimating a PDF from simulated data and I need the density to monotonically decrease outside of the 'main' density region (as x-> infinity). What I have yields a close to zero density, but which does not monotonically decrease.
Detailed Problem
I am estimating a simulated maximum likelihood model, which requires me to numerically evaluate the probability distribution function of some random variable (the probability of which cannot be analytically derived) at some (observed) value x. The goal is to maximize the log-likelihood of these densities, which requires them to not have spurious local maxima.
Since I do not have an analytic likelihood function I numerically simulate the random variable by drawing the random component from some known distribution function, and apply some non-linear transformation to it. I save the results of this simulation in a dataset named simulated_stats.
I then use density() to approximate the PDF and approxfun() to evaluate the PDF at x:
#some example simulation
Simulated_stats_ <- runif(n=500, 10,15)+ rnorm(n=500,mean = 15,sd = 3)
#approximation for x
approxfun(density(simulated_stats))(x)
This works well within the range of simulated simulated_stats, see image:
Example PDF. The problem is I need to be able to evaluate the PDF far from the range of simulated data.
So in the image above, I would need to evaluate the PDF at, say, x=50:
approxfun(density(simulated_stats))(50)
> [1] NA
So instead I use the from and to arguments in the density function, which correctly approximate near 0 tails, such
approxfun(
density(Simulated_stats, from = 0, to = max(Simulated_stats)*10)
)(50)
[1] 1.924343e-18
Which is great, under one condition - I need the density to go to zero the further out from the range x is. That is, if I evaluated at x=51 the result must be strictly smaller. (Otherwise, my estimator may find local maxima far from the 'true' region, since the likelihood function is not monotonic very far from the 'main' density mass, i.e. the extrapolated region).
To test this I evaluated the approximated PDF at fixed intervals, took logs, and plotted. The result is discouraging: far from the main density mass the probability 'jumps' up and down. Always very close to zero, but NOT monotonically decreasing.
a <- sapply(X = seq(from = 0, to = 100, by = 0.5), FUN = function(x){approxfun(
density(Simulated_stats_,from = 0, to = max(Simulated_stats_)*10)
)(x)})
aa <- cbind( seq(from = 0, to = 100, by = 0.5), a)
plot(aa[,1],log(aa[,2]))
Result:
Non-monotonic log density far from density mass
My question
Does this happen because of the kernel estimation in density() or is it inaccuracies in approxfun()? (or something else?)
What alternative methods can I use that will deliver a monotonically declining PDF far from the simulated density mass?
Or - how can I manually change the approximated PDF to monotonically decline the further I am from the density mass? I would happily stick some linear trend that goes to zero...
Thanks!
One possibility is to estimate the CDF using a beta regression model; numerical estimate of the derivative of this model could then be used to estimate the pdf at any point. Here's an example of what I was thinking. I'm not sure if it helps you at all.
Import libraries
library(mgcv)
library(data.table)
library(ggplot2)
Generate your data
set.seed(123)
Simulated_stats_ <- runif(n=5000, 10,15)+ rnorm(n=500,mean = 15,sd = 3)
Function to estimate CDF using gam beta regression model
get_mod <- function(ss,p = seq(0.02, 0.98, 0.02)) {
qp = quantile(ss, probs=p)
betamod = mgcv::gam(p~s(qp, bs="cs"), family=mgcv::betar())
return(betamod)
}
betamod <- get_mod(Simulated_stats_)
Very basic estimate of PDF at val given model that estimates CDF
est_pdf <- function(val, betamod, tol=0.001) {
xvals = c(val,val+tol)
yvals = predict(betamod,newdata=data.frame(qp = xvals), type="response")
as.numeric((yvals[1] - yvals[2])/(xvals[1] - xvals[2]))
}
Lets check if monotonically increasing below min of Simulated_stats
test_x = seq(0,min(Simulated_stats_), length.out=1000)
pdf = sapply(test_x, est_pdf, betamod=betamod)
all(pdf == cummax(pdf))
[1] TRUE
Lets check if monotonically decreasing above max of Simulated_stats
test_x = seq(max(Simulated_stats_), 60, length.out=1000)
pdf = sapply(test_x, est_pdf, betamod=betamod)
all(pdf == cummin(pdf))
[1] TRUE
Additional thoughts 3/5/22
As discussed in comments, using the betamod to predict might slow down the estimator. While this could be resolved to a great extent by writing your own predict function directly, there is another possible shortcut.
Generate estimates from the betamod over the range of X, including the extremes
k <- sapply(seq(0,max(Simulated_stats_)*10, length.out=5000), est_pdf, betamod=betamod)
Use the approach above that you were initially using, i.e. a linear interpolation across the density, but rather than doing this over the density outcome, instead do over k (i.e. over the above estimates from the beta model)
lin_int = approxfun(x=seq(0,max(Simulated_stats_)*10, length.out=5000),y=k)
You can use the lin_int() function for prediction in the estimator, and it will be lighting fast. Note that it produces virtually the same value for a given x
c(est_pdf(38,betamod), lin_int(38))
[1] 0.001245894 0.001245968
and it is very fast
microbenchmark::microbenchmark(
list = alist("betamod" = est_pdf(38, betamod),"lin_int" = lint(38)),times=100
)
Unit: microseconds
expr min lq mean median uq max neval
betamod 1157.0 1170.20 1223.304 1188.25 1211.05 2799.8 100
lin_int 1.7 2.25 3.503 4.35 4.50 10.5 100
Finally, lets check the same plot you did before, but using lin_int() instead of approxfun(density(....))
a <- sapply(X = seq(from = 0, to = 100, by = 0.5), lin_int)
aa <- cbind( seq(from = 0, to = 100, by = 0.5), a)
plot(aa[,1],log(aa[,2]))

Finding minimum by optimising a vector in R

I need to find a minimum of an objective function by optimising a vector. The problem is finance related if that helps - the function RC (provided below) computes the sum of squared differences of risk contribution of different assets, where the risk contribution is a product of input Risk Measure (RM, given) and weights.
The goal is to find such weights that the sum is zero, i.e. all assets have equal risk contributions.
RC = function (RM, w){
w = w/sum(w) # normalizing weights so they sum up to 1
nAssets = length(RM)
rc_matrix = matrix(nrow=1,ncol=nAssets)
rc_matrix = RM*w #risk contributions: RM (risk measure multiplied by asset's
#w eight in the portfolio)
rc_sum_squares = numeric(length=1) #placeholder
rc_sum_squares = sum(combn(
seq_along(RM),
2,
FUN = function(x)
(rc_matrix[ , x[1]] - rc_matrix[, x[2]]) ** 2
)) # this function sums the squared differences of the risk contributions
return(rc_sum_squares)
}
I searched and the solution seems to lie in the "optim" function, so I tried:
out <- optim(
par = rep(1 / length(RM), length(RM)), # initial guess
fn = RC,
RM = RM,
method = "L-BFGS-B",
lower = 0.00001,
upper = 1)
However, this returns an error message: "Error in rc_matrix[, x[1]] : incorrect number of dimensions"
I don't know how the optimization algorithm works, so I can't really wrap my head around it. The RC function works though, here is a sample for replicability:
RM <- c(0.06006928, 0.06823795, 0.05716360, 0.08363529, 0.06491009, 0.06673174, 0.03103578, 0.05741140)
w <- matrix(0.125, nrow=1, ncol=1)
I saw also CVXR package, which crashes my RStudio for some reason and nlm(), which is little more complicated and I can't write the function properly.
A solution might be not to do the funky summation of the squared differences, but finding the weights so that the risk contributions (RM*weight) are equal. I will be very glad for your help.
Note: the vector of the weights has to sum up to 1 and the values have to lie between 0 and 1.
Cheers
Daniel

Using mle2 function

I would like to find the MLE for parameters epsilon and mu in such a model:
$$X \sim \frac{1}{mu1}e^{-x/mu1}+\frac{1}{mu2}e^[-x/mu2}$$
library(Renext)
library(bbmle)
epsilon = 0.01
#the real model
X <- rmixexp2(n = 20, prob1 = epsilon, rate1 = 1/mu1, rate2 = 1/mu2)
LL <- function(mu1,mu2, eps){
R = (1-eps)*dexp(X,rate=1/mu1,log=TRUE)+eps*dexp(X,rate=1/mu2,log=TRUE)
-sum(R)
}
fit_norm <- mle2(LL, start = list(eps = 0,mu1=1, mu2 = 1), lower = c(-Inf, 0),
upper = c(Inf, Inf), method = 'L-BFGS-B')
summary(fit_norm)
But I get the error
> fn = function (p) ':method 'L-BFGS-B' requires finite values of fn"
There are a bunch of issues here. The primary one is that your likelihood expression is wrong (you can't log the components separately and then add them, you have to add the components and then take the log). Your bounds are also funny: the mixture probability should be [0,1] and the means should be [0, Inf].
The other problem you have is that with the current simulation design (n=20, prob=0.01), you have a high probability of getting no points in the first mixture component (the probability of a point being in the second component is 1-0.01=0.99, so the probability that all of the points are in the second component is 0.99^20 = 82%). In this case the MLE will be degenerate (i.e., you're trying to fit a two-component mixture to a data set that essentially only has one component); in this case any of these solutions will give equivalent likelihoods:
prob=0, mu2=mean of the data, mu1=anything
prob=1, mu1=mean of the data, mu2=anything
mu1=mu2=mean of the data, prob=anything
With all these solutions, where you end up will depend very sensitively on starting conditions and optimization algorithm.
For this problem I would encourage you to use the built-in dmixexp2 function from the Renext package (which correctly implements the log-likelihood as log(p*Prob(X|exp1) + (1-p)*Prob(X|exp2))) and the formula interface to mle2:
fit_norm <- mle2(X ~ dmixexp2(rate1=1/mu1,rate2=1/mu2,prob1=eps),
data=list(X=X),
start = list(mu1=1, mu2 = 2, eps=0.4),
lower = c(mu1=0, mu2=0, eps=0),
upper = c(mu1=Inf, mu2=Inf, eps=1),
method = 'L-BFGS-B')
This gives me estimates of mu1=1.58, mu2=2.702, eps=0. mean(X) in my case equals the value of mu2, so this is the first case in the bulleted list above. You also get a warning:
some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable
There are also a variety of more specialized algorithms for fitting mixture models (especially those based on the expectation-maximization algorithm); you can look for packages on CRAN (flexmix is one of them).
This problem is small enough that you can visualize the whole log-likelihood surface by brute force (code below): the colours represent deviations from the minimum negative log-likelihood (the colour gradient is log-scaled, so there's a small offset to avoid log(0)). Dark blue represents parameters that are the best fit to the data, yellow are the worst.
dd <- expand.grid(mu1=seq(0.1,4,length=51),
mu2=seq(0.1,4,length=51),
eps=seq(0,1,length=9),
nll=NA)
for (i in 1:nrow(dd)) {
dd$nll[i] <- with(dd[i,],
-sum(dmixexp2(X,rate1=1/mu1,
rate2=1/mu2,
prob1=eps,
log=TRUE)))
}
library(ggplot2)
ggplot(dd,aes(mu1,mu2,fill=nll-min(nll)+1e-4)) +
facet_wrap(~eps, labeller=label_both) +
geom_raster() +
scale_fill_viridis_c(trans="log10") +
scale_x_continuous(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
theme(panel.spacing=grid::unit(0.1,"lines"))
ggsave("fit_norm.png", type="cairo-png")

Integral and numeric optimization (nlminb) R

I am having issues with an optimization problem involving numerical estimation of an integral which contains an unknown variable.
Numerical estimating an integral is simple enough, just use the integrate function in R. I am trying to estimate a rather unpleasant integral which requires optimization since it contains an unknown variable and a constraint. I am using the nlminb function but the result is highly incorrect. The idea is to evaluate the integral to constraint smaller or equal to 1-l, where l is between 0 and 1.
the code is the following:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
and it is the variable p which is unknown.
The objective function to be minimised is the following:
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}
test <- nlminb(0.015, objective = objective, lower = 0, upper = 1)$par*100
Edited to reflect mistakes in the objective function and the integral.
Same issue still remains.
I think my mistake is not specifying which variable to minimise. The optimisation just gives the starting value in nlminb multiplied by 100.
The authors of the paper used dummy variables and showed that a l = 0,5 should give p=0,15%.
Thank you for your time.
Of course, since your objective function does not depend on p. Do:
integrand <- function(x, p) {dnorm(x,0,1)*(1-dnorm((qnorm(p)
-sqrt(0.12)*x)/(sqrt(1-0.12)), 0, 1))^800}
objective <- function(p){
PoD <- integrate(integrand, lower = -Inf, upper = Inf, p = p)$value
PoD - 0.5
}

DEoptim does not return optimal parameters

I am trying to use DEoptim to optimize the parameters of the Heston pricing model (NMOF package). My goal is to minimize the difference between the real option price and the heston price. However, when running my code, DEoptim does not save the best result but always displays the value that is obtained by using the initial parameters, not the optimized ones. Unfortunately, I'm totally new to R (and any kind of programming) and thus I cannot seem to fix the problem.
My data, for one exemplary subset of an option looks like this.
#Load data
#Real option price
C0116_P=as.vector(c(1328.700000, 1316.050000, 1333.050000, 1337.900000, 1344.800000))
#Strike price
C0116_K=as.vector(c(500, 500, 500, 500, 500))
#Time to maturity in years
C0116_T_t=as.vector(c(1.660274, 1.657534, 1.654795, 1.652055, 1.649315))
#Interest rate percentage
C0116_r=as.vector(c(0.080000, 0.080000, 0.090000, 0.090000, 0.090000))
#Dividend yield percentage
C0116_DY=as.vector(c(2.070000, 2.090000, 2.070000, 2.070000,2.060000))
#Price underlying
C0116_SP_500_P=as.vector(c(1885.08, 1872.83, 1888.03, 1892.49, 1900.53))
In the next step, I want to define the function I want to minimize (difference between real and heston price) and set some initial parameters. To optimize, I am running a loop which unfortunately at the end only returns the difference between the real option price and the heston price using the initial parameters as a best value and not the actual parameters that minimize the difference.
#Load packages
require(NMOF)
require(DEoptim)
#Initial parameters
v0=0.2
vT=0.2
rho=0.2
k=0.2
sigma=0.2
#Define function
error_heston<-function(x)
{error<-P-callHestoncf(S, X, tau, r, q, v0, vT, rho, k, sigma)
return(error)}
#Run optimization
outDEoptim<-matrix()
for (i in 1:5)
{
#I only want the parameters v0, vT, rho, k and sigma to change. That is why I kept the others constant
lower<-c(C0116_P[i],C0116_SP_500_P[i],C0116_K[i],C0116_T_t[i],C0116_r[i]/100,C0116_DY[i]/100,0.0001,0.0001,-1,0.0001,0.0001)
upper<-c(C0116_P[i],C0116_SP_500_P[i],C0116_K[i],C0116_T_t[i],C0116_r[i]/100,C0116_DY[i]/100,10,10,1,10,10)
outDEoptim<-(DEoptim(error_heston, lower, upper, DEoptim.control(VTR=0,itermax=100)))
print(outDEoptim$opti$bestval)
i=i+1
}
Any help is much appreciated!
One of the first problems is that your objective function only has one argument (the parameters to optimize), so all the others objects used inside the function must be looked up. It's better practice to pass them explicitly.
Plus, many of the necessary values aren't defined in your example (e.g. S, X, etc). All the parameters you want to optimize will be passed to your objective function via the first argument. It can help clarify things if you explicitly assign each element inside your objective function. So a more robust objective function definition is:
# Define objective function
error_heston <- function(x, P, S, K, tau, r, q) {
v0 <- x[1]
vT <- x[2]
rho <- x[3]
k <- x[4]
sigma <- x[5]
error <- abs(P - callHestoncf(S, K, tau, r, q, v0, vT, rho, k, sigma))
return(error)
}
Also note that I took the absolute error. DEoptim is going to minimize the objective function, so it would try to make P - callHestoncf() as negative as possible, when you want it to be close to zero instead.
You specified the box constraints upper and lower even for the parameters that don't vary. It's best to only have DEoptim generate a population for the parameters that do vary, so I removed the non-varying parameters from the box constraints. I also defined them outside the for loop.
# Only need to set bounds for varying parameters
lower <- c(1e-4, 1e-4, -1, 1e-4, 1e-4)
upper <- c( 10, 10, 1, 10, 10)
Now to the actual DEoptim call. Here is where you will pass the values for all the non-varying parameters. You set them as named arguments to the DEoptim call, as I've done below.
i <- 1
outDEoptim <- DEoptim(error_heston, lower, upper,
DEoptim.control(VTR=0, itermax=100), P = C0116_P[i], S = C0116_SP_500_P[i],
K = C0116_K[i], tau = C0116_T_t[i], r = C0116_r[i], q = C0116_DY[i])
I only ran one iteration of the for loop, because the callHestoncf() function frequently throws an error because the numerical integration routine fails. This stops the optimization. You should look into the cause of that, and ask a new question if you have trouble.
I also noticed you specified one of the non-varying inputs incorrectly. Your dividend yield percentages are 100 times too large. Your non-varying inputs should be:
# Real option price
C0116_P <- c(1328.70, 1316.05, 1333.05, 1337.90, 1344.80)
# Strike price
C0116_K <- c(500, 500, 500, 500, 500)
# Time to maturity in years
C0116_T_t <- c(1.660274, 1.657534, 1.654795, 1.652055, 1.649315)
# Interest rate percentage
C0116_r <- c(0.08, 0.08, 0.09, 0.09, 0.09)
# Dividend yield percentage
C0116_DY <- c(2.07, 2.09, 2.07, 2.07, 2.06) / 100
# Price underlying
C0116_SP_500_P <- c(1885.08, 1872.83, 1888.03, 1892.49, 1900.53)
As an aside, you should take a little time to format your code better. It makes it more readable, which should help you avoid typo-like errors.

Resources