Why the mle function can not run with lower and upper bound? - r

I tried to use mle to estimate the parameters for the negative binomial distribution. Here is my code.
library(stats4)
library(bbmle)
library(MASS)
b=rnbinom(n=1000, size=3, prob=0.1)
LL2 <- function(size, prob) {
R = dnbinom(b, size, prob, log = TRUE)
-sum(R)
}
When I set the mle function with lower and upper bound, I got
stats4::mle(LL2, start = list(size = 3, prob = 0.1),lower = c(-Inf,-Inf),upper = c(Inf,Inf))
Error in optim(start, f, method = method, hessian = TRUE, lower = lower, :
L-BFGS-B needs finite values of 'fn'
When I removed the bounds
stats4::mle(LL2, start = list(size = 3, prob = 0.1))
Call:
stats4::mle(minuslogl = LL2, start = list(size = 3, prob = 0.1))
Coefficients:
size prob
3.0467857 0.1037522
However, if I change the bounds to a finite value, the error is still there.
I was wondering why this happened? Is that because the L-BFGS-B method can not handle with bounds settings?
Any comments will be appreciated.

I ran your setup code with set.seed(101).
Create an instrumented version of the score function so we can see where the optimizer is going:
LL2 <- function(size, prob) {
R = dnbinom(b, size, prob, log = TRUE)
res <- -sum(R)
cat(size,prob,res,"\n")
res
}
stats4::mle(LL2, start = list(size = 3, prob = 0.1),lower = c(-Inf,-Inf),upper = c(Inf,Inf))
## 3 0.1 4085.146
## 3.001 0.1 4085.166
## 2.999 0.1 4085.127
## 3 0.101 4084.767
## 3 0.099 4085.858
## 2.964666 1.099376 NaN
Error in optim(start, f, method = method, hessian = TRUE, lower = lower, :
L-BFGS-B needs finite values of 'fn'
In addition: Warning message:
In dnbinom(b, size, prob, log = TRUE) : NaNs produced
The first 5 steps are the evaluation of initial value and of the finite difference approximation of the derivatives. The very next optimization step takes us to prob = 1.099, which gives us an NaN result (we need 0 < prob < 1). L-BFGS-B is much more finicky than the other optimizers about non-finite values; most of the others treat non-finite results as "bad" and try something sensible.
You could set the lower bound to 0 for size and bounds (0,1) for prob ... (I tried it and it seems to work). You do have to be a little bit careful with L-BFGS-B - it doesn't always respect the bounds when it is calculating the finite-difference approximation, so e.g. if values <= 0 will give non-finite results you may need to set the lower bound slightly above 0 (e.g. 0.002, since the default finite-difference epsilon is 0.001).

Related

Integrate function returning roundoff error after working previously

When using integrate to integrate a lognormal density function from 2000 -> Inf, I am returned with an error. I had used a very similar equation previously with no problems.
I have tried disabling stop on error, and setting rel.tol lower. I am fairly new and unfamilar with r so I apologize if neither of those are expected to have done anything.
> integrand = function(x) {(x-2000)*(1/x)*(1/(.99066*((2*pi)^.5)))*exp(-((log(x)-7.641)^2)/((2*(.99066)^2)))}
> integrate(integrand,lower=2000,upper=Inf)
1854.002 with absolute error < 0.018
#returns value fine
> integrand = function(x) {(x-2000)*(1/x)*(1/(1.6247*((2*pi)^.5)))*exp(-((log(x)-9.0167)^2)/((2*(1.6247)^2)))}
> integrate(integrand,lower=2000,upper=Inf)
Error in integrate(integrand, lower = 2000, upper = Inf) :
roundoff error is detected in the extrapolation table
#small change in the mu and sigma in the lognormal density function results in roundoff error
> integrate(integrand,lower=1293,upper=Inf)
29005.08 with absolute error < 2
#integrating on lower bound works fine, but having lower=1294 returns a roundoff error again
> integrate(integrand,lower=1294,upper=Inf)
Error in integrate(integrand, lower = 1294, upper = Inf) :
roundoff error is detected in the extrapolation table
I should be getting returned a value, no? I struggle to see how very slightly altering the values would cause the function to no longer integrate.
First of all, I believe you are complicating when you define the integrand by writing down the entire expression, it seems better to use the built-in dlnorm function.
g <- function(x, deduce, meanlog, sdlog){
(x - deduce) * dlnorm(x, meanlog = meanlog, sdlog = sdlog)
}
curve(g(x, deduce = 2000, meanlog = 9.0167, sdlog = 1.6247),
from = 1294, to = 1e4)
As for the integration problem, package cubature generally does a better job when integrate fails. All of the following produce the results, with no errors.
library(cubature)
cubintegrate(g, lower = 1293, upper = Inf, method = "pcubature",
deduce = 2000, meanlog = 9.0167, sdlog = 1.6247)
cubintegrate(g, lower = 1294, upper = Inf, method = "pcubature",
deduce = 2000, meanlog = 9.0167, sdlog = 1.6247)
cubintegrate(g, lower = 2000, upper = Inf, method = "pcubature",
deduce = 2000, meanlog = 9.0167, sdlog = 1.6247)

R: error in mle initial value in 'vmmin' is not finite

nll <- function(lambda, kappa){
logit=function(x) {log(x/(1-x))}
a=c(1-exp(-(15/lambda)^kappa), 1-exp(-(25/lambda)^kappa), 1-exp(-(35/lambda)^kappa))
a=logit(a)
mu = c(0.1, 0.2, 0.3)
mu = logit(mu)
cov = matrix(c(0.18830690, 0.00235681, 0.00071954, 0.00235681, 0.00736811, 0.00110457, 0.00071954, 0.00110457, 0.00423955), nrow =3)
L1 = dmvnorm(a, mu, cov)
a=c(1-exp(-(25/lambda)^kappa), 1-exp(-(35/lambda)^kappa), 1-exp(-(45/lambda)^kappa))
a=logit(a)
mu = c(0.4, 0.1, 0.9)
mu = logit(mu)
cov = matrix(c(2.7595442, 0.0045178, 0.0010505, 0.0045178, 0.00972309, 0.0015120, 0.0010505, 0.0015120, 0.0088425), nrow =3)
L2 = dmvnorm(a, mu, cov)
-sum(log(L1*L2))
}
> mle(nll, start = list(lambda = 1, kappa = 1))
Error in optim(start, f, method = method, hessian = TRUE, ...) :
initial value in 'vmmin' is not finite
I'm trying to find the lambda and kappa values that maximize the function above.
My original likelihood function returns L1*L2, but because the mle function requires the negative log-likelihood to be passed in, I modified the function to return -sum(log(L1*L2)) instead.
However, I ran into the above error. And I've also tried specifying dmvnorm(... ,log = TRUE) but that didn't solve the problem.
L1 and L2 are both scalars. Assuming we're going to pass log=TRUE to dmvnorm so they are each log-likelihoods, do you mean just -(L1+L2) in the final line?
by specifying debug(nll) and nll(lambda=1,kappa=1) , then stepping through the code, waiting til we find an infinite value, and then backtracking, we see that 1-exp(-(45/lambda)^kappa) is exactly 1 (exp(-45) is less than 1e-16, the smallest value for which 1+x is > 1, so that the final element of logit(a) is infinite ...
So if I make dmvnorm(...,log=TRUE) in both places, change the last line to return(-(L1+L2)), and change the initial value of lambda to 10, I get a finite value for nll(10,1) (4474), and stats4::mle(nll,start=list(lambda=10,kappa=1)) gives:
Coefficients:
lambda kappa
40.622673 4.883857

R optim(){fExtremes} gets 0 hessian matrix

I am using R {fExtremes} to find best parameters of GEV distribution for my data (a vector). but get the following error message
Error in solve.default(fit$hessian) : Lapack routine dgesv: system is exactly singular: U[1,1] = 0
I traced back to fit$hessian, found my hessian matrix is a sigular matrix, all of the elements are 0s. The source code (https://github.com/cran/fExtremes/blob/master/R/GevFit.R) of gevFit() shows fit$hessian is calculated by optim(). The output parameters are exactly the same value as the initial parameters. I am wondering what could be the problems of my data that cause this problem? I copied my code here
> min(sample);
[1] 5.240909
> max(sample)
[1] 175.8677
> length(sample)
[1] 6789
> mean(sample)
[1] 78.04107
>para<-gevFit(sample, type = "mle")
Error in solve.default(fit$hessian) :
Lapack routine dgesv: system is exactly singular: U[1,1] = 0
fit = optim(theta, .gumLLH, hessian = TRUE, ..., tmp = data)
> fit
$par
xi -0.3129225
mu 72.5542497
beta 16.4450897
$value
[1] 1e+06
$counts
function gradient
4 NA
$convergence
[1] 0
$message
NULL
$hessian
xi mu beta
xi 0 0 0
mu 0 0 0
beta 0 0 0
I updated my dataset on google docs:
https://docs.google.com/spreadsheets/d/1IRRpjmdrrJPhNmfiLism_P0efV_Ot4HlEsa6kwMnljc/edit?usp=sharing
This is going to be a long story, possibly more suited to https://stats.stackexchange.com/.
====== Part 1 -- The problem ======
This is the sequence generating the error:
library(fExtremes)
samp <- read.csv("optimdata.csv")[ ,2]
## does not converge
para <- gevFit(samp, type = "mle")
We are facing the typical cause of lack-of-convergence when using optim() and friends: inadequate starting values for the optimisation.
To see what goes wrong, let us use the PWM estimator (http://arxiv.org/abs/1310.3222); this consists of an analytical formula, hence it does not incur into convergence problems, since it makes no use of optim():
para <- gevFit(samp, type = "pwm")
fitpwm<- attr(para, "fit")
fitpwm$par.ests
The estimated tail parameter xi is negative, corresponding to a bounded upper tail; in fact the fitted distribution displays even more "upper tail boundedness" than the sample data, as you can see from the "leveling off" of the quantile-quantile graph at the right:
qqgevplot <- function(samp, params){
probs <- seq(0.1,0.99,by=0.01)
qqempir <- quantile(samp, probs)
qqtheor <- qgev(probs, xi=params["xi"], mu=params["mu"], beta=params["beta"])
rang <- range(qqempir,qqtheor)
plot(qqempir, qqtheor, xlim=rang, ylim=rang,
xlab="empirical", ylab="theoretical",
main="Quantile-quantile plot")
abline(a=0,b=1, col=2)
}
qqgevplot(samp, fitpwm$par.ests)
For xi<0.5 the MLE estimator is not regular (http://arxiv.org/abs/1301.5611): the value of -0.46 estimated by PWM for xi is very close to that. Now the PWM estimates are used internally by gevFit() as starting values for optim(): you can see this if you print out the code for the function gevFit():
print(gevFit)
print(.gevFit)
print(.gevmleFit)
The starting value for optim is theta, obtained by PWM. For the specific data at hand, this starting value is not adequate, in that it leads to non-convergence of optim().
====== Part 2 -- solutions? ======
Solution 1 is to use para <- gevFit(samp, type = "pwm") as above. If you'd like to use ML, then you have to specify good starting values for optim(). Unfortunately, the fExtremes package does not make it easy to do so. You can then re-define your own version of .gevmleFit to include those, e.g.
.gevmleFit <- function (data, block = NA, start.param, ...)
{
data = as.numeric(data)
n = length(data)
if(missing(start.param)){
theta = .gevpwmFit(data)$par.ests
}else{
theta = start.param
}
fit = optim(theta, .gevLLH, hessian = TRUE, ..., tmp = data)
if (fit$convergence)
warning("optimization may not have succeeded")
par.ests = fit$par
varcov = solve(fit$hessian)
par.ses = sqrt(diag(varcov))
ans = list(n = n, data = data, par.ests = par.ests, par.ses = par.ses,
varcov = varcov, converged = fit$convergence, nllh.final = fit$value)
class(ans) = "gev"
ans
}
## diverges, just as above
.gevmleFit(samp)
## diverges, just as above
startp <- fitpwm$par.ests
.gevmleFit(samp, start.param=startp)
## converges
startp <- structure(c(-0.1, 1, 1), names=names(fitpwm$par.ests))
.gevmleFit(samp, start.param=startp)$par.ests
Now check this out: the beta estimated by PWM is 0.1245; by changing this to a tiny amount, the MLE is made to converge:
startp <- fitpwm$par.ests
startp["beta"]
startp["beta"] <- 0.13
.gevmleFit(samp, start.param=startp)$par.ests
This hopefully clearly illustrates that to blindly optim()ise works until it doesn't and might then turn into a quite delicate endeavour indeed. For this reason, it might be useful to leave this reply here, rather than to migrate to CrossValidated.

Fitting truncnorm using fitdistrplus

I am trying to fit a truncated normal distribution to some data. However, I have been running into the following error:
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, gr = gradient, ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower, upper = upper, ...): non-finite finite-difference value [1]>
Error in fitdist(testData, "truncnorm", start = list(a = 0, mean = 0.8, :
the function mle failed to estimate the parameters,
with the error code 100
I'm not sure what's going wrong - I've read that in some cases there can be problems fitting if the initial guesses are wrong or higher than the actual values, but I've tried a number of different start values and none seem to work.
Here is a small sample of my data, and the code I used to get the error:
library(fitdistrplus)
library(truncnorm)
testData <- c(3.2725167726, 0.1501345235, 1.5784128343, 1.218953218, 1.1895520932,
2.659871271, 2.8200152609, 0.0497193249, 0.0430677458, 1.6035277181,
0.2003910167, 0.4982836845, 0.9867184303, 3.4082793339, 1.6083770189,
2.9140912221, 0.6486576911, 0.335227878, 0.5088426851, 2.0395797721,
1.5216239237, 2.6116576364, 0.1081283479, 0.4791143698, 0.6388625172,
0.261194346, 0.2300098384, 0.6421213993, 0.2671907741, 0.1388568942,
0.479645736, 0.0726750815, 0.2058983462, 1.0936704833, 0.2874115077,
0.1151566887, 0.0129750118, 0.152288794, 0.1508512023, 0.176000366,
0.2499423442, 0.8463027325, 0.0456045486, 0.7689214668, 0.9332181529,
0.0290242892, 0.0441181842, 0.0759601229, 0.0767983979, 0.1348839304
)
fitdist(testData, "truncnorm", start = list(a = 0, mean = 0.8, sd = 0.9))
The problem is that the mle estimator provides increasingly negative estimates for the parameter mean as the lower bound a tends to zero (note that the latter must not be specified within the start parameter, but within fix.arg):
fitdist(testData, "truncnorm", fix.arg=list(a=-.5),
start = list(mean = mean(testData), sd = sd(testData)))
fitdist(testData, "truncnorm", fix.arg=list(a=-.2),
start = list(mean = mean(testData), sd = sd(testData)))
fitdist(testData, "truncnorm", fix.arg=list(a=-.15),
start = list(mean = mean(testData), sd = sd(testData)))
One possibility to prevent large negative values for mean is to use a lower bound for the optimisation:
fitdist(testData, "truncnorm", fix.arg=list(a=0),
start = list(mean = mean(testData), sd = sd(testData)),
optim.method="L-BFGS-B", lower=c(0, 0))
However, this alters the estimation procedure; in fact you are imposing additional constraints on the parameters and might obtain different answers with different lower bounds.

optim in r :non finite finite difference error

I have a simple likelihood function (from a normal dist with mean=0) that I want to maximize. optim keeps giving me this error:
Error in optim(par = phi, fn = loglike, estimates = estimates, NULL, hessian = TRUE, : non-finite finite-difference value [1]
Here is my data and likelihood function:
y = [ -0.01472 0.03942 0.03592 0.02776 -0.00090 ]
C = a varcov matrix:
1.66e-03 -0.000120 -6.780e-06 0.000102 -4.000e-05
-1.20e-04 0.001387 7.900e-05 -0.000140 -8.000e-05
-6.78e-06 0.000079 1.416e-03 -0.000070 8.761e-06
1.02e-04 -0.000140 -7.000e-05 0.001339 -6.000e-05
-4.00e-05 -0.000080 8.761e-06 -0.000060 1.291e-03
my log likelihood function is:
lglkl = -.5*(log(det(v)) + (t(y)%%vi%%y))` where v = phi*I + C and vi=inverse(v) and I= 5*5 Identity matrix.
I am trying to get the mle estimate for "phi". I thought this would be a simple optimization problem but am struggling. Would really appreciate any help. Thanks in advance. My code is below:
loglike <- function(phi,y) {
v = phi*I + C
vi = solve(v)
loglike = -.5*(log(det(v)) + (t(y)%*%vi%*%y))
return(-loglike)
}
phi = 0
parm <- optim(par=phi,fn=loglike,y=y,NULL,hessian = TRUE, method="L-BFGS-B",lower=0,upper=1000)
The error you ran into is because ϕ becomes negative beyond a certain number of iterations (which indicates that the constraints are not being applied correctly by the algorithm). Also, the solution does not converge to a single value but jumps between a few small values before reaching a situation where the updated covariance matrix is no-longer positive definite. At that stage you get det(v) < 0 and log[det(v)] is undefined. The optim algorithm bails out at that stage.
To see what's happening, play with the maxit and ndeps parameters in the code below.
require("matrixcalc")
#-------------------------------------------------
# Log-likelihood function
#-------------------------------------------------
loglike <- function(phi, y) {
# Shift the covariance matrix
print(paste("phi = ", phi))
#v = phi*I + (1 - phi)*C
v = phi*I + C
stopifnot(is.positive.definite(v))
# Invert shifted matrix
vi = solve(v)
# Compute log likelihood
loglike = -.5*(log(det(v)) + (t(y) %*% vi %*% y))
print(paste("L = ", loglike))
return(-loglike)
}
#-------------------------------------------------
# Data
#-------------------------------------------------
y = c(-0.01472, 0.03942, 0.03592, 0.02776, -9e-04)
C = structure(c(0.00166, -0.00012, -6.78e-06, 0.000102, -4e-05, -0.00012,
0.001387, 7.9e-05, -0.00014, -8e-05, -6.78e-06, 7.9e-05,
0.001416, -7e-05, 8.761e-06, 0.000102, -0.00014, -7e-05,
0.001339, -6e-05, -4e-05, -8e-05, 8.761e-06, -6e-05, 0.001291),
.Dim = c(5L, 5L ))
#--------
# Initial parameter
#--------
I = diag(5)
phi = 50
#--------
# Minimize
#--------
parm <- optim(par = phi, fn = loglike, y = y, NULL, hessian = TRUE,
method = "L-BFGS-B", lower = 0.0001, upper = 1000,
control = list(trace = 3,
maxit = 1000,
ndeps = 1e-4) )

Resources