nll <- function(lambda, kappa){
logit=function(x) {log(x/(1-x))}
a=c(1-exp(-(15/lambda)^kappa), 1-exp(-(25/lambda)^kappa), 1-exp(-(35/lambda)^kappa))
a=logit(a)
mu = c(0.1, 0.2, 0.3)
mu = logit(mu)
cov = matrix(c(0.18830690, 0.00235681, 0.00071954, 0.00235681, 0.00736811, 0.00110457, 0.00071954, 0.00110457, 0.00423955), nrow =3)
L1 = dmvnorm(a, mu, cov)
a=c(1-exp(-(25/lambda)^kappa), 1-exp(-(35/lambda)^kappa), 1-exp(-(45/lambda)^kappa))
a=logit(a)
mu = c(0.4, 0.1, 0.9)
mu = logit(mu)
cov = matrix(c(2.7595442, 0.0045178, 0.0010505, 0.0045178, 0.00972309, 0.0015120, 0.0010505, 0.0015120, 0.0088425), nrow =3)
L2 = dmvnorm(a, mu, cov)
-sum(log(L1*L2))
}
> mle(nll, start = list(lambda = 1, kappa = 1))
Error in optim(start, f, method = method, hessian = TRUE, ...) :
initial value in 'vmmin' is not finite
I'm trying to find the lambda and kappa values that maximize the function above.
My original likelihood function returns L1*L2, but because the mle function requires the negative log-likelihood to be passed in, I modified the function to return -sum(log(L1*L2)) instead.
However, I ran into the above error. And I've also tried specifying dmvnorm(... ,log = TRUE) but that didn't solve the problem.
L1 and L2 are both scalars. Assuming we're going to pass log=TRUE to dmvnorm so they are each log-likelihoods, do you mean just -(L1+L2) in the final line?
by specifying debug(nll) and nll(lambda=1,kappa=1) , then stepping through the code, waiting til we find an infinite value, and then backtracking, we see that 1-exp(-(45/lambda)^kappa) is exactly 1 (exp(-45) is less than 1e-16, the smallest value for which 1+x is > 1, so that the final element of logit(a) is infinite ...
So if I make dmvnorm(...,log=TRUE) in both places, change the last line to return(-(L1+L2)), and change the initial value of lambda to 10, I get a finite value for nll(10,1) (4474), and stats4::mle(nll,start=list(lambda=10,kappa=1)) gives:
Coefficients:
lambda kappa
40.622673 4.883857
Related
I tried to use mle to estimate the parameters for the negative binomial distribution. Here is my code.
library(stats4)
library(bbmle)
library(MASS)
b=rnbinom(n=1000, size=3, prob=0.1)
LL2 <- function(size, prob) {
R = dnbinom(b, size, prob, log = TRUE)
-sum(R)
}
When I set the mle function with lower and upper bound, I got
stats4::mle(LL2, start = list(size = 3, prob = 0.1),lower = c(-Inf,-Inf),upper = c(Inf,Inf))
Error in optim(start, f, method = method, hessian = TRUE, lower = lower, :
L-BFGS-B needs finite values of 'fn'
When I removed the bounds
stats4::mle(LL2, start = list(size = 3, prob = 0.1))
Call:
stats4::mle(minuslogl = LL2, start = list(size = 3, prob = 0.1))
Coefficients:
size prob
3.0467857 0.1037522
However, if I change the bounds to a finite value, the error is still there.
I was wondering why this happened? Is that because the L-BFGS-B method can not handle with bounds settings?
Any comments will be appreciated.
I ran your setup code with set.seed(101).
Create an instrumented version of the score function so we can see where the optimizer is going:
LL2 <- function(size, prob) {
R = dnbinom(b, size, prob, log = TRUE)
res <- -sum(R)
cat(size,prob,res,"\n")
res
}
stats4::mle(LL2, start = list(size = 3, prob = 0.1),lower = c(-Inf,-Inf),upper = c(Inf,Inf))
## 3 0.1 4085.146
## 3.001 0.1 4085.166
## 2.999 0.1 4085.127
## 3 0.101 4084.767
## 3 0.099 4085.858
## 2.964666 1.099376 NaN
Error in optim(start, f, method = method, hessian = TRUE, lower = lower, :
L-BFGS-B needs finite values of 'fn'
In addition: Warning message:
In dnbinom(b, size, prob, log = TRUE) : NaNs produced
The first 5 steps are the evaluation of initial value and of the finite difference approximation of the derivatives. The very next optimization step takes us to prob = 1.099, which gives us an NaN result (we need 0 < prob < 1). L-BFGS-B is much more finicky than the other optimizers about non-finite values; most of the others treat non-finite results as "bad" and try something sensible.
You could set the lower bound to 0 for size and bounds (0,1) for prob ... (I tried it and it seems to work). You do have to be a little bit careful with L-BFGS-B - it doesn't always respect the bounds when it is calculating the finite-difference approximation, so e.g. if values <= 0 will give non-finite results you may need to set the lower bound slightly above 0 (e.g. 0.002, since the default finite-difference epsilon is 0.001).
I do not even know if any of my equations are right, but I did this and its saying the objective function in optim evaluates to length 2 not 1. I am trying to estimate lambda and alfa using mle and log likelihood.
set.seed(1001)
q <- 1:5000
mean(q)
sd(q)
ll <- function(lambda = 0.5, alpha = 3.5) {
n <- 5000
(-n) * alpha * log(lambda) +
n * log(gamma(alpha)) - ((alpha - 1) * sum(log(q))) +
lambda * sum(q)
}
MLE <- optim(c(0.5, 3.5),
fn = ll,
method = "L-BFGS-B",
lower = 0.00001,
control = list(fnscale = -1),
hessian = T
)
There are several places to make your optim fly:
you should define your ll as a function of one argument (instead of two arguments, otherwise you will have the error "objective function in optim evaluates to length 2 not 1"), since within optim, your initial value x0 will be pass to objective function fn
ll<-function(x){
n<-5000
lambda <- x[1]
alfa <- x[2]
(-n)*alfa*log(lambda)+n*log(gamma(alfa))-((alfa-1)*sum(log(q)))+lambda*sum(q)
}
In optim, the dimension of lower bound lower should be the same as x0, e.g.,
MLE = optim(c(0.5,3.5),
fn = ll,
method = "L-BFGS-B",
lower = c(1e-5,1e-5),
control = list(fnscale = -1),
hessian = FALSE
)
Hi, I want to estimate gamma distribution parameters hand by hand! I know a lot of R functions to estimate shape and scale parameters, but it seems hard to find code about estimating location parameter.
x <- c(108,91,62,59,84,60,71,105,70,69,66,65,78,83,82,68,107,68,68,69,80,
75,89,68,64,68,70,57,62,87,51,55,56,57,75,98,60,68,81,47,76,48,63,
58,40,62,61,58,38,40,45,68,56,64,49,53,50,39,54,47,37,50,54,70,49,
57,52,47,43,52,57,46,63,56,50,51,50,42,46,56,52,59,45,50,59,44,52,
54,53,63,45,56,55,53,56,46,45,49,63,50,41,42,53,50,58,50,37,53,58,
49,53,51,64,44,53,53,55,43,50,60,51,55,56,52,51,45,49,51,63,48,51,
60,45,40,50,66,62,69,53,54,49,47,63,55,62,57,58,51,50,57,62,45,47,
52,35,41,53,48,59,45,41,52,36,84,62,31,41,48,47,50,50,57,53,37,46,
41,56,51,39,59,53,51,49,45,42,32,55,34,43,35,48,33,41,38,57,37,40,
34,44,43,62,36,41,51,48,31,28,33,35,48,31)
# estimate shape and scale parameter
gamma_likelihood <- function(para){
sum ( (para[2] -1)*log(x) - para[2]*log(para[1]) - log(gamma(para[2])) - x/para[1] + 1/para[1])
}
MLE = optim(c(10,10),
fn = gamma_likelihood,
method = "L-BFGS-B",
lower = 0.00001,
control = list(fnscale = -1),
hessian = T
)
MLE$par
# estimate location, shape and scale parameter
gamma_likelihood <- function(para){
x = x[x > para[1]]
sum ( (para[3] -1)*log(x - para[1]) - para[3]*log(para[2]) -
log(gamma(para[3])) - x/para[2] + para[1]/para[2] )
}
MLE = optim(c(23,6,7),
fn = gamma_likelihood,
method = 'L-BFGS-B',
lower = 0.00000001,
control = list(fnscale = -1)
)
MLE$par
This is my code, I can estimate shape and scale parameters.
However, when it comes to add location parameters into log likelihood. The result seems incorrect.The TRUE parameters are c(21.4, 5.47, 6.0).
If you have any observed value less or equal than your location parameter, your whole likelihood for that value of lambda must be 0 (remember it's a function of parameters, not observations).
x = x[x > para[1]] is cutting observations that don't make sense for a specific location parameter, making your function return a valid number, when it should return -Inf if any of the x is "invalid", since you'd have 0 likelihood.
Here's a corrected version of your log-likelihood function:
# estimate location, shape and scale parameter
gamma_likelihood <- function(para){
if(min(x) < para[1]) return(-Inf)
sum ( (para[3] -1)*log(x - para[1]) - para[3]*log(para[2]) -
log(gamma(para[3])) - x/para[2] + para[1]/para[2] )
}
MLE = optim(c(23,6,7),
fn = gamma_likelihood,
method = 'L-BFGS-B',
lower = 0.00000001,
control = list(fnscale = -1)
)
MLE$par
retults in: [1] 21.161109 5.394343 6.136862
I am trying to fit a truncated normal distribution to some data. However, I have been running into the following error:
<simpleError in optim(par = vstart, fn = fnobj, fix.arg = fix.arg, obs = data, gr = gradient, ddistnam = ddistname, hessian = TRUE, method = meth, lower = lower, upper = upper, ...): non-finite finite-difference value [1]>
Error in fitdist(testData, "truncnorm", start = list(a = 0, mean = 0.8, :
the function mle failed to estimate the parameters,
with the error code 100
I'm not sure what's going wrong - I've read that in some cases there can be problems fitting if the initial guesses are wrong or higher than the actual values, but I've tried a number of different start values and none seem to work.
Here is a small sample of my data, and the code I used to get the error:
library(fitdistrplus)
library(truncnorm)
testData <- c(3.2725167726, 0.1501345235, 1.5784128343, 1.218953218, 1.1895520932,
2.659871271, 2.8200152609, 0.0497193249, 0.0430677458, 1.6035277181,
0.2003910167, 0.4982836845, 0.9867184303, 3.4082793339, 1.6083770189,
2.9140912221, 0.6486576911, 0.335227878, 0.5088426851, 2.0395797721,
1.5216239237, 2.6116576364, 0.1081283479, 0.4791143698, 0.6388625172,
0.261194346, 0.2300098384, 0.6421213993, 0.2671907741, 0.1388568942,
0.479645736, 0.0726750815, 0.2058983462, 1.0936704833, 0.2874115077,
0.1151566887, 0.0129750118, 0.152288794, 0.1508512023, 0.176000366,
0.2499423442, 0.8463027325, 0.0456045486, 0.7689214668, 0.9332181529,
0.0290242892, 0.0441181842, 0.0759601229, 0.0767983979, 0.1348839304
)
fitdist(testData, "truncnorm", start = list(a = 0, mean = 0.8, sd = 0.9))
The problem is that the mle estimator provides increasingly negative estimates for the parameter mean as the lower bound a tends to zero (note that the latter must not be specified within the start parameter, but within fix.arg):
fitdist(testData, "truncnorm", fix.arg=list(a=-.5),
start = list(mean = mean(testData), sd = sd(testData)))
fitdist(testData, "truncnorm", fix.arg=list(a=-.2),
start = list(mean = mean(testData), sd = sd(testData)))
fitdist(testData, "truncnorm", fix.arg=list(a=-.15),
start = list(mean = mean(testData), sd = sd(testData)))
One possibility to prevent large negative values for mean is to use a lower bound for the optimisation:
fitdist(testData, "truncnorm", fix.arg=list(a=0),
start = list(mean = mean(testData), sd = sd(testData)),
optim.method="L-BFGS-B", lower=c(0, 0))
However, this alters the estimation procedure; in fact you are imposing additional constraints on the parameters and might obtain different answers with different lower bounds.
I have a simple likelihood function (from a normal dist with mean=0) that I want to maximize. optim keeps giving me this error:
Error in optim(par = phi, fn = loglike, estimates = estimates, NULL, hessian = TRUE, : non-finite finite-difference value [1]
Here is my data and likelihood function:
y = [ -0.01472 0.03942 0.03592 0.02776 -0.00090 ]
C = a varcov matrix:
1.66e-03 -0.000120 -6.780e-06 0.000102 -4.000e-05
-1.20e-04 0.001387 7.900e-05 -0.000140 -8.000e-05
-6.78e-06 0.000079 1.416e-03 -0.000070 8.761e-06
1.02e-04 -0.000140 -7.000e-05 0.001339 -6.000e-05
-4.00e-05 -0.000080 8.761e-06 -0.000060 1.291e-03
my log likelihood function is:
lglkl = -.5*(log(det(v)) + (t(y)%%vi%%y))` where v = phi*I + C and vi=inverse(v) and I= 5*5 Identity matrix.
I am trying to get the mle estimate for "phi". I thought this would be a simple optimization problem but am struggling. Would really appreciate any help. Thanks in advance. My code is below:
loglike <- function(phi,y) {
v = phi*I + C
vi = solve(v)
loglike = -.5*(log(det(v)) + (t(y)%*%vi%*%y))
return(-loglike)
}
phi = 0
parm <- optim(par=phi,fn=loglike,y=y,NULL,hessian = TRUE, method="L-BFGS-B",lower=0,upper=1000)
The error you ran into is because ϕ becomes negative beyond a certain number of iterations (which indicates that the constraints are not being applied correctly by the algorithm). Also, the solution does not converge to a single value but jumps between a few small values before reaching a situation where the updated covariance matrix is no-longer positive definite. At that stage you get det(v) < 0 and log[det(v)] is undefined. The optim algorithm bails out at that stage.
To see what's happening, play with the maxit and ndeps parameters in the code below.
require("matrixcalc")
#-------------------------------------------------
# Log-likelihood function
#-------------------------------------------------
loglike <- function(phi, y) {
# Shift the covariance matrix
print(paste("phi = ", phi))
#v = phi*I + (1 - phi)*C
v = phi*I + C
stopifnot(is.positive.definite(v))
# Invert shifted matrix
vi = solve(v)
# Compute log likelihood
loglike = -.5*(log(det(v)) + (t(y) %*% vi %*% y))
print(paste("L = ", loglike))
return(-loglike)
}
#-------------------------------------------------
# Data
#-------------------------------------------------
y = c(-0.01472, 0.03942, 0.03592, 0.02776, -9e-04)
C = structure(c(0.00166, -0.00012, -6.78e-06, 0.000102, -4e-05, -0.00012,
0.001387, 7.9e-05, -0.00014, -8e-05, -6.78e-06, 7.9e-05,
0.001416, -7e-05, 8.761e-06, 0.000102, -0.00014, -7e-05,
0.001339, -6e-05, -4e-05, -8e-05, 8.761e-06, -6e-05, 0.001291),
.Dim = c(5L, 5L ))
#--------
# Initial parameter
#--------
I = diag(5)
phi = 50
#--------
# Minimize
#--------
parm <- optim(par = phi, fn = loglike, y = y, NULL, hessian = TRUE,
method = "L-BFGS-B", lower = 0.0001, upper = 1000,
control = list(trace = 3,
maxit = 1000,
ndeps = 1e-4) )