R optim() L-BFGS-B needs finite values of 'fn' - Weibull - r

I try to estimate the three parameters a, b0 and b1 with the optim() function. But I always get the error:
Error in optim(par = c(1, 1, 1), fn = logweibull, method = "L-BFGS-B", :
L-BFGS-B needs finite values of 'fn'
t<-c(6,6,6,6,7,9,10,10,11,13,16,17,19,20,22,23,25,32,32,34,35,1,1,2,2,3,4,4,5,5,8,8,8,8,11,11,12,12,15,17,22,23)
d<-c(0,1,1,1,1,0,0,1,0,1,1,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
X<-c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0)
logweibull <- function (a,b0,b1) {a <- v[1];b0 <- v[2]; b1 <- v[3];
sum (d*log(t^a*exp(b0+X*b1)-t^a*exp(b0+X*b1))) + sum (d + log((a*t^(a-1))/t^a)) }
v<-c(1,1,1)
optim( par=c(1,1,1) ,fn = logweibull, method = "L-BFGS-B",lower = c(0.1, 0.1,0.1), upper = c(100, 100,100),control = list(fnscale = -1) )
Can you help me? Do you know what I did wrong?

You may also consider
(1) passing the additional data variables to the objective function along with the parameters you want to estimate.
(2) passing the gradient function (added the gradient function)
(3) the original objective function can be further simplified (as below)
logweibull <- function (v,t,d,X) {
a <- v[1]
b0 <- v[2]
b1 <- v[3]
sum(d*(1+a*log(t)+b0+X*b1) - t^a*exp(b0+X*b1) + log(a/t)) # simplified function
}
grad.logweibull <- function (v,t,d,X) {
a <- v[1]
b0 <- v[2]
b1 <- v[3]
c(sum(d*log(t) - t^a*log(t)*exp(b0+X*b1) + 1/a),
sum(d-t^a*exp(b0+X*b1)),
sum(d*X - t^a*X*exp(b0+X*b1)))
}
optim(par=c(1,1,1), fn = logweibull, gr = grad.logweibull,
method = "L-BFGS-B",
lower = c(0.1, 0.1,0.1),
upper = c(100, 100,100),
control = list(fnscale = -1),
t=t, d=d, X=X)
with output
$par
[1] 0.2604334 0.1000000 0.1000000
$value
[1] -191.5938
$counts
function gradient
10 10
$convergence
[1] 0
$message
[1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
Also, below is a comparison between the convergence of with and without gradient function (with finite difference). With an explicit gradient function it takes 9 iterations to converge to the solution, whereas without it (with finite difference), it takes 126 iterations to converge.

Related

objective function in optim evaluates to length 2 not 1

I do not even know if any of my equations are right, but I did this and its saying the objective function in optim evaluates to length 2 not 1. I am trying to estimate lambda and alfa using mle and log likelihood.
set.seed(1001)
q <- 1:5000
mean(q)
sd(q)
ll <- function(lambda = 0.5, alpha = 3.5) {
n <- 5000
(-n) * alpha * log(lambda) +
n * log(gamma(alpha)) - ((alpha - 1) * sum(log(q))) +
lambda * sum(q)
}
MLE <- optim(c(0.5, 3.5),
fn = ll,
method = "L-BFGS-B",
lower = 0.00001,
control = list(fnscale = -1),
hessian = T
)
There are several places to make your optim fly:
you should define your ll as a function of one argument (instead of two arguments, otherwise you will have the error "objective function in optim evaluates to length 2 not 1"), since within optim, your initial value x0 will be pass to objective function fn
ll<-function(x){
n<-5000
lambda <- x[1]
alfa <- x[2]
(-n)*alfa*log(lambda)+n*log(gamma(alfa))-((alfa-1)*sum(log(q)))+lambda*sum(q)
}
In optim, the dimension of lower bound lower should be the same as x0, e.g.,
MLE = optim(c(0.5,3.5),
fn = ll,
method = "L-BFGS-B",
lower = c(1e-5,1e-5),
control = list(fnscale = -1),
hessian = FALSE
)

Plotting incomplete elliptic integral of 1st kind

I wanted to set a small dataframe in order to plot myself some points of the incomplete elliptic integral of 1st kind for different values of amplitude phi and modulus k. The function to integrate is 1/sqrt(1 - (k*sin(x))^2) between 0 and phi.Here is the code I imagined:
v.phi <- seq(0, 2*pi, 1)
n.phi <- length(v.phi)
v.k <- seq(-1, +1, 0.5)
n.k <- length(v.k)
k <- rep(v.k, each = n.phi, times = 1)
phi <- rep(v.phi, each = 1, times = n.k)
df <- data.frame(k, phi)
func <- function(x, k) 1/sqrt(1 - (k*sin(x))^2)
df$area <- integrate(func,lower=0, upper=df$phi, k=df$k)
But this generates errors and I am obviously mistaking in constructing the new variable df$area... Could someone put me in the right way?
You can use mapply:
df$area <- mapply(function(phi,k){
integrate(func, lower=0, upper=phi, k=k)$value
}, df$phi, df$k)
However that generates an error because there are some values of k equal to 1 or -1, while the allowed values are -1 < k < 1. You can't evaluate this integral for k = +/- 1.
Note that there's a better way to evaluate this integral: the incomplete elliptic function of the first kind is implemented in the gsl package:
> integrate(func, lower=0, upper=6, k=0.5)$value
[1] 6.458877
> gsl::ellint_F(6, 0.5)
[1] 6.458877
As I said, this function is not defined for k=-1 or k=1:
> gsl::ellint_F(6, 1)
[1] NaN
> gsl::ellint_F(6, -1)
[1] NaN
> integrate(func, lower=0, upper=6, k=1)
Error in integrate(func, lower = 0, upper = 6, k = 1) :
non-finite function value

Maximum likelihood estimation with uniform distribution in R leads to absurd result

I want to use the mle function to get estimates of a and b in a Unif(a,b) distribution. But I get absurd estimates nowhere close to 1 and 3.
library(stats4)
set.seed(20161208)
N <- 100
c <- runif(N, 1, 3)
LL <- function(min, max) {
R <- runif(100, min, max)
suppressWarnings((-sum(log(R))))
}
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS",
lower = c(-Inf, 0), upper = c(Inf, Inf))
I got:
Call:
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS")
Coefficients:
min max
150.8114 503.6586
Any ideas of what's going on? Thank you in advance!
I would first point out where your code is wrong.
You need dunif not runif. You may define:
LL <- function (a, b) -sum(dunif(x, a, b, log.p = TRUE))
In my code below I did not use dunif, as the density is just 1 / (b - a) so I wrote it directly.
You are generating samples inside objective function. For U[a,b] this is OK as its density is free of x. But for other distributions the objective function changes at each iteration.
With box constraints, you need method = "L-BFGS-B", not the ordinary "BFGS". And you are not using the right constraints.
Now in more depth...
For a length-n sample vector x from U[a, b], the likelihood is (b - a) ^ (-n), and negative-log-likelihood is n * log(b - a). Obviously the MLE are a = min(x) and b = max(x).
Numerical optimization is completely unnecessary, and is in fact impossible without constraints. Look at the gradient vector:
( n / (a - b), n / (b - a) )
The partial derivative w.r.t. a / b is always negative / positive and can't be 0.
Numerical approach becomes feasible when we impose box constraints: -Inf < a <= min(x) and max(x) <= b < Inf. We know for sure that iteration terminates at the boundary.
My code below uses both optim and mle. Note mle will fail, when it inverts Hessian matrix, as it is singular:
-(b - a) ^ 2 (b - a) ^ 2
(b - a) ^ 2 -(b - a) ^ 2
Code:
## 100 samples
set.seed(20161208); x <- runif(100, 1, 3)
# range(x)
# [1] 1.026776 2.984544
## using `optim`
nll <- function (par) log(par[2] - par[1]) ## objective function
gr_nll <- function (par) c(-1, 1) / diff(par) ## gradient function
optim(par = c(0,4), fn = nll, gr = gr_nll, method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf), hessian = TRUE)
#$par
#[1] 1.026776 2.984544 ## <- reaches boundary!
#
# ...
#
#$hessian ## <- indeed singular!!
# [,1] [,2]
#[1,] -0.2609022 0.2609022
#[2,] 0.2609022 -0.2609022
## using `stats4::mle`
library(stats4)
nll. <- function (a, b) log(b - a)
mle(minuslogl = nll., start = list(a = 0, b = 4), method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf))
#Error in solve.default(oout$hessian) :
# Lapack routine dgesv: system is exactly singular: U[2,2] = 0

optim in r :non finite finite difference error

I have a simple likelihood function (from a normal dist with mean=0) that I want to maximize. optim keeps giving me this error:
Error in optim(par = phi, fn = loglike, estimates = estimates, NULL, hessian = TRUE, : non-finite finite-difference value [1]
Here is my data and likelihood function:
y = [ -0.01472 0.03942 0.03592 0.02776 -0.00090 ]
C = a varcov matrix:
1.66e-03 -0.000120 -6.780e-06 0.000102 -4.000e-05
-1.20e-04 0.001387 7.900e-05 -0.000140 -8.000e-05
-6.78e-06 0.000079 1.416e-03 -0.000070 8.761e-06
1.02e-04 -0.000140 -7.000e-05 0.001339 -6.000e-05
-4.00e-05 -0.000080 8.761e-06 -0.000060 1.291e-03
my log likelihood function is:
lglkl = -.5*(log(det(v)) + (t(y)%%vi%%y))` where v = phi*I + C and vi=inverse(v) and I= 5*5 Identity matrix.
I am trying to get the mle estimate for "phi". I thought this would be a simple optimization problem but am struggling. Would really appreciate any help. Thanks in advance. My code is below:
loglike <- function(phi,y) {
v = phi*I + C
vi = solve(v)
loglike = -.5*(log(det(v)) + (t(y)%*%vi%*%y))
return(-loglike)
}
phi = 0
parm <- optim(par=phi,fn=loglike,y=y,NULL,hessian = TRUE, method="L-BFGS-B",lower=0,upper=1000)
The error you ran into is because ϕ becomes negative beyond a certain number of iterations (which indicates that the constraints are not being applied correctly by the algorithm). Also, the solution does not converge to a single value but jumps between a few small values before reaching a situation where the updated covariance matrix is no-longer positive definite. At that stage you get det(v) < 0 and log[det(v)] is undefined. The optim algorithm bails out at that stage.
To see what's happening, play with the maxit and ndeps parameters in the code below.
require("matrixcalc")
#-------------------------------------------------
# Log-likelihood function
#-------------------------------------------------
loglike <- function(phi, y) {
# Shift the covariance matrix
print(paste("phi = ", phi))
#v = phi*I + (1 - phi)*C
v = phi*I + C
stopifnot(is.positive.definite(v))
# Invert shifted matrix
vi = solve(v)
# Compute log likelihood
loglike = -.5*(log(det(v)) + (t(y) %*% vi %*% y))
print(paste("L = ", loglike))
return(-loglike)
}
#-------------------------------------------------
# Data
#-------------------------------------------------
y = c(-0.01472, 0.03942, 0.03592, 0.02776, -9e-04)
C = structure(c(0.00166, -0.00012, -6.78e-06, 0.000102, -4e-05, -0.00012,
0.001387, 7.9e-05, -0.00014, -8e-05, -6.78e-06, 7.9e-05,
0.001416, -7e-05, 8.761e-06, 0.000102, -0.00014, -7e-05,
0.001339, -6e-05, -4e-05, -8e-05, 8.761e-06, -6e-05, 0.001291),
.Dim = c(5L, 5L ))
#--------
# Initial parameter
#--------
I = diag(5)
phi = 50
#--------
# Minimize
#--------
parm <- optim(par = phi, fn = loglike, y = y, NULL, hessian = TRUE,
method = "L-BFGS-B", lower = 0.0001, upper = 1000,
control = list(trace = 3,
maxit = 1000,
ndeps = 1e-4) )

hand-rolled R code for Poisson MLE

I'm attempting to write my own function to understand how the Poisson distribution behaves within a Maximum Likelihood Estimation framework (as it applies to GLM).
I'm familiar with R's handy glm function, but wanted to try and hand-roll some code to understand what's going on:
n <- 10000 # sample size
b0 <- 1.0 # intercept
b1 <- 0.2 # coefficient
x <- runif(n=n, min=0, max=1.5) # generate covariate values
lp <- b0+b1*x # linear predictor
lambda <- exp(lp) # compute lamda
y <- rpois(n=n, lambda=lambda) # generate y-values
dta <- data.frame(y=y, x=x) # generate dataset
negloglike <- function(lambda) {n*lambda-sum(x)*log(lambda) + sum(log(factorial(y)))} # build negative log-likelihood
starting.vals <- c(0,0) # one starting value for each parameter
pars <- c(b0, b1)
maxLike <- optim(par=pars,fn=negloglike, data = dta) # optimize
My R output when I enter maxLike is the following:
Error in fn(par, ...) : unused argument (data = list(y = c(2, 4....
I assume I've specified optim within my function incorrectly, but I'm not familiar enough with the nuts-and-bolts of MLE or constrained optimization to understand what I'm missing.
optim can only use your function in a certain way. It assumes the first parameter in your function takes in the parameters as a vector. If you need to pass other information to this function (in your case the data) you need to have that as a parameter of your function. Your negloglike function doesn't have a data parameter and that's what it is complaining about. The way you have it coded you don't need one so you probably could fix your problem by just removing the data=dat part of your call to optim but I didn't test that. Here is a small example of doing a simple MLE for just a poisson (not the glm)
negloglike_pois <- function(par, data){
x <- data$x
lambda <- par[1]
-sum(dpois(x, lambda, log = TRUE))
}
dat <- data.frame(x = rpois(30, 5))
optim(par = 4, fn = negloglike_pois, data = dat)
mean(dat$x)
> optim(par = 4, fn = negloglike_pois, data = dat)
$par
[1] 4.833594
$value
[1] 65.7394
$counts
function gradient
22 NA
$convergence
[1] 0
$message
NULL
Warning message:
In optim(par = 4, fn = negloglike_pois, data = dat) :
one-dimensional optimization by Nelder-Mead is unreliable:
use "Brent" or optimize() directly
> # The "true" MLE. We didn't hit it exactly but came really close
> mean(dat$x)
[1] 4.833333
Implementing the comments from Dason's answer is quite straightforward, but just in case:
library("data.table")
d <- data.table(id = as.character(1:100),
x1 = runif(100, 0, 1),
x2 = runif(100, 0, 1))
#' the assumption is that lambda can be written as
#' log(lambda) = b1*x1 + b2*x2
#' (In addition, could add a random component)
d[, mean := exp( 1.57*x1 + 5.86*x2 )]
#' draw a y for each of the observations
#' (rpois is not vectorized, need to use sapply)
d[, y := sapply(mean, function(x)rpois(1,x)) ]
negloglike_pois <- function(par, data){
data <- copy(d)
# update estimate of the mean
data[, mean_tmp := exp( par[1]*x1 + par[2]*x2 )]
# calculate the contribution of each observation to the likelihood
data[, log_p := dpois(y, mean_tmp, log = T)]
#' Now we can sum up the probabilities
data[, -sum(log_p)]
}
optim(par = c(1,1), fn = negloglike_pois, data = d)
$par
[1] 1.554759 5.872219
$value
[1] 317.8094
$counts
function gradient
95 NA
$convergence
[1] 0
$message
NULL

Resources