I am working on a paper that requires me to find the MLE of Gumbel’s type I
bivariate exponential distribution. I have proved the likelihood and log-likelihood functions likelihood and log-likelihood but I am struggling to implement it in r to perform optimization with Optim function. My code generates NA values.
Below are my codes.
# likelihood function of x
likelihood.x = function(params, data) {
lambda1 = params[1]
lambda2 = params[2]
theta = params[3]
A = (1 - theta) * (lambda1 * lambda2)
B = theta * (lambda1 ^ 2) * lambda2 * data$X1
C = theta * lambda1 * (lambda2 ^ 2) * data$X2
D = (theta ^ 2) * (lambda1 ^ 2) * (lambda2 ^ 2) * data$X1 * data$X2
E = (lambda1 * data$X1) + (lambda2 * data$X2) + (theta * lambda1 * lambda2 * data$X1 * data$X2)
f = sum(log(A + B + C + D)) - sum(E)
return(exp(f))
}
# Log-likelihood function of x
log.likelihood.x = function(params, data){
lambda1 = params[1]
lambda2 = params[2]
theta = params[3]
A = (1 - theta) * (lambda1 * lambda2)
B = theta * (lambda1 ^ 2) * lambda2 * data$X1
C = theta * lambda1 * (lambda2 ^ 2) * data$X2
D = (theta ^ 2) * (lambda1 ^ 2) * (lambda2 ^ 2) * data$X1 * data$X2
E = (lambda1 * data$X1) + (lambda2 * data$X2) + (theta * lambda1 * lambda2 * data$X1 * data$X2)
f = sum(log(A + B + C + D)) - sum(E)
return(-f)
}
Here's the function for generating the data
# Simulating data
rGBVE = function(n, lambda1, lambda2, theta) {
x1 = rexp(n, lambda1)
lambda12 = lambda1 * lambda2
pprod = lambda12 * theta
C = exp(lambda1 * x1)
A = (lambda12 - pprod + pprod * lambda1 * x1) / C
B = (pprod * lambda2 + pprod ^ 2 * x1) / C
D = lambda2 + pprod * x1
wExp = A / D
wGamma = B / D ^ 2
data.frame(x1, x2 = rgamma(n, (runif(n) > wExp / (wExp + wGamma)) + 1, D))
}
data = rGBVE(n=100, lambda1 = 1.2, lambda2 = 1.4, theta = 0.5)
colnames(data) = c("X1", "X2")
My goal is to find MLE for lambda1, lambda2 and theta using Optim() in r.
Kindly assist me to implement my likelihood and log-likelihood function in r.
Thank you.
Your concern appears to be about the warning message
In log(A+B+C+D): NaNs produced
Such warnings are usually harmless — it just means that the optimization algorithm tried a set of parameters somewhere along the way that violated the condition A+B+C+D ≥ 0. Since these are reasonably complex expressions it would take a little bit of effort to figure out how one might constrain the parameters (or reparameterize the function, e.g. fitting some of the parameters on the log scale) to avoid the warning, but taking a guess that keeping the parameters non-negative will help, we can try using the L-BFGS-B algorithm (which is the only algorithm available in optim() that allows multidimensional bounded optimization).
r1 <- optim(par = c(1,2,1),
fn = log.likelihood.x,
dat = data)
r2 <- optim(par = c(1,2,1),
fn = log.likelihood.x,
lower = rep(0,3),
method = "L-BFGS-B",
dat = data)
The second does not generate warnings, and the results are close (if not identical):
all.equal(r1$par, r2$par)
## "Mean relative difference: 0.0001451953"
You might want to use bbmle, which has some additional features for likelihood modeling:
library(bbmle)
fwrap <- function(x) log.likelihood.x(x, dat = data)
parnames(fwrap) <- c("lambda1", "lambda2", "theta")
m1 <- mle2(fwrap, start = c(lambda1 = 1, lambda2 = 2, theta = 1), vecpar = TRUE,
method = "L-BFGS-B", lower = c(0, 0, -0.5))
pp <- profile(m1)
plot(pp)
confint(pp)
confint(m1, method = "quad")
I am trying to manually optimise a negative binomial regression model using the optim package in R trying to predict a count variable y using a matrix of factors X using the following code:
# generating some fake data
n <- 1000
X <- matrix(NA, ncol = 5, nrow = n)
X[,1] <- 1
X[,2] <- sample(size = n, x = c(0,1), replace = TRUE)
X[,3] <- sample(size = n, x = c(0,1), replace = TRUE)
X[,4] <- sample(size = n, x = c(0,1), replace = TRUE)
X[,5] <- sample(size = n, x = c(0,1), replace = TRUE)
beta0 <- 3
beta1 <- -2
beta2 <- -2
beta3 <- -4
beta4 <- -0.9
k <- 0.9
## draws from negative binomial distribution
mu <- exp(beta0 + beta1 * X[,2] + beta2 * X[,3] + beta3 * X[,4] + beta4 * X[,5])
theta <- mu + mu ^2 / k
# dependent variable
y <- rnegbin(n, mu = mu, theta = theta)
# function to be optimised
negbin_ll <- function(y, X, theta){
beta <- theta[1:ncol(X)]
alpha <- theta[ncol(X) + 1]
logll <- y * log(alpha) + y *( beta %*% t(X) ) - (y + (1 / alpha ) ) * log( 1 + alpha * exp(beta %*% t(X))) + lgamma(y + (1 / alpha)) - lgamma ( y + 1) - lgamma ( 1 / alpha)
logll <- sum( logll )
return(logll)
}
stval <- rep(0, ncol(X) + 1)
res <-
optim(
stval,
negbin_ll,
y = y,
X = X,
control = list(fnscale = -1),
hessian = TRUE,
method = "BFGS"
)
The code should produce point estimates from the optimisation process, but instead fails when executing the optim-function with the error in optim(stval, negbin_ll, y = y, X = X, control = list(fnscale = -1), : initial value in 'vmmin' is not finite.
I already tried to change log(gamma(...)) to lgamma(...) in the likelihood function and tried many other ways, but I fail to get estimates.
Changing the start values of optim also does not help.
Do you have any idea if there is any particularity to the likelihood function that leads to values being treated in any odd fashion?
Help would be much appreciated.
optim tries several points to get to the minimum, in your case it hits some non-positive values in the arguments inside the logs. One way is to discard the values that return any non-positive inside the problematic functions by returning a negative (in your case) large number, like -lenght(series)*10^6. Remade the log-likelihood function, like this it kinda works:
negbin_ll <- function(y, X, theta){
beta <- theta[1:ncol(X)]
alpha <- theta[ncol(X) + 1]
if(any(alpha<=0)) return(-length(y)*10^6)
if(any(1 + alpha * exp(beta %*% t(X))<=0)) return(-length(y)*10^6)
logll <- y * log(alpha) + y *( beta %*% t(X) ) - (y + (1 / alpha ) ) * log( 1 + alpha * exp(beta %*% t(X))) + lgamma(y + (1 / alpha)) - lgamma ( y + 1) - lgamma ( 1 / alpha)
logll <- sum( logll )
return(logll)
}
i'm writing a code to solve Hyperbolic differential equations with different numerica methods such as Lax-Friederichs, Lax-Wendroff and Upwind scheme. During the calculation i often obtain this type of error:
RuntimeWarning: overflow encountered in double_scalars
that seems to disappear when i reduce the dimensions of matrix. Here i attach my code:
for i in range (0,nt):
#inlet
rho[0,i] = P_inlet/(R*T_inlet)
u[0,i] = u_inlet
P[0,i] = P_inlet
T[0,i] = T_inlet
Ac[0,0] = A_var_list[0]
Q1[0,i] = rho[0,i]
Q2[0,i] = rho[0,i] * u[0,i]
Q3[0,i] = (1/2)*(rho[0,i])*(u[0,i]**2) + (P[0,i]/(k-1))
F1[0,i] = rho[0,i] * u[0,i]
F2[0,i] = (1/2)*(rho[0,i])*(u[0,i]**2) + P[0,i]
F3[0,i] = u[0,i] * ((1/2)*(rho[0,i])*(u[0,i]**2) + (k*P[0,i]/(k-1)))
#outlet
rho[nx-1,i] = rho_outlet
P[nx-1,i] = P_outlet
u[nx-1,i] = u_outlet
T[nx-1,i] = T_outlet
Q1[nx-1,i] = rho[nx-1,i]
Q2[nx-1,i] = rho[nx-1,i]*u[nx-1,i]
Q3[nx-1,i] = (1/2)*rho[nx-1,i]*u[nx-1,i] + (P[nx-1,i]/(k-1))
F1[nx-1,i] = rho[nx-1,i] * u[nx-1,i]
F2[nx-1,i] = (1/2)*rho[nx-1,i]*(u[nx-1,i]**2) + P[nx-1,i]
F3[nx-1,i] = u[nx-1,i] * ((1/2)*(rho[nx-1,i])*(u[nx-1,i]**2) + (k*P[nx-1,i]/(k-1)))
#manifold
for i in range (1,nx-1):
rho[i,0] = P_inlet/(R*Tw[i])
u[i,0] = u_inlet
P[i,0] = P_inlet
Ac[i,0] = A_var_list[i]
Q1[i,0] = rho[i,0]
Q2[i,0] = rho[i,0] * u[i,0]
Q3[i,0] = (1 / 2) * (rho[i,0]) * (u[i,0] ** 2) + (P[i,0] / (k - 1))
F1[i, 0] = rho[i, 0] * u[i, 0]
F2[i, 0] = (1 / 2) * (rho[i, 0]) * (u[i, 0] ** 2) + P[i, 0]
F3[i, 0] = u[i, 0] * ((1 / 2) * (rho[i, 0]) * (u[i, 0] ** 2) + (k * P[i, 0] / (k - 1)))
S1[i, 0] = -rho[i, 0] * u[i, 0] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, 0] = -(rho[i, 0] * ((u[i, 0] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - (
(frict_fact * np.pi * rho[i, 0] * d[i] * u[i, 0] ** 2) / (2 * Ac[i, 0]))
S3[i, 0] = - (u[i, 0] * (rho[i, 0] * ((u[i, 0] ** 2) / 2) + (k * P[i, 0] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
def Upwind():
for n in range (0,nt-1):
for i in range (1,nx):
Q1[i,n+1] = Q1[i-1,n]-((F1[i,n] - F1[i-1,n])/Dx)*Dt + (S1[i,n]-S1[i-1,n])*Dt
Q2[i, n + 1] = Q2[i-1, n] - ((F2[i, n] - F2[i - 1, n]) / Dx) * Dt + (S2[i, n] - S2[i - 1, n]) * Dt
Q3[i, n + 1] = Q3[i-1, n] - ((F3[i, n] - F3[i - 1, n]) / Dx) * Dt + (S3[i, n] - S3[i - 1, n]) * Dt
rho[i, n+1] = Q1[i, n+1]
u[i, n+1] = Q2[i, n+1] / rho[i, n+1]
P[i, n+1] = (Q3[i, n+1] - 0.5 * rho[i, n+1] * u[i, n+1] ** 2) * (k - 1)
T[i, n+1] = P[i, n+1] / (R * rho[i, n+1])
F1[i,n+1] = Q2[i,n+1]
F2[i,n+1] = rho[i,n+1]*((u[i,n+1]**2)/2) +P[i,n+1]
F3[i, n + 1] = u[i, n + 1] * (
(rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2)) + (k * P[i , n + 1] / (k - 1)))
S1[i, n + 1] = -rho[i, n + 1] * u[i, n + 1] * (Ac[i, 0] - Ac[i-1, 0])
S2[i, n + 1] = - (rho[i, n + 1] * (
(u[i, n + 1] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i-1, 0])) - ((
(frict_fact * np.pi * rho[i, n + 1] * d[i] * (u[i, n + 1] ** 2)) / (2 * Ac[i, 0])))
S3[i, n + 1] = -(u[i, n + 1] * (
rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2) + (k * P[i, n + 1] / (k - 1))) * (
(Ac[i , 0] - Ac[i-1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i ] * (Tw[i] - T[i, 0]) / Ac[i, 0])
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
def Lax_Friedrichs():
for n in range (1,nt):
for i in range (1,nx-1):
F1_m1 = 0.5 * (F1[i, n - 1] + F1[i - 1, n - 1])
F2_m1 = 0.5 * (F2[i, n - 1] + F2[i - 1, n - 1])
F3_m1 = 0.5 * (F3[i, n - 1] + F3[i - 1, n - 1])
S1_m1 = 0.5 * (S1[i, 0] + S1[i - 1, 0])
S2_m1 = 0.5 * (S2[i, 0] + S2[i - 1, 0])
S3_m1 = 0.5 * (S3[i, 0] + S3[i - 1, 0])
F1_p1 = 0.5 * (F1[i + 1, n - 1] + F1[i, n - 1])
F2_p1 = 0.5 * (F2[i + 1, n - 1] + F2[i, n - 1])
F3_p1 = 0.5 * (F3[i + 1, n - 1] + F3[i, n - 1])
S1_p1 = 0.5 * (S1[i + 1, n - 1] + S1[i, n - 1])
S2_p1 = 0.5 * (S2[i + 1, n - 1] + S2[i, n - 1])
S3_p1 = 0.5 * (S3[i + 1, n - 1] + S3[i, n - 1])
Q1[i, n] = 0.5 * (Q1[i - 1, n - 1] + Q1[i + 1, n - 1]) - Dt/Dx * (F1_p1 - F1_m1) + (S1_p1 - S1_m1) * Dt
Q2[i, n] = 0.5 * (Q2[i - 1, n - 1] + Q2[i + 1, n - 1]) - Dt/Dx * (F2_p1 - F2_m1) + (S2_p1 - S2_m1) * Dt
Q3[i, n] = 0.5 * (Q3[i - 1, n - 1] + Q3[i + 1, n - 1]) - Dt/Dx * (F3_p1 - F3_m1) + (S3_p1 - S3_m1) * Dt
rho[i, n] = Q1[i, n]
u[i, n] = Q2[i, n] / rho[i, n]
P[i, n] = (Q3[i, n] - 0.5 * rho[i, n] * u[i, n] ** 2) * (k - 1)
T[i, n] = P[i, n] / (R * rho[i, n])
F1[i, n] = Q2[i, n]
F2[i, n] = rho[i, n] * ((u[i, n] ** 2) / 2) + P[i, n]
F3[i, n] = u[i, n] * (
(rho[i, n] * ((u[i, n] ** 2) / 2)) + (k * P[i, n] / (k - 1)))
S1[i, n] = -rho[i, n] * u[i, n] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, n] = - (rho[i, n] * (
(u[i, n] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - ((
(frict_fact * np.pi * rho[i, n] * d[i] * (u[i, n] ** 2)) / (2 * Ac[i, 0])))
S3[i, n] = -(u[i, n] * (
rho[i, n] * ((u[i, n] ** 2) / 2) + (k * P[i, n] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
# Plot
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
def Lax_Wendroff():
for n in range (0,nt-1):
for i in range (1,nx-1):
Q1_plus_half = (1 / 2) * (Q1[i, n] + Q1[i + 1, n]) - (Dt / (2 * Dx)) * (F1[i + 1, n] - F1[i, n]) + (
S1[i + 1, n] - S1[i, n]) * Dt
Q1_less_half = (1 / 2) * (Q1[i, n] + Q1[i - 1, n]) - (Dt / (2 * Dx)) * (F1[i, n] - F1[i - 1, n]) + (
S1[i, n] - S1[i - 1, n]) * Dt
Q2_plus_half = (1 / 2) * (Q2[i-1, n] + Q2[i + 1, n]) - (Dt / (2 * Dx)) * (F2[i + 1, n] - F2[i, n]) + (
S2[i + 1, n] - S2[i, n]) * Dt
Q2_less_half = (1 / 2) * (Q2[i, n] + Q2[i - 1, n]) - (Dt / (2 * Dx)) * (F2[i, n] - F2[i - 1, n]) + (
S2[i, n] - S2[i - 1, n]) * Dt
Q3_plus_half = (1 / 2) * (Q3[i, n] + Q3[i + 1, n]) - (Dt / (2 * Dx)) * (F3[i + 1, n] - F3[i, n]) + (
S3[i + 1, n] - S3[i, n]) * Dt
Q3_less_half = (1 / 2) * (Q3[i, n] + Q3[i - 1, n]) - (Dt / (2 * Dx)) * (F3[i, n] - F3[i - 1, n]) + (
S3[i, n] - S3[i - 1, n]) * Dt
rho_less_half = Q1_less_half
u_less_half = Q2_less_half / rho_less_half
P_less_half = (Q3_less_half - ((1 / 2) * rho_less_half * (u_less_half ** 2) / 2)) * (k - 1)
F1_less_half = rho_less_half * u_less_half
F2_less_half = rho_less_half * ((u_less_half ** 2) / 2) + P_less_half
F3_less_half = u_less_half * ((rho_less_half * ((u_less_half ** 2) / 2)) + (k * P_less_half / (k - 1)))
rho_plus_half = Q1_plus_half
u_plus_half = Q2_plus_half / rho_plus_half
P_plus_half = (Q3_plus_half - ((1 / 2) * rho_plus_half * (u_plus_half ** 2) / 2)) * (k - 1)
F1_plus_half = rho_plus_half * u_plus_half
F2_plus_half = rho_plus_half * ((u_plus_half ** 2) / 2) + P_plus_half
F3_plus_half = u_plus_half * ((rho_plus_half * ((u_plus_half ** 2) / 2)) + (k * P_plus_half / (k - 1)))
# I termini sorgente da mettere dentro l'equazione finale di Q li calcolo come medie delle variabili nel condotto
S1_less_half = 0.5 * (S1[i - 1, n] + S1[i, n])
S2_less_half = 0.5 * (S2[i - 1, n] + S2[i, n])
S3_less_half = 0.5 * (S3[i - 1, n] + S3[i, n])
S1_plus_half = 0.5 * (S1[i + 1, n] + S1[i, n])
S2_plus_half = 0.5 * (S2[i + 1, n] + S2[i, n])
S3_plus_half = 0.5 * (S3[i + 1, n] + S3[i, n])
"""S1_less_half = Q1_less_half + F1_less_half
S2_less_half = Q2_less_half + F2_less_half
S3_less_half = Q3_less_half + F3_less_half
S1_plus_half = Q1_plus_half + F1_plus_half
S2_plus_half = Q2_plus_half + F2_plus_half
S3_plus_half = Q3_plus_half + F3_plus_half"""
Q1[i , n + 1] = Q1[i, n] - (Dt / Dx) * (F1_plus_half - F1_less_half) - (S1_plus_half - S1_less_half) * Dt
Q2[i, n + 1] = Q2[i, n] - (Dt / Dx) * (F2_plus_half - F2_less_half) - (S2_plus_half - S2_less_half) * Dt
Q3[i, n + 1] = Q3[i, n] - (Dt / Dx) * (F3_plus_half - F3_less_half) - (S3_plus_half - S3_less_half) * Dt
rho[i, n + 1] = Q1[i, n + 1]
u[i, n + 1] = Q2[i, n + 1] / rho[i, n + 1]
P[i, n + 1] = (Q3[i, n + 1] - 0.5 * rho[i, n + 1] * (u[i, n + 1] ** 2)) * (k - 1)
F1[i, n + 1] = rho[i, n + 1] * u[i, n + 1]
F2[i, n + 1] = rho[i, n + 1] * ((u[i, n + 1] ** 2) / 2) + P[i, n + 1]
F3[i, n+1] = u[i, n+1] * (
(rho[i, n+1] * ((u[i, n+1] ** 2) / 2)) + (k * P[i, n+1] / (k - 1)))
S1[i, n+1] = -rho[i, n+1] * u[i, n+1] * (Ac[i, 0] - Ac[i - 1, 0])
S2[i, n+1] = - (rho[i, n+1] * (
(u[i, n+1] ** 2) / (Ac[i, 0])) * (Ac[i, 0] - Ac[i - 1, 0])) - ((
(frict_fact * np.pi * rho[i, n+1] * d[i] * (u[i, n+1] ** 2)) / (2 * Ac[i, 0])))
S3[i, n+1] = -(u[i, n+1] * (
rho[i, n+1] * ((u[i, n+1] ** 2) / 2) + (k * P[i, n+1] / (k - 1))) * (
(Ac[i, 0] - Ac[i - 1, 0]) / Ac[i, 0])) + (
Lambda * np.pi * d[i] * (Tw[i] - T[i, 0]) / Ac[i, 0])
# Plot
plt.figure(1)
plt.plot(P[:, nt - 1])
plt.figure(2)
plt.plot(u[:, nt - 1])
I'm pretty sure that's a matter of indices but i havent't found the solution yet. Hope you can help me.