Related
I'm trying to fit the differential equation using the least squares method (FME package).
However, I keep getting this error that I don't know how to tackle.
The reproducible example:
times = seq(0, 4, by = 0.5)
dat = data.frame(time = seq(1,4),
Tick = c(128, 52.5, 28, 121))
N = 10
tick.model <- function(time, y, params, ...) { #here we begin a function with three arguments
with(as.list(c(y, params)),{
dTick <- (30 - s.t*Tick)*Tick*0.3*N - delta.t*Tick
return(list(c(dTick)))
})
}
y = c(Tick = 82.375)
cost1 <- function(p) {
out <- ode(y, times, tick.model, p)
modCost(out, dat, weight = "none")
}
params <- c(s.t=0.1, delta.t = 1)
fit = modFit(f = cost1, p = params, lower = rep(0,2),
upper = c(10, 5))
summary(fit)
The result comes out like this:
Parameters:
Estimate Std. Error t value Pr(>|t|)
s.t 0.3641876 NA NA NA
delta.t 0.0001417 NA NA NA
Residual standard error: 60.92 on 2 degrees of freedom
Error in cov2cor(x$cov.unscaled) : 'V' is not a square numeric matrix
In addition: Warning message:
In summary.modFit(fit) : Cannot estimate covariance; system is singular
Also, the fitted model doesn't look nice
.
I have no idea what I could have done wrong.
I'm trying to run a MLE for an infectious disease compartmental transmission model (SEIR, in my case SSEIR) with the mle2 command, trying to fit a curve of predicted number of weekly deaths to that of observed weekly deaths similar to this:
plot of predicted vs observed weekly deaths.
However, the parameter estimates seem to always be on the (sensible) boundaries I provide and SEs, z-values, p-values are NA.
I set up the SEIR model and then solve it with the ode solver. Using that model output and the observed data, I calculate a negative log likelihood, which I then submit to the mle2 function.
When I first set it up, there were multiple errors that stopped the script from running, but now that those are resolved, I cannot seem to find the root of why the fitting doesn't work.
I am certain that the boundaries I set for the parameter estimation are sensible. The parameters are transition rates between compartments and are therefore defined as (for example) delta = 1/duration of infectiousness, so there are very real biological boundaries on what the parameters can be.
I am aware that I am trying to fit a lot of parameters with not that much data, but the same problem persists when I try only fitting one, so that cannot be the root of it.
library(deSolve)
library(bbmle)
#data
gdta <- c(0, 36.2708172419082, 1.57129615346629, 28.1146409459558, 147.701669719614, 311.876708482584, 512.401145459178, 563.798275104372, 470.731269976821, 292.716043742125, 153.604156195608, 125.760068922451, 198.755685044427, 143.847282793854, 69.2693867232681, 42.2093135487066, 17.0200426587424)
#build seir function
seir <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
dS0 <- - beta0 * S0 * (I/N)
dS1 <- - beta1 * S1 * (I/N)
dE <- beta0 * S0 * (I/N) + beta1 * S1 * (I/N) - delta * E
dI <- delta * E - gamma * I
dR <- gamma * I
return(list(c(dS0, dS1, dE, dI, dR)))
})
}
# build function to run seir, include ode solver
run_seir <- function(time, state, beta0, beta1, delta, gamma, sigma, N, startInf) {
parameters <- c(beta0, beta1, delta, gamma)
names(parameters) <- c("beta0", "beta1", "delta", "gamma")
init <- c(S0 = (N - startInf)*(sigma) ,
S1 = (N - startInf) * (1-sigma),
E = 0,
I = startInf,
R = 0)
state_est <- as.data.frame(ode(y = init, times = times, func = seir, parms = parameters))
return(state_est)
}
times <- seq(0, 16, by = 1) #sequence
states <- c("S0", "S1", "E", "I", "R")
# run the run_seir function to see if it works
run_seir(time = times, state= states, beta0 = 1/(1.9/7), beta1 = 0.3*(1/(1.9/7)), delta = 1/(4.1/7), gamma = 1/(4.68/7), sigma = 0.7, N = 1114100, startInf = 100)
#build calc likelihood function
calc_likelihood <- function(times, state, beta0, beta1, delta, gamma, sigma, N, startInf, CFR) {
model.output <- run_seir(time, state, beta0, beta1, delta, gamma, sigma, N, startInf)
LL <- sum(dpois(round(as.numeric(gdta)), (model.output$I)/(1/delta)*CFR, log = TRUE))
print(LL)
return(LL)
}
# run calc_likelihood function
calc_likelihood(time = times, state = states, beta0 = 1/(1.9/7), beta1 = 0.3*(1/(1.9/7)), delta = 1/(4.1/7), gamma = 1/(4.68/7), sigma = 0.7, N = 1114100, startInf = 100, CFR = 0.02)
#MLE
#parameters that are supposed to be fixed
fixed.pars <- c(N=1114100, startInf=100, CFR = 0.02)
#parameters that mle2 is supposed to estimate
free.pars <- c(beta0 = 1/(1.9/7), beta1 = 0.3*(1/(1.9/7)),
delta = 1/(4.1/7), gamma = 1/(4.68/7), sigma = 0.7)
#lower bound
lower_v <- c(beta0 = 0, beta1 = 0, delta = 0, gamma = 0, sigma = 0)
#upper bound
upper_v <- c(beta0 = 15, beta1 = 15, delta = 15, gamma = 15, sigma = 1)
#sigma = 1, this is not a typo
#mle function - need to use L-BFGS-B since we need to include boundaries
test2 <- mle2(calc_likelihood, start = as.list(free.pars), fixed = as.list(fixed.pars),method = "L-BFGS-B", lower = lower_v, upper = upper_v)
summary(test2)
After I run mle2, I get a warning saying:
Warning message:
In mle2(calc_likelihood, start = as.list(free.pars), fixed = as.list(fixed.pars), :
some parameters are on the boundary: variance-covariance calculations based on Hessian may be unreliable
and if I look at summary(test2):
Warning message:
In sqrt(diag(object#vcov)) : NaNs produced
Based on the research I've done so far, I understand that the second error might be due to the estimates being on the boundaries, so my question really is how to address the first one.
If I run mle2 with only lower boundaries, I get parameter estimates in the millions, which cannot be correct.
I am fairly certain that my model specification for the SEIR is correct, but after staring at this code and trying to resolve this issue for a week, I'm open to any input on how to make the fitting work.
Thanks,
JJ
I have a dataset like this
df
x y
7.3006667 -0.14383333
-0.8983333 0.02133333
2.7953333 -0.07466667
and I would like to fit an exponential function like y = a*(exp(bx)).
This is what I tried and the error I get
f <- function(x,a,b) {a * exp(b * x)}
st <- coef(nls(log(y) ~ log(f(x, a, b)), df, start = c(a = 1, b = -1)))
Error in qr.qty(QR, resid) : NA/NaN/Inf in foreign function call (arg 5)
In addition: Warning messages:
1: In log(y) : NaNs produced
2: In log(y) : NaNs produced
fit <- nls(y ~ f(x, a, b), data = df, start = list(a = st[1], b = st[2]))
Error in nls(y ~ exp(a + b * x), data = df, start = list(a = st[1], :
singular gradient
I believe it has to do with the fact that the log is not defined for negative numbers but I don't know how to solve this.
I'm having trouble seeing the problem here.
f <- function(x,a,b) {a * exp(b * x)}
fit <- nls(y~f(x,a,b),df,start=c(a=1,b=1))
summary(fit)$coefficients
# Estimate Std. Error t value Pr(>|t|)
# a -0.02285668 0.03155189 -0.7244157 0.6008871
# b 0.25568987 0.19818736 1.2901422 0.4197729
plot(y~x, df)
curve(predict(fit,newdata=data.frame(x)), add=TRUE)
The coefficients are very poorly estimated, but that's not surprising: you have two parameters and three data points.
As to why your code fails: the first call to nls(...) generates an error, so st is never set to anything (although it may have a value from some earlier code). Then you try to use that in the second call to nls(...).
When doing constrained optimization using the constrOptim function, I sometimes get the following error message:
Error in optim(theta.old, fun, gradient, control = control, method = method, :
initial value in 'vmmin' is not finite
Example
x <- c(-0.2496881061155757641767394261478330008685588836669921875,
0.0824038146359631351600683046854101121425628662109375,
0.25000000111421105675191256523248739540576934814453125)
nw <- length(x)
ui <- diag(1, nrow = nw)
ui <- rbind(ui, rep(0, nw))
ui[cbind(2:(nw + 1), 1:nw)] <- -1
ci <- rep(-0.8 / (nw + 1), nw + 1)
constrOptim(theta = rep(0, nw), f = function(theta) mean((theta - x)^2),
grad = function(theta) 2 * (theta - x), ui = ui, ci = ci,
method = "BFGS")
What I know
The problem occurs during the iteration inside constrOptim, when the result comes so close to the boundary that almost all point evaluated by the BFGS optimizer are NaNs (excluding the initial point). In this case, BFGS will sometimes return an optimal value of NaN and a corresponding minimizing parameter outside the constraint set.
In constrOptim, the objective function fed to BFGS is given by
R <- function(theta, theta.old, ...) {
ui.theta <- ui %*% theta
gi <- ui.theta - ci
if (any(gi < 0)) {
return(NaN)
}
gi.old <- ui %*% theta.old - ci
bar <- sum(gi.old * log(gi) - ui.theta)
if (!is.finite(bar))
bar <- -Inf
f(theta, ...) - mu * bar
}
My question
It seems to me that the obvious solution to the problem is to simply return sign(mu) * Inf instead of NaN if there are any gi < 0, but could this fix lead to other problems?
After normalizing the gradient properly
constrOptim(theta = rep(0, nw), f = function(theta) mean((theta - x)^2),
grad = function(theta) 2 / nw * (theta - x), ui = ui, ci = ci,
method = "BFGS")
I can no longer replicate the problem. It seems that the issue was caused by the wrong weighting of the gradient of the objective function and the gradient of the logarithmic barrier term in the internal gradient.
However, I still think that returning Inf outside the boundary would be more robust than returning NaN.
Given:
set.seed(1001)
outcome<-rnorm(1000,sd = 1)
covariate<-rnorm(1000,sd = 1)
log-likelihood of normal pdf:
loglike <- function(par, outcome, covariate){
cov <- as.matrix(cbind(1, covariate))
xb <- cov * par
(- 1/2* sum((outcome - xb)^2))
}
optimize:
opt.normal <- optim(par = 0.1,fn = loglike,outcome=outcome,cov=covariate, method = "BFGS", control = list(fnscale = -1),hessian = TRUE)
However I get different results when running an simple OLS. However maximizing log-likelihhod and minimizing OLS should bring me to a similar estimate. I suppose there is something wrong with my optimization.
summary(lm(outcome~covariate))
Umm several things... Here's a proper working likelihood function (with names x and y):
loglike =
function(par,x,y){cov = cbind(1,x); xb = cov %*% par;(-1/2)*sum((y-xb)^2)}
Note use of matrix multiplication operator.
You were also only running it with one par parameter, so it was not only broken because your loglike was doing element-element multiplication, it was only returning one value too.
Now compare optimiser parameters with lm coefficients:
opt.normal <- optim(par = c(0.1,0.1),fn = loglike,y=outcome,x=covariate, method = "BFGS", control = list(fnscale = -1),hessian = TRUE)
opt.normal$par
[1] 0.02148234 -0.09124299
summary(lm(outcome~covariate))$coeff
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02148235 0.03049535 0.7044466 0.481319029
covariate -0.09124299 0.03049819 -2.9917515 0.002842011
shazam.
Helpful hints: create data that you know the right answer for - eg x=1:10; y=rnorm(10)+(1:10) so you know the slope is 1 and the intercept 0. Then you can easily see which of your things are in the right ballpark. Also, run your loglike function on its own to see if it behaves as you expect.
Maybe you will find it usefull to see the difference between these two methods from my code. I programmed it the following way.
data.matrix <- as.matrix(hprice1[,c("assess","bdrms","lotsize","sqrft","colonial")])
loglik <- function(p,z){
beta <- p[1:5]
sigma <- p[6]
y <- log(data.matrix[,1])
eps <- (y - beta[1] - z[,2:5] %*% beta[2:5])
-nrow(z)*log(sigma)-0.5*sum((eps/sigma)^2)
}
p0 <- c(5,0,0,0,0,2)
m <- optim(p0,loglik,method="BFGS",control=list(fnscale=-1,trace=10),hessian=TRUE,z=data.matrix)
rbind(m$par,sqrt(diag(solve(-m$hessian))))
And for the lm() method I find this
m.ols <- lm(log(assess)~bdrms+lotsize+sqrft+colonial,data=hprice1)
summary(m.ols)
Also if you would like to estimate the elasticity of assessed value with respect to the lotsize or calculate a 95% confidence interval
for this parameter, you could use the following
elasticity.at.mean <- mean(hprice1$lotsize) * m$par[3]
var.coefficient <- solve(-m$hessian)[3,3]
var.elasticity <- mean(hprice1$lotsize)^2 * var.coefficient
# upper bound
elasticity.at.mean + qnorm(0.975)* sqrt(var.elasticity)
# lower bound
elasticity.at.mean + qnorm(0.025)* sqrt(var.elasticity)
A more simple example of the optim method is given below for a binomial distribution.
loglik1 <- function(p,n,n.f){
n.f*log(p) + (n-n.f)*log(1-p)
}
m <- optim(c(pi=0.5),loglik1,control=list(fnscale=-1),
n=73,n.f=18)
m
m <- optim(c(pi=0.5),loglik1,method="BFGS",hessian=TRUE,
control=list(fnscale=-1),n=73,n.f=18)
m
pi.hat <- m$par
numerical calculation of s.d
rbind(pi.hat=pi.hat,sd.pi.hat=sqrt(diag(solve(-m$hessian))))
analytical
rbind(pi.hat=18/73,sd.pi.hat=sqrt((pi.hat*(1-pi.hat))/73))
Or this code for the normal distribution.
loglik1 <- function(p,z){
mu <- p[1]
sigma <- p[2]
-(length(z)/2)*log(sigma^2) - sum(z^2)/(2*sigma^2) +
(mu*sum(z)/sigma^2) - (length(z)*mu^2)/(2*sigma^2)
}
m <- optim(c(mu=0,sigma2=0.1),loglik1,
control=list(fnscale=-1),z=aex)