I am writing a delay differential equation in deSolve (R), and I am getting an error message I am unsure how to solve.
So for some background. I have a system with 12 differential equations, and 3 of those have a delay. I was successful in writing the system without deSolve, but I want to use deSolve because it gives me opportunity to easily work with other methods that are not a fixed-step euler.
However, now I put it in deSolve, and I am using the general solver for delay differential equations (dede), I get an error.
This is the delay and the differential equation that the error is about:
lag2=ifelse(t-tau2<0, 0, e2(lagvalue(t-tau2,3))*lagvalue(t-tau2,8))
dn9dt=lag2-IP2*ethaP2M*P2-mu2*sf0B(B2)*P2-DNB(N2,B2)*P2+DD*PD
The first and third delay differential equations are done the same as this one and don't seem to have an error. The error is:
Error in lagvalue(t - tau2, 3) : illegal input in lagvalue - lag, 0, too large, at time = 15.945
Important to note is that the delay (tau2) is 16 in this case, and the error occurs sligtly before time = 16.
I have already tried to change t-tau2<0 to t-tau2<=0, but this doesn't help, and I have tried to increase the size of the history array (control=list(mxhist = 1e6)), which also didn't help. I have also tried to rewrite the lag a couple of times, but everytime I get the same error.
I have tried searching online, but I can hardly find anything on dede in deSolve, so I hope someone here might be able to help.
The question did not contain a full reproducible example, but the error can be reproduced if a simulation is run with a large number of time steps and the history array is too small. The following example is an adapted version from the ?dede help page:
library("deSolve")
derivs <- function(t, y, parms) {
#cat(t, "\n") # uncomment this to see when the error occurs
lag1 <- ifelse(t - tau1 < 0, 0.1, lagvalue(t - tau1, 2))
lag2 <- ifelse(t - tau2 < 0, 0.1, lagvalue(t - tau2, 2))
dy1 <- -y[1] * lag1 + lag2
dy2 <- y[1] * lag1 - y[2]
dy3 <- y[2] - lag2
list(c(dy1, dy2, dy3))
}
yinit <- c(x=5, y=0.1, z=1)
times <- seq(0, 40, by = 0.1)
tau1 <- 1
tau2 <- 10
It runs successfuly with:
yout <- dede(y = yinit, times = times, func = derivs, parms = NULL)
but if we increase the number of time steps:
times <- seq(0, 40, by = 1e-3)
yout <- dede(y = yinit, times = times, func = derivs, parms = NULL)
we can get an error:
Error in lagvalue(t - tau2, 2) :
illegal input in lagvalue - lag, 0, too large, at time = 9.99986
It occurs after exceeding the threshold time (uncomment the cat above) when the algorithm starts to interpolate in the history array. As a solution, increase the history buffer:
yout <- dede(y = yinit, times = times, func = derivs,
parms = NULL, control = list(mxhist = 1e5))
It may also help to reduce the number of time steps.
Related
I'm currently developing a model to integrate microbiology and geochemistry that uses lsoda to solve a great number of differential equations. The model is far too large to be posted here because it's made of several modules, but I have some very weird happening.
These are my initial values
enter image description here
I have initialised them as zero because I don't want any kind of microbe, just to check how the chemistry would change without microbes. However after 5 or 6 steps I start seeing values that are different from zero in some of my microbial variables:
enter image description here.
I wonder if maybe lsoda is doing some kind of round and that's why I get these values, because otherwise I cannot explain where these values are popping out from. If this is the case, does anyone know how to stop this kind of round-ups?
Short answer
No, lsoda does not create "fictional" values.
Detailed Answer
As #Ben Bolker stated, "floating point arithmetics is intrinsically imprecise". I want to add that all solvers including lsoda compute approximations, and that nonlinear models can magnify errors.
You can check your model, if all derivatives are exactly zero by calling the model function separately, without the solver, see the following example of a Lorenz system, where all states are set to zero:
library(deSolve)
Lorenz <- function(t, state, parameters) {
with(as.list(c(state, parameters)), {
dX <- a * X + Y * Z
dY <- b * (Y - Z)
dZ <- -X * Y + c * Y - Z
list(c(dX, dY, dZ))
})
}
parameters <- c(a = -8/3, b = -10, c = 28)
state <- c(X = 0, Y = 0, Z = 0)
times <- seq(0, 500, by = 1)
range(Lorenz(0, state, parameters))
that gives
[1] 0 0
Counter example
If we modify the model, so that one derivative is slightly different from zero due to rounding error, e.g. by using Ben's example:
Fancy <- function(t, state, parameters) {
with(as.list(c(state, parameters)), {
dX <- a * X + Y * Z + 27 * (sqrt(X + Y + Z + 2)^2 - 2)
dY <- b * (Y - Z)
dZ <- -X * Y + c * Y - Z
list(c(dX, dY, dZ))
})
}
range(Fancy(0, state, parameters))
then it gives:
[1] 0.000000e+00 1.199041e-14
Depending on tolerances, such a model may then fluctuate or even blow up. Surprisingly, a smaller tolerance may sometimes be worse:
out1 <- ode(y = state, times = times, func = Lorenz, parms = parameters, atol=1e-6)
out2 <- ode(y = state, times = times, func = Fancy, parms = parameters, atol=1e-6)
out3 <- ode(y = state, times = times, func = Fancy, parms = parameters, atol=1e-8)
par(mfrow=c(3,1))
plot(out1[,1:2], type="l", main="Lorenz, derivatives = 0")
plot(out2[,1:2], type="l", main="Fancy, small error in derivatives, atol=1e-6")
plot(out3[,1:2], type="l", main="Fancy, small error in derivatives, atol=1e-8")
Conclusion
lsodaworks well within the limitations of floating point arithmetics.
models should be carefully designed and tested.
I am trying to simulate the transmission of viruses in a population using the function ode from the deSolve package. The basic of my model is a SIR model and I posted a much simpler demo of my model here, which consists of only three states S(susceptible), I(infectious) and R(recovered). Each state is represented by a m*n matrix in my code, since I have m age groups and n subpopulations in my population.
The problem is: during the simulation period, there will be several vaccination activities that transfer people in state S to state I. Each vaccination activity is characterized by a begin date, an end date, its coverage rate and duration. What I want to do is once the time t falls into the interval of begin date and end date of one vaccination activity, the code calculates the effective vaccination rate (also a m*n matrix, based on coverage rate and duration) and times it with S (m*n matrix), to get a matrix of people transited to state I. Right now, I am using if() to decide if time t is between a begin date and a end date:
#initialize the matrix of effective vaccination rate
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
Here, irrate_matrix is the m*n effective vaccination rate matrix, m = 2 is the number of age groups, n = 2 is the number of subpopulations, tbegin = c(5, 20, 35) is the begin date of 3 vaccination activities, tend = c(8, 23, 38) is the end date of 3 vaccination activities, covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) is the coverage rate of each vaccination for each subpopulation (e.g., covir[1] = 0.35 is the coverage rate of the first vaccination for subpopulation1, while covir[4] = 0.225 is the coverage rate of the first vaccination for subpopulation2) and duration = c(4, 4, 4) is the duration of each vaccination (in days).
After calculating irrate_matrix, I take it into derivatives and therefore I have:
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
I want to do a simulation from day 0 to day 50, by 1-day step, thus:
times = seq(0, 50, 1)
The current issue with my code is: every time the time t comes to a time point close to a tbegin[i] or tend[i], the simulation becomes much slower since it iterates at this time point for much more rounds than at any other time point. For example, once the time t comes to tbegin[1] = 5, the model iterates at time point 5 for many rounds. I attached screenshots from printing out those iterations (screenshot1 and screenshot2). I find this is why my bigger model takes a very long running time now.
I have tried using the "events" function of deSolve mentioned by tpetzoldt in this question stackoverflow: change the value of a parameter as a function of time. However, I found it's inconvenient for me to change a matrix of parameters and change it every time there is a vaccination activity.
I am looking for solutions regarding:
How to change my irrate_matrix to non-zero matrix when there is a vaccination activity and let it be zero matrix when there is no vaccination? (it has to be calculated for each vaccination)
At the same time, how to make the code run faster by avoiding iterating at any tbegin[i] or tend[i] for many rounds? (I think I should not use if() but I do not know what I should do with my case)
If I need to use "forcing" or "events" function, could you please also tell me how to have multiple "forcing"/"events" in the model? Right now, I have had an "events" used in my bigger model to introduce a virus to the population, as:
virusevents = data.frame(var = "I1", time = 2, value = 1, method = "add")
Any good idea is welcome and directly providing some codes is much appreciated! Thank you in advance!
For reference, I post the whole demo here:
library(deSolve)
##################################
###(1) define the sir function####
##################################
sir_basic <- function (t, x, params)
{ # retrieve initial states
S = matrix(data = x[(0*m*n+1):(1*m*n)], nrow = m, ncol = n)
I = matrix(data = x[(1*m*n+1):(2*m*n)], nrow = m, ncol = n)
R = matrix(data = x[(2*m*n+1):(3*m*n)], nrow = m, ncol = n)
with(as.list(params), {
N = as.matrix(S + I + R)
# print out current iteration
print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = rep(0, m*n), nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t>=tbegin[i] & t<=tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1)*length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = as.matrix(b*N) - as.matrix(irrate_matrix*S) - as.matrix(mu*S)
dI = as.matrix(irrate_matrix*S) - as.matrix(gammaS*I) - as.matrix(mu*I)
dR = as.matrix(gammaS*I) - as.matrix(mu*R)
derivatives <- c(dS, dI, dR)
list(derivatives)
})
}
##################################
###(2) characterize parameters####
##################################
m = 2 # the number of age groups
n = 2 # the number of sub-populations
tbegin = c(5, 20, 35) # begin dates
tend = c(8, 23, 38) # end dates
duration = c(4, 4, 4) # duration
covir = c(0.35, 0.25, 0.25, 0.225, 0.18, 0.13) # coverage rates
b = 0.0006 # daily birth rate
mu = 0.0006 # daily death rate
gammaS = 0.05 # transition rate from I to R
parameters = c(m = m, n = n,
tbegin = tbegin, tend = tend, duration = duration, covir = covir,
b = b, mu = mu, gammaS = gammaS)
##################################
#######(3) initial states ########
##################################
inits = c(
S = c(20000, 40000, 10000, 20000),
I = rep(0, m*n),
R = rep(0, m*n)
)
##################################
#######(4) run simulations########
##################################
times = seq(0, 50, 1)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
plot(traj)
Element wise operations are the same for matrices and vectors, so the as.matrix conversions are redundant, as no true matrix multiplication is used. Same with the rep: the zero is recycled anyway.
In effect, CPU time reduces already to 50%. In contrast, use of an external forcing with approxTime instead of the inner if and for made the model slower (not shown).
Simplified code
sir_basic2 <- function (t, x, params)
{ # retrieve initial states
S = x[(0*m*n+1):(1*m*n)]
I = x[(1*m*n+1):(2*m*n)]
R = x[(2*m*n+1):(3*m*n)]
with(as.list(params), {
N = S + I + R
# print out current iteration
#print(paste0("Total population at time ", t, " is ", sum(N)))
# calculate irrate_matrix by checking time t
irrate_matrix = matrix(data = 0, nrow = m, ncol = n)
for (i in 1:length(tbegin)){
if (t >= tbegin[i] & t <= tend[i]){
for (j in 1:n){
irrate_matrix[, j] = -log(1-covir[(j-1) * length(tbegin)+i])/duration[i]
}
}
}
# derivatives
dS = b*N - irrate_matrix*S - mu*S
dI = irrate_matrix*S - gammaS*I - mu*I
dR = gammaS*I - mu*R
list(c(dS, dI, dR))
})
}
Benchmark
Each model version is run 10 times. Model sir_basic is the original implementation, where print line was disabled for a fair comparison.
system.time(
for(i in 1:10)
traj <- ode(func = sir_basic,
y = inits,
parms = parameters,
times = times)
)
system.time(
for(i in 1:10)
traj2 <- ode(func = sir_basic2,
y = inits,
parms = parameters,
times = times)
)
plot(traj, traj2)
summary(traj - traj2)
I observed another considerable speedup, when I use method="adams" instead of the default lsoda solver, but this may differ for your full model.
My question relates to the use of R for the derivation of maximum likelihood estimates of parameters when a probability distributions is expressed in the form of an infinite sum, such as the one below due to Rao, Girija et al.
I wanted to see if I could reproduce the maximum likelihood estimates obtained by these authors (who used Matlab, rather than R) when the model is applied to a given set of data. My attempt is given below, although this throws up several warnings that "longer object length is not a multiple of shorter object length". I know why I am getting this warning, but I do not know how to remedy it. How can I edit my code to overcome this?
Also, is there a better way to handle infinite sums? Here I'm just using an arbitrary large number for n (1000).
library(bbmle)
svec <- list(c=1,lambda=1)
x <- scan(textConnection("0.1396263 0.1570796 0.2268928 0.2268928 0.2443461 0.3141593 0.3839724 0.4712389 0.5235988 0.5934119 0.6632251 0.6632251 0.6981317 0.7679449 0.7853982 0.8203047 0.8377580 0.8377580 0.8377580 0.8377580 0.8726646 0.9250245 0.9773844 0.9948377 1.0122910 1.0122910 1.0646508 1.0995574 1.1170107 1.1170107 1.1170107 1.1344640 1.1344640 1.1868239 1.2217305 1.2740904 1.3613568 1.3613568 1.3613568 1.4486233 1.4486233 1.5358897 1.5358897 1.5358897 1.5707963 1.6057029 1.6057029 1.6231562 1.6580628 1.6755161 1.7104227 1.7453293 1.7976891 1.8500490 1.9722221 2.0594885 2.4085544 2.6703538 2.6703538 2.7052603 3.5604717 3.7524579 3.8920842 3.9444441 4.1364303 4.1538836 4.2411501 4.2586034 4.3633231 4.3807764 4.4854962 4.6774824 4.9741884 5.5676003 5.9864793 6.1086524"))
dL <- function(x, c,lambda,n = 1000,log=TRUE) {
k <- 0:n
r <- log(sum(lambda*c*(x+2*k*pi)^(-c-1)*(exp(-(x+2*k*pi)^(-c))^(lambda))))
if (log) return(r) else return(exp(r))
}
dat <- data.frame(x)
m1 <- mle2( x ~ dL(c,lambda),
data=dat,
start=svec,
control=list(parscale=unlist(svec)),
method="L-BFGS-B",
lower=c(0,0)
)
I suggest starting out with that algorithm and making a density function that can be tested for proper behavior by integrating over its range of definition, (c(0, 2*pi). You are calling it a "probability function" but that is a term that I associate with CDF's rather than density distributions (PDF's):
dL <- function(x, c=1,lambda=1,n = 1000, log=FALSE) {
k <- 0:n
r <- sum(lambda*c*(x+2*k*pi)^(-c-1)*(exp(-(x+2*k*pi)^(-c))^(lambda)))
if (log) {log(r) }
}
vdL <- Vectorize(dL)
integrate(vdL, 0,2*pi)
#0.999841 with absolute error < 9.3e-06
LL <- function(x, c, lambda){ -sum( log( vdL(x, c, lambda))) }
(I think you were trying to pack too much into your log-likelihood function so I decide to break apart the steps.)
When I ran that version I got a warning message from the final mle2 step that I didn't like and I thought it might be the case that this density function was occasionally returning negative values, so this was my final version:
dL <- function(x, c=1,lambda=1,n = 1000) {
k <- 0:n
r <- max( sum(lambda*c*(x+2*k*pi)^(-c-1)*(exp(-(x+2*k*pi)^(-c))^(lambda))), 0.00000001)
}
vdL <- Vectorize(dL)
integrate(vdL, 0,2*pi)
#0.999841 with absolute error < 9.3e-06
LL <- function(x, c, lambda){ -sum( log( vdL(x, c, lambda))) }
(m0 <- mle2(LL,start=list(c=0.2,lambda=1),data=list(x=x)))
#------------------------
Call:
mle2(minuslogl = LL, start = list(c = 0.2, lambda = 1), data = list(x = x))
Coefficients:
c lambda
0.9009665 1.1372237
Log-likelihood: -116.96
(The warning and the warning-free LL numbers were the same.)
So I guess I think you were attempting to pack too much into your definition of a log-likelihood function and got tripped up somewhere. There should have been two summations, one for the density approximation and a second one for the summation of the log-likelihood. The numbers in those summations would have been different, hence the error you were seeing. Unpacking the steps allowed success at least to the extent of not throwing errors. I'm not sure what that density represents and cannot verify correctness.
As far as the question of whether there is a better way to approximate an infinite series, the answer hinges on what is known about the rate of convergence of the partial sums, and whether you can set up a tolerance value to compare successive values and stop calculations after a smaller number of terms.
When I look at the density, it makes me wonder if it applies to some scattering process:
curve(vdL(x, c=.9, lambda=1.137), 0.00001, 2*pi)
You can examine the speed of convergence by looking at the ratios of successive terms. Here's a function that does that for the first 10 terms at an arbitrary x:
> ratios <- function(x, c=1, lambda=1) {lambda*c*(x+2*(1:11)*pi)^(-c-1)*(exp(-(x+2*(1:10)*pi)^(-c))^(lambda))/lambda*c*(x+2*(0:10)*pi)^(-c-1)*(exp(-(x+2*(0:10)*pi)^(-c))^(lambda)) }
> ratios(0.5)
[1] 1.015263e-02 1.017560e-04 1.376150e-05 3.712618e-06 1.392658e-06 6.351874e-07 3.299032e-07 1.880054e-07
[9] 1.148694e-07 7.409595e-08 4.369854e-08
Warning message:
In lambda * c * (x + 2 * (1:11) * pi)^(-c - 1) * (exp(-(x + 2 * :
longer object length is not a multiple of shorter object length
> ratios(0.05)
[1] 1.755301e-08 1.235632e-04 1.541082e-05 4.024074e-06 1.482741e-06 6.686497e-07 3.445688e-07 1.952358e-07
[9] 1.187626e-07 7.634088e-08 4.443193e-08
Warning message:
In lambda * c * (x + 2 * (1:11) * pi)^(-c - 1) * (exp(-(x + 2 * :
longer object length is not a multiple of shorter object length
> ratios(0.5)
[1] 1.015263e-02 1.017560e-04 1.376150e-05 3.712618e-06 1.392658e-06 6.351874e-07 3.299032e-07 1.880054e-07
[9] 1.148694e-07 7.409595e-08 4.369854e-08
Warning message:
In lambda * c * (x + 2 * (1:11) * pi)^(-c - 1) * (exp(-(x + 2 * :
longer object length is not a multiple of shorter object length
That looks like pretty rapid convergence to me, so I'm guessing that you could use only the first 20 terms and get similar results. With 20 terms the results look like:
> integrate(vdL, 0,2*pi)
0.9924498 with absolute error < 9.3e-06
> (m0 <- mle2(LL,start=list(c=0.2,lambda=1),data=list(x=x)))
Call:
mle2(minuslogl = LL, start = list(c = 0.2, lambda = 1), data = list(x = x))
Coefficients:
c lambda
0.9542066 1.1098169
Log-likelihood: -117.83
Since you never attempt to interpret a LL in isolation but rather look at differences, I'm guessing that the minor difference will not affect your inferences adversely.
I'm trying to fit an ODE model to some data and solve for the values of the parameters in the model.
I know there is a package called FME in R which is designed to solve this kind of problem. However, when I tried to write the code like the manual of this package, the program could not run with the following traceback information:
Error in lsoda(y, times, func, parms, ...) : illegal input detected before taking any integration steps - see written message
The code is the following:
x <- c(0.1257,0.2586,0.5091,0.7826,1.311,1.8636,2.7898,3.8773)
y <- c(11.3573,13.0092,15.1907,17.6093,19.7197,22.4207,24.3998,26.2158)
time <- 0:7
# Initial Values of the Parameters
parms <- c(r = 4, b11 = 1, b12 = 0.2, a111 = 0.5, a112 = 0.1, a122 = 0.1)
# Definition of the Derivative Functions
# Parameters in pars; Initial Values in y
derivs <- function(time, y, pars){
with(as.list(c(pars, y)),{
dx <- r + b11*x + b12*y - a111*x^2 - a122*y^2 - a112*x*y
dy <- r + b12*x + b11*y - a122*x^2 - a111*y^2 - a112*x*y
list(c(dx,dy))
})
}
initial <- c(x = x[1], y = y[1])
data <- data.frame(time = time, x = x, y = y)
# Cost Computation, the Object Function to be Minimized
model_cost <- function(pars){
out <- ode(y = initial, time = time, func = derivs, parms = pars)
cost <- modCost(model = out, obs = data, x = "time")
return(cost)
}
# Model Fitting
model_fit <- modFit(f = model_cost, p = parms, lower = c(-Inf,rep(0,5)))
Is there anyone that knows how to use the FME package and fix the problem?
Your code-syntax is right and it works until the last line.
you can check your code with
model_cost(parms)
This works fine and you can see with
model_cost(parms)$model
that your "initial guess" is far away from the observed data (compare "obs" and "mod"). Perhaps here is a failure so that the fitting procedure will not reach the observed data.
So much for the while ... I also checked different methods with parameter "methods = ..." but still does not work.
Best wishes,
Johannes
edit: If you use:
model_fit <- modFit(f = model_cost, p = parms)
without any lower bounds, then you will get an result (even there are warnings), but then a112 is negative what you wanted to omit.
When doing constrained optimization using the constrOptim function, I sometimes get the following error message:
Error in optim(theta.old, fun, gradient, control = control, method = method, :
initial value in 'vmmin' is not finite
Example
x <- c(-0.2496881061155757641767394261478330008685588836669921875,
0.0824038146359631351600683046854101121425628662109375,
0.25000000111421105675191256523248739540576934814453125)
nw <- length(x)
ui <- diag(1, nrow = nw)
ui <- rbind(ui, rep(0, nw))
ui[cbind(2:(nw + 1), 1:nw)] <- -1
ci <- rep(-0.8 / (nw + 1), nw + 1)
constrOptim(theta = rep(0, nw), f = function(theta) mean((theta - x)^2),
grad = function(theta) 2 * (theta - x), ui = ui, ci = ci,
method = "BFGS")
What I know
The problem occurs during the iteration inside constrOptim, when the result comes so close to the boundary that almost all point evaluated by the BFGS optimizer are NaNs (excluding the initial point). In this case, BFGS will sometimes return an optimal value of NaN and a corresponding minimizing parameter outside the constraint set.
In constrOptim, the objective function fed to BFGS is given by
R <- function(theta, theta.old, ...) {
ui.theta <- ui %*% theta
gi <- ui.theta - ci
if (any(gi < 0)) {
return(NaN)
}
gi.old <- ui %*% theta.old - ci
bar <- sum(gi.old * log(gi) - ui.theta)
if (!is.finite(bar))
bar <- -Inf
f(theta, ...) - mu * bar
}
My question
It seems to me that the obvious solution to the problem is to simply return sign(mu) * Inf instead of NaN if there are any gi < 0, but could this fix lead to other problems?
After normalizing the gradient properly
constrOptim(theta = rep(0, nw), f = function(theta) mean((theta - x)^2),
grad = function(theta) 2 / nw * (theta - x), ui = ui, ci = ci,
method = "BFGS")
I can no longer replicate the problem. It seems that the issue was caused by the wrong weighting of the gradient of the objective function and the gradient of the logarithmic barrier term in the internal gradient.
However, I still think that returning Inf outside the boundary would be more robust than returning NaN.