I try to estimate parameters for an ode model in R using nls (and later nlme). My testing code gives me error messages.
library(deSolve)
seed=2423
dat2<-data.frame(days=(runif(20)+1)*10, X1=runif(20), X2=runif(20))
dat2$y<-0.4*exp(dat2$X1)+0.6*exp(dat2$X2)+rnorm(20, sd=0.3)
# example intentionally simple. I would usually solve it analysically
#***************************************************
#*Model definition
#***************************************************
decomp<-function(t, state, parameters){
with(as.list(c(state, parameters)), {
dX1<-a1*X1
dX2<-a2*X2#+a1*X1
list(c(dX1, dX2))
} )
}
# Testing the code to demonstrate that it works
parameters<-c(a1=0.05,a2=0.05)
state<-c(X1=0.5, X2=0.5)
times<-seq(0,100, by=1)
out<-ode(y=state, times=times, func=decomp, parms=parameters)
out[100,]
#*****************************
#* Wrapper function to be passed to nls or nlme
#**************************************
calcdecom<-function(a1,a2,t,x1, x2)
{
state<-c(X1=x1, X2=x2)
times<-c(0,t)
parameters<-c(a1=a1,a2=a2)
out<-ode(y=state, times=times, func=decomp, parms=parameters)
return(as.numeric(out[2,2]+out[2,3]))
}
# ******************** test
calcdecom(0.1,0.1,5,0.3,0.3)
test<-nls(y~calcdecom(a1, a2, days, X1, X2 ),
start=list(a1=0.02, a2=0.4), data=dat2)
My error messages for the nls function is:
Error in lsoda(y, times, func, parms, ...) :
illegal input detected before taking any integration steps - see written message
Here a possible approach. One remaining question is, whether the data should really come from independent cases (simulations or measurements) and not from a time series. If the first is intended, then ode must be called for each case separately. This can be done in a for-loop or with an applyfunction. Note also the correction of the set.seed-call, the plot function and the reduced standard deviation in the data generating process. It seems to me that the ode model and the data generating process do not match yet, so that the fitted parameters differ. It would be nice if the OP could post a corrected version by editing the question.
library("deSolve")
## use set.seed to make example reproducible
set.seed(2423)
## simulated data
dat2 <- data.frame(
days = (runif(20) + 1) + 10,
X1 = runif(20),
X2 = runif(20)
)
## reduced error for testing
dat2$y <- 0.4 * exp(dat2$X1) + 0.6 * exp(dat2$X2) + rnorm(20, sd = 0.1)
plot(dat2)
decomp <- function(t, state, parameters){
with(as.list(c(state, parameters)), {
dX1 <- a1 * X1
dX2 <- a2 * X2 + a1 * X1
list(c(dX1, dX2))
} )
}
## test the ode moel to demonstrate that it works
parameters <- c(a1 = 0.05, a2 = 0.05)
state <- c(X1 = 0.5, X2 = 0.5)
times <- seq(0, 100, by = 1)
out <- ode(y = state, times = times, func = decomp, parms = parameters)
plot(out)
out[100,]
## Wrapper function to be passed to nls or nlme
calcdecom <- function(a1, a2, t, x1, x2) {
ret <- numeric(length(t))
parameters <- c(a1 = a1, a2 = a2)
for (i in 1:length(t)) {
times <- c(0, t[i])
state <- c(X1 = x1[i], X2 = x2[i])
out <- ode(y = state, times = times, func = decomp, parms = parameters)
ret[i] <- out[2, 2] + out[2, 3]
}
return(ret)
}
## test wrapper function
calcdecom(0.1, 0.1, 5, 0.3, 0.3)
## with time and state as vectors
calcdecom(0.1, 0.1, 1:10, dat2$X1, dat2$X2)
test <- nls(y ~ calcdecom(a1, a2, days, X1, X2 ),
start = list(a1 = 0.02, a2 = 0.4), data = dat2)
summary(test)
Instead of nls one may also consider modFit from package FME, that has some more flexibility for this kind of models. Details are found in the package vignettes and the following JSS paper: https://doi.org/10.18637/jss.v033.i03
I am trying to fit SIRD model in R to real data. However, the observed values are lying nowhere on the fitted curve. I can't understand what the error is or how to resolve it, but I have noticed that changing the value of "state" produces the error
DLSODA- Warning..Internal T (=R1) and H (=R2) are
such that in the machine, T + H = T on the next step
(H = step size). Solver will continue anyway.
In above message, R1 = 0.1, R2 = 9.94667e-21
Here is my entire code. Any help is greatly appreciated!
library(deSolve)
state<-c(S=10000,I=1000,R=5000,D=100)
parameters <- c(a=180,b=0.4,g=0.2)
eqn<-function(t, state, parameters) {
with(as.list(c(state, parameters)),{
dS <- -a*I*S
dI <- a*I*S-g*I-b*I
dR <- g*I
dD <-b*I
list(c(dS,dI,dR,dD))
})
}
times <- seq(0.1,2.6,by=0.1)
out <- ode(y = state, times = times, func = eqn, parms = parameters)
out
plot(out)
library(FME)
data <- data.frame(
time = seq(0.1,2.6,0.1),
S=c(11417747943,11417733626,11417717809,11417702207,11417685587,11417670536,
11417652672,11417629493,11417603660,11417577979,11417550853,11417520318,
11417495673,11417466974,11417435119,11417399167,11417362265,11417326539,
11417286125,11417254482,11417226564,11417187020,11417143837,11417095924,
11417046477,11416989403),
I=c(3686,7062,4415,8040,7706,4316,8266,13947,13593,11207,13446,19114,5121,15400,
16658,15386,19766,21024,22426,10683,3958,15701,10290,23299,11340,29331),
R=c(9913,7193,11344,7467,8861,10671,9510,9138,12174,14400,13588,11314,19463,13165,
15098,20444,17019,14523,17874,20854,23820,23600,32641,24126,37821,27508),
D=c(54,57,56,88,50,48,87,84,58,70,92,99,58,132,95,111,112,166,108,102,139,
227,249,481,277,222)
)
cost <- function(p) {
out <- ode(state, times, eqn, p)
modCost(out, data, weight = "none")
}
fit <- modFit(f = cost, p = parameters)
summary(fit)
out1 <- ode(state, times, eqn, parameters)
out2 <- ode(state, times, eqn, coef(fit))
plot(out1, out2, obs=data, obspar=list(pch=16, col="red"))
Your code has several issues:
the order of magnitude of state variables differs, so you need weight="std" or weight = "mean"
the initial values of the state variables are far away. This is the most critical error. You may either set it manually to a reasonable value (see below) or even better, fit it, see FME documentation how this can be done.
Start parameters are far away from optimum. While it is desirable that the algorithm converges to an optimum from arbitrary naive start values, this is rarely the case. Therefore, some careful consideration or trial and error is unavoidable.
The mass balance is violated, i.e. the sum of all 4 states changes over time. Check rowSums(data[-1]).
Here an approach that handles parts of the problem. The next step would then be to fix the mass balance and to include the ode initial states of the ode model as parameters of the nonlinear optimization.
library(deSolve)
library(FME)
eqn<-function(t, state, parameters) {
with(as.list(c(state, parameters)),{
dS <- -a*I*S
dI <- a*I*S - g*I - b*I
dR <- g*I
dD <- b*I
list(c(dS,dI,dR,dD))
})
}
data <- data.frame(
time = seq(0.1,2.6,0.1),
S=c(11417747943,11417733626,11417717809,11417702207,11417685587,11417670536,
11417652672,11417629493,11417603660,11417577979,11417550853,11417520318,
11417495673,11417466974,11417435119,11417399167,11417362265,11417326539,
11417286125,11417254482,11417226564,11417187020,11417143837,11417095924,
11417046477,11416989403),
I=c(3686,7062,4415,8040,7706,4316,8266,13947,13593,11207,13446,19114,5121,15400,
16658,15386,19766,21024,22426,10683,3958,15701,10290,23299,11340,29331),
R=c(9913,7193,11344,7467,8861,10671,9510,9138,12174,14400,13588,11314,19463,13165,
15098,20444,17019,14523,17874,20854,23820,23600,32641,24126,37821,27508),
D=c(54,57,56,88,50,48,87,84,58,70,92,99,58,132,95,111,112,166,108,102,139,
227,249,481,277,222)
)
state <- c(S=11417747943, I=5000, R=8000, D=50)
parameters <- c(a=1e-10, b=0.001, g=0.1)
times<-seq(0.1,2.6,by=0.01)
cost <- function(p) {
out <- ode(state, times, eqn, p)
modCost(out, data, weight = "mean")
}
fit <- modFit(f = cost, p = parameters)
summary(fit, corr=TRUE)
out2 <- ode(state, times, eqn, coef(fit))
plot(out2, obs=data, obspar=list(pch=16, col="red"), ylim=list(c(0, 2e10), c(0, 50000), c(0, 50000), c(0, 600)))
Edit
The following approach improves the fit by:
fixing mass balance by setting total population to be constant over time
re-scale data to improve stability of optimization
guessing initial values from data
It would (in theory) be even better to include initial values in the optimization, but this would lead again to non-identifiability of parameters
due to the intrinsic characteristics of the given model and data. See twocomp_final.R for a related tutorial example.
Instead of data rescaling, one may also consider to adapt control parameters
of the optimizer(s) and of the ode function, or to rescale individual state variables differently.
However, it is easiest here just to rescale the population to "million people".
## fix mass balance, i.e. make sum of all states constant
## an alternative would be an additional process in the model
## for migration and / or birth and natural death
Population <- rowSums(data[c("S", "I", "R", "D")])
data$S <- Population[1] - rowSums(data[c("I", "R", "D")])
## rescale state variables to numerically more convenient numbers
## here simply: million people
scaled_data <- cbind(
time = data$time,
data[c("S", "I", "R", "D")] * 1e-6
)
## guess initial values from data (of course a little bit subjective)
state <- c(
S = scaled_data$S[1],
I = mean(scaled_data$I[1:3]),
R = mean(scaled_data$R[1:5]),
D = mean(scaled_data$D[1:3])
)
## use good initial parameters by thinking and some trial and error
parameters <- c(a = 0.0001, b = 0.01, g = 1)
cost2 <- function(p) {
out <- ode(state, times, eqn, p)
modCost(out, scaled_data, weight = "mean")
}
## fit model, enable trace with option nprint
fit <- modFit(f = cost2, p = parameters, control = list(nprint = 1))
summary(fit, corr=TRUE)
out2 <- ode(state, times, eqn, coef(fit))
plot(out2, obs = scaled_data, obspar = list(pch = 16, col = "red"))
I need to implement a logistic regression manually, using the Score/GMM approach, without the use of GLM. This is because at later stages the model will be much more complicated. Currently I am running into a problem where for the logistic regression, the optimization procedures are very initial point dependent.To illustrate, here is my code using an online dataset. More details about the procedure are in the comments:
library(data,table)
library(nleqslv)
library(Matrix)
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
data_analysis<-data.table(mydata)
data_analysis[,constant:=1]
#Likelihood function for logit
#The logistic regression will regress the binary variable
#admit on a constant and the variable gpa
LL <- function(beta){
beta=as.numeric(beta)
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = h/(1+h)
llf <- sum(data_temp$admit * log(choice_prob)) + (sum((one-data_temp$admit) * log(one-choice_prob)))
return(-1*llf)
}
#Score to be used when optimizing using LL
#Identical to the Score function below but returns negative output
Score_LL <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = h/(1+h)
resid = as.numeric(data_temp$admit - choice_prob)
score_final2 = t(mat_temp2) %*% Diagonal(length(resid), x=resid) %*% one
return(-1*as.numeric(score_final2))
}
#The Score/Deriv/Jacobian of the Likelihood function
Score <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
choice_prob = as.numeric(h/(1+h))
resid = as.numeric(data_temp$admit - choice_prob)
score_final2 = t(mat_temp2) %*% Diagonal(length(resid), x=resid) %*% one
return(as.numeric(score_final2))
}
#Derivative of the Score function
Score_Deriv <- function(beta){
data_temp=data_analysis
mat_temp2 = cbind(data_temp$constant,
data_temp$gpa)
one = rep(1,dim(mat_temp2)[1])
h = exp(beta %*% t(mat_temp2))
weight = (h/(1+h)) * (1- (h/(1+h)))
weight_mat = Diagonal(length(weight), x=weight)
deriv = t(mat_temp2)%*%weight_mat%*%mat_temp2
return(-1*as.array(deriv))
}
#Quadratic Gain function
#Minimized at Score=0 and so minimizing is equivalent to solving the
#FOC of the Likelihood. This is the GMM approach.
Quad_Gain<- function(beta){
h=Score(as.numeric(beta))
return(sum(h*h))
}
#Derivative of the Quadratic Gain function
Quad_Gain_deriv <- function(beta){
return(2*t(Score_Deriv(beta))%*%Score(beta))
}
sol1=glm(admit ~ gpa, data = data_analysis, family = "binomial")
sol2=optim(c(2,2),Quad_Gain,gr=Quad_Gain_deriv,method="BFGS")
sol3=optim(c(0,0),Quad_Gain,gr=Quad_Gain_deriv,method="BFGS")
When I run this code, I get that sol3 matches what glm produces (sol1) but sol2, with a different initial point, differs from the glm solution by a lot. This is something happening in my main code with the actual data as well. One solution is to create a grid and test multiple starting points. However, my main data set has 10 parameters and this would make the grid very large and the program computationally infeasible. Is there a way around this problem?
Your code seems overly complicated. The following two functions define the negative log-likelihood and negative score vector for a logistic regression with the logit link:
logLik_Bin <- function (betas, y, X) {
eta <- c(X %*% betas)
- sum(dbinom(y, size = 1, prob = plogis(eta), log = TRUE))
}
score_Bin <- function (betas, y, X) {
eta <- c(X %*% betas)
- crossprod(X, y - plogis(eta))
}
Then you can use it as follows:
# load the data
mydata <- read.csv("https://stats.idre.ucla.edu/stat/data/binary.csv")
# fit with optim()
opt1 <- optim(c(-1, 1, -1), logLik_Bin, score_Bin, method = "BFGS",
y = mydata$admit, X = cbind(1, mydata$gre, mydata$gpa))
opt1$par
# compare with glm()
glm(admit ~ gre + gpa, data = mydata, family = binomial())
Typically, for well-behaved covariates (i.e., expecting to have a coefficients in the interval [-4 to 4]), starting at 0 is a good idea.
I am using 'KFAS' package from R to estimate a state-space model with the Kalman filter. My measurement and transition equations are:
y_t = Z_t * x_t + \eps_t (measurement)
x_t = T_t * x_{t-1} + R_t * \eta_t (transition),
with \eps_t ~ N(0,H_t) and \eta_t ~ N(0,Q_t).
So, I want to estimate the variances H_t and Q_t, but also T_t, the AR(1) coefficient. My code is as follows:
library(KFAS)
set.seed(100)
eps <- rt(200, 4, 1)
meas <- as.matrix((arima.sim(n=200, list(ar=0.6), innov = rnorm(200)*sqrt(0.5)) + eps),
ncol=1)
Zt <- 1
Ht <- matrix(NA)
Tt <- matrix(NA)
Rt <- 1
Qt <- matrix(NA)
ss_model <- SSModel(meas ~ -1 + SSMcustom(Z = Zt, T = Tt, R = Rt,
Q = Qt), H = Ht)
fit <- fitSSM(ss_model, inits = c(0,0.6,0), method = 'L-BFGS-B')
But it returns: "Error in is.SSModel(do.call(updatefn, args = c(list(inits, model), update_args)),: System matrices (excluding Z) contain NA or infinite values, covariance matrices contain values larger than 1e+07"
The NA definitions for the variances works well, as documented in the package's paper. However, it seems this cannot be done for the AR coefficients. Does anyone know how can I do this?
Note that I am aware of the SSMarima function, which eases the definition of the transition equation as ARIMA models. Although I am able to estimate the AR(1) coef. and Q_t this way, I still cannot estimate the \eps_t variance (H_t). Moreover, I am migrating my Kalman filter codes from EViews to R, so I need to learn SSMcustom for other models that are more complicated.
Thanks!
It seems that you are missing something in your example, as your error message comes from the function fitSSM. If you want to use fitSSM for estimating general state space models, you need to provide your own model updating function. The default behaviour can only handle NA's in covariance matrices H and Q. The main goal of fitSSM is just to get started with simple stuff. For complex models and/or large data, I would recommend using your self-written objective function (with help of logLik method) and your favourite numerical optimization routines manually for maximum performance. Something like this:
library(KFAS)
set.seed(100)
eps <- rt(200, 4, 1)
meas <- as.matrix((arima.sim(n=200, list(ar=0.6), innov = rnorm(200)*sqrt(0.5)) + eps),
ncol=1)
Zt <- 1
Ht <- matrix(NA)
Tt <- matrix(NA)
Rt <- 1
Qt <- matrix(NA)
ss_model <- SSModel(meas ~ -1 + SSMcustom(Z = Zt, T = Tt, R = Rt,
Q = Qt), H = Ht)
objf <- function(pars, model, estimate = TRUE) {
model$H[1] <- pars[1]
model$T[1] <- pars[2]
model$Q[1] <- pars[3]
if (estimate) {
-logLik(model)
} else {
model
}
}
opt <- optim(c(1, 0.5, 1), objf, method = "L-BFGS-B",
lower = c(0, -0.99, 0), upper = c(100, 0.99, 100), model = ss_model)
ss_model_opt <- objf(opt$par, ss_model, estimate = FALSE)
Same with fitSSM:
updatefn <- function(pars, model) {
model$H[1] <- pars[1]
model$T[1] <- pars[2]
model$Q[1] <- pars[3]
model
}
fit <- fitSSM(ss_model, c(1, 0.5, 1), updatefn, method = "L-BFGS-B",
lower = c(0, -0.99, 0), upper = c(100, 0.99, 100))
identical(ss_model_opt, fit$model)
I know that the smoothing parameter(lambda) is quite important for fitting a smoothing spline, but I did not see any post here regarding how to select a reasonable lambda (spar=?), I was told that spar normally ranges from 0 to 1. Could anyone share your experience when use smooth.spline()? Thanks.
smooth.spline(x, y = NULL, w = NULL, df, spar = NULL,
cv = FALSE, all.knots = FALSE, nknots = NULL,
keep.data = TRUE, df.offset = 0, penalty = 1,
control.spar = list(), tol = 1e-6 * IQR(x))
agstudy provides a visual way to choose spar. I remember what I learned from linear model class (but not exact) is to use cross validation to pick "best" spar. Here's a toy example borrowed from agstudy:
x = seq(1:18)
y = c(1:3,5,4,7:3,2*(2:5),rep(10,4))
splineres <- function(spar){
res <- rep(0, length(x))
for (i in 1:length(x)){
mod <- smooth.spline(x[-i], y[-i], spar = spar)
res[i] <- predict(mod, x[i])$y - y[i]
}
return(sum(res^2))
}
spars <- seq(0, 1.5, by = 0.001)
ss <- rep(0, length(spars))
for (i in 1:length(spars)){
ss[i] <- splineres(spars[i])
}
plot(spars, ss, 'l', xlab = 'spar', ylab = 'Cross Validation Residual Sum of Squares' , main = 'CV RSS vs Spar')
spars[which.min(ss)]
R > spars[which.min(ss)]
[1] 0.381
Code is not neatest, but easy for you to understand. Also, if you specify cv=T in smooth.spline:
R > xyspline <- smooth.spline(x, y, cv=T)
R > xyspline$spar
[1] 0.3881
From the help of smooth.spline you have the following:
The computational λ used (as a function of \code{spar}) is λ = r *
256^(3*spar - 1)
spar can be greater than 1 (but I guess no too much). I think you can vary this parameters and choose it graphically by plotting the fitted values for different spars. For example:
spars <- seq(0.2,2,length.out=10) ## I will choose between 10 values
dat <- data.frame(
spar= as.factor(rep(spars,each=18)), ## spar to group data(to get different colors)
x = seq(1:18), ## recycling here to repeat x and y
y = c(1:3,5,4,7:3,2*(2:5),rep(10,4)))
xyplot(y~x|spar,data =dat, type=c('p'), pch=19,groups=spar,
panel =function(x,y,groups,...)
{
s2 <- smooth.spline(y,spar=spars[panel.number()])
panel.lines(s2)
panel.xyplot(x,y,groups,...)
})
Here for example , I get best results for spars = 0.4
If you don't have duplicated points at the same x value, then try setting GCV=TRUE - the Generalized Cross Validation (GCV) procedure is a clever way of selecting a pretty good stab at picking a good value for lambda (span). One neat detail about the GCV is that it doesn't actually have to go to the trouble of doing the calculations for every single set of one-left-out points - as highlighted in Simon Wood's book. For lots of detail on this have a look at the notes on Simon Wood's web page on MGCV.
Adrian Bowman's (sm) r-package has a function h.select() which is intended specifically for going the grunt work for choosing a value of lambda (though I'm not 100% sure that it is compatible with the smooth.spline() function in the base package.