Discrepancies in curve fitting of exponential functions using maximum likelihood in R - r

I have a data-set and I'm exploring potential exponential fits to the data using eye-balled estimations and estimations via maximum likelihood methods. I'm finding a huge discrepancy in one of the parameter fits and I'm not sure how to reconcile between what I expect vs the stats. I'm wondering if the dnorm in the ML function is appropriate or perhaps my plotting from the estimates isn't correct?
Data fit to exponential function type a + exp(-x * b) + c
y.size <- c(2.69,4.1,8.04,3.1,5.27,5.033333333,3.2,7.25,6.29,4.55,6.1,2.65,3.145,3.775,3.46,5.73,5.31,4.425,3.725,4.32,5,3.09,5.25,5.65,3.48,6.1,10,9.666666667,6.06,5.9,2.665,4.32,3.816666667,3.69,5.8,5,3.72,3.045,4.485,3.642857143,5.5,6.333333333,4.75,6,7.466666667,5.03,5.23,4.85,5.59,5.96,5.33,4.92,4.255555556,6.346666667,4.13,6.33,4,7.35,6.35,4.63,5.13,7.4,4.28,4.233333333,4.3125,6.18,4.3,4.47,4.88,4.5,2.96,2.1,3.7,3.62,5.42,3.8,5.5,3.27,3.36,3.266666667,2.265,3.1,2.51,2.51,4.4,2.64,4.38,4.53,2.29,2.87,3.395,3.26,2.77,3.22,4.31,4.73,4.05,3.48,4.8,4.7,3.05,4.21,5.95,4.39,4.55,4.27,4.955,4.65,3.32,3.48,3.828571429,4.69,4.68,3.76,3.91,4,4.41,4.19,4.733333333,4.32,2.83,3.41,4.42,3.47,3.84,4.39)
x.number <- c(69,62,8,80,13,12,2,22,19,49,840,44,31,56,33,58,91,8,15,86,11,69,12,24,32,27,1,4,26,4,28,33,1516,41,20,58,44,29,58,14,3,3,6,3,26,52,26,29,92,30,18,11,27,19,38,78,57,52,17,45,56,7,37,7,14,13,164,76,82,14,273,122,662,434,126,374,1017,522,374,602,164,5,191,243,134,70,23,130,306,516,414,236,172,164,92,53,50,17,22,27,92,48,30,55,28,296,35,12,350,17,22,53,97,62,92,272,242,170,37,220,452,270,392,314,150,232)
df <- df[df$totalinf < 750,]
a <- 9
b <- 0.07
c <- 3
ggplot(data=df,aes(x=x.number,y=y.size))+ geom_point()+ stat_function(fun=function(x)a*exp(-x*b) + c, color = "blue")
I use a ML estimator to initialize the function using the estimates
a = 9; b = 0.07; c = 3; sigma = 1
expreg <- function(a,b,c, sigma){
y.pred <- a * exp(-x.number*b) + c
ll <- -sum(dnorm(y.size, mean = y.pred, sd = sigma, log=TRUE ))
ll
}
mle2.expreg.model <- mle(expreg, start = list(a = 9, b = 0.07, c = 3 , sigma = 1))
summary(mle2.expreg.model)
Coefficients:
Estimate Std. Error
a 8.667305 974.16243176
b 12.671940 NaN
c 4.488451 0.12466034
sigma 1.382550 0.08814796
The b estimates are nearly 10x difference in order of magnitude. Needless to say, the curve looks quite flat with the new estimates

Related

MCMCglmm binomial model prior

I want to estimate a binomial model with the R package MCMCglmm. The model shall incorporate an intercept and a slope - both as fixed and random parts. How do I have to specify an accepted prior? (Note, here is a similar question, but in a much more complicated setting.)
Assume the data have the following form:
y x cluster
1 0 -0.56047565 1
2 1 -0.23017749 1
3 0 1.55870831 1
4 1 0.07050839 1
5 0 0.12928774 1
6 1 1.71506499 1
In fact, the data have been generated by
set.seed(123)
nj <- 15 # number of individuals per cluster
J <- 30 # number of clusters
n <- nj * J
x <- rnorm(n)
y <- rbinom(n, 1, prob = 0.6)
cluster <- factor(rep(1:nj, each = J))
dat <- data.frame(y = y, x = x, cluster = cluster)
The information in the question about the model, suggest to specify fixed = y ~ 1 + x and random = ~ us(1 + x):cluster. With us() you allow the random effects to be correlated (cf. section 3.4 and table 2 in Hadfield's 2010 jstatsoft-article)
First of all, as you only have one dependent variable (y), the G part in the prior (cf. equation 4 and section 3.6 in Hadfield's 2010 jstatsoft-article) for the random effects variance(s) only needs to have one list element called G1. This list element isn't the actual prior distribution - this was specified by Hadfield to be an inverse-Wishart distribution. But with G1 you specify the parameters of this inverse-Whishart distribution which are the scale matrix ( in Wikipedia notation and V in MCMCglmm notation) and the degrees of freedom ( in Wikipedia notation and nu in MCMCglmm notation). As you have two random effects (the intercept and the slope) V has to be a 2 x 2 matrix. A frequent choice is the two dimensional identity matrix diag(2). Hadfield often uses nu = 0.002 for the degrees of freedom (cf. his course notes)
Now, you also have to specify the R part in the prior for the residual variance. Here again an inverse-Whishart distribution was specified by Hadfield, leaving the user to specify its parameters. As we only have one residual variance, V has to be a scalar (lets say V = 0.5). An optional element for R is fix. With this element you specify, whether the residual variance shall be fixed to a certain value (than you have to write fix = TRUE or fix = 1) or not (then fix = FALSE or fix = 0). Notice, that you don't fix the residual variance to be 0.5 by fix = 0.5! So when you find in Hadfield's course notes fix = 1, read it as fix = TRUE and look to which value of V it is was fixed.
All togehter we set up the prior as follows:
prior0 <- list(G = list(G1 = list(V = diag(2), nu = 0.002)),
R = list(V = 0.5, nu = 0.002, fix = FALSE))
With this prior we can run MCMCglmm:
library("MCMCglmm") # for MCMCglmm()
set.seed(123)
mod0 <- MCMCglmm(fixed = y ~ 1 + x,
random = ~ us(1 + x):cluster,
data = dat,
family = "categorical",
prior = prior0)
The draws from the Gibbs-sampler for the fixed effects are found in mod0$Sol, the draws for the variance parameters in mod0$VCV.
Normally a binomial model requires the residual variance to be fixed, so we set the residual variance to be fixed at 0.5
set.seed(123)
prior1 <- list(G = list(G1 = list(V = diag(2), nu = 0.002)),
R = list(V = 0.5, nu = 0.002, fix = TRUE))
mod1 <- MCMCglmm(fixed = y ~ 1 + x,
random = ~ us(1 + x):cluster,
data = dat,
family = "categorical",
prior = prior1)
The difference can be seen by comparing mod0$VCV[, 5] to mod1$VCV[, 5]. In the later case, all entries are 0.5 as specified.

`nls` fails to estimate parameters of my model

I am trying to estimate the constants for Heaps law.
I have the following dataset novels_colection:
Number of novels DistinctWords WordOccurrences
1 1 13575 117795
2 1 34224 947652
3 1 40353 1146953
4 1 55392 1661664
5 1 60656 1968274
Then I build the next function:
# Function for Heaps law
heaps <- function(K, n, B){
K*n^B
}
heaps(2,117795,.7) #Just to test it works
So n = Word Occurrences, and K and B are values that should be constants in order to find my prediction of Distinct Words.
I tried this but it gives me an error:
fitHeaps <- nls(DistinctWords ~ heaps(K,WordOccurrences,B),
data = novels_collection[,2:3],
start = list(K = .1, B = .1), trace = T)
Error = Error in numericDeriv(form[[3L]], names(ind), env) :
Missing value or an infinity produced when evaluating the model
Any idea in how could I fix this or a method to fit the function and get the values for K and B?
If you take log transform on both sides of y = K * n ^ B, you get log(y) = log(K) + B * log(n). This is a linear relationship between log(y) and log(n), hence you can fit a linear regression model to find log(K) and B.
logy <- log(DistinctWords)
logn <- log(WordOccurrences)
fit <- lm(logy ~ logn)
para <- coef(fit) ## log(K) and B
para[1] <- exp(para[1]) ## K and B
With minpack.lm we can fit a non-linear model but I guess it will be prone to overfitting more than a linear model on the log-transformed variables will do (as done by Zheyuan), but we may compare the residuals of linear / non-linear model on some held-out dataset to get the empirical results, which will be interesting to see.
library(minpack.lm)
fitHeaps = nlsLM(DistinctWords ~ heaps(K, WordOccurrences, B),
data = novels_collection[,2:3],
start = list(K = .01, B = .01))
coef(fitHeaps)
# K B
# 5.0452566 0.6472176
plot(novels_collection$WordOccurrences, novels_collection$DistinctWords, pch=19)
lines(novels_collection$WordOccurrences, predict(fitHeaps, newdata = novels_collection[,2:3]), col='red')

How to estimate the Kalman Filter with 'KFAS' R package, with an AR(1) transition equation?

I am using 'KFAS' package from R to estimate a state-space model with the Kalman filter. My measurement and transition equations are:
y_t = Z_t * x_t + \eps_t (measurement)
x_t = T_t * x_{t-1} + R_t * \eta_t (transition),
with \eps_t ~ N(0,H_t) and \eta_t ~ N(0,Q_t).
So, I want to estimate the variances H_t and Q_t, but also T_t, the AR(1) coefficient. My code is as follows:
library(KFAS)
set.seed(100)
eps <- rt(200, 4, 1)
meas <- as.matrix((arima.sim(n=200, list(ar=0.6), innov = rnorm(200)*sqrt(0.5)) + eps),
ncol=1)
Zt <- 1
Ht <- matrix(NA)
Tt <- matrix(NA)
Rt <- 1
Qt <- matrix(NA)
ss_model <- SSModel(meas ~ -1 + SSMcustom(Z = Zt, T = Tt, R = Rt,
Q = Qt), H = Ht)
fit <- fitSSM(ss_model, inits = c(0,0.6,0), method = 'L-BFGS-B')
But it returns: "Error in is.SSModel(do.call(updatefn, args = c(list(inits, model), update_args)),: System matrices (excluding Z) contain NA or infinite values, covariance matrices contain values larger than 1e+07"
The NA definitions for the variances works well, as documented in the package's paper. However, it seems this cannot be done for the AR coefficients. Does anyone know how can I do this?
Note that I am aware of the SSMarima function, which eases the definition of the transition equation as ARIMA models. Although I am able to estimate the AR(1) coef. and Q_t this way, I still cannot estimate the \eps_t variance (H_t). Moreover, I am migrating my Kalman filter codes from EViews to R, so I need to learn SSMcustom for other models that are more complicated.
Thanks!
It seems that you are missing something in your example, as your error message comes from the function fitSSM. If you want to use fitSSM for estimating general state space models, you need to provide your own model updating function. The default behaviour can only handle NA's in covariance matrices H and Q. The main goal of fitSSM is just to get started with simple stuff. For complex models and/or large data, I would recommend using your self-written objective function (with help of logLik method) and your favourite numerical optimization routines manually for maximum performance. Something like this:
library(KFAS)
set.seed(100)
eps <- rt(200, 4, 1)
meas <- as.matrix((arima.sim(n=200, list(ar=0.6), innov = rnorm(200)*sqrt(0.5)) + eps),
ncol=1)
Zt <- 1
Ht <- matrix(NA)
Tt <- matrix(NA)
Rt <- 1
Qt <- matrix(NA)
ss_model <- SSModel(meas ~ -1 + SSMcustom(Z = Zt, T = Tt, R = Rt,
Q = Qt), H = Ht)
objf <- function(pars, model, estimate = TRUE) {
model$H[1] <- pars[1]
model$T[1] <- pars[2]
model$Q[1] <- pars[3]
if (estimate) {
-logLik(model)
} else {
model
}
}
opt <- optim(c(1, 0.5, 1), objf, method = "L-BFGS-B",
lower = c(0, -0.99, 0), upper = c(100, 0.99, 100), model = ss_model)
ss_model_opt <- objf(opt$par, ss_model, estimate = FALSE)
Same with fitSSM:
updatefn <- function(pars, model) {
model$H[1] <- pars[1]
model$T[1] <- pars[2]
model$Q[1] <- pars[3]
model
}
fit <- fitSSM(ss_model, c(1, 0.5, 1), updatefn, method = "L-BFGS-B",
lower = c(0, -0.99, 0), upper = c(100, 0.99, 100))
identical(ss_model_opt, fit$model)

Does the function y = at^b * exp(-ct) have a name? Can it be linearized? How can I estimate a, b, c?

I am trying to fit a non-linear model, but can not find any good examples
online.
Does this function have a name?
Can it be linearized?
I've attempted to estimate the parameters a, b, and c with a random effect g (as in group) as a function of time t, below. I can fit the model using nls without a random effect, but am having trouble getting the model to converge. Suggestions welcome (preferably within R, but any suitable package will do)?
## time, repeated 16 times for 4 replicates from each of 4 groups
t <- rep(1:20, 16)
## g, group
g <- rep(1:4, each = 80)
## starting to create an example dataset,
## to see if I can recover known parameters
a <- rep(c(3.5, 4, 4.1, 5), each = 80)
b <- rep(c(1.1, 1.4, 1.8, 2.5), each = 80)
c <- rep(c(0.125, 0.25), each = 160)
## error to add to above parameters
set.seed(1)
e_a <- runif(320, -0.5, 0.5)
e_b <- runif(320, -0.1, -0.1)
e_c <- runif(320, -0.02, 0.02)
## this is my function
f <- function(t, a, b, c) a * (t^b) * exp(-c * t)
## simulate y
y <- f(t = t, a + e_a, b + e_b, c + e_c)
mydata <- data.frame(t = t, y = y, g = g)
library(nlme)
## now fit the model to estimate a, b, c
fm1 <- nlme(y ~ a * (t^b) * exp(-c * t),
data = mydata,
fixed = a + b + c~1,
random = a + b + c ~ 1|g,
start = c(a = 4, b = 1, c = 0.25),
method = "REML")
In physics (and some other areas) I've seen this or variants of it called a Hoerl curve or Hoerl function e.g. here, though it has other names. If c is negative and a and b are positive it's a scaled gamma density.
When you ask about linearizing it, you have to be careful; the equation y = at^b . exp(ct) is not actually what you mean - the observations, y(i), are not exactly equal to a . t(i)^b . exp(ct(i)) (otherwise almost any 3 observations would give you the exact parameter values).
So the noise has to enter your model for y somehow. Is it additive? multiplicative, or something else? (Also important, but for other reasons: does its size change in some way as t changes, or not? Are the noise terms for different observations independent?)
If your actual model is y(i) = at(i)^b . exp(ct(i))+ε(i), that's not linearizable.
If your actual model is y(i) = at(i)^b . exp(ct(i)) . ε(i), and ε(i)=exp(η(i)) for some (hopefully zero-mean) η(i), that is linearizable.
Taking the second form,
log(y(i)) = log(a) + b log(t(i)) + c t(i) + log(ε(i))
or
y*(i) = a* + b.log(t(i)) + c.t(i) + η(i)
which is linear in the parameters a* = log(a), b and c, and the error term η(i); so if you're prepared to make that sort of an assumption about the error you should be able to fit it with methods suitable for such linear models; you may wish in that case to ponder the parenthetical questions about the error term above which may affect how you model it.

maximum likelihood estimation

I am new user of R and hope you will bear with me if my question is silly. I want to estimate the following model using the maximum likelihood estimator in R.
y= a+b*(lnx-α)
Where a, b, and α are parameters to be estimated and X and Y are my data set. I tried to use the following code that I get from the web:
library(foreign)
maindata <- read.csv("C:/Users/NUNU/Desktop/maindata/output2.csv")
h <- subset(maindata, cropid==10)
library(likelihood)
modelfun <- function (a, b, x) { b *(x-a)}
par <- list(a = 0, b = 0)
var<-list(x = "x")
par_lo <- list(a = 0, b = 0)
par_hi <- list(a = 50, b = 50)
var$y <- "y"
var$mean <- "predicted"
var$sd <- 0.815585
var$log <- TRUE
results <- anneal(model = modelfun, par = par, var = var,
source_data = h, par_lo = par_lo, par_hi = par_hi,
pdf = dnorm, dep_var = "y", max_iter = 20000)
The result I am getting is similar although the data is different, i.e., even when I change the cropid. Similarly, the predicted value generated is for x rather than y.
I do not know what I missed or went wrong. Your help is highly appreciated.
I am not sure if your model formula will lead to a unique solution, but in general you can find MLE with optim function
Here is a simple example for linear regression with optim:
fn <- function(beta, x, y) {
a = beta[1]
b = beta[2]
sum( (y - (a + b * log(x)))^2 )
}
# generate some data for testing
x = 1:100
# a = 10, b = 3.5
y = 10 + 3.5 * log(x)
optim(c(0,0,0),fn,x=x,y=y,method="BFGS")
you can change the function "fn" to reflect your model formula e.g.
sum( (y - (YOUR MODEL FORMULA) )^2 )
EDIT
I am just giving a simple example of using optim in case you have a custom model formula to optimize. I did not mean using it from simple linear regression, since lm will be sufficient.
I was a bit surprised that iTech used optim for what is a problem that is linear in its parameters. With his data for x and y:
> lm(y ~ log(x) )
Call:
lm(formula = y ~ log(x))
Coefficients:
(Intercept) log(x)
10.0 3.5
For linear problems, the least squares solution is the ML solution.

Resources