I am trying to evaluate themodel fit of several regressions in R, and I have run into a problem I have had multiple times now: the log-likelihood of my Poisson regression is infinite.
I'm using a non-integer dependent variable (Note: I know what I'm doing in this regard), and I'm wondering if maybe that's the problem. However, I don't get an infinite log-likelihood when running the regression with glm.nb.
Code to reproduce the issue is below.
Edit: the problem appears to go away when I coerce the DV to integer. Any idea how to get log likelihood from Poissons with non-integer DVs?
# Input Data
so_data <- data.frame(dv = c(21.0552722691125, 24.3061351414885, 7.84658638053276,
25.0294679770848, 15.8064731063311, 10.8171744654056, 31.3008088413026,
2.26643928259238, 18.4261153345417, 5.62915828161753, 17.0691184593063,
1.11959635820499, 30.0154935602592, 23.0000809735738, 28.4389825676123,
27.7678405415711, 23.7108405071757, 23.5070651053276, 14.2534787168392,
15.2058525068363, 19.7449094187771, 2.52384709295823, 29.7081691356397,
32.4723790240354, 19.2147002673637, 61.7911384519901, 10.5687170234821,
23.9047421013736, 18.4889651451222, 13.0360878554798, 15.1752866581849,
11.5205948111817, 31.3539840929108, 31.7255952728076, 25.3034625215724,
5.00013988265465, 30.2037887018226, 1.86123112349445, 3.06932041603219,
22.6739418581257, 6.33738321053804, 24.2933951601142, 14.8634827414491,
31.8302947881089, 34.8361908525564, 1.29606416941288, 13.206844629927,
28.843579313401, 25.8024295609021, 14.4414831628722, 18.2109680632694,
14.7092063453463, 10.0738043919183, 28.4124482962025, 27.1004208775326,
1.31350378236957, 14.3009307888745, 1.32555197766214, 2.70896028922312,
3.88043749517381, 3.79492216916016, 19.4507965653633, 32.1689088941444,
2.61278585713499, 41.6955885902228, 2.13466761675063, 30.4207256294235,
24.8231524369244, 20.7605955978196, 17.2182798298094, 2.11563574288652,
12.290778250655, 0.957467139696772, 16.1775287334746))
# Run Model
p_mod <- glm(dv ~ 1, data = so_data, family = poisson(link = 'log'))
# Be Confused
logLik(p_mod)
Elaborating on #ekstroem's comment: the Poisson distribution is only supported over the non-negative integers (0, 1, ...). So, technically speaking, the probability of any non-integer value is zero -- although R does allow for a little bit of fuzz, to allow for round-off/floating-point representation issues:
> dpois(1,lambda=1)
[1] 0.3678794
> dpois(1.1,lambda=1)
[1] 0
Warning message:
In dpois(1.1, lambda = 1) : non-integer x = 1.100000
> dpois(1+1e-7,lambda=1) ## fuzz
[1] 0.3678794
It is theoretically possible to compute something like a Poisson log-likelihood for non-integer values:
my_dpois <- function(x,lambda,log=FALSE) {
LL <- -lambda+x*log(lambda)-lfactorial(x)
if (log) LL else exp(LL)
}
but I would be very careful - some quick tests with integrate suggest it integrates to 1 (after I fixed the bug in it), but I haven't checked more carefully that this is really a well-posed probability distribution. (On the other hand, some reasonable-seeming posts on CrossValidated suggest that it's not insane ...)
You say "I know what I'm doing in this regard"; can you give some more of the context? Some alternative possibilities (although this is steering into CrossValidated territory) -- the best answer depends on where your data really come from (i.e., why you have "count-like" data that are non-integer but you think should be treated as Poisson).
a quasi-Poisson model (family=quasipoisson). (R will still not give you log-likelihood or AIC values in this case, because technically they don't exist -- you're supposed to do inference on the basis of the Wald statistics of the parameters; see e.g. here for more info.)
a Gamma model (probably with a log link)
if the data started out as count data that you've scaled by some measure of effort or exposure), use an appropriate offset model ...
a generalized least-squares model (nlme::gls) with an appropriate heteroscedasticity specification
Poisson log-likelihood involves calculating log(factorial(x)) (https://www.statlect.com/fundamentals-of-statistics/Poisson-distribution-maximum-likelihood). For values larger than 30 it has to be done using Stirling's approximation formula in order to avoid exceeding the limit of computer arithmetic. Sample code in Python:
# define a likelihood function. https://www.statlect.com/fundamentals-of- statistics/Poisson-distribution-maximum-likelihood
def loglikelihood_f(lmba, x):
#Using Stirling formula to avoid calculation of factorial.
#logfactorial(n) = n*ln(n) - n
n = x.size
logfactorial = x*np.log(x+0.001) - x #np.log(factorial(x))
logfactorial[logfactorial == -inf] = 0
result =\
- np.sum(logfactorial) \
- n * lmba \
+ np.log(lmba) * np.sum(x)
return result
Hopefully someone can help me with this one, because I am really stuck and do not find my coding error!
I am fitting zero-inflated poisson / negative binomial GLMs (no random effects) in JAGS (with R2Jags) and everything is fine with the parameter estimates, priors, initial values and chains convergence. All results are perfectly in line with, e.g., the estimates from the pscl-package, including my calculation of pearson residuals in the model...
The only thing I cannot get to work is to sample from the model a new sample to obtain a bayesian p-value for evaluating the model fit. The "normal" poisson and negative binomial models I fit before all gave the expected replicated samples and no problems occured.
Here's my code so far, but the important part is "#New Samples":
model{
# 1. Priors
beta ~ dmnorm(b0[], B0[,])
aB ~ dnorm(0.001, 1)
#2. Likelihood function
for (i in 1:N){
# Logistic part
W[i] ~ dbern(psi.min1[i])
psi.min1[i] <- 1 - psi[i]
eta.psi[i] <- aB
logit(psi[i]) <- eta.psi[i]
# Poisson part
Y[i] ~ dpois(mu.eff[i])
mu.eff[i] <- W[i] * mu[i]
log(mu[i]) <- max(-20, min(20, eta.mu[i]))
eta.mu[i] <- inprod(beta[], X[i,])
# Discrepancy measures:
ExpY[i] <- mu [i] * (1 - psi[i])
VarY[i] <- (1- psi[i]) * (mu[i] + psi[i] * pow(mu[i], 2))
PRes[i] <- (Y[i] - ExpY[i]) / sqrt(VarY[i])
D[i] <- pow(PRes[i], 2)
# New Samples:
YNew[i] ~ dpois(mu.eff[i])
PResNew[i] <- (YNew[i] - ExpY[i]) / sqrt(VarY[i])
DNew[i] <- pow(PResNew[i], 2)
}
Fit <- sum(D[1:N])
FitNew <- sum(DNew[1:N])
}
The big problem is, that I really tried all combinations and alterations I think could/should work, but when I look at the simulated samples, I get this here:
> all.equal( Jags1$BUGSoutput$sims.list$YNew, Jags1$BUGSoutput$sims.list$Y )
[1] TRUE
And, to make it really weird, when using the means of Fit and FitNew:
> Jags1$BUGSoutput$mean$Fit
[1] 109.7883
> Jags1$BUGSoutput$mean$FitNew
[1] 119.2111
Has anyone a clue what I am doing wrong? Any help would be deeply appreciated!
Kind regards, Ulf
I suspect this isn't the case, but the only obvious reason I can suspect for Y[i] and YNew[i] being always identical is if mu.eff[i] is ~zero, either because W[i] is 0 or mu[i] is close to zero. This implies that Y[] is always zero, which is easy to check from your data, but as I said it does seem odd that you would be trying to model this... Otherwise, I'm not sure what is going on ... try simplifying the code to see if that solves the problem, and then add things back in until it breaks again. Some other suggestions:
It may be helpful for debugging to look at the absolute values of Y and YNew rather than just Y==YNew
If you want a negative binomial (= gamma-Poisson) try sampling mu[i] from a gamma distribution - I have used this formulation for ZINB models extensively, so am sure it works
Your prior for aB looks odd to me - it gives a prior 95% CI for zero inflation around 12-88% - is that what you intended? And why a mean of 0.001 not 0? If you have no predictors then a beta prior for psi.min seems more natural - and if you have no useful prior information a beta(1,1) prior would be an obvious choice.
Minor point but you are calculating a lot of deterministic functions of aB within the for loop - this is going to slow down your model...
Hope that helps,
Matt
So, after getting a nervous breakdown and typing all again and again while searching for my coding error, I found the most stupid error I have ever made - so far:
I just did not specify "Y" as a parameter to save, only "YNew", so when I compared YNew and Y from the sims.list with all.equal, I did not get what I thought I should. I do not know why JAGS gives me the Y at all (from the sims.list of the JAGS-object), but for some reason it is just giving me YNew when asked to give Y. So this part is actually right:
Jags1$BUGSoutput$mean$Fit
[1] 109.7883
Jags1$BUGSoutput$mean$FitNew
[1] 119.2111
So I hope that I did not cause a major confusion for anybody...
I am working with the cumulative emergence of flies over time (taken at irregular intervals) over many summers (though first I am just trying to make one year work). The cumulative emergence follows a sigmoid pattern and I want to create a maximum likelihood estimation of a 3-parameter Weibull cumulative distribution function. The three-parameter models I've been trying to use in the fitdistrplus package keep giving me an error. I think this must have something to do with how my data is structured, but I cannot figure it out. Obviously I want it to read each point as an x (degree days) and a y (emergence) value, but it seems to be unable to read two columns. The main error I'm getting says "Non-numeric argument to mathematical function" or (with slightly different code) "data must be a numeric vector of length greater than 1". Below is my code including added columns in the df_dd_em dataframe for cumulative emergence and percent emergence in case that is useful.
degree_days <- c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94)
emergence <- c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0)
cum_em <- cumsum(emergence)
df_dd_em <- data.frame (degree_days, emergence, cum_em)
df_dd_em$percent <- ave(df_dd_em$emergence, FUN = function(df_dd_em) 100*(df_dd_em)/46)
df_dd_em$cum_per <- ave(df_dd_em$cum_em, FUN = function(df_dd_em) 100*(df_dd_em)/46)
x <- pweibull(df_dd_em[c(1,3)],shape=5)
dframe2.mle <- fitdist(x, "weibull",method='mle')
Here's my best guess at what you're after:
Set up data:
dd <- data.frame(degree_days=c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94),
emergence=c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0))
dd <- transform(dd,cum_em=cumsum(emergence))
We're actually going to fit to an "interval-censored" distribution (i.e. probability of emergence between successive degree day observations: this version assumes that the first observation refers to observations before the first degree-day observation, you could change it to refer to observations after the last observation).
library(bbmle)
## y*log(p) allowing for 0/0 occurrences:
y_log_p <- function(y,p) ifelse(y==0 & p==0,0,y*log(p))
NLLfun <- function(scale,shape,x=dd$degree_days,y=dd$emergence) {
prob <- pmax(diff(pweibull(c(-Inf,x), ## or (c(x,Inf))
shape=shape,scale=scale)),1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
library(bbmle)
I should probably have used something more systematic like the method of moments (i.e. matching the mean and variance of a Weibull distribution with the mean and variance of the data), but I just hacked around a bit to find plausible starting values:
## preliminary look (method of moments would be better)
scvec <- 10^(seq(0,4,length=101))
plot(scvec,sapply(scvec,NLLfun,shape=1))
It's important to use parscale to let R know that the parameters are on very different scales:
startvals <- list(scale=1000,shape=1)
m1 <- mle2(NLLfun,start=startvals,
control=list(parscale=unlist(startvals)))
Now try with a three-parameter Weibull (as originally requested) -- requires only a slight modification of what we already have:
library(FAdist)
NLLfun2 <- function(scale,shape,thres,
x=dd$degree_days,y=dd$emergence) {
prob <- pmax(diff(pweibull3(c(-Inf,x),shape=shape,scale=scale,thres)),
1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
startvals2 <- list(scale=1000,shape=1,thres=100)
m2 <- mle2(NLLfun2,start=startvals2,
control=list(parscale=unlist(startvals2)))
Looks like the three-parameter fit is much better:
library(emdbook)
AICtab(m1,m2)
## dAIC df
## m2 0.0 3
## m1 21.7 2
And here's the graphical summary:
with(dd,plot(cum_em~degree_days,cex=3))
with(as.list(coef(m1)),curve(sum(dd$emergence)*
pweibull(x,shape=shape,scale=scale),col=2,
add=TRUE))
with(as.list(coef(m2)),curve(sum(dd$emergence)*
pweibull3(x,shape=shape,
scale=scale,thres=thres),col=4,
add=TRUE))
(could also do this more elegantly with ggplot2 ...)
These don't seem like spectacularly good fits, but they're sane. (You could in principle do a chi-squared goodness-of-fit test based on the expected number of emergences per interval, and accounting for the fact that you've fitted a three-parameter model, although the values might be a bit low ...)
Confidence intervals on the fit are a bit of a nuisance; your choices are (1) bootstrapping; (2) parametric bootstrapping (resample parameters assuming a multivariate normal distribution of the data); (3) delta method.
Using bbmle::mle2 makes it easy to do things like get profile confidence intervals:
confint(m1)
## 2.5 % 97.5 %
## scale 1576.685652 1777.437283
## shape 4.223867 6.318481
dd <- data.frame(degree_days=c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94),
emergence=c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0))
dd$cum_em <- cumsum(dd$emergence)
dd$percent <- ave(dd$emergence, FUN = function(dd) 100*(dd)/46)
dd$cum_per <- ave(dd$cum_em, FUN = function(dd) 100*(dd)/46)
dd <- transform(dd)
#start 3 parameter model
library(FAdist)
## y*log(p) allowing for 0/0 occurrences:
y_log_p <- function(y,p) ifelse(y==0 & p==0,0,y*log(p))
NLLfun2 <- function(scale,shape,thres,
x=dd$degree_days,y=dd$percent) {
prob <- pmax(diff(pweibull3(c(-Inf,x),shape=shape,scale=scale,thres)),
1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
startvals2 <- list(scale=1000,shape=1,thres=100)
m2 <- mle2(NLLfun2,start=startvals2,
control=list(parscale=unlist(startvals2)))
summary(m2)
#graphical summary
windows(5,5)
with(dd,plot(cum_per~degree_days,cex=3))
with(as.list(coef(m2)),curve(sum(dd$percent)*
pweibull3(x,shape=shape,
scale=scale,thres=thres),col=4,
add=TRUE))
Following some standard textbooks on ARMA(1,1)-GARCH(1,1) (e.g. Ruey Tsay's Analysis of Financial Time Series), I try to write an R program to estimate the key parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. For some random reason, I cannot decipher what is wrong with my R program. The R package fGarch already gives me the answer, but my customized function does not seem to produce the same result.
I would like to build an R program that helps estimate the baseline ARMA(1,1)-GARCH(1,1) model. Then I would like to adapt this baseline script to fit different GARCH variants (e.g. EGARCH, NGARCH, and TGARCH). It would be much appreciated if you could provide some guidance in this case. The code below is the R script for estimating the 6 parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. At any rate, I would be glad to know your thoughts and insights. If you have a similar example, please feel free to share your extant code in R. Many thanks in advance.
Emily
# This R script offers a suite of functions for estimating the volatility dynamics based on the standard ARMA(1,1)-GARCH(1,1) model and its variants.
# The baseline ARMA(1,1) model characterizes the dynamic evolution of the return generating process.
# The baseline GARCH(1,1) model depicts the the return volatility dynamics over time.
# We can extend the GARCH(1,1) volatility model to a variety of alternative specifications to capture the potential asymmetry for a better comparison:
# GARCH(1,1), EGARCH(1,1), NGARCH(1,1), and TGARCH(1,1).
options(scipen=10)
intel= read.csv(file="intel.csv")
summary(intel)
raw_data= as.matrix(intel$logret)
library(fGarch)
garchFit(~arma(1,1)+garch(1,1), data=raw_data, trace=FALSE)
negative_log_likelihood_arma11_garch11=
function(theta, data)
{mean =theta[1]
delta=theta[2]
gamma=theta[3]
omega=theta[4]
alpha=theta[5]
beta= theta[6]
r= ts(data)
n= length(r)
u= vector(length=n)
u= ts(u)
u[1]= r[1]- mean
for (t in 2:n)
{u[t]= r[t]- mean- delta*r[t-1]- gamma*u[t-1]}
h= vector(length=n)
h= ts(h)
h[1]= omega/(1-alpha-beta)
for (t in 2:n)
{h[t]= omega+ alpha*(u[t-1]^2)+ beta*h[t-1]}
#return(-sum(dnorm(u[2:n], mean=mean, sd=sqrt(h[2:n]), log=TRUE)))
pi=3.141592653589793238462643383279502884197169399375105820974944592
return(-sum(-0.5*log(2*pi) -0.5*log(h[2:n]) -0.5*(u[2:n]^2)/h[2:n]))
}
#theta0=c(0, +0.78, -0.79, +0.0000018, +0.06, +0.93, 0.01)
theta0=rep(0.01,6)
negative_log_likelihood_arma11_garch11(theta=theta0, data=raw_data)
alpha= proc.time()
maximum_likelihood_fit_arma11_garch11=
nlm(negative_log_likelihood_arma11_garch11,
p=theta0,
data=raw_data,
hessian=TRUE,
iterlim=500)
#optim(theta0,
# negative_log_likelihood_arma11_garch11,
# data=raw_data,
# method="L-BFGS-B",
# upper=c(+0.999999999999,+0.999999999999,+0.999999999999,0.999999999999,0.999999999999,0.999999999999),
# lower=c(-0.999999999999,-0.999999999999,-0.999999999999,0.000000000001,0.000000000001,0.000000000001),
# hessian=TRUE)
# We record the end time and calculate the total runtime for the above work.
omega= proc.time()
runtime= omega-alpha
zhours = floor(runtime/60/60)
zminutes=floor(runtime/60- zhours*60)
zseconds=floor(runtime- zhours*60*60- zminutes*60)
print(paste("It takes ",zhours,"hour(s)", zminutes," minute(s) ","and ", zseconds,"second(s) to finish running this R program",sep=""))
maximum_likelihood_fit_arma11_garch11
sqrt(diag(solve(maximum_likelihood_fit_arma11_garch11$hessian)))