"convergence" for a derived quantity in JAGS/R2Jags - r

UPDATE: Now with Traceplot example
UPDATE: Now with new traceplot
I am trying to adapt Outhwaite et. als 2018 code for occupancy modelling and have a couple of questions that I just can't seem to find an answer for...
Code used to create model
cat(
"model{
### Model ###
# State model
for (i in 1:nsite){
for (t in 1:nyear){
z[i,t] ~ dbern(psi[i,t])
logit(psi[i,t])<- b[t] + u[i]
}}
# Observation model
for(j in 1:nvisit) {
y[j] ~ dbern(Py[j]+0.0001)
Py[j]<- z[Site[j],Year[j]]*p[j]
logit(p[j]) <- a[Year[j]] + c*logL[j]
}
### Priors ###
# State model priors
for(t in 1:nyear){
b[t] ~ dunif(-10,10) # fixed year effect
}
for (i in 1:nsite) {
u[i] ~ dnorm(0, tau.u) # random site effect
}
tau.u <- 1/(sd.u * sd.u)
sd.u ~ dunif(0, 5) # half-uniform hyperpriors
# Observation model priors
for (t in 1:nyear) {
a[t] ~ dnorm(mu.a, tau.a) # random year effect
}
mu.a ~ dnorm(0, 0.01)
tau.a <- 1 / (sd.a * sd.a)
sd.a ~ dunif(0, 5) # half-uniform hyperpriors
c ~ dunif(-10, 10) # sampling effort effect
### Derived parameters ###
# Finite sample occupancy - proportion of occupied sites
for (t in 1:nyear) {
psi.fs[t] <- sum(z[1:nsite,t])/nsite
}
#data# nyear, nsite, nvisit, y, logL, Site, Year
}", file="bmmodel.txt"
)
Note that dbern(Py[j]+0.0001) includes a correction factor since dbern(0) is not supported in JAGS.
I am running the model on some plant data just basically trying it out to see if it runs and converges and behaves as I would expect it to.
Question number 1(ANSWERED): I am interested in the quantity psi.fs[t]. But since the model calculates this quantity after the actual modelling process, can convergence be assessed for psi.fs[t]?
R code for running model with R2JAGS
jagsrespsi<-jags(data.list, inits=test.inits,
n.chains=2, n.iter=15000, n.thin=3,
DIC=T,
model.file=paste0(modeltype,"model.txt"), parameters.to.save=c("psi.fs"))
Question number 2: When I use traceplot(jagsrespsi) to plot the traceplot seems all over the place but the Rhat for jagsrespsi$BUGSoutput is 1 for all my years? gelman.diag(as.mcmc(jagsrespsi)) also indicates convergence. Same goes for monitoring psi!
I am very astonished by this model behaviour and am suspecting there is something wrong... but no idea where to look

Yes, you can check psi.ft[] for convergence in exactly the same way as you check the convergence of the model's parameters. That's exactly what happens, for example, in a logistic regression, where the fitted probabilities of response are calculated as exp(z)/(1 + exp(z)) for some linear predictor z.
When you say the traceplot is "all over the place", what do you mean? This could be either good or bad. Can you show an example? A "good" traceplot looks like a "fat, hairy caterpillar": consecutive samples taken from all regions of the sample space, a horizontal hair ball. Although written for SAS, this page gives a reasonable high level description of what a good trace plot looks like, and what problems might be indicated by less-than-ideal examples.
In response to your edit to include the trace plot...
That doesn't look like a particularly good traceplot to me: there seems to be some negative autocorrelation between successive samples. Have you calculated the effective sample size [ESS]?
But the plot may look a little odd because your chain is very short, IMHO. You can use the ESS to provide a very rough approximation for the accuracy of an estimated probability: the worst case half width CI of a binomial proportion is +/-2 * sqrt(0.5*0.5/N), where N is the sample size (or ESS in this case). So even if the efficiency of your MCMC process is 1 - so that the ESS is equal to the chain length - then the accuracy of your estimates is only +/-0.02. To estimate a probability to 2 decimal places (so that the half width of the CI is no more than 0.005), you need an ESS of 40,000.
There's nothing wrong with using short chain lengths during testing, but for "production" runs, then I would always use a chan length much greater than 2,500. (And I'd also use multiple chains so that I can use Gelman-Rubin statistics to test for convergence.)

Related

Why is the likelihood/AIC of my poisson regression infinite?

I am trying to evaluate themodel fit of several regressions in R, and I have run into a problem I have had multiple times now: the log-likelihood of my Poisson regression is infinite.
I'm using a non-integer dependent variable (Note: I know what I'm doing in this regard), and I'm wondering if maybe that's the problem. However, I don't get an infinite log-likelihood when running the regression with glm.nb.
Code to reproduce the issue is below.
Edit: the problem appears to go away when I coerce the DV to integer. Any idea how to get log likelihood from Poissons with non-integer DVs?
# Input Data
so_data <- data.frame(dv = c(21.0552722691125, 24.3061351414885, 7.84658638053276,
25.0294679770848, 15.8064731063311, 10.8171744654056, 31.3008088413026,
2.26643928259238, 18.4261153345417, 5.62915828161753, 17.0691184593063,
1.11959635820499, 30.0154935602592, 23.0000809735738, 28.4389825676123,
27.7678405415711, 23.7108405071757, 23.5070651053276, 14.2534787168392,
15.2058525068363, 19.7449094187771, 2.52384709295823, 29.7081691356397,
32.4723790240354, 19.2147002673637, 61.7911384519901, 10.5687170234821,
23.9047421013736, 18.4889651451222, 13.0360878554798, 15.1752866581849,
11.5205948111817, 31.3539840929108, 31.7255952728076, 25.3034625215724,
5.00013988265465, 30.2037887018226, 1.86123112349445, 3.06932041603219,
22.6739418581257, 6.33738321053804, 24.2933951601142, 14.8634827414491,
31.8302947881089, 34.8361908525564, 1.29606416941288, 13.206844629927,
28.843579313401, 25.8024295609021, 14.4414831628722, 18.2109680632694,
14.7092063453463, 10.0738043919183, 28.4124482962025, 27.1004208775326,
1.31350378236957, 14.3009307888745, 1.32555197766214, 2.70896028922312,
3.88043749517381, 3.79492216916016, 19.4507965653633, 32.1689088941444,
2.61278585713499, 41.6955885902228, 2.13466761675063, 30.4207256294235,
24.8231524369244, 20.7605955978196, 17.2182798298094, 2.11563574288652,
12.290778250655, 0.957467139696772, 16.1775287334746))
# Run Model
p_mod <- glm(dv ~ 1, data = so_data, family = poisson(link = 'log'))
# Be Confused
logLik(p_mod)
Elaborating on #ekstroem's comment: the Poisson distribution is only supported over the non-negative integers (0, 1, ...). So, technically speaking, the probability of any non-integer value is zero -- although R does allow for a little bit of fuzz, to allow for round-off/floating-point representation issues:
> dpois(1,lambda=1)
[1] 0.3678794
> dpois(1.1,lambda=1)
[1] 0
Warning message:
In dpois(1.1, lambda = 1) : non-integer x = 1.100000
> dpois(1+1e-7,lambda=1) ## fuzz
[1] 0.3678794
It is theoretically possible to compute something like a Poisson log-likelihood for non-integer values:
my_dpois <- function(x,lambda,log=FALSE) {
LL <- -lambda+x*log(lambda)-lfactorial(x)
if (log) LL else exp(LL)
}
but I would be very careful - some quick tests with integrate suggest it integrates to 1 (after I fixed the bug in it), but I haven't checked more carefully that this is really a well-posed probability distribution. (On the other hand, some reasonable-seeming posts on CrossValidated suggest that it's not insane ...)
You say "I know what I'm doing in this regard"; can you give some more of the context? Some alternative possibilities (although this is steering into CrossValidated territory) -- the best answer depends on where your data really come from (i.e., why you have "count-like" data that are non-integer but you think should be treated as Poisson).
a quasi-Poisson model (family=quasipoisson). (R will still not give you log-likelihood or AIC values in this case, because technically they don't exist -- you're supposed to do inference on the basis of the Wald statistics of the parameters; see e.g. here for more info.)
a Gamma model (probably with a log link)
if the data started out as count data that you've scaled by some measure of effort or exposure), use an appropriate offset model ...
a generalized least-squares model (nlme::gls) with an appropriate heteroscedasticity specification
Poisson log-likelihood involves calculating log(factorial(x)) (https://www.statlect.com/fundamentals-of-statistics/Poisson-distribution-maximum-likelihood). For values larger than 30 it has to be done using Stirling's approximation formula in order to avoid exceeding the limit of computer arithmetic. Sample code in Python:
# define a likelihood function. https://www.statlect.com/fundamentals-of- statistics/Poisson-distribution-maximum-likelihood
def loglikelihood_f(lmba, x):
#Using Stirling formula to avoid calculation of factorial.
#logfactorial(n) = n*ln(n) - n
n = x.size
logfactorial = x*np.log(x+0.001) - x #np.log(factorial(x))
logfactorial[logfactorial == -inf] = 0
result =\
- np.sum(logfactorial) \
- n * lmba \
+ np.log(lmba) * np.sum(x)
return result

How to obtain new samples from ZIP or ZINB-model for bayesian p-value

Hopefully someone can help me with this one, because I am really stuck and do not find my coding error!
I am fitting zero-inflated poisson / negative binomial GLMs (no random effects) in JAGS (with R2Jags) and everything is fine with the parameter estimates, priors, initial values and chains convergence. All results are perfectly in line with, e.g., the estimates from the pscl-package, including my calculation of pearson residuals in the model...
The only thing I cannot get to work is to sample from the model a new sample to obtain a bayesian p-value for evaluating the model fit. The "normal" poisson and negative binomial models I fit before all gave the expected replicated samples and no problems occured.
Here's my code so far, but the important part is "#New Samples":
model{
# 1. Priors
beta ~ dmnorm(b0[], B0[,])
aB ~ dnorm(0.001, 1)
#2. Likelihood function
for (i in 1:N){
# Logistic part
W[i] ~ dbern(psi.min1[i])
psi.min1[i] <- 1 - psi[i]
eta.psi[i] <- aB
logit(psi[i]) <- eta.psi[i]
# Poisson part
Y[i] ~ dpois(mu.eff[i])
mu.eff[i] <- W[i] * mu[i]
log(mu[i]) <- max(-20, min(20, eta.mu[i]))
eta.mu[i] <- inprod(beta[], X[i,])
# Discrepancy measures:
ExpY[i] <- mu [i] * (1 - psi[i])
VarY[i] <- (1- psi[i]) * (mu[i] + psi[i] * pow(mu[i], 2))
PRes[i] <- (Y[i] - ExpY[i]) / sqrt(VarY[i])
D[i] <- pow(PRes[i], 2)
# New Samples:
YNew[i] ~ dpois(mu.eff[i])
PResNew[i] <- (YNew[i] - ExpY[i]) / sqrt(VarY[i])
DNew[i] <- pow(PResNew[i], 2)
}
Fit <- sum(D[1:N])
FitNew <- sum(DNew[1:N])
}
The big problem is, that I really tried all combinations and alterations I think could/should work, but when I look at the simulated samples, I get this here:
> all.equal( Jags1$BUGSoutput$sims.list$YNew, Jags1$BUGSoutput$sims.list$Y )
[1] TRUE
And, to make it really weird, when using the means of Fit and FitNew:
> Jags1$BUGSoutput$mean$Fit
[1] 109.7883
> Jags1$BUGSoutput$mean$FitNew
[1] 119.2111
Has anyone a clue what I am doing wrong? Any help would be deeply appreciated!
Kind regards, Ulf
I suspect this isn't the case, but the only obvious reason I can suspect for Y[i] and YNew[i] being always identical is if mu.eff[i] is ~zero, either because W[i] is 0 or mu[i] is close to zero. This implies that Y[] is always zero, which is easy to check from your data, but as I said it does seem odd that you would be trying to model this... Otherwise, I'm not sure what is going on ... try simplifying the code to see if that solves the problem, and then add things back in until it breaks again. Some other suggestions:
It may be helpful for debugging to look at the absolute values of Y and YNew rather than just Y==YNew
If you want a negative binomial (= gamma-Poisson) try sampling mu[i] from a gamma distribution - I have used this formulation for ZINB models extensively, so am sure it works
Your prior for aB looks odd to me - it gives a prior 95% CI for zero inflation around 12-88% - is that what you intended? And why a mean of 0.001 not 0? If you have no predictors then a beta prior for psi.min seems more natural - and if you have no useful prior information a beta(1,1) prior would be an obvious choice.
Minor point but you are calculating a lot of deterministic functions of aB within the for loop - this is going to slow down your model...
Hope that helps,
Matt
So, after getting a nervous breakdown and typing all again and again while searching for my coding error, I found the most stupid error I have ever made - so far:
I just did not specify "Y" as a parameter to save, only "YNew", so when I compared YNew and Y from the sims.list with all.equal, I did not get what I thought I should. I do not know why JAGS gives me the Y at all (from the sims.list of the JAGS-object), but for some reason it is just giving me YNew when asked to give Y. So this part is actually right:
Jags1$BUGSoutput$mean$Fit
[1] 109.7883
Jags1$BUGSoutput$mean$FitNew
[1] 119.2111
So I hope that I did not cause a major confusion for anybody...

Errors running Maximum Likelihood Estimation on a three parameter Weibull cdf

I am working with the cumulative emergence of flies over time (taken at irregular intervals) over many summers (though first I am just trying to make one year work). The cumulative emergence follows a sigmoid pattern and I want to create a maximum likelihood estimation of a 3-parameter Weibull cumulative distribution function. The three-parameter models I've been trying to use in the fitdistrplus package keep giving me an error. I think this must have something to do with how my data is structured, but I cannot figure it out. Obviously I want it to read each point as an x (degree days) and a y (emergence) value, but it seems to be unable to read two columns. The main error I'm getting says "Non-numeric argument to mathematical function" or (with slightly different code) "data must be a numeric vector of length greater than 1". Below is my code including added columns in the df_dd_em dataframe for cumulative emergence and percent emergence in case that is useful.
degree_days <- c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94)
emergence <- c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0)
cum_em <- cumsum(emergence)
df_dd_em <- data.frame (degree_days, emergence, cum_em)
df_dd_em$percent <- ave(df_dd_em$emergence, FUN = function(df_dd_em) 100*(df_dd_em)/46)
df_dd_em$cum_per <- ave(df_dd_em$cum_em, FUN = function(df_dd_em) 100*(df_dd_em)/46)
x <- pweibull(df_dd_em[c(1,3)],shape=5)
dframe2.mle <- fitdist(x, "weibull",method='mle')
Here's my best guess at what you're after:
Set up data:
dd <- data.frame(degree_days=c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94),
emergence=c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0))
dd <- transform(dd,cum_em=cumsum(emergence))
We're actually going to fit to an "interval-censored" distribution (i.e. probability of emergence between successive degree day observations: this version assumes that the first observation refers to observations before the first degree-day observation, you could change it to refer to observations after the last observation).
library(bbmle)
## y*log(p) allowing for 0/0 occurrences:
y_log_p <- function(y,p) ifelse(y==0 & p==0,0,y*log(p))
NLLfun <- function(scale,shape,x=dd$degree_days,y=dd$emergence) {
prob <- pmax(diff(pweibull(c(-Inf,x), ## or (c(x,Inf))
shape=shape,scale=scale)),1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
library(bbmle)
I should probably have used something more systematic like the method of moments (i.e. matching the mean and variance of a Weibull distribution with the mean and variance of the data), but I just hacked around a bit to find plausible starting values:
## preliminary look (method of moments would be better)
scvec <- 10^(seq(0,4,length=101))
plot(scvec,sapply(scvec,NLLfun,shape=1))
It's important to use parscale to let R know that the parameters are on very different scales:
startvals <- list(scale=1000,shape=1)
m1 <- mle2(NLLfun,start=startvals,
control=list(parscale=unlist(startvals)))
Now try with a three-parameter Weibull (as originally requested) -- requires only a slight modification of what we already have:
library(FAdist)
NLLfun2 <- function(scale,shape,thres,
x=dd$degree_days,y=dd$emergence) {
prob <- pmax(diff(pweibull3(c(-Inf,x),shape=shape,scale=scale,thres)),
1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
startvals2 <- list(scale=1000,shape=1,thres=100)
m2 <- mle2(NLLfun2,start=startvals2,
control=list(parscale=unlist(startvals2)))
Looks like the three-parameter fit is much better:
library(emdbook)
AICtab(m1,m2)
## dAIC df
## m2 0.0 3
## m1 21.7 2
And here's the graphical summary:
with(dd,plot(cum_em~degree_days,cex=3))
with(as.list(coef(m1)),curve(sum(dd$emergence)*
pweibull(x,shape=shape,scale=scale),col=2,
add=TRUE))
with(as.list(coef(m2)),curve(sum(dd$emergence)*
pweibull3(x,shape=shape,
scale=scale,thres=thres),col=4,
add=TRUE))
(could also do this more elegantly with ggplot2 ...)
These don't seem like spectacularly good fits, but they're sane. (You could in principle do a chi-squared goodness-of-fit test based on the expected number of emergences per interval, and accounting for the fact that you've fitted a three-parameter model, although the values might be a bit low ...)
Confidence intervals on the fit are a bit of a nuisance; your choices are (1) bootstrapping; (2) parametric bootstrapping (resample parameters assuming a multivariate normal distribution of the data); (3) delta method.
Using bbmle::mle2 makes it easy to do things like get profile confidence intervals:
confint(m1)
## 2.5 % 97.5 %
## scale 1576.685652 1777.437283
## shape 4.223867 6.318481
dd <- data.frame(degree_days=c(998.08,1039.66,1111.29,1165.89,1236.53,1293.71,
1347.66,1387.76,1445.47,1493.44,1553.23,1601.97,
1670.28,1737.29,1791.94,1849.20,1920.91,1967.25,
2036.64,2091.85,2152.89,2199.13,2199.13,2263.09,
2297.94,2352.39,2384.03,2442.44,2541.28,2663.90,
2707.36,2773.82,2816.39,2863.94),
emergence=c(0,0,0,1,1,0,2,3,17,10,0,0,0,2,0,3,0,0,1,5,0,0,0,0,
0,0,0,0,1,0,0,0,0,0))
dd$cum_em <- cumsum(dd$emergence)
dd$percent <- ave(dd$emergence, FUN = function(dd) 100*(dd)/46)
dd$cum_per <- ave(dd$cum_em, FUN = function(dd) 100*(dd)/46)
dd <- transform(dd)
#start 3 parameter model
library(FAdist)
## y*log(p) allowing for 0/0 occurrences:
y_log_p <- function(y,p) ifelse(y==0 & p==0,0,y*log(p))
NLLfun2 <- function(scale,shape,thres,
x=dd$degree_days,y=dd$percent) {
prob <- pmax(diff(pweibull3(c(-Inf,x),shape=shape,scale=scale,thres)),
1e-6)
## multinomial probability
-sum(y_log_p(y,prob))
}
startvals2 <- list(scale=1000,shape=1,thres=100)
m2 <- mle2(NLLfun2,start=startvals2,
control=list(parscale=unlist(startvals2)))
summary(m2)
#graphical summary
windows(5,5)
with(dd,plot(cum_per~degree_days,cex=3))
with(as.list(coef(m2)),curve(sum(dd$percent)*
pweibull3(x,shape=shape,
scale=scale,thres=thres),col=4,
add=TRUE))

Feature selection + cross-validation, but how to make ROC-curves in R

I'm stuck with the next problem. I divide my data into 10 folds. Each time, I use 1 fold as test set and the other 9 as training set (I do this ten times). On each training set, I do feature selection (filter methode with chi.squared) and then I make a SVMmodel with my training set and the selected features.
So at the end, I become 10 different models (because of the feature selection). But now I want to make a ROC-curve in R from this filter methode in general. How can I do this?
Silke
You can indeed store the predictions if they are all on the same scale (be especially careful about this as you perform feature selection... some methods may produce scores that are dependent on the number of features) and use them to build a ROC curve. Here is the code I used for a recent paper:
library(pROC)
data(aSAH)
k <- 10
n <- dim(aSAH)[1]
indices <- sample(rep(1:k, ceiling(n/k))[1:n])
all.response <- all.predictor <- aucs <- c()
for (i in 1:k) {
test = aSAH[indices==i,]
learn = aSAH[indices!=i,]
model <- glm(as.numeric(outcome)-1 ~ s100b + ndka + as.numeric(wfns), data = learn, family=binomial(link = "logit"))
model.pred <- predict(model, newdata=test)
aucs <- c(aucs, roc(test$outcome, model.pred)$auc)
all.response <- c(all.response, test$outcome)
all.predictor <- c(all.predictor, model.pred)
}
roc(all.response, all.predictor)
mean(aucs)
The roc curve is built from all.response and all.predictor that are updated at each step. This code also stores the AUC at each step in auc for comparison. Both results should be quite similar when the sample size is sufficiently large, however small samples within the cross-validation may lead to underestimated AUC as the ROC curve with all data will tend to be smoother and less underestimated by the trapezoidal rule.

Maximum likelihood estimation for ARMA(1,1)-GARCH(1,1)

Following some standard textbooks on ARMA(1,1)-GARCH(1,1) (e.g. Ruey Tsay's Analysis of Financial Time Series), I try to write an R program to estimate the key parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. For some random reason, I cannot decipher what is wrong with my R program. The R package fGarch already gives me the answer, but my customized function does not seem to produce the same result.
I would like to build an R program that helps estimate the baseline ARMA(1,1)-GARCH(1,1) model. Then I would like to adapt this baseline script to fit different GARCH variants (e.g. EGARCH, NGARCH, and TGARCH). It would be much appreciated if you could provide some guidance in this case. The code below is the R script for estimating the 6 parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. At any rate, I would be glad to know your thoughts and insights. If you have a similar example, please feel free to share your extant code in R. Many thanks in advance.
Emily
# This R script offers a suite of functions for estimating the volatility dynamics based on the standard ARMA(1,1)-GARCH(1,1) model and its variants.
# The baseline ARMA(1,1) model characterizes the dynamic evolution of the return generating process.
# The baseline GARCH(1,1) model depicts the the return volatility dynamics over time.
# We can extend the GARCH(1,1) volatility model to a variety of alternative specifications to capture the potential asymmetry for a better comparison:
# GARCH(1,1), EGARCH(1,1), NGARCH(1,1), and TGARCH(1,1).
options(scipen=10)
intel= read.csv(file="intel.csv")
summary(intel)
raw_data= as.matrix(intel$logret)
library(fGarch)
garchFit(~arma(1,1)+garch(1,1), data=raw_data, trace=FALSE)
negative_log_likelihood_arma11_garch11=
function(theta, data)
{mean =theta[1]
delta=theta[2]
gamma=theta[3]
omega=theta[4]
alpha=theta[5]
beta= theta[6]
r= ts(data)
n= length(r)
u= vector(length=n)
u= ts(u)
u[1]= r[1]- mean
for (t in 2:n)
{u[t]= r[t]- mean- delta*r[t-1]- gamma*u[t-1]}
h= vector(length=n)
h= ts(h)
h[1]= omega/(1-alpha-beta)
for (t in 2:n)
{h[t]= omega+ alpha*(u[t-1]^2)+ beta*h[t-1]}
#return(-sum(dnorm(u[2:n], mean=mean, sd=sqrt(h[2:n]), log=TRUE)))
pi=3.141592653589793238462643383279502884197169399375105820974944592
return(-sum(-0.5*log(2*pi) -0.5*log(h[2:n]) -0.5*(u[2:n]^2)/h[2:n]))
}
#theta0=c(0, +0.78, -0.79, +0.0000018, +0.06, +0.93, 0.01)
theta0=rep(0.01,6)
negative_log_likelihood_arma11_garch11(theta=theta0, data=raw_data)
alpha= proc.time()
maximum_likelihood_fit_arma11_garch11=
nlm(negative_log_likelihood_arma11_garch11,
p=theta0,
data=raw_data,
hessian=TRUE,
iterlim=500)
#optim(theta0,
# negative_log_likelihood_arma11_garch11,
# data=raw_data,
# method="L-BFGS-B",
# upper=c(+0.999999999999,+0.999999999999,+0.999999999999,0.999999999999,0.999999999999,0.999999999999),
# lower=c(-0.999999999999,-0.999999999999,-0.999999999999,0.000000000001,0.000000000001,0.000000000001),
# hessian=TRUE)
# We record the end time and calculate the total runtime for the above work.
omega= proc.time()
runtime= omega-alpha
zhours = floor(runtime/60/60)
zminutes=floor(runtime/60- zhours*60)
zseconds=floor(runtime- zhours*60*60- zminutes*60)
print(paste("It takes ",zhours,"hour(s)", zminutes," minute(s) ","and ", zseconds,"second(s) to finish running this R program",sep=""))
maximum_likelihood_fit_arma11_garch11
sqrt(diag(solve(maximum_likelihood_fit_arma11_garch11$hessian)))

Resources