Following some standard textbooks on ARMA(1,1)-GARCH(1,1) (e.g. Ruey Tsay's Analysis of Financial Time Series), I try to write an R program to estimate the key parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. For some random reason, I cannot decipher what is wrong with my R program. The R package fGarch already gives me the answer, but my customized function does not seem to produce the same result.
I would like to build an R program that helps estimate the baseline ARMA(1,1)-GARCH(1,1) model. Then I would like to adapt this baseline script to fit different GARCH variants (e.g. EGARCH, NGARCH, and TGARCH). It would be much appreciated if you could provide some guidance in this case. The code below is the R script for estimating the 6 parameters of an ARMA(1,1)-GARCH(1,1) model for Intel's stock returns. At any rate, I would be glad to know your thoughts and insights. If you have a similar example, please feel free to share your extant code in R. Many thanks in advance.
Emily
# This R script offers a suite of functions for estimating the volatility dynamics based on the standard ARMA(1,1)-GARCH(1,1) model and its variants.
# The baseline ARMA(1,1) model characterizes the dynamic evolution of the return generating process.
# The baseline GARCH(1,1) model depicts the the return volatility dynamics over time.
# We can extend the GARCH(1,1) volatility model to a variety of alternative specifications to capture the potential asymmetry for a better comparison:
# GARCH(1,1), EGARCH(1,1), NGARCH(1,1), and TGARCH(1,1).
options(scipen=10)
intel= read.csv(file="intel.csv")
summary(intel)
raw_data= as.matrix(intel$logret)
library(fGarch)
garchFit(~arma(1,1)+garch(1,1), data=raw_data, trace=FALSE)
negative_log_likelihood_arma11_garch11=
function(theta, data)
{mean =theta[1]
delta=theta[2]
gamma=theta[3]
omega=theta[4]
alpha=theta[5]
beta= theta[6]
r= ts(data)
n= length(r)
u= vector(length=n)
u= ts(u)
u[1]= r[1]- mean
for (t in 2:n)
{u[t]= r[t]- mean- delta*r[t-1]- gamma*u[t-1]}
h= vector(length=n)
h= ts(h)
h[1]= omega/(1-alpha-beta)
for (t in 2:n)
{h[t]= omega+ alpha*(u[t-1]^2)+ beta*h[t-1]}
#return(-sum(dnorm(u[2:n], mean=mean, sd=sqrt(h[2:n]), log=TRUE)))
pi=3.141592653589793238462643383279502884197169399375105820974944592
return(-sum(-0.5*log(2*pi) -0.5*log(h[2:n]) -0.5*(u[2:n]^2)/h[2:n]))
}
#theta0=c(0, +0.78, -0.79, +0.0000018, +0.06, +0.93, 0.01)
theta0=rep(0.01,6)
negative_log_likelihood_arma11_garch11(theta=theta0, data=raw_data)
alpha= proc.time()
maximum_likelihood_fit_arma11_garch11=
nlm(negative_log_likelihood_arma11_garch11,
p=theta0,
data=raw_data,
hessian=TRUE,
iterlim=500)
#optim(theta0,
# negative_log_likelihood_arma11_garch11,
# data=raw_data,
# method="L-BFGS-B",
# upper=c(+0.999999999999,+0.999999999999,+0.999999999999,0.999999999999,0.999999999999,0.999999999999),
# lower=c(-0.999999999999,-0.999999999999,-0.999999999999,0.000000000001,0.000000000001,0.000000000001),
# hessian=TRUE)
# We record the end time and calculate the total runtime for the above work.
omega= proc.time()
runtime= omega-alpha
zhours = floor(runtime/60/60)
zminutes=floor(runtime/60- zhours*60)
zseconds=floor(runtime- zhours*60*60- zminutes*60)
print(paste("It takes ",zhours,"hour(s)", zminutes," minute(s) ","and ", zseconds,"second(s) to finish running this R program",sep=""))
maximum_likelihood_fit_arma11_garch11
sqrt(diag(solve(maximum_likelihood_fit_arma11_garch11$hessian)))
Related
UPDATE: Now with Traceplot example
UPDATE: Now with new traceplot
I am trying to adapt Outhwaite et. als 2018 code for occupancy modelling and have a couple of questions that I just can't seem to find an answer for...
Code used to create model
cat(
"model{
### Model ###
# State model
for (i in 1:nsite){
for (t in 1:nyear){
z[i,t] ~ dbern(psi[i,t])
logit(psi[i,t])<- b[t] + u[i]
}}
# Observation model
for(j in 1:nvisit) {
y[j] ~ dbern(Py[j]+0.0001)
Py[j]<- z[Site[j],Year[j]]*p[j]
logit(p[j]) <- a[Year[j]] + c*logL[j]
}
### Priors ###
# State model priors
for(t in 1:nyear){
b[t] ~ dunif(-10,10) # fixed year effect
}
for (i in 1:nsite) {
u[i] ~ dnorm(0, tau.u) # random site effect
}
tau.u <- 1/(sd.u * sd.u)
sd.u ~ dunif(0, 5) # half-uniform hyperpriors
# Observation model priors
for (t in 1:nyear) {
a[t] ~ dnorm(mu.a, tau.a) # random year effect
}
mu.a ~ dnorm(0, 0.01)
tau.a <- 1 / (sd.a * sd.a)
sd.a ~ dunif(0, 5) # half-uniform hyperpriors
c ~ dunif(-10, 10) # sampling effort effect
### Derived parameters ###
# Finite sample occupancy - proportion of occupied sites
for (t in 1:nyear) {
psi.fs[t] <- sum(z[1:nsite,t])/nsite
}
#data# nyear, nsite, nvisit, y, logL, Site, Year
}", file="bmmodel.txt"
)
Note that dbern(Py[j]+0.0001) includes a correction factor since dbern(0) is not supported in JAGS.
I am running the model on some plant data just basically trying it out to see if it runs and converges and behaves as I would expect it to.
Question number 1(ANSWERED): I am interested in the quantity psi.fs[t]. But since the model calculates this quantity after the actual modelling process, can convergence be assessed for psi.fs[t]?
R code for running model with R2JAGS
jagsrespsi<-jags(data.list, inits=test.inits,
n.chains=2, n.iter=15000, n.thin=3,
DIC=T,
model.file=paste0(modeltype,"model.txt"), parameters.to.save=c("psi.fs"))
Question number 2: When I use traceplot(jagsrespsi) to plot the traceplot seems all over the place but the Rhat for jagsrespsi$BUGSoutput is 1 for all my years? gelman.diag(as.mcmc(jagsrespsi)) also indicates convergence. Same goes for monitoring psi!
I am very astonished by this model behaviour and am suspecting there is something wrong... but no idea where to look
Yes, you can check psi.ft[] for convergence in exactly the same way as you check the convergence of the model's parameters. That's exactly what happens, for example, in a logistic regression, where the fitted probabilities of response are calculated as exp(z)/(1 + exp(z)) for some linear predictor z.
When you say the traceplot is "all over the place", what do you mean? This could be either good or bad. Can you show an example? A "good" traceplot looks like a "fat, hairy caterpillar": consecutive samples taken from all regions of the sample space, a horizontal hair ball. Although written for SAS, this page gives a reasonable high level description of what a good trace plot looks like, and what problems might be indicated by less-than-ideal examples.
In response to your edit to include the trace plot...
That doesn't look like a particularly good traceplot to me: there seems to be some negative autocorrelation between successive samples. Have you calculated the effective sample size [ESS]?
But the plot may look a little odd because your chain is very short, IMHO. You can use the ESS to provide a very rough approximation for the accuracy of an estimated probability: the worst case half width CI of a binomial proportion is +/-2 * sqrt(0.5*0.5/N), where N is the sample size (or ESS in this case). So even if the efficiency of your MCMC process is 1 - so that the ESS is equal to the chain length - then the accuracy of your estimates is only +/-0.02. To estimate a probability to 2 decimal places (so that the half width of the CI is no more than 0.005), you need an ESS of 40,000.
There's nothing wrong with using short chain lengths during testing, but for "production" runs, then I would always use a chan length much greater than 2,500. (And I'd also use multiple chains so that I can use Gelman-Rubin statistics to test for convergence.)
I am trying to evaluate themodel fit of several regressions in R, and I have run into a problem I have had multiple times now: the log-likelihood of my Poisson regression is infinite.
I'm using a non-integer dependent variable (Note: I know what I'm doing in this regard), and I'm wondering if maybe that's the problem. However, I don't get an infinite log-likelihood when running the regression with glm.nb.
Code to reproduce the issue is below.
Edit: the problem appears to go away when I coerce the DV to integer. Any idea how to get log likelihood from Poissons with non-integer DVs?
# Input Data
so_data <- data.frame(dv = c(21.0552722691125, 24.3061351414885, 7.84658638053276,
25.0294679770848, 15.8064731063311, 10.8171744654056, 31.3008088413026,
2.26643928259238, 18.4261153345417, 5.62915828161753, 17.0691184593063,
1.11959635820499, 30.0154935602592, 23.0000809735738, 28.4389825676123,
27.7678405415711, 23.7108405071757, 23.5070651053276, 14.2534787168392,
15.2058525068363, 19.7449094187771, 2.52384709295823, 29.7081691356397,
32.4723790240354, 19.2147002673637, 61.7911384519901, 10.5687170234821,
23.9047421013736, 18.4889651451222, 13.0360878554798, 15.1752866581849,
11.5205948111817, 31.3539840929108, 31.7255952728076, 25.3034625215724,
5.00013988265465, 30.2037887018226, 1.86123112349445, 3.06932041603219,
22.6739418581257, 6.33738321053804, 24.2933951601142, 14.8634827414491,
31.8302947881089, 34.8361908525564, 1.29606416941288, 13.206844629927,
28.843579313401, 25.8024295609021, 14.4414831628722, 18.2109680632694,
14.7092063453463, 10.0738043919183, 28.4124482962025, 27.1004208775326,
1.31350378236957, 14.3009307888745, 1.32555197766214, 2.70896028922312,
3.88043749517381, 3.79492216916016, 19.4507965653633, 32.1689088941444,
2.61278585713499, 41.6955885902228, 2.13466761675063, 30.4207256294235,
24.8231524369244, 20.7605955978196, 17.2182798298094, 2.11563574288652,
12.290778250655, 0.957467139696772, 16.1775287334746))
# Run Model
p_mod <- glm(dv ~ 1, data = so_data, family = poisson(link = 'log'))
# Be Confused
logLik(p_mod)
Elaborating on #ekstroem's comment: the Poisson distribution is only supported over the non-negative integers (0, 1, ...). So, technically speaking, the probability of any non-integer value is zero -- although R does allow for a little bit of fuzz, to allow for round-off/floating-point representation issues:
> dpois(1,lambda=1)
[1] 0.3678794
> dpois(1.1,lambda=1)
[1] 0
Warning message:
In dpois(1.1, lambda = 1) : non-integer x = 1.100000
> dpois(1+1e-7,lambda=1) ## fuzz
[1] 0.3678794
It is theoretically possible to compute something like a Poisson log-likelihood for non-integer values:
my_dpois <- function(x,lambda,log=FALSE) {
LL <- -lambda+x*log(lambda)-lfactorial(x)
if (log) LL else exp(LL)
}
but I would be very careful - some quick tests with integrate suggest it integrates to 1 (after I fixed the bug in it), but I haven't checked more carefully that this is really a well-posed probability distribution. (On the other hand, some reasonable-seeming posts on CrossValidated suggest that it's not insane ...)
You say "I know what I'm doing in this regard"; can you give some more of the context? Some alternative possibilities (although this is steering into CrossValidated territory) -- the best answer depends on where your data really come from (i.e., why you have "count-like" data that are non-integer but you think should be treated as Poisson).
a quasi-Poisson model (family=quasipoisson). (R will still not give you log-likelihood or AIC values in this case, because technically they don't exist -- you're supposed to do inference on the basis of the Wald statistics of the parameters; see e.g. here for more info.)
a Gamma model (probably with a log link)
if the data started out as count data that you've scaled by some measure of effort or exposure), use an appropriate offset model ...
a generalized least-squares model (nlme::gls) with an appropriate heteroscedasticity specification
Poisson log-likelihood involves calculating log(factorial(x)) (https://www.statlect.com/fundamentals-of-statistics/Poisson-distribution-maximum-likelihood). For values larger than 30 it has to be done using Stirling's approximation formula in order to avoid exceeding the limit of computer arithmetic. Sample code in Python:
# define a likelihood function. https://www.statlect.com/fundamentals-of- statistics/Poisson-distribution-maximum-likelihood
def loglikelihood_f(lmba, x):
#Using Stirling formula to avoid calculation of factorial.
#logfactorial(n) = n*ln(n) - n
n = x.size
logfactorial = x*np.log(x+0.001) - x #np.log(factorial(x))
logfactorial[logfactorial == -inf] = 0
result =\
- np.sum(logfactorial) \
- n * lmba \
+ np.log(lmba) * np.sum(x)
return result
So I have 500 poisson distributed simulated samples with n=100 each.
1) How can I estimate the lambdas for each of these samples separately in R ?
2) How can I draw Kernel Estimation of the density function of the estimator for lambda based on the 500 estimated lambdas? (my guess is somehow with "Kernsmooth" package and function "bkfe" but i fail to programm it normally anyway
taskpois <- function(size, leng){
+ taskmlepois <- NULL
+ for (i in 1:leng){
+ randompois <- rpois(size, 6)
+ taskmlepois[i] <- mean(randompois)
+ }
+ return(taskmlepois)
+ }
tasksample <- taskpois(size=100, leng=500)
As the comments suggest, it seems you're pretty close already.
ltarget <- 2
set.seed(101)
lambdavec <- replicate(500,mean(rpois(100,lambda=ltarget)))
dd <- density(lambdavec)
plot(dd,main="",las=1,bty="l")
We might as well add the expected result based on asymptotic theory:
curve(dnorm(x,mean=2,sd=sqrt(2/100)),add=TRUE,col=2)
We can add another line that shows that the variation among the densities of different experiments is pretty large relative to the difference between the theoretical and observed density from the first experiment:
lambdavec2 <- replicate(500,mean(rpois(100,lambda=ltarget)))
lines(density(lambdavec2),col=4)
When I run the example code for a random walk using the fractal package:
library(fractal)
## create test series
set.seed(100)
x <- rnorm(1024)
walk <- cumsum(x)
## calculate the Hurst coefficient of a random
## walk series using various techniques
methods <- c("aggabs","aggvar","diffvar","higuchi")
z <- lapply(methods, function(method, walk){
hurstBlock(ifelse1(method=="higuchi",diff(walk),walk), method=method)
},walk=walk )
names(z) <- methods
## plot results
old.plt <- splitplot(2,2,1)
for (i in 1:4){
if (i > 1)
splitplot(2,2,i)
plot(z[[i]], key=FALSE)
mtext(paste(attr(z[[i]],"stat.name"), round(as.numeric(z[[i]]),3), sep=", H="),
line=0.5, adj=1)
}
par(old.plt)
I do not get hurst exponent of .5? I understand the there are different method of calculations but I believe AggVar is most like the R/S rescale method and for a brownian motion this should return .5 not .94 like it does.
SHouldn't it be .5?
Also if you change
walk <- cumsum(x) to
walk <- x Then this is white noise and I should get a Hurst Exponent of 0. But I dont.
Can you help clarify?
I have done extensive testing of the three most prominent Hurst exponent libraries since my answer above and posted a video on youtube, However, from fractal, the most common, Hurst estimator used is the hurstpec from fractal and the hurstexp from the pracma library. The other estimator from PerformanceAnalytics econometrics finance library does not seem to be right no matter what estimate you use. I left on the Authors Github maintenance site an error correction request or check, I have not checked back to see the response, I left the request 4 months ago.
iwm_rt <- dailyReturn(IWM, type ="log")
head(iwm_rt)
```{r}
library(pracma)
library(fractal)
x <- iwm_rt
hurstBlock(x, method="aggAbs", scale.min=8, scale.max=NULL,
scale.ratio=9, weight=function(x) rep(1,length(x)), fit=lm)
fit_aggab <- hurstBlock(x, method="aggAbs", scale.min=8, scale.max=NULL,
scale.ratio=9, weight=function(x) rep(1,length(x)), fit=lm)
plot(fit_aggab)
RoverS(x, n.block.min=22, scale.ratio=9, scale.min=8)
hurstSpec(x, method="standard", freq.max=0.25, dc=FALSE, n.block=NULL,
weight=function(x) rep(1,length(x)), fit=lm, sdf.method="direct")
hurstexp(x)
The aggabs method from hurstblock in fractal is an aggregate estimate from a large sample time series, the most crucial factor in fitting the aggabs estimator is the scale.ratio estimate I solve it by trial and error which is a bit tedious but I do not know of a more scientific method, the scale. ratio is the multiplier for each subsequent series of samples from the time series
I'm stuck with the next problem. I divide my data into 10 folds. Each time, I use 1 fold as test set and the other 9 as training set (I do this ten times). On each training set, I do feature selection (filter methode with chi.squared) and then I make a SVMmodel with my training set and the selected features.
So at the end, I become 10 different models (because of the feature selection). But now I want to make a ROC-curve in R from this filter methode in general. How can I do this?
Silke
You can indeed store the predictions if they are all on the same scale (be especially careful about this as you perform feature selection... some methods may produce scores that are dependent on the number of features) and use them to build a ROC curve. Here is the code I used for a recent paper:
library(pROC)
data(aSAH)
k <- 10
n <- dim(aSAH)[1]
indices <- sample(rep(1:k, ceiling(n/k))[1:n])
all.response <- all.predictor <- aucs <- c()
for (i in 1:k) {
test = aSAH[indices==i,]
learn = aSAH[indices!=i,]
model <- glm(as.numeric(outcome)-1 ~ s100b + ndka + as.numeric(wfns), data = learn, family=binomial(link = "logit"))
model.pred <- predict(model, newdata=test)
aucs <- c(aucs, roc(test$outcome, model.pred)$auc)
all.response <- c(all.response, test$outcome)
all.predictor <- c(all.predictor, model.pred)
}
roc(all.response, all.predictor)
mean(aucs)
The roc curve is built from all.response and all.predictor that are updated at each step. This code also stores the AUC at each step in auc for comparison. Both results should be quite similar when the sample size is sufficiently large, however small samples within the cross-validation may lead to underestimated AUC as the ROC curve with all data will tend to be smoother and less underestimated by the trapezoidal rule.