How can I fit an EVT-Copula-GARCH model to the ugarchroll() function to obtain Value-at-Risk forecasts? Thanks.
I already obtained standardized residuals from my fitted GARCH, next I would need to apply the Kernel and fit a Copula, I know how I would do it in individual steps from a technical perspective but I cannot wrap my head around it to apply it straight forward with the "rugarch" or "rmgarch" package.
Related
I have been using lppm (point pattern on a linear network) on spatstat with bunch of covariates and fitting a log-linear model but I couldn't see how to check over-fitting. Is there a quick way to do it?
It depends on what you want.
What tool would you use to check overfitting in (say) a linear model?
To identify whether individual observations may have been over-fitted, you could use influence.lppm (from the spatstat.linnet package).
To identify collinearity in the covariates, currently we do not provide a dedicated function in spatstat, but you could use the following trick. If fit is your fitted model of class lppm, first extract the corresponding GLM using
g <- getglmfit(as.ppm(fit))
Next install the package faraway and use the vif function to calculate the variance inflation factors
library(faraway)
vif(g)
I would like to perform a nonlinear transformation of a regression coefficient. For example:
, or
.
Stata has a convenient implementation with nlcom this that employs the delta method to estimate standard errors and corresponding confidence intervals. I understand a simple transformation as posted can be simply done by directly addressing the coefficient of interest from the model. However, if we are interested in the ratio of several linear and nonlinear combinations, what would be an efficient method to produce confidence bounds on a transformation such as this? Moreover, when coefficients have a full co-variance matrix with standard errors estimated along with them.
To answer my own question, I discovered the library(msm) package that accommodates my request nicely with the function deltamethod(). UCLA has a really nice write up of this method, so I am providing the link for anyone who might have a similar need.
Using the delta method for nonlinear transformations of regression coefficients.
The deltaMethod() function from package car also accomplishes the same, providing as its output the estimate, its standard error and 95% confidence interval.
I have a mydata.ts which is around 200 rows. I used stationary tests, took differences and examined ACF and PACF. So I decided to try ARIMA(1,1,1)(0,1,1) for instance.
Which R function should I use to find fitted values and forecasts? Arima, arima or auto.arima?
And can I trust the MAPE, MAD and other error results on summary(model)? Because I read an answer and it was saying the results are not the real but approximated or something.
auto.arima will find the whole model specification that is the 'best' based on AIC, BIC.
IF you know the order (1,1,1) or (0,1,1) then use Arima from forecast package(same as arima, but little more general)
Arima(your_data, order=c(1,1,1)) will give the basic answer.
Seee the documentation for forecast.
then actual out-of-sample forecast can be done with the forecast function.
If I have some (x,y) data, I can easily draw straight-line through it, e.g.
f=glm(y~x)
plot(x,y)
lines(x,f$fitted.values)
But for curvy data I want a curvy line. It seems loess() can be used:
f=loess(y~x)
plot(x,y)
lines(x,f$fitted)
This question has evolved as I've typed and researched it. I started off with wanting to a simple function to fit curvy data (where I know nothing about the data), and wanting to understand how to use nls() or optim() to do that. That was what everyone seemed to be suggesting in similar questions I found. But now I stumbled upon loess() I'm happy. So, now my question is why would someone choose to use nls or optim instead of loess (or smooth.spline)? Using the toolbox analogy, is nls a screwdriver and loess is a power-screwdriver (meaning I'd almost always choose the latter as it does the same thing but with less of my effort)? Or is nls a flat-head screwdriver and loess a cross-head screwdriver (meaning loess is a better fit for some problems, but for others it simply won't do the job)?
For reference, here is the play data I was using that loess gives satisfactory results for:
x=1:40
y=(sin(x/5)*3)+runif(x)
And:
x=1:40
y=exp(jitter(x,factor=30)^0.5)
Sadly, it does less well on this:
x=1:400
y=(sin(x/20)*3)+runif(x)
Can nls(), or any other function or library, cope with both this and the previous exp example, without being given a hint (i.e. without being told it is a sine wave)?
UPDATE: Some useful pages on the same theme on stackoverflow:
Goodness of fit functions in R
How to fit a smooth curve to my data in R?
smooth.spline "out of the box" gives good results on my 1st and 3rd examples, but terrible (it just joins the dots) on the 2nd example. However f=smooth.spline(x,y,spar=0.5) is good on all three.
UPDATE #2: gam() (from mgcv package) is great so far: it gives a similar result to loess() when that was better, and a similar result to smooth.spline() when that was better. And all without hints or extra parameters. The docs were so far over my head I felt like I was squinting at a plane flying overhead; but a bit of trial and error found:
#f=gam(y~x) #Works just like glm(). I.e. pointless
f=gam(y~s(x)) #This is what you want
plot(x,y)
lines(x,f$fitted)
Nonlinear-least squares is a means of fitting a model that is non-linear in the parameters. By fitting a model, I mean there is some a priori specified form for the relationship between the response and the covariates, with some unknown parameters that are to be estimated. As the model is non-linear in these parameters NLS is a means to estimate values for those coefficients by minimising a least-squares criterion in an iterative fashion.
LOESS was developed as a means of smoothing scatterplots. It has a very less well defined concept of a "model" that is fitted (IIRC there is no "model"). LOESS works by trying to identify pattern in the relationship between response and covariates without the user having to specify what form that relationship is. LOESS works out the relationship from the data themselves.
These are two fundamentally different ideas. If you know the data should follow a particular model then you should fit that model using NLS. You could always compare the two fits (NLS vs LOESS) to see if there is systematic variation from the presumed model etc - but that would show up in the NLS residuals.
Instead of LOESS, you might consider Generalized Additive Models (GAMs) fitted via gam() in recommended package mgcv. These models can be viewed as a penalised regression problem but allow for the fitted smooth functions to be estimated from the data like they are in LOESS. GAM extends GLM to allow smooth, arbitrary functions of covariates.
loess() is non-parametric, meaning you don't get a set of coefficients you can use later - it's not a model, just a fit line. nls() will give you coefficients you could use to build an equation and predict values with a different but similar data set - you can create a model with nls().
What functions do you use in R to fit a curve to your data and test how well that curve fits? What results are considered good?
Just the first part of that question can fill entire books. Just some quick choices:
lm() for standard linear models
glm() for generalised linear models (eg for logistic regression)
rlm() from package MASS for robust linear models
lmrob() from package robustbase for robust linear models
loess() for non-linear / non-parametric models
Then there are domain-specific models as e.g. time series, micro-econometrics, mixed-effects and much more. Several of the Task Views as e.g. Econometrics discuss this in more detail. As for goodness of fit, that is also something one can spend easily an entire book discussing.
The workhorses of canonical curve fitting in R are lm(), glm() and nls(). To me, goodness-of-fit is a subproblem in the larger problem of model selection. Infact, using goodness-of-fit incorrectly (e.g., via stepwise regression) can give rise to seriously misspecified model (see Harrell's book on "Regression Modeling Strategies"). Rather than discussing the issue from scratch, I recommend Harrell's book for lm and glm. Venables and Ripley's bible is terse, but still worth a reading. "Extending the Linear Model with R" by Faraway is comprehensive and readable. nls is not covered in these sources, but "Nonlinear Regression with R" by Ritz & Streibig fills the gap and is very hands-on.
The nls() function (http://sekhon.berkeley.edu/stats/html/nls.html) is pretty standard for nonlinear least-squares curve fitting. Chi squared (the sum of the squared residuals) is the metric that is optimized in that case, but it is not normalized so you can't readily use it to determine how good the fit is. The main thing you should ensure is that your residuals are normally distributed. Unfortunately I'm not sure of an automated way to do that.
The Quick R site has a reasonable good summary of basic functions used for fitting models and testing the fits, along with sample R code:
http://www.statmethods.net/stats/regression.html
The main thing you should ensure is
that your residuals are normally
distributed. Unfortunately I'm not
sure of an automated way to do that.
qqnorm() could probably be modified to find the correlation between the sample quantiles and the theoretical quantiles. Essentially, this would just be a numerical interpretation of the normal quantile plot. Perhaps providing several values of the correlation coefficient for different ranges of quantiles could be useful. For example, if the correlation coefficient is close to 1 for the middle 97% of the data and much lower at the tails, this tells us the distribution of residuals is approximately normal, with some funniness going on in the tails.
Best to keep simple, and see if linear methods work "well enuff". You can judge your goodness of fit GENERALLY by looking at the R squared AND F statistic, together, never separate. Adding variables to your model that have no bearing on your dependant variable can increase R2, so you must also consider F statistic.
You should also compare your model to other nested, or more simpler, models. Do this using log liklihood ratio test, so long as dependant variables are the same.
Jarque–Bera test is good for testing the normality of the residual distribution.