what tests to use to validate a dynamic factor model obtained using nowcasting package in r - r

I used the nowcast function from R package to use dynamic factor model to nowcast GDP using the extracted factors. I Have tried multiple combination of the initial variables and finally obtained this model which all variables in it seems significant and teh ales obtained for my variable of interest is acceptable.
enter image description here
But I can't find any reference about what tests on residuals that I need to do in order to validate this model.
I am really struggling and have been stuck in this for a month, I need to submit my graduation project this weekend and I really need this model to work. so any help will be very much appreciated. Thank you.
Update 1:
This is teh acf plot n residuals suggested by the same package nowcasting, I think my model passes that test and therefore I can use it. right?
enter image description here

Related

How can I get My.stepwise.glm to return the model outside the console?

I asked this question on RCommunity but haven't had anyone bite... so I'm here!
My current project involves me predicting whether some trees will survive given future climate change scenarios. Against better judgement (like using Maxent) I've decided to pursue this with a GLM, which requires presence and absence data. Everytime I generate my absence data (as I was only given presence data) using randomPoints from dismo, the resulting GLM model has different significant variables. I found a package called My.stepwise that has a My.stepwise.glm function (here: My.stepwise.glm: Stepwise Variable Selection Procedure for Generalized Linear... in My.stepwise: Stepwise Variable Selection Procedures for Regression Analysis) , and this goes through a forward/backward selection process to find the best variables and returns a model ready for you.
My problem is that I don't want to run My.stepwise.glm just once and use the model it spits out for me. I'd like to run it roughly 100 times with different pseudo-absence data and see which variables it returns, then take the most frequent variables and move forward with building my model using those. The issue is that the My.stepwise.glm function ends by 'print(summary(initial.model))' and I would like to be able to access the output similar to how step() returns a list, where you can then say 'step$coefficients' and have the function coefficients return as numerics. Can anyone help me with this?

Holt-Winters in r with function hw. Question on value of beta parameter and forecasting phase

I have used the function hw to analyze a time series
fcast<-hw(my_series,h=12,level=95,seasonal="multiplicative",damped=TRUE,lambda=NULL)
By looking at the fcast$model$par I observe the values of alpha, beta, gamma, phi, and the initial states.
I've also looked at the contents of fcast$model$states to see the evolution of all the values. I've tried to reproduce the results in Excel in order to understand the whole procedure.
To achieve the same values of b (trend) as in fcast$model$states I observe that I have to use a formula like the one in the bibliography about the Holt-Winters method:
b(t)=beta2*(l(t)-l(t-1)+(1-beta2)*phi*b(t-1)
But, if in fcast$model$par beta=0.08128968, I find that in order to achieve the same results I have to use beta2=0.50593541.
What's the reason for that? I don't see any relationship between beta and beta2.
I have also found that in order to get the same forecast as the one obtained with the hw function I have to use the following formulas once the data are finished:
l(t)=l(t-1)+b(t-1)
b(t)=phi*b(t-1)
^y(t)=(l(t-1)+b(t-1))*s(t-m)
I haven't found any bibliography on this forecasting phase, explaining that some parameters are no longer used. For instance, in this case phi is still used for b(t), but not used anymore for l(t).
Can anyone refer to any bibliography where I can find this part explained?
So in the end I've been able to reproduce the whole set of data in Excel, but there's a couple of steps I would like to understand better.
Thank you!!

How to run a dynamic linear regression in R?

I am new to using R as I usually use Stata. I want to estimate a state space model on some time series data with time varying coefficients. From what I have gathered this is not possible to do in Stata.
I have downloaded the dlm package in R and I am trying to run the dlmModReg command to regress my dependent variable on a single explanatory variable. I would like to allow the intercept and beta coefficient to vary over time.
If anyone could show me an example of the code I want to run I think that would be enough for me to work out how to do this. The examples I have found online are vague or use terminology that I am not familiar with as a new R user. Any help or comments are greatly appreciated.

Clustered robust standard errors on country-year pairs

I want to replicate a Stata do.file (panel model) in R, but unfortunately I'm ending up with the wrong standard error estimates. The data is proprietary, so I can't post it here. The Stata code used looks like:
xtreg Y X, vce(cluster countrycodeid) fe nonest dfadj
With fe for fixed effects, nonest indicating that the panels are not nested within the clusters, and dfadj for the fact that some sort of DF-adjustment takes place - not possible to find out which sort as of now.
My R-Code looks like this and makes me end up with the right coefficient values:
model <- plm(Y~X+as.factor(year),data=panel,model="within",index=c("codeid","year"))
Now comes the difficult part, which I haven't found a solution for so far, even after trying out numerous sorts of standard error robust estimation methods, for example making extensive use of lmtest and various degrees of freedom transformation methods. The standard errors are supposed to follow a country-year pair pattern (captured by the variable countrycodeid in the Stata code, which takes the form codeid-year, as there appears to be missing data for some variables which are not available on a monthly basis.
Does anyone know if there are special tricks to keep in mind when working with unbalanced panels and the plm() package, which sort of DF-adjustment can be used, and if there is a possibility to group data in the coeftest() function on a country-year basis?
This is not a complete answer.
Stata uses a finite sample correction described in this post. I think that may get your standard errors a tad closer.
Moreover, you can learn more about the nonest/dfadj by issuing the help whatsnew9. Stata used to adjust the VCE for the within transformation when the cluster() option was specified. The cluster-robust VCE no longer adjusts unless the dfadj is specified. You may need to use the version control to replicate old estimates.

Arima.sim issues in R

I am working on making a prediction in R using time-series models.
I used the auto.arima function to find a model for my dataset (which is a ts object).
fit<-auto.arima(data)
I can then plot the results of the prediction for the 20 following dates using the forecast function:
plot(forecast(fit,h=20))
However I would like to add external variables and I cannot do it using forecast because it is kind of a black box to me as I am new to R.
So I tried to mimic it by using the arima.sim function and a problem arose:
HOW TO INITIALIZE THIS FUNCTION ?
I got the model by setting model=as.list(coef(fit)) but the other parameters are still obscure to me.
I went through hundreds of page including in stackoverflow but nobody seems to really know what is going on.
How is it calculated ? Like why does n.start (the burn-in period) must have ma+ar length and not only a max(ar,ma) length ? What is exactly start.innov?
I thought I understood when there is only an AR part but I cannot reproduce the results with an AR+MA filter.
My understanding as for the AR is concerned is that start.innov represent the errors between a filtered zero-signal and the true signal, is it true ?
Like if you want to have an ar of order 2 with initial conditions (a1,a2) you need to set
start.innov[1]=a1-ar1*0-ar2*0=a1
start.innov[2]=a2-ar1*start.innov[1]
and innov to rep(0,20) but what to do when facing an arima function how do you set the innov to get exactly the same curbs as forecast does ?
thanks for your help !!!
You seem to be confused between modelling and simulation. You are also wrong about auto.arima().
auto.arima() does allow exogenous variables via the xreg argument. Read the help file. You can include the exogenous variables for future periods using forecast.Arima(). Again, read the help file.
It is not clear at all why you are referring to arima.sim() here. It is for simulating ARIMA processes, not for modelling or forecasting.

Resources