There are something that i don't know with this plot? - plot

I am looking at this code, previously v-transformations were done and fitting VT-ARMA copula models, now here it is applying shapiro test to residuals and want to plot 4 graphs: 
https://i.stack.imgur.com/gTtBU.png
These 4 plots should come out of plot(vtcop, plotoption=3) etc... I have never used this argument plotoption, i think this argument is contained in the tscopula package ,but I have already done the necessary research on the help and read the pdf file that explains the tscopula package but there is no such "plotoption".
Can anyone tell me why it tells me unused argument at this point?
This code from by paper of AlexanderMcNeil: "Modelling Volatile Time Series with V-Transforms and Copulas".
Thank you very much. Good day.

Related

Holt-Winters in r with function hw. Question on value of beta parameter and forecasting phase

I have used the function hw to analyze a time series
fcast<-hw(my_series,h=12,level=95,seasonal="multiplicative",damped=TRUE,lambda=NULL)
By looking at the fcast$model$par I observe the values of alpha, beta, gamma, phi, and the initial states.
I've also looked at the contents of fcast$model$states to see the evolution of all the values. I've tried to reproduce the results in Excel in order to understand the whole procedure.
To achieve the same values of b (trend) as in fcast$model$states I observe that I have to use a formula like the one in the bibliography about the Holt-Winters method:
b(t)=beta2*(l(t)-l(t-1)+(1-beta2)*phi*b(t-1)
But, if in fcast$model$par beta=0.08128968, I find that in order to achieve the same results I have to use beta2=0.50593541.
What's the reason for that? I don't see any relationship between beta and beta2.
I have also found that in order to get the same forecast as the one obtained with the hw function I have to use the following formulas once the data are finished:
l(t)=l(t-1)+b(t-1)
b(t)=phi*b(t-1)
^y(t)=(l(t-1)+b(t-1))*s(t-m)
I haven't found any bibliography on this forecasting phase, explaining that some parameters are no longer used. For instance, in this case phi is still used for b(t), but not used anymore for l(t).
Can anyone refer to any bibliography where I can find this part explained?
So in the end I've been able to reproduce the whole set of data in Excel, but there's a couple of steps I would like to understand better.
Thank you!!

R - replicate weight survey

Currently I'm interested in learning how to obtain information from the American Community Survey PUMS files. I have read some of the the ACS documentation and found that to replicate weights I must use the following formula:
And thanks to google I also found that there's the SURVEY package and the svrepdesign function to help me get this done
https://www.rdocumentation.org/packages/survey/versions/3.33-2/topics/svrepdesign
Now, even though I'm getting into R and learning statistics and have a SQL background, there are two BIG problems:
1 - I have no idea what that formula means and I would really like to understand it before going any further
2 - I don't understand how the SVREPDESIGN function works nor how to use it.
I'm not looking for someone to solve my life/problems, but I would really appreciate if someone points me in the right direction and gives a jump start.
Thank you for your time.
When you are using svyrepdesign, you are specifying that it is a design with replicated weights, and it uses the formula you provided to calculate the standard errors.
The American Community Survey has 80 replicate weights, so it first calculates the statistic you are interested in with the full sample weights (X), then it calculates the same statistic with all 80 replicate weights (X_r).
You should read this: https://usa.ipums.org/usa/repwt.shtml

Function to plot model with one variable varying and others constant

It's simple, but I can't remember how this procedure is called, hence I was not able to find the function to do so. I want to explore the effects and gradients of a simple lm() model by plotting the response of one variable at a time, the others being kept constant.
Can anybody tell me which function to use to do so? I seem to remember it's a function generating several plots, or something like this. It could be something akin to sensitivity analysis... Sorry for the beginner question.
Thank you in advance!
The car package has a lot of utilities for analyzing regression models. This sounds like a component+residual plot (or partial residuals plot).
library(car) # for avPlots(...)
fit <- lm(mpg~wt+hp+disp, mtcars)
crPlots(fit)
As noted in the comments, termplot(...) does basically the same thing.

Arima.sim issues in R

I am working on making a prediction in R using time-series models.
I used the auto.arima function to find a model for my dataset (which is a ts object).
fit<-auto.arima(data)
I can then plot the results of the prediction for the 20 following dates using the forecast function:
plot(forecast(fit,h=20))
However I would like to add external variables and I cannot do it using forecast because it is kind of a black box to me as I am new to R.
So I tried to mimic it by using the arima.sim function and a problem arose:
HOW TO INITIALIZE THIS FUNCTION ?
I got the model by setting model=as.list(coef(fit)) but the other parameters are still obscure to me.
I went through hundreds of page including in stackoverflow but nobody seems to really know what is going on.
How is it calculated ? Like why does n.start (the burn-in period) must have ma+ar length and not only a max(ar,ma) length ? What is exactly start.innov?
I thought I understood when there is only an AR part but I cannot reproduce the results with an AR+MA filter.
My understanding as for the AR is concerned is that start.innov represent the errors between a filtered zero-signal and the true signal, is it true ?
Like if you want to have an ar of order 2 with initial conditions (a1,a2) you need to set
start.innov[1]=a1-ar1*0-ar2*0=a1
start.innov[2]=a2-ar1*start.innov[1]
and innov to rep(0,20) but what to do when facing an arima function how do you set the innov to get exactly the same curbs as forecast does ?
thanks for your help !!!
You seem to be confused between modelling and simulation. You are also wrong about auto.arima().
auto.arima() does allow exogenous variables via the xreg argument. Read the help file. You can include the exogenous variables for future periods using forecast.Arima(). Again, read the help file.
It is not clear at all why you are referring to arima.sim() here. It is for simulating ARIMA processes, not for modelling or forecasting.

Phylocom non-ultrametric tree vs ultrametric tree

Again.
I need help one more time.
Nowadays I am getting into PHYLOCOM software for inferring characteristics of a phylogeny from different samples. This software allows you to calculate if your species are showing clustering or overdispersion within other populations in your analyses.
As input files you need a phylogenetic tree in NEWICK format and a sample file (.txt).
I have done two tests, one modifying my tree in R with 'ape' package this way:
compute.brlen(tree, main=expression(rho==10))
And the other one by this other 'ape' option:
tree$edge.length = tree$edge.length * 10
The first modification generates an output with an ultrametric tree while the second output is a non-ultrametric tree. If then I run PHYLOCOM itself by
phylocom comstruct
I get different results, not only in the values of the parameters, but also in the signification p-values.
My question is if anyone knows how should I run the PHYLOCOM to do these 'comstruct' analyses correctly, with an input of a ultrametric or non-ultrametric, and also what are the differences in running this in oone way or in another.
I know this is not a 'classical' question for stackoverflow forums, but maybe anyone that works with phylogeny could help me.
Thanks a lot.
I think I may be able to help, but unfortunately I cannot add comments to get more information so I will have to infer what you mean from the information given. I apologize if it doesnt help.
Firstly, you may want to consult the help for compute.brlen(). As there is no argument for "main" in this function. I think you have taken it from the example in the help file, but you may note that this is outside the compute.brlen function and in the plot function. It will give you a title in your plot.
To change the rho value in compute.brlen() you need to change the power argument.
For example:
compute.brlen(tree, power = 10)
This may be why you are getting different results for the different trees. Because there is no transformation being performed on your compute.brlen() tree.
I am not familiar with PHYLOCOM, so I can't help on that front. But ultrametric and non-ultra-metric trees will give different relationships between the tips of the tree, so I would not be surprised that they give different results. I should note that I am not super confident on the differences in the analysis of ultrametric and non-ultrametric trees, but from looking at the plotted differences I would assume that this is true.

Resources