What is the best way to store anciliary data with a 2D timeseries object in R? - r

I currently try to move from matlab to R.
I have 2D measurements, consisting of irradiance in time and wavelength together with quality flags and uncertainty and error estimates.
In Matlab I extended the timeseries object to store both the wavelength array and the auxiliary data.
What is the best way in R to store this data?
Ideally I would like this data to be stored together such that e.g. window(...) keeps all data synchronized.
So far I had a look at the different timeseries classes like ts, zoo etc and some spatial-time series. However none of them allow me to neither attach auxiliary data to observations nor can they give me a secondary axes.

Not totally sure what you want, but here is a simple tutorial mentioning
R's "ts" and"zoo" time series classes:
http://faculty.washington.edu/ezivot/econ424/Working%20with%20Time%20Series%20Data%20in%20R.pdf
and here is a more comprehensive outline of many more classes(see the Time Series Classes section)
http://cran.r-project.org/web/views/TimeSeries.html

Related

How to handle a large collection of time series in R?

I have data that represents about 50,000 different 2-year monthly time series. What would be the most convenient and tidyverse-ish way to store that in R? I'll be using R to review each series, trying to extract characteristic features of their shapes.
Somehow a data frame with 50,000 rows and 24 columns (plus a few more for meta data) seems awkward, because the time axis is in the columns. But what else should I be using? A list of xts objects? A data frame with 50,000x24 rows? A three-dimensional matrix? I'm not really seeing anything obviously convenient, and my friend google hasn't found any great examples for me either. I imagine this means I'm overlooking the obvious solution, so maybe someone can suggest it. Any help?

Decomposition of additive time series

[Hey, I am sorry if this question might be too easy for this forum! My task was to decompose a time series to break it into the different components. Then plot it in R.
However, the correct solutions included the part that I uploaded as a picture. We are trying to identify the monthly data right? but why are we using the cbind and the data.frame function? than you very much in advance[enter image description here]1 ][2]

Interpolate a high frequency time series

I have a physical time series in a range of 2 year sample data with a frequency of 30 minutes, but there are multiple and wide lost data intervals as you can see there:
I tried with the function na.interp from forecast package with a bad result (shown above):
sapply(dataframeTS[2:10], na.interp)
Im looking for a more useful method.
UPDATE:
Here is more info about the pattern I want to capture, concretely the row data. This subsample belongs to May.
You might want to try the **imputeTS** package. It's an R package dedicated to time series missing value imputation.
The na_seadec(), na_seasplit(), na_kalman() methods might be interesting here
There are many more algorithm options - you can find a list in this Paper about the package.
In this specific case I would try:
na_seasplit(yourData)
or
na_kalman(yourData)
or
na_seadec(yourData)
Be aware, that it might be you need to give the seasonality information correctly with the time series. (you have to create a time series (ts object) and set the frequency parameter)
Still might be that it won't work out at all, you will have to try.
(if you can provide the data I'll also give it a try)

R gstat spatio-temporal variogram kriging

I am trying to use the function variogramST from the R package gstat to calculate a spatio-temporal variogram.
There are 12 years of data with 20'000 data points at irregular points in space and time (no full grid or partial grid). I have to use the STIDF from the spacetime package for an irregular data set. I would like a temporal semivariogram with reference points at 0, 90, 180, 270 days, up to some years etc. Unfortunately both computational and memory problems occur. When the command
samplevariogram<-variogramST(formula=formula_gstat,data=STIDF1)
is run without further arguments, the semiovariogram is taking into account only very short time periods in terms of reference points for the semivariogram, which does not seem to capture the inherent data structure appropriately.
There are more arguments for this function at the user's disposal, but I am not sure how to parametrize them correctly: tlag, tunit, twindow. Specifically, I am wondering how they interact and how I achieve my goal as described above. So I tried the following code
samplevariogram<-variogramST(formula=formula_gstat,data=STIDF1,tlag= ...., tunit=... , twindow= ...)
The following code results ist not working due to memory issues in my 32Gbyte RAM computer:
samplevariogram<-variogramST(formula=formula_gstat,data=STIDF1,tlag=90*(0:20), tunit="days")
but might be perhaps flawed, otherwise. Furthermore, the latter line of code also seems infeasible in terms of computation time.
Does someone know how to specify the variogramST-function from the gstat packaging correctly, aiming at the desired time intervals?
Thanks
If I understand correctly, the twindow argument should be the number of observations to include when calculating the space-time variogram. Assuming your 20k point are distributed more or less evenly over the 12 years, then you have about 1600 points per year. Again, assuming I understand things correctly, if you wanted to include about two years of data in temporal autocorrelation calculations, you would do:
samplevariogram<-variogramST(formula=formula_gstat,data=STIDF1,tlag=90*(0:20), tunit="days",twindow=2*1600)

How to interpret the values in auto arima plot and store it in a dataframe

I want to use forecasting to my data and I have used the auto arima method and got graph.
The following is my code,
fit <- auto.arima(a)
LH.pred <- forecast(fit,h=30)
plot(LH.pred)
I want to interpret the graphs as values and store it in a data frame, so that I can make calculations based on the forecasting.
Can anybody let me know how to take the values from the graph and store it in a data frame?
Also when I used the auto arima method, the days just got converted to days count from 1-1-1970. I want to convert back to normal dates. Can anybody plese help in that too?
Thanks
Observer
Taking the values from the graph is not really necessary. The graph consists of two parts. The first one is the time series 'a' used to build 'fit'. It is still stored in 'fit' as 'fit$x'. The second part is the forecast. You can take it from 'LH.pred' using 'as.data.frame(LH.pred)'.

Resources