Extract time series from standardized precipitation evapotranspiration index (SPEI) R - r

Probably a simple question but I could not figure it out after several attempts. Given this example (see below) on computing SPEI at mulltiple sites, my questions are:
1) How can one plot the time series for a select site? Not all sites at once
2) How can one extract the time series for all sites from a SPEI object and form a dataframe? The intent is to apply principal component analysis on the time series.
library(SPEI)
# Computing several time series at a time
data(balance)
names(balance)
bal_spei12 <- spei(balance,12)
plot(bal_spei12)

Related

time series with multiple observations per unit of time

I have a dataset of the daily spreads of 500 stocks. My eventual goal is to make a model using extreme value theory. However as one of the first steps, I want to check my data for volatility clustering and leptokurticity. So I first want R to see my data as a time series and I want to plot my data. However, I only find examples of time series with only one observation per unit of time. Is there a possibility for R to treat my type of dataset as a time series? And what's the best way to plot it?

Suggestions for clustering methods

I have two time series of meteorological measurements (i.e., X and Y). Both X and Y time series were constructed using daily measurements over a period of one year. By plotting X time series versus Y times series as a scatterplot and connecting all the points by date in ascending order, a closed loop is obtained representing the annual cycle. I have measurements at N locations and thus I have N loops (i.e., annual cycles) which I want to cluster to find those that have similar shapes.
With so many clustering methods, I am not sure which one will be more appropriate to use for this analysis (initially I was
thinking to use self-organizing maps).
Thank you very much for any suggestions.
Unless you have too many time series, I suggest to start with hierarchical clustering. It's easy to interpret because of the dendrogram.
For similarity, a cyclic version of DTW may be good, assuming that there is some delay between different locations.

How do I plot multiple data subset forecast predictions onto a single plot

I am new to R and have found this site extremely helpful, so here is my first posted question. I appreciate your assistance and acknowledge the wisdom on this site.
Background: Start with 5 years of weekly sales data to develop a forecast for future production based on weekly sales with a very strong year seasonality. Determined the starting point with:
auto.fit <- auto.arima(arima.ts, stepwise=FALSE, parallel=TRUE, num.cores=6, trace=TRUE )
> ARIMA(2,1,2)(0,0,1)[52] with drift.
Now I wish to certify the accuracy with visual plotting of multiple 'windows' into the data and compare to the actual values. (This included logging the AIC values.) In other words, the function loops through the data at programmed intervals recomputing/plotting the forecast onto the same plot. It plotted correctly when my window started at the head of the data. Now I am looking at a moving 104 week window and the results are all overlaid starting at 104th observation.
require(forecast) ##[EDITED for simplified clarity]
data <- rep(cos(1:52*(3.1416/26)),5)*100+1000+c(1:26,25:0)
# Create the current fit on data and predict one year out
plot(data, type="l", xlab="weeks", ylab="counts",main="Overlay forecasts & actuals",
sub="green=FIT(1-105,by 16) wks back & PREDICT(26) wks, blue=52 wks")
result <- tryCatch({
arima.fit <- auto.arima(tail(data,156))
arima.pred <- predict(arima.fit, n.ahead=52)
lines(arima.pred$pred, col="blue")
lines(arima.pred$pred+2*arima.pred$se, col="red")
lines(arima.pred$pred-2*arima.pred$se, col="red")
}, error = function(e) {return(e$message)} ) ## Trap error
# Loop and perform comparison plotting of forecast to actuals
for (j in seq(1,105,by=16)) {
result <- tryCatch({
############## This plotted correctly as "Arima(head(data,-j),..."
arima1.fit <- auto.arima(head(tail(data,-j),156))
arima1.pred <- predict(arima1.fit, n.ahead=52)
lines(arima1.pred$pred, col="green", lty=(numtests %% 6) + 1 )
}, error = function(e) {return(e$message)}) ## Trap errors
}
The plots were accurate when all the forecasting included the head of the file, however, the AIC was not comparable between forecast windows because the sample size kept shrinking.
Question: How do I show the complete 5 years of sales data and overlay forecasts at programmed intervals which are computed from a rolling window of 3 years (156 observations)?
The AIC values logged are comparable using the rolling window approach, but all the forecasts overlay starting at observation 157. I tried making the data into a time series and found the initial data plotted correctly on a time axis, but the forecasts were not time series, so they did not display.
This is answered in another post Is there an easy way to revert a forecast back into a time series for plotting?
This was initially posted as two unique questions, but they have the same answer.
The core question being addressed is "how to restore the original time stamps to the forecast data". What I have learned with trial and error is "configure, then never loose the time series attribute" by applying these steps:
1: Make a time series Use the ts() command and create a time series.
2: Subset a time series Use 'window()' to create a subset of the time series in 'for()' loop. Use 'start()' and 'end()' on the data to show the time axis positions.
3: Forecast a time series Use 'forecast()' or 'predict()' which operate on time series.
4: Plot a time series When you plot a time series, then the time axis will align correctly for additional data using the lines() command. {Plotting options are user preference.}
The forecasts will plot over the historical data in the correct time axis location.
The code is here: Is there an easy way to revert a forecast back into a time series for plotting?

How to compare two forecasted graph for two different time series in R?

Actually I want to compare the forecasted graph for two different time series data. I have data for 5 year for two different city of rain data which has been observed monthly. For that I have plotted the graph for 5 years of period of time series and also for 2 more year in future using forecast package for both city. Now I want to compare graph these two graphs and their future prediction for 2 years(may be in terms of error).
Can anyone help me out of these.
You could start with something like this:
f1 <- forecast(series1, h=24)
f2 <- forecast(series2, h=24)
accuracy(f1)
accuracy(f2)
That will give you a lot of error measures on the historical data. Unless you have the actual data for the future periods, you can't do much more than that.

Is it possibile to arrange a time series in the way that a specific autocorrleation is created?

I have a file containing 2,500 random numbers. Is it possible to rearrange these saved numbers in the way that a specific autocorrelation is created? Lets say, autocorrelation to the lag 1 of 0.2, autocorrelation to the lag 2 of 0.4, etc.etc.
Any help is greatly appreciated!
To be more specific:
The time series of a daily return in percent of an asset has the following characteristics that I am trying to recreate:
Leptokurtic, symmetric distribution, let's say centered at a daily return of zero
No significant autocorrelations (because the sign of a daily return is not predictable)
Significant autocorrleations if the time series is squared
The aim is to produce a random time series which satisfies all these three characteristics. The only two inputs should be the leptokurtic distribution (this I have already created) and the specific autocorrelation of the squared resulting time series (e.g. the final squared time series should have an autocorrelation at lag 1 of 0.2).
I only know how to produce random numbers out of my own mixed-distribution. Naturally if I would square this resulting time series, there would be no autocorrelation. I would like to find a way which takes this into account.
Generally the most straightforward way to create autocorrelated data is to generate the data so that it's autocorrelated. For example, you could create an auto correlated path by always using the value at p-1 as the mean for the random draw at time period p.
Rearranging is not only hard, but sort of odd conceptually. What are you really trying to do in the end? Giving some context might allow better answers.
There are functions for simulating correlated data. arima.sim() from stats package and simulate.Arima() from the forecast package.
simulate.Arima() has the advantages that (1.) it can simulate seasonal ARIMA models (maybe sometimes called "SARIMA") and (2.) It can simulate a continuation of an existing timeseries to which you have already fit an ARIMA model. To use simulate.Arima(), you do need to already have an Arima object.
UPDATE:
type ?arima.sim then scroll down to "examples".
Alternatively:
install.packages("forecast")
library(forecast)
fit <- auto.arima(USAccDeaths)
plot(USAccDeaths,xlim=c(1973,1982))
lines(simulate(fit, 36),col="red")

Resources