Dual seasonal cycles in ts object - r

I want to strip out seasonality from a ts. This particular ts is daily, and has both yearly and weekly seasonal cycles (frequency 365 and 7).
In order to remove both, I have tried conducting stl() on the ts with frequency set to 365, before extracting trend and remainders, and setting the frequency of the new ts to 7, and repeat.
This doesn't seem to be working very well and I am wondering whether it's my approach, or something inherent to the ts which is causing me problems. Can anyone critique my methodology, and perhaps recommend an alternate approach?

There is a very easy way to do it using a TBATS model implemented in the forecast package. Here is an example assuming your data are stored as x:
library(forecast)
x2 <- msts(x, seasonal.periods=c(7,365))
fit <- tbats(x2)
x.sa <- seasadj(fit)
Details of the model are described in De Livera, Hyndman and Snyder (JASA, 2011).

An approach that can handle not only seasonal components (cyclically reoccurring events) but also trends (slow shifts in the norm) admirably is stl(), specifically as implemented by Rob J Hyndman.
The decomp function Hyndman gives there (reproduced below) is very helpful for checking for seasonality and then decomposing a time series into seasonal (if one exists), trend, and residual components.
decomp <- function(x,transform=TRUE)
{
#decomposes time series into seasonal and trend components
#from http://robjhyndman.com/researchtips/tscharacteristics/
require(forecast)
# Transform series
if(transform & min(x,na.rm=TRUE) >= 0)
{
lambda <- BoxCox.lambda(na.contiguous(x))
x <- BoxCox(x,lambda)
}
else
{
lambda <- NULL
transform <- FALSE
}
# Seasonal data
if(frequency(x)>1)
{
x.stl <- stl(x,s.window="periodic",na.action=na.contiguous)
trend <- x.stl$time.series[,2]
season <- x.stl$time.series[,1]
remainder <- x - trend - season
}
else #Nonseasonal data
{
require(mgcv)
tt <- 1:length(x)
trend <- rep(NA,length(x))
trend[!is.na(x)] <- fitted(gam(x ~ s(tt)))
season <- NULL
remainder <- x - trend
}
return(list(x=x,trend=trend,season=season,remainder=remainder,
transform=transform,lambda=lambda))
}
As you can see it uses stl() (which uses loess) if there is seasonality and penal­ized regres­sion splines if there is no seasonality.

Check if this is useful:
Start and End Values depends on your Data - Change the Frequency values accordingly
splot <- ts(Data1, start=c(2010, 2), end=c(2013, 9), frequency=12)
additive trend, seasonal, and irregular components can be decomposed using the stl() Function
fit <- stl(splot, s.window="period")
monthplot(splot)
library(forecast)
vi <-seasonplot(splot)
vi should give seperate values for a seasonal indices
Also Check the below one:
splot.stl <- stl(splot,s.window="periodic",na.action=na.contiguous)
trend <- splot.stl$time.series[,2]
season <- splot.stl$time.series[,1]
remainder <- splot - trend - season

Related

Reconstruct seasonally (and non seasonally) differenced data in R

I've achieved stationary data for use in arima (see) forecasts using seasonal and non seasonal differencing. Now how do I revert back to the original date using the differenced data?
raw <- read.csv("https://raw.githubusercontent.com/thistleknot/Python-Stock/master/data/combined_set.csv",row.names=1,header=TRUE)
temp <- raw$CSUSHPINSA
#tells me to seasonally difference 1 time
print(nsdiffs(ts(temp,frequency=4)))
temp_1 <- temp-dplyr::lag(temp,1*season)
#tells me I need to difference it once more
print(ndiffs(temp_2))
temp_2 <- temp_1-dplyr::lag(temp_1,1)
#shows data is somewhat stationary
plot(temp_2)
#gives me back the original dataset if I only had seasonal differencing
na.omit(dplyr::lag(raw$CSUSHPINSA ,4)+temp_1)
#how to do this with temp_2?
Some references
Pandas reverse of diff()
Reverse Diff function in R
Nevermind, I got it
dplyr::lag(raw$CSUSHPINSA ,4) + dplyr::lag(temp_1,1)+temp_2
More complete examples
temp <- raw$MSPUS
#print(nsdiffs(ts(temp,frequency=4)))
#temp_1 <- temp-dplyr::lag(temp,1*season)
print(ndiffs(temp_1))
temp_1 <- temp-dplyr::lag(temp,1)
temp_2 <- temp_1-dplyr::lag(temp_1,1)
#forecast values of temp_2
temp_3 <- dplyr::lag(temp_1,1)+temp_2
temp_4 = (dplyr::lag(raw$MSPUS ,1) + temp_3)
new_temp_2_values = c(8000,10000)
extended <- c(temp_4,tail(c(c(temp_3),tail(temp_4,1)+cumsum(tail(temp_3,1)+cumsum(new_temp_2_values))),length(new_temp_2_values)))
print(extended)
Wrote a more involved version here
https://gist.github.com/thistleknot/eeaf1631f736e20806c37107f344d50e

Implementation of time series cross-validation

I am working with time series 551 of the monthly data of the M3 competition.
So, my data is :
library(forecast)
library(Mcomp)
# Time Series
# Subset the M3 data to contain the relevant series
ts.data<- subset(M3, 12)[[551]]
print(ts.data)
I want to implement time series cross-validation for the last 18 observations of the in-sample interval.
Some people would normally call this “forecast evaluation with a rolling origin” or something similar.
How can i achieve that ? Whats means the in-sample interval ? Which is the timeseries i must evaluate?
Im quite confused , any help in order to light up this would be welcome.
The tsCV function of the forecast package is a good place to start.
From its documentation,
tsCV(y, forecastfunction, h = 1, window = NULL, xreg = NULL, initial = 0, .
..)
Let ‘y’ contain the time series y[1:T]. Then ‘forecastfunction’ is
applied successively to the time series y[1:t], for t=1,...,T-h,
making predictions f[t+h]. The errors are given by e[t+h] =
y[t+h]-f[t+h].
That is first tsCV fit a model to the y[1] and then forecast y[1 + h], next fit a model to y[1:2] and forecast y[2 + h] and so on for T-h steps.
The tsCV function returns the forecast errors.
Applying this to the training data of the ts.data
# function to fit a model and forecast
fmodel <- function(x, h){
forecast(Arima(x, order=c(1,1,1), seasonal = c(0, 0, 2)), h=h)
}
# time-series CV
cv_errs <- tsCV(ts.data$x, fmodel, h = 1)
# RMSE of the time-series CV
sqrt(mean(cv_errs^2, na.rm=TRUE))
# [1] 778.7898
In your case, it maybe that you are supposed to
fit a model to ts.data$x and then forecast ts.data$xx[1]
fit mode the c(ts.data$x, ts.data$xx[1]) and forecast(ts.data$xx[2]),
so on.

Adding two forecast objects in R

I have two forecast objects, one obtained with ARIMA, where I forecast a deseasoned time series, and the other one that involves the seasonal component of the previous ts, forecasted with the seasonal naive method, so it repeats the last years. I'd like to combine those forecast in one object, by adding its values. How can I do it?
Here's the code
ts.comp <- stl(ts, s.window="periodic")
deseasonal_ts <- seasadj(ts.comp)
fit <- auto.arima(deseasonal_ts, seasonal=FALSE)
prediction <- forecast(fit, h=30)
seasonal_ts <- ts.comp$time.series[,1]
seasonal_ts_prediction<- snaive(seasonal_ts, 30)
I'd like to combine prediction and seasonal_ts_prediction. Is this possible?
This can be done in the following way:
predComb <- prediction$mean + seasonal_ts_prediction$mean
You can see the outcome:
foo <- ts.union(nottem, predComb) # for the nottem time-series
plot(foo)

Weekly and Yearly Seasonality in R

I have daily electric load data from 1-1-2007 till 31-12-2016. I use ts() function to load the data like so
ts_load <- ts(data, start = c(2007,1), end = c(2016,12),frequency = 365)
I want to remove the yearly and weekly seasonality from my data, to decompose the data and remove the seasonality, I use the following code
decompose_load = decompose(ts_load, "additive")
deseasonalized = ts_load - decompose_load$seasonal
My question is, am I doing it right? is this the right way to remove the yearly seasonality? and what is the right way to remove the weekly seasonality?
A few points:
a ts series must have regularly spaced points and the same number of points in each cycle. In the question a frequency of 365 is specified but some years, i.e. leap years, would have 366 points. In particular, if you want the frequency to be a year then you can't use daily or weekly data without adjustment since different years have different numbers of days and the number of weeks in a year is not integer.
decompose does not handle multiple seasonalities. If by weekly you mean remove the effect of Monday, of Tuesday, etc. and if by yearly you mean remove the effect of being 1st of the year, 2nd of the year, etc. then you are asking for multiple seasonalities.
end = c(2017, 12) means the 12th day of 2017 since frequency is 365.
The msts function in the forecast package can handle multiple and non-integer seasonalities.
Staying with base R, another approach is to approximate it by a linear model avoiding all the above problems (but ignoring correlations) and we will discuss that.
Assuming the data shown reproducibly in the Note at the end we define the day of week, dow, and day of year, doy, variables and regress on those with an intercept and trend and then construct just the intercept plus trend plus residuals in the last line of code to deseasonalize. This isn't absolutely necessary but we have used scale to remove the mean of trend in order that the three terms defining data.ds are mutually orthogonal -- Whether or not we do this the third term will be orthogonal to the other 2 by the properties of linear models.
trend <- scale(seq_along(d), TRUE, FALSE)
dow <- format(d, "%a")
doy <- format(d, "%j")
fm <- lm(data ~ trend + dow + doy)
data.ds <- coef(fm)[1] + coef(fm)[2] * trend + resid(fm)
Note
Test data used in reproducible form:
set.seed(123)
d <- seq(as.Date("2007-01-01"), as.Date("2016-12-31"), "day")
n <- length(d)
trend <- 1:n
seas_week <- rep(1:7, length = n)
seas_year <- rep(1:365, length = n)
noise <- rnorm(n)
data <- trend + seas_week + seas_year + noise
you can use the dsa function in the dsa package to adjust a daily time series. The advantage over the regression solution is, that it takes into account that the impact of the season can change over time, which is usually the case.
In order to use that function, your data should be in the xts format (from the xts package). Because in that case the leap year is not ignored.
The code will then look something like this:
install.packages(c("xts", "dsa"))
data = rnorm(365.25*10, 100, 1)
data_xts <- xts::xts(data, seq.Date(as.Date("2007-01-01"), by="days", length.out = length(data)))
sa = dsa::dsa(data_xts, fourier_number = 24)
# the fourier_number is used to model monthly recurring seasonal patterns in the regARIMA part
data_adjusted <- sa$output[,1]

Time Series Decomposition on a few months of data?

I'm trying to decompose my data to see what the trend and seasonality effects are. I have 4 months of data, recorded daily. Data looks like:
date amount
11/1/2000 1700
11/2/2000 11087
11/3/2000 11248
11/4/2000 13336
11/5/2000 18815
11/6/2000 8820
11/7/2000 7687
11/8/2000 5514
11/9/2000 9591
11/10/2000 9676
11/11/2000 14782
11/12/2000 18554
And so forth to the end of Feb 2001. I read in the data like so and generate a timeseries object:
myvector <- read.table("clipboard", sep="\t", header=T)
myts <- ts(myvector$amount, start=c(2000,11), frequency=52)
I'm very confused as to how to read this data in as a time series object. The data is recorded daily, but if I use frequency=365, then try
fit <- stl(myts2, s.window="periodic")
I get:
Error in stl(myts2, s.window = "periodic") :
series is not periodic or has less than two periods
Every example I find does the object casting with multiple years worth of data. Is this not possible in my case?
I know the next steps for plotting the trend and decomposition are:
fit <- stl(myts, s.window="periodic")
plot(fit)
Try seasonal differencing, which is similar to regular differencing except is applied over different periods:
An example:
data(austres)
plot(austres)
seasonal <- diff(austres, lag = 12, differences = 1)
plot(seasonal)
d.seasonal <- diff(seasonal, differences = 2)
plot(d.seasonal)
Now you've made stationary the seasonal component of the time series.

Resources