Im doing a VAR forecast, and then trying to forecast with a rolling window, however it doesn't seem to work for me. I have done a loop for my ARIMA model, which works (it does give me a warning: i window.default(x, ...) : 'end' value not changed <- which im not sure what means)
this is my code for the rolling window for arima
h <- 1 # h-step ahead
train <- window(Y, start=2002, end=c(2017,12),frequency = 12) #in-sample
test <- window(Y, start=2018, end=c(2021,12),frequency = 12) #out-of-sample
n <- length(test) - h+1
fit <- arima(Y, order=c(1,1,1), seasonal = list(order = c(1, 0, 1), period = 12))
fc <- ts(numeric(n), start=2018+(h-1)/12, freq=12)
for(i in 1:n)
x <- window(Y, end=c(2017,12) + (i-1)/12)
refit <- Arima(x, model=fit) # apply the estimated model to the extended data
fc[i] <- forecast(refit, h=h)$mean[h]
however when forecasting with VAR I have to use predict, and this doesn't seem to work for me with the rolling window, do any of you have any suggestions?
my training-set contains 192 values, and my test set 48
As it is large I can't dput it here. But suppose the realmatrix is a "mts" with non-trivial values
realmatrix <- matrix(NA, ncol = 100, nrow = 138)
In fact it stores 100 time series with length (rows) = 138 (from Jan 2005 to June 2016).
I want to store the Arima forecasts (12 months ahead: that is, from July 2016 to June 2017) in another matrix farimamatrix (which should have 12 rows and 100 columns), via the following loop:
farimamatrix <- matrix(NA, nrow = 12, ncol = 100)
m <- k <- list()
for (i in 1:100) {
try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
k[[i]] <- forecast.Arima(m[[i]], h=12)
farimamatrix[,i] <- fitted(k[[i]])
}
But I am getting the following message:
Error in farimamatrix[, i] <- fitted(k[[i]]) :
incorrect number of subscripts on matrix
What's wrong? Thanks in advance.
Edited (24/10): updated / corrected under Zheyuan's answer and previous problem gone
Original data:
tsdata <-
structure(c(28220L, 27699L, 28445L, 29207L, 28482L, 28326L, 28322L,
28611L, 29187L, 29145L, 29288L, 29352L, 28881L, 29383L, 29898L,
29888L, 28925L, 29069L, 29114L, 29886L, 29917L, 30144L, 30531L,
30494L, 30700L, 30325L, 31313L, 32031L, 31383L, 30767L, 30500L,
31181L, 31736L, 32136L, 32654L, 32305L, 31856L, 31731L, 32119L,
31953L, 32300L, 31743L, 32150L, 33014L, 32964L, 33674L, 33410L,
31559L, 30667L, 30495L, 31978L, 32043L, 30945L, 30715L, 31325L,
32262L, 32717L, 33420L, 33617L, 34123L, 33362L, 33731L, 35118L,
35027L, 34298L, 34171L, 33851L, 34715L, 35184L, 35190L, 35079L,
35958L, 35875L, 35446L, 36352L, 36050L, 35567L, 35161L, 35419L,
36337L, 36967L, 36745L, 36370L, 36744L, 36303L, 36899L, 38621L,
37994L, 36809L, 36527L, 35916L, 37178L, 37661L, 37794L, 38642L,
37763L, 38367L, 38006L, 38442L, 38654L, 38345L, 37628L, 37698L,
38613L, 38525L, 39389L, 39920L, 39556L, 40280L, 41653L, 40269L,
39592L, 39100L, 37726L, 37867L, 38551L, 38895L, 40100L, 40950L,
39838L, 40643L, 40611L, 39611L, 39445L, 38059L, 37131L, 36697L,
37746L, 37733L, 39188L, 39127L, 38554L, 38219L, 38497L, 39165L,
40077L, 38370L, 37174L), .Dim = c(138L, 1L), .Dimnames = list(
NULL, "Data"), .Tsp = c(2005, 2016.41666666667, 12), class = "ts")
Code
library("forecast")
z <- stl(tsdata[, "Data"], s.window="periodic")
t <- z$time.series[,"trend"]
s <- z$time.series[,"seasonal"]
e <- z$time.series[,"remainder"]
# error matrix
ematrix <- matrix(rnorm(138 * 100, sd = 100), nrow = 138)
# generating a ts class error matrix
ematrixts <- ts(ematrix, start=c(2005,1), freq=12)
# combining the trend + season + error matrix into a real matrix
realmatrix <- t + s + ematrixts
# creating a (forecast) arima matrix
farimamatrix <- matrix(NA, ncol = 100, nrow = 12)
m <- k <- vector("list", length = 100)
for (i in 1:100) {
try(m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(1,0,1)))
print(i)
k[[i]] <- forecast.Arima(m[[i]], h = 12)
farimamatrix[,i] <- k[[i]]$mean
}
# ts.plot(farimamatrix[,1:100],col = c(rep("gray",100),rep("red",1)))
The loop seems to work, but breaks down after a few iterations due to failure of Arima:
Error in stats::arima(x = x, order = order, seasonal = seasonal, include.mean = include.mean, : " non-stationary seasonal AR part from CSS
Yep, the previous problem is gone, and now you have a new problem, regarding the failure of Arima. Strictly speaking you should raise a new question on this. But I will answer it here anyway.
The error message is quite illustrative. When you fit a model ARIMA(0,1,0)(1,0,1), sometimes the seasonal part is non-stationary, so a further seasonal differencing is needed.
By looking at ts.plot(realmatrix),I see that all 100 columns of realmatrix are pretty similar. I will thus take out the first column for some analysis.
x <- realmatrix[,1]
Obviously the non-seasonal differencing is a must, but do we need a seasonal differencing as well? Have a check with ACF
acf(diff(x))
We actually spotted strong evidence that for the seasonal pattern. So yes, a seasonal differencing is needed.
Now let's check the ACF after both differencing:
acf(diff(diff(x, lag = 12))) ## first do seasonal diff, then non-seasonal diff
There appears to be a negative spike between season, suggesting a seasonal MA process. So ARIMA(0,1,0)(0,1,1)[12] would be a good bet.
fit <- arima(x, order = c(0,1,0), seasonal = c(0,1,1))
Have a check at the residuals:
acf(fit$residuals)
I would actually be pretty happy about this result, as there is no lag 1 or even lag 2 autocorrelation at all, and there is also no seasonal autocorrelation. You can actually try further adding a seasonal and / or non-seasonal AR(1), but there will be no improvement. So this is our final model to go.
So use the following loop:
farimamatrix <- matrix(NA, ncol = 100, nrow = 12)
m <- k <- vector("list", length = 100)
for (i in 1:100) {
m[[i]] <- Arima(realmatrix[,i], order = c(0,1,0), seasonal = c(0,1,1))
print(i)
k[[i]] <- forecast.Arima(m[[i]], h = 12)
farimamatrix[,i] <- k[[i]]$mean
}
Now all 100 model fitting are successful.
---------
A retrospect reflection
Perhaps I should explain why ARIMA(0,1,0)(1,0,1)[12] models works for my simulated data in the initial answer. Because note how I simulate my data:
seasonal <- rep_len(sin((1:12) * pi / 6), 138)
Yes, the underlying seasonal pattern is a true replication and of course stationary.
I am new to R programming. I've generated a hierarchical time series using the hts package.I need to plot time series in each hierarchy separately using dygraphs.
library(hts)
abc <- ts(5 + matrix(sort(rnorm(1000)), ncol = 10, nrow = 100))
colnames(abc) <- c("A10A", "A10B", "A10C", "A20A", "A20B",
"B30A", "B30B", "B30C", "B40A", "B40B")
y <- hts(abc, characters = c(1, 2, 1))
fcasts1 <- forecast(y, method = "bu" ,h=4, fmethod = "arima",
parallel = TRUE)
dygraph(fcasts1,y)
I keep getting this error message ,
Error in UseMethod("as.xts") :
no applicable method for 'as.xts' applied to an object of class "c('gts', 'hts')"
Is there a solution for this issue ?Maybe if someone could tell me how to put the variables right in dygraph.
It is not possible to directly plot hts objects using dygraph. What you need to do is convert the hts$bts object into a matrix and then convert into a normal time series using ts() function.
Here is an example I've worked out.
library(hts)
abc <- ts(5 + matrix(sort(rnorm(1000)), ncol = 10, nrow = 100))
colnames(abc) <- c("A10A", "A10B", "A10C", "A20A", "A20B",
"B30A", "B30B", "B30C", "B40A", "B40B")
y <- hts(abc, characters = c(1, 2, 1))
fcasts1 <- forecast.gts(y, method = "bu" ,h=4, fmethod = "arima",
parallel = TRUE)
ts1 <- as.matrix(fcasts1$bts)
ts1 <- ts(ts1,start = c(2016,3), frequency = 12)
dygraph(ts1[,"A10A"],main='Sample dygraph ',ylab = 'Demand')
In the following example, I am trying to use Holt-Winters smoothing on daily data, but I run into a couple of issues:
# generate some dummy daily data
mData = cbind(seq.Date(from = as.Date('2011-12-01'),
to = as.Date('2013-11-30'), by = 'day'), rnorm(731))
# convert to a zoo object
zooData = as.zoo(mData[, 2, drop = FALSE],
order.by = as.Date(mData[, 1, drop = FALSE], format = '%Y-%m-%d'),
frequency = 7)
# attempt Holt-Winters smoothing
hw(x = zooData, h = 10, seasonal = 'additive', damped = FALSE,
initial = 'optimal', exponential = FALSE, fan = FALSE)
# no missing values in the data
sum(is.na(zooData))
This leads to the following error:
Error in ets(x, "AAA", alpha = alpha, beta = beta, gamma = gamma,
damped = damped, : You've got to be joking. I need more data! In
addition: Warning message: In ets(x, "AAA", alpha = alpha, beta =
beta, gamma = gamma, damped = damped, : Missing values encountered.
Using longest contiguous portion of time series
Emphasis mine.
Couple of questions:
1. Where are the missing values coming from?
2. I am assuming that the "need more data" arises from attempting to estimate 365 seasonal parameters?
Update 1:
Based on Gabor's suggestion, I have recreated a fractional index for the data where whole numbers are weeks.
I have a couple of questions.
1. Is this is an appropriate way of handling daily data when the periodicity is assumed to be weekly?
2. Is there is a more elegant way of handling the dates when working with daily data?
library(zoo)
library(forecast)
# generate some dummy daily data
mData = cbind(seq.Date(from = as.Date('2011-12-01'),
to = as.Date('2013-11-30'), by = 'day'), rnorm(731))
# conver to a zoo object with weekly frequency
zooDataWeekly = as.zoo(mData[, 2, drop = FALSE],
order.by = seq(from = 0, by = 1/7, length.out = 731))
# attempt Holt-Winters smoothing
hwData = hw(x = zooDataWeekly, h = 10, seasonal = 'additive', damped = FALSE,
initial = 'optimal', exponential = FALSE, fan = FALSE)
plot(zooDataWeekly, col = 'red')
lines(fitted(hwData))
hw requires a ts object not a zoo object. Use
zooDataWeekly <- ts(mData[,2], frequency=7)
Unless there is a good reason for specifying the model exactly, it is usually better to let R select the best model for you:
fit <- ets(zooDataWeekly)
fc <- forecast(fit)
plot(fc)