Hierarchical forecasting of time series including missing values (R) - r

I am trying to forecast a hierarchical time series including missing values.
I expect the same behavior like auto.arima for a single time series.
The missing values should not influence the result and also not be displayed.
fit = ts %>% auto.arima()
forecast(fit, h=20) %>% autoplot()
But when I try to forecast the hierarchical time series, the NAs are automatically replaced by 0.
This influences the results dramatically.
Both of the following functions have the same output:
hts_fc <- forecast(object = hts
, h = 20
, fmethod = "arima"
)
hts_fc <- forecast(object = hts
, h = 20
, FUN = auto.arima
)
plot(hts_fc)

Related

Grey-Markov method in R

In R, I have loaded the built-in time series: AirPassengers and split it in train- and testdata like this:
rm(list = ls())
data = AirPassengers
traindata = ts(data[1:(0.75*length(data))], frequency = 12)
testdata = ts(data[((0.75*length(data))+1):length(data)], frequency = 12)
from here I want to estimate future values of a time series with the traindata using the Grey-Markov method. I know the Grey-Markov method consist of a Grey GM(1, 1) forecasting model followed by a Markov chain forecasting model refinement. But is there a function in R that performs this Grey-Markov method on its own, just like, for example, the auto.arima function?

Forecast() function in R: how it works?

I have a doubt related to the forecast () function from the package Forecast.
I am using this function for forecasting the closing price of a stock given an ARIMAX model (with xreg). The doubt is: when it is forecasting, the closing price at time t depends on the external regressors at time t-1 or it (closing price) depends on the external regressors at time t?
In other words, today I still don't know the high price (i.e.) so the closing price of today cannot depend on the high price of today, but on the one of yesterday.
This function works like that or in a different way?
I hope I have been clear. Thanks!
you can setup the function to work like this yes! Though there are some steps to take:
lag the regressor as you want yesterdays value to explain todays
clean values without regressor (first value of timeseries got no regressor as it will be used for the second value of the ts)
build the regressor for prediction
model and predict
Below I wrangled something together from a few links that shows how it can be done and thus should explain how prediction with regressor in your case works with forecast:
library(quantmod)
library(forecast)
library(dplyr)
# get some finance data to play with
quantmod::getSymbols("AAPL", from = '2017-01-01',
to = "2018-03-01",warnings = FALSE,
auto.assign = TRUE)
# I prefer working with df and then convert to ts objects later
new_AAPL <- as.data.frame(AAPL)%>%
# select close values and lag high values
dplyr::transmute(AAPL.Close,
AAPL.High = lag(AAPL.High)) %>%
# keep only complete values
dplyr::filter(across(everything(), ~!is.na(.x)))
# set up new time series, regressor (watch the starting points)
AAPL.Close <- ts(new_AAPL$AAPL.Close, start = as.Date("2017-01-04"), frequency = 365)
AAPL.High <- ts(new_AAPL$AAPL.High, start = as.Date("2017-01-04"), frequency = 365)
# set up the future regressor (last value of original high values
AAPL.futureg <- ts(as.data.frame(AAPL)$AAPL.High[291], start = as.Date("2018-03-02"), frequency = 365)
# I will use a arima model here
modArima <- forecast::auto.arima(AAPL.Close, xreg=AAPL.High)
# forecast with regressor
forecast::forecast(modArima, h = 1, xreg = AAPL.futureg)
Here is where I got the infos from:
https://www.codingfinance.com/post/2018-03-27-download-price/
https://stats.stackexchange.com/questions/41070/how-to-setup-xreg-argument-in-auto-arima-in-r

Forecasting of multivariate data through Vector Autoregression model

I am working in the functional time series using the multivariate time series data(hourly time series data). I am using FAR model more than one order for which no statistical package is available in R, so for this I convert my data into functional form and obtained the functional principle component and from those FPCA I extract their corresponding** FPCscores**. Know I use the VAR model on those FPCscores for the forecasting of each 24 hours through the VAR model, but the VAR give me the forecasted value for all 23hours when I put phat=23, but whenever I put phat=24 for example want to predict each 24 hours its give the results in the form of NA. the code is given below
library(vars)
library(fda)
fdata<- function(mat){
nb = 27 # number of basis functions for the data
fbf = create.fourier.basis(rangeval=c(0,1), nbasis=nb) # basis for data
args=seq(0,1,length=24)
fdata1=Data2fd(args,y=t(mat),fbf) # functions generated from discretized y
return(fdata1)
}
prediction.ffpe = function(fdata1){
n = ncol(fdata1$coef)
D = nrow(fdata1$coef)
#center the data
#mu = mean.fd(fdata1)
data = center.fd(fdata1)
#ffpe = fFPE(fdata1, Pmax=10)
#p.hat = ffpe[2] #order of the model
d.hat=23
p.hat=6
#fPCA
fpca = pca.fd(data,nharm=D, centerfns=TRUE)
scores = fpca$scores[,0:d.hat]
# to avoid warnings from vars predict function below
colnames(scores) <- as.character(seq(1:d.hat))
VAR.pre= predict(VAR(scores, p.hat), n.ahead=1, type="const")$fcst
}
kindly guide me that how can I solve out my problem or what error I doing. THANKS

Auto.Arima incorrectly predicts first point

I'm trying to complete a time series analysis of some reservoir data and am using auto.arima with a Fourier component to account for seasonality, as described here https://otexts.com/fpp2/dhr.html#dhr The code I have used is shown below and the dataset I used can be found here https://www.dropbox.com/sh/563nu3daeid0agb/AAB6NSddVUKgBCCbQtuqXPsZa?dl=0
Reservoir = read.csv("Reservoir1.csv",TRUE,",")
#impute missing data from data set
Reservoir = imputeTS::na_interpolation(Reservoir)
#Create Time Series
Reservoir = ts(Reservoir[,2],frequency = (365.25),start = c(2013,116))
plots = list()
for (i in seq (10)) {
fit = auto.arima(Reservoir, xreg = fourier(Reservoir, K = i), seasonal = FALSE)
plots[[i]] = autoplot(forecast(fit, xreg = fourier(Reservoir, K = i, h=10))) +
xlab(paste("K=",i,"AICC=",round(fit[["aicc"]],2))) + ylab("")
}
gridExtra::grid.arrange(plots[[1]],plots[[2]],plots[[3]],plots[[4]],plots[[5]],
plots[[6]],plots[[7]],plots[[8]],plots[[9]],plots[[10]],
nrow=5)
bestfit = auto.arima(Reservoir, xreg=fourier(Reservoir, K=9), seasonal=FALSE)
summary(bestfit)
checkresiduals(bestfit)
plot(Reservoir,col="red")
lines(fitted(bestfit),col="blue")
The model fits well, except for the incorrect first prediction. I'm lost as to why only this value would be so far off. Or, is this an acceptable error?
The residuals are the one-step forecast errors using all previous observations. At time 1, the residual is the forecast error with no previous observations, so it is simply based on the fitted model. In fact, it is an artificially "good" forecast because the differencing means there is no way for the model to know the location of the data until there is an observation. But the way ARIMA models are implemented in R makes the first prediction use a little more information than it should.

ARFIMA model and accurancy function

I am foresting with data sets from fpp2 package and forecast package. So my intention is to make automatic forecasting with a several time series. So for that reason I am forecasting with function. You can see code below:
# CODE
library(fpp2)
library(dplyr)
library(forecast)
df<-qauselec
# Forecasting function
fct_fun <- function(Z, hrz = forecast_horizon) {
timeseries <- msts(Z, start = 1956, seasonal.periods = 4)
forecast <- arfima(timeseries)
}
acc_list <- lapply(X = df, fct_fun)
So next step is to check accuracy of model. So for that reason I am trying with this line of code you can see below
accurancy_arfima <- lapply(acc_list, accuracy)
Until now this line of code or function accuracy worked perfectly with other models like snaive,ets etc. but with arfima can’t work properly.
So can anybody help me how to resolve this problem with accuracy function?
Follow R-documentation, Returns range of summary measures of the forecast accuracy. If x is provided, the function measures test set forecast accuracy based on x-f . If x is not provided, the function only produces training set accuracy measures of the forecasts based on f["x"]-fitted(f).
And usage summary can be seen :
accuracy(f, x, test = NULL, d = NULL, D = NULL,
...)
So :
accuracy(acc_list[[1]]$fitted, df)
If you want to evaluate separately accuracy, It will work.
a <- c()
for (i in 1:4) {
b <- accuracy(df[i], acc_list[[1]]$fitted[i])
a <- rbind(a,b)
}

Resources