I would love to be able to use the exogenous variables to help in the arima forecast. I run into issues in everyway i try to use the variables outside of the one I am trying to forecast.
I would also love for the actual plot to be more beautiful than the default r.
Error in auto.arima(datats1$Slots, seasonal = TRUE, xreg = datats1ts) :
xreg is rank deficient
All of the problems associated with having rank deficient dataframe or matrix do not hold. There are no linear combinations in the dataset.
#Load Data
datats1 <- read.csv("ProjectTS2.CSV") # Time Series I want to forecast
xreg <- read.csv("ProjectTS4.CSV") #Data is want to use as exogenous
datats1$Slots <- ts(datats1$slots, start=2015,frequency=365)
dfTS<-as.matrix(ts(xreg))
new<-auto.arima(datats1$slots,seasonal=TRUE,xreg=dfTS)
seas_fcast <- forecast(new, h=30)
ts.plot(seas_fcast,xlim=c(2018,2018.2))
Related
I want to model, and make predictions for hierarchical time series data in R but I want to be able to see what the performance / plots look like for multiple types of arima and exponential smoothing models, not just the 'best' one. For a functional example below I create a hierarchical time series with one top node, 2 middle nodes, and 5 bottom nodes.
library(forecast)
library(hts)
#create the bottom level time series
bts <- ts(5 + matrix(sort(rnorm(500)), ncol=5, nrow=100))
#create the hierarchical time series
y <- hts(bts, nodes=list(2, c(3, 2)))
I know I can predict the hierarchical time series like so
#create forecast object
y_for <- forecast(y, h=10, fmethod='arima', method='mo', level=1)
#pull predictions from forecast object
y_pred <- aggts(y_for, forecasts=TRUE)
and this gives me predictions for each series using middle out aggregation and the auto.arima() function. But I want to be able to specify what model is making the predictions before they are aggregated / dis-aggregated to the top and bottom levels. Is this something that has an elegant solution developed?
Thanks in advance!
My goal: I want to understand a time series, a strongly auto-regressive one (ACF and PACF output told me that) and make a forecast.
So what I did was I first transformed my data into a ts, then decomposed the time series, checked its stationarity (the series wasn't stationary). Then I conducted a log transformation and found an Arima model that fits the data best - I checked the accuracy with accuracy(x) - I selected the model with the accuracy output closest to 0.
Was this the correct procedure? I'm new to statistics and R and would appreciate some criticism if that wasn't correct.
When building the Arima model I used the following code:
ARIMA <- Arima(log(mydata2), order=c(2,1,2), list(order=c(0,1,1), period=12))
The result I received was a log function and the data from the past (the data I used to build the model) wasn't displayed in the diagram. So then to transform the log into the original scale I used the following code:
ARIMA_FORECAST <- forecast(ARIMA, h=24, lambda=0)
Is that correct? I found it somewhere on the web and don't really understand it.
Now my main question: How can I plot the original data and the ARIMA_FORECAST in one diagram? I mean displaying it the way the forecasts are displayed if no log transformation is undertaken - the forecast should be displayed as the extension of the data from the past, confidence intervals should be there too.
The simplest approach is to set the Box-Cox transformation parameter $\lambda=0$ within the modelling function, rather than take explicit logarithms (see https://otexts.org/fpp2/transformations.html). Then the transformation will be automatically reversed when the forecasts are produced. This is simpler than the approach described by #markus. For example:
library(forecast)
# estimate an ARIMA model to log data
ARIMA <- auto.arima(AirPassengers, lambda=0)
# make a forecast
ARIMA_forecast <- forecast(ARIMA)
# Plot forecasts and data
plot(ARIMA_forecast)
Or if you prefer ggplot graphics:
library(ggplot2)
autoplot(ARIMA_forecast)
The package forecast provides the functions autolayer and geom_forecast that might help you to draw the desired plot. Here is an example using the AirPassengers data. I use the function auto.arima to estimate the model.
library(ggplot2)
library(forecast)
# log-transform data
dat <- log(AirPassengers)
# estimate an ARIMA model
ARIMA <- auto.arima(dat)
# make a forecast
ARIMA_forecast <- forecast(ARIMA, h = 24, lambda = 0)
Since your data is of class ts you can use the autoplot function from ggplot2 to plot your original data and add the forecast with the autolayer function from forecast.
autoplot(AirPassengers) + forecast::autolayer(ARIMA_forecast)
The result is shown below.
I'm trying to fit a model to my data set using the auto.arima function but I get an error message of no suitable ARIMA model found which I suspect can be attributed to what I'm passing for the xreg portion. My data set contains 1176 total observation, including 1 variable I'm trying to forecast and the rest being dummy variables (holidays, days of the week, etc.) which I'm trying to pass into auto.arima as regressors.
library(forecast)
data <- read.csv(...)
#extract variable to be forecasted and extract regressors
forcast.var <- data[, 29]
regressors <- data[, 2:27]
#split forecast variable and regressors into train and test sets
train.r <- regressors[1:1000, ]
test.r <- regressors[1001:1176, ]
train.f <- forecast.var[1:1000]
test.f <- forecast.var[1001:1176]
#fit the data, pass 'train.r' into data.matrix and into 'xreg' since
#documentation for this function says it must be a vector or matrix
fit <- auto.arima(train.f, stepwise = FALSE, approximation = FALSE
, xreg = data.matrix(train.r))
If I attempt to run this, I get the aforementioned error message. I do get a fitted model if I don't pass anything for xreg, but the fitted values or nowhere near close to the actuals. I should mention that train.r does already have column names. So what is it that I'm doing wrong? How do I successfully pass the regressors in hopes that my model comes out more accurate?
I managed to fix this by excluding one of the dummy variables. That is, for days of the week package dummies created 7 variables for me. However, if you have 7 categories only 6 dummies are needed. I excluded one and then arima worked fine.
I have a time series of revenue and need to forecast revenue for 3 years.
My dependent variable is Revenue and independent variables are GDP, Company wealth, and S and P 500 Index.
How should I go about it?
Can a simple Linear regression model work?
Look into the "forecast" package in R. If your timeseries is x and your independent variables are in a matrix mat, you can utilize the auto.arima() function to automatically fit an arima model with covariates.
library(forecast)
mod <- auto.arima(x, xreg = mat)
# Forecast 12 periods
forecast(mod, h = 12)
Is there a way to create a holdout/back test sample in following ARIMA model with exogenous regressors. Lets say I want to estimate the following model using the first 50 observations and then evaluate model performance on the remaining 20 observations where the x-variables are pre-populated for all 70 observations. What I really want at the end is a graph that plots actual and fitted values in development period and validation/hold out period (also known as Back Testing in time series)
library(TSA)
xreg <- cbind(GNP, Time_Scaled_CO) # two time series objects
fit_A <- arima(Charge_Off,order=c(1,1,0),xreg) # Charge_Off is another TS object
plot(Charge_Off,col="red")
lines(predict(fit_A, Data),col="green") #Data contains Charge_Off, GNP, Time_Scaled_CO
You don't seem to be using the TSA package at all, and you don't need to for this problem. Here is some code that should do what you want.
library(forecast)
xreg <- cbind(GNP, Time_Scaled_CO)
training <- window(Charge_Off, end=50)
test <- window(Charge_Off, start=51)
fit_A <- Arima(training,order=c(1,1,0),xreg=xreg[1:50,])
fc <- forecast(fit_A, h=20, xreg=xreg[51:70,])
plot(fc)
lines(test, col="red")
accuracy(fc, test)
See http://otexts.com/fpp/9/1 for an intro to using R with these models.