I am using autoplot function from the forecast package to show the out-of-sample on step ahead forecast
# fit ARIMA model
model1 <- auto.arima(y, seasonal=TRUE, stationary=TRUE)
where y is a ts object at a monthly frequency from 1960-01-01 to 2017-12-01.
Then I use the autoplot function to see the forecast of the model which in my case I set to be the next month (that is should be 2018-01-01)
I use the following command:
autoplot(forecast(model1, h=1))
Which gives me the following pic:
Due to a large number of observations before the forecasting period, it looks that my forecast is not very clear.
How should adjust my autoplot function to make the my forecast appear ? I am thinking to focus only on the last twelve months, but I don't know how should modify the autplot function
Can someone help me ?
You just need to pass scale_x_continuous and subset it. Here is an example using goog dataset and a naive forecast.
library(forecast); library(tidyverse)
autoplot(naive(goog200, h = 20)) +
scale_x_continuous(limits = c(150,300))
Related
I am using tbats function from forecast package in R. tbats.components function enables us to extract different components from a model object. However, I was not able to find any solution to extract trend component after I make the forecast. That is, I am looking for solutions to get trend forecast component.
Below are the codes examples:
library(forecast)
fit <- tbats(USAccDeaths)
the following is used to extract components
comp = tbats.components(fit);
trend_comp = comp[, 'level']
Making prediction, and this would give the forecast on target variable.
pred = forecast(fit, h = 10)
My question would be: is it possible to get trend component from pred? i.e. trend forecast.
We can use tbats.components
library(forecast)
out <- tbats.components(fit)
out[, "slope"]
It also depends on how the model was created.
data
fit <- tbats(USAccDeaths, use.trend = TRUE)
I am trying to forecast next-day hourly electricity prices for 2016 using the exponential smoothing method. The data-set that I am using contains hourly price data for the period 2014-01-01 00:00 to 2016-12-31 23:00. My goal is to reproduce the results in Beigaitė & Krilavičius (2018)
As electricity price data exhibits multiple seasonalities (daily, weekly, and yearly), I have defined a msts object for the period 2014-01-01 to 2015-12-31
msts.elspot.prices.2014_2015 <- msts(df.elspot.prices.2014_2015$Price, seasonal.periods = c(24, 168, 8760), ts.frequency = 8760, start = 2014)
I want to use this msts object to forecast the next day (2016-01-01) hourly electricity prices using the hw() function from the forecast package and store the point forecasts in the data frame containing the actual hourly electricity prices for the year 2016.
df.elspot.prices.2016$pred.hw <- hw(msts.elspot.prices.2014_2015, h = 24)$mean
However, I am unable to use the hw() function as I get the following error message:
Error in ets(x, "AAA", alpha = alpha, beta = beta, gamma = gamma, phi = phi, : `
Frequency too high
After looking online, it appears that the ets() function can only accept the parameter frequency to be max 24. As I am working with hourly data, this is much far below the frequency of my data.
Is there a way I can achieve my desired results using the hw() function? Are there any other packages/functions that could help me achieve my desired results?
I appreciate your help!
After looking a bit more, I've came across this question where a user wanted to use the hw method to forecast half-hourly electricity demand using the taylor dataset available in the forecast package.
As Professor Rob Hyndman suggest in the response to the linked question, the double seasonal Holt Winters model method dshw from the forecast package can be used to deal with half-hourly data.
After removing the yearly seasonality parameter (seasonal.periods = 8760) in the definition of my msts object, I've ran the model and it provided a pretty accurate result.
My goal: I want to understand a time series, a strongly auto-regressive one (ACF and PACF output told me that) and make a forecast.
So what I did was I first transformed my data into a ts, then decomposed the time series, checked its stationarity (the series wasn't stationary). Then I conducted a log transformation and found an Arima model that fits the data best - I checked the accuracy with accuracy(x) - I selected the model with the accuracy output closest to 0.
Was this the correct procedure? I'm new to statistics and R and would appreciate some criticism if that wasn't correct.
When building the Arima model I used the following code:
ARIMA <- Arima(log(mydata2), order=c(2,1,2), list(order=c(0,1,1), period=12))
The result I received was a log function and the data from the past (the data I used to build the model) wasn't displayed in the diagram. So then to transform the log into the original scale I used the following code:
ARIMA_FORECAST <- forecast(ARIMA, h=24, lambda=0)
Is that correct? I found it somewhere on the web and don't really understand it.
Now my main question: How can I plot the original data and the ARIMA_FORECAST in one diagram? I mean displaying it the way the forecasts are displayed if no log transformation is undertaken - the forecast should be displayed as the extension of the data from the past, confidence intervals should be there too.
The simplest approach is to set the Box-Cox transformation parameter $\lambda=0$ within the modelling function, rather than take explicit logarithms (see https://otexts.org/fpp2/transformations.html). Then the transformation will be automatically reversed when the forecasts are produced. This is simpler than the approach described by #markus. For example:
library(forecast)
# estimate an ARIMA model to log data
ARIMA <- auto.arima(AirPassengers, lambda=0)
# make a forecast
ARIMA_forecast <- forecast(ARIMA)
# Plot forecasts and data
plot(ARIMA_forecast)
Or if you prefer ggplot graphics:
library(ggplot2)
autoplot(ARIMA_forecast)
The package forecast provides the functions autolayer and geom_forecast that might help you to draw the desired plot. Here is an example using the AirPassengers data. I use the function auto.arima to estimate the model.
library(ggplot2)
library(forecast)
# log-transform data
dat <- log(AirPassengers)
# estimate an ARIMA model
ARIMA <- auto.arima(dat)
# make a forecast
ARIMA_forecast <- forecast(ARIMA, h = 24, lambda = 0)
Since your data is of class ts you can use the autoplot function from ggplot2 to plot your original data and add the forecast with the autolayer function from forecast.
autoplot(AirPassengers) + forecast::autolayer(ARIMA_forecast)
The result is shown below.
I'm working with a data set from 2017-01-01 to 2017-10-27, however, the auto.arima says it can only handle univariate time series, despite there being only daily data.
What am I missing?
Reproducible example:
set.seed(25)
datelist<-seq(as.Date("2016-01-01"),as.Date("2017-10-27"),by="day")
salesvals<-round(abs(rnorm(length(datelist)))*1000,digits=2)
salestbl<-data.frame(datelist,salesvals)
salesTS<-ts(salestbl,
start=c(2016,as.numeric(format(salestbl$datelist, "%j"))),
frequency=7)
fit <- auto.arima(salesTS)
Error:
Error in auto.arima(salesTS) :
auto.arima can only handle univariate time series
Overall, I know there's a weekly seasonality, hence the seven days. I know there's also a quarterly seasonality, but I can tackle that another time.
Overall I'm trying to get a forecast for 2017-12-31, using an arima forecast.
The problem is that you are declaring the data.frame as time series wrong. Any way you dont need to do that just omit the ts part like this:
set.seed(25)
datelist<-seq(as.Date("2016-01-01"),as.Date("2017-10-27"),by="day")
salesvals<-round(abs(rnorm(length(datelist)))*1000,digits=2)
salestbl<-data.frame(datelist,salesvals)
fit <- auto.arima(salestbl[,2])
just head(salesTS) and you will see why you get the error.
I'm trying to use an ARIMA model in R to forecast data. A slice of my time series looks like this:
This is just a slice of time for you get a sense of it. I have daily data from 2010 to 2015.
I want to forecast this into the future. I'm using the forecast library, and my code looks like this:
dt = msts(data$val, seasonal.periods=c(7, 30))
fit = auto.arima(dt)
plot(forecast(fit, 300))
This results in:
This model isn't good or interesting. My seasonal.periods were defined by me because I expect to see weekly and monthly seasonality, but the result looks the same with no seasonal periods defined.
Am I missing something? Very quickly the forecast predictions change very, very little from point to point.
Edit:
To further show what I'm talking about, here's a concrete example. Let's say I have the following fake dataset:
x = 1:500
y = 0.5*c(NA, head(x, -1)) - 0.4*c(NA, NA, head(x, -2)) + rnorm(500, 0, 5)
This is an AR(2) model with coefficients 0.5 and 0.4. Plotting this time series yields:
So I create an ARIMA model of this and plot the forecast results:
plot(forecast(auto.arima(y), 300))
And the results are:
Why can't the ARIMA function learn this obvious model? I don't get any better results if I use the arima function and force it to try an AR(2) model.
auto.arima does not handle multiple seasonal periods. Use tbats for that.
dt = msts(data$val, seasonal.periods=c(7, 30))
fit = tbats(dt)
plot(forecast(fit, 300))
auto.arima will just use the largest seasonal period and try to do the best it can with that.