Frequency in xts vs ts for auto.arima - r

Q: What is the right way to set the frequency in an xts object given a set of dates? Ideally, auto.arima() called on this xts object would yield the same results as when called on an analogous ts object.
Detail: I was surprised to find different results from an auto.arima() fit based on whether I passed a ts or xts object. I found the difference had to do with the frequency (which, in the case of xts, was being reset to 1 despite my setting it to 12 in the construction). Below, setting up sim_ts_12 and estimating the intended model was relatively straightforward. But in my initial attempts at working with xts (sim_xts and sim_xts_not) I estimated the wrong model. I finally estimated the right model using xts (sim_xts_12, sim_ts2xts), but both of those approaches seem wrong in some way. I'd expect working with xts to be simpler than ts. But that doesn't seem to be the case here. Am I missing something?
sim <- scan(file="./sim.dat")
sim_ts_12 <- ts(sim, start=c(2016,1), frequency=12)
sim_ts2xts_12 <- as.xts(sim_ts_12)
sim_xts <- xts(x=sim, order.by=seq.Date(from=as.Date("2016-01-01"), by="month", length.out = length(sim)))
sim_xts_12_not <- xts(x=sim, order.by=seq.Date(from=as.Date("2016-01-01"), by="month", length.out = length(sim)), frequency=12)
sim_xts_12 <- sim_xts
attr(sim_xts_12, 'frequency') <- 12
auto.arima(sim_ts_12) # ARIMA(0,1,1)(0,1,0)[12]
auto.arima(sim_ts2xts_12) # ARIMA(0,1,1)(0,1,0)[12]
auto.arima(sim_xts) # ARIMA(0,1,1) with drift
auto.arima(sim_xts_12_not) # ARIMA(0,1,1) with drift
auto.arima(sim_xts_12) # ARIMA(0,1,1)(0,1,0)[12]
txt <- "0.04767597 0.07217235 0.03954613 0.03698637 0.04283896
0.03534811 0.04198519 0.04129214 0.04576022 0.03966146
0.03656881 0.04396736 0.04459328 0.07062732 0.03477407
0.0340033 0.039136 0.0347761 0.03819997 0.03634627
0.03966617 0.03455635 0.03009606 0.03927688 0.03959629
0.06554147 0.02908742 0.02619443 0.03179742 0.02468108
0.02612955 0.02300656 0.02988827 0.01878513 0.01399028
0.02601922 0.0250159 0.05610426 0.01537538 0.01231939
0.01330564 0.008744173 0.01296571 0.005741129 0.01674992
0.003210812 -0.007936987 0.01018758"
sim.dat <- scan(text=txt, what=numeric() )
UPDATE, NOT A DUPLICATE: The possible duplicate question/answer does not address the best practice method for handling frequency in an xts. The question does not ask for it, nor does the answer address it. The answer handles ts.

Related

Error while refitting ARIMA model

I am getting the following error when I try to refit the ARIMA model.
new_model <- Arima(data,model=old_model)
Error in Ops.Date(driftmod$coeff[2], time(x)) :
* not defined for "Date" objects
Note: The class of data is zoo. I also tried using xts, but I got the same error.
Edit: As suggested by Joshua, here is the reproducible example.
library('zoo')
library('forecast')
#Creating sample data
sample_range <- seq(from=1, to=10, by=1)
x<-sample(sample_range, size=61, replace=TRUE)
ts<-seq.Date(as.Date('2017-03-01'),as.Date('2017-04-30'), by='day')
dt<-data.frame(ts=ts,data=x)
#Split the data to training set and testing set
noOfRows<-NROW(dt)
trainDataLength=floor(noOfRows*0.70)
trainData<-dt[1:trainDataLength,]
testData<-dt[(trainDataLength+1):noOfRows,]
# Use zoo, so that we get dates as index of dataframe
trainData.zoo<-zoo(trainData[,2:ncol(trainData)], order.by=as.Date((trainData$ts), format='%Y-%m-%d'))
testData.zoo<-zoo(testData[,2:ncol(testData)], order.by=as.Date((testData$ts), format='%Y-%m-%d'))
#Create Arima Model Using Forecast package
old_model<-Arima(trainData.zoo,order=c(2,1,2),include.drift=TRUE)
# Refit the old model with testData
new_model<-Arima(testData.zoo,model=old_model)
The ?Arima page says that y (the first argument) should be a ts object. My guess is that the first call to Arima coerces your zoo object to ts, but the second call does not.
An easy way to work-around this is to explicitly coerce to ts:
# Refit the old model with testData
new_model <- Arima(as.ts(testData.zoo), model = old_model)

ets: Error in ets(timeseries, model = "MAM") : Nonseasonal data

I'm trying to create a forecast using an exponential smoothing method, but get the error "nonseasonal data". This is clearly not true - see code below.
Why am I getting this error? Should I use a different function (it should be able to perform simple, double, damped trend, seasonal, Winters method)?
library(forecast)
timelen<-48 # use 48 months
dates<-seq(from=as.Date("2008/1/1"), by="month", length.out=timelen)
# create seasonal data
time<-seq(1,timelen)
season<-sin(2*pi*time/12)
constant<-40
noise<-rnorm(timelen,mean=0,sd=0.1)
trend<-time*0.01
values<-constant+season+trend+noise
# create time series object
timeseries<-as.ts(x=values,start=min(dates),end=max(dates),frequency=1)
plot(timeseries)
# forecast MAM
ets<-ets(timeseries,model="MAM") # ANN works, why MAM not?
ets.forecast<-forecast(ets,h=24,level=0.9)
plot(ets.forecast)
Thanks&kind regards
You should use ts simply to create a time series from a numeric vector. See the help file for more details.
Your start and end values aren't correctly specified.
And setting the frequency at 1 is not a valid seasonality, it's the same as no seasonality at all.
Try:
timeseries <- ts(data=values, frequency=12)
ets <- ets(timeseries, model="MAM")
print(ets)
#### ETS(M,A,M)
#### Call:
#### ets(y = timeseries, model = "MAM")
#### ...
The question in your comments, why ANN works is because the third N means no seasonnality, so the model can be computed even with a non-seasonal timeseries.

How to Create a R TimeSeries for Hourly data

I have hourly snapshot of an event starting from 2012-05-15-0700 to 2013-05-17-1800. How can I create a Timeseries on this data and perform HoltWinters to it?
I tried the following
EventData<-ts(Eventmatrix$X20030,start=c(2012,5,15),frequency=8000)
HoltWinters(EventData)
But I got Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods
What value should I put from Frequency?
I think you should consider using ets from the package forecast to perform exponential smoothing. Read this post to have a comparison between HoltWinters and ets .
require(xts)
require(forecast)
time_index <- seq(from = as.POSIXct("2012-05-15 07:00"),
to = as.POSIXct("2012-05-17 18:00"), by = "hour")
set.seed(1)
value <- rnorm(n = length(time_index))
eventdata <- xts(value, order.by = time_index)
ets(eventdata)
Now if you want to know more about the syntax of ets check the help of this function and the online book of Rob Hyndman (Chap 7 section 6)
Please take a look at the following post which might answer the question:
Decompose xts hourly time series
Its explains how you can create a xts object using POSIXct objects. This xts object can have its frequency attribute set manually and you will probably then be able to use HoltWinters

Dynamic time-series prediction and rollapply

I am trying to get a rolling prediction of a dynamic timeseries in R (and then work out squared errors of the forecast). I based a lot of this code on this StackOverflow question, but I am very new to R so I am struggling quite a bit. Any help would be much appreciated.
require(zoo)
require(dynlm)
set.seed(12345)
#create variables
x<-rnorm(mean=3,sd=2,100)
y<-rep(NA,100)
y[1]<-x[1]
for(i in 2:100) y[i]=1+x[i-1]+0.5*y[i-1]+rnorm(1,0,0.5)
int<-1:100
dummydata<-data.frame(int=int,x=x,y=y)
zoodata<-as.zoo(dummydata)
prediction<-function(series)
{
mod<-dynlm(formula = y ~ L(y) + L(x), data = series) #get model
nextOb<-nrow(series)+1
#make forecast
predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]
#strip timeseries information
attributes(predicted)<-NULL
return(predicted)
}
rolling<-rollapply(zoodata,width=40,FUN=prediction,by.column=FALSE)
This returns:
20 21 ..... 80
10.18676 10.18676 10.18676
Which has two problems I was not expecting:
Runs from 20->80, not 40->100 as I would expect (as the width is 40)
The forecasts it gives out are constant: 10.18676
What am I doing wrong? And is there an easier way to do the prediction than to write it all out? Thanks!
The main problem with your function is the data argument to dynlm. If you look in ?dynlm you will see that the data argument must be a data.frame or a zoo object. Unfortunately, I just learned that rollapply splits your zoo objects into array objects. This means that dynlm, after noting that your data argument was not of the right form, searched for x and y in your global environment, which of course were defined at the top of your code. The solution is to convert series into a zoo object. There were a couple of other issues with your code, I post a corrected version here:
prediction<-function(series) {
mod <- dynlm(formula = y ~ L(y) + L(x), data = as.zoo(series)) # get model
# nextOb <- nrow(series)+1 # This will always be 21. I think you mean:
nextOb <- max(series[,'int'])+1 # To get the first row that follows the window
if (nextOb<=nrow(zoodata)) { # You won't predict the last one
# make forecast
# predicted<-coef(mod)[1]+coef(mod)[2]*zoodata$y[nextOb-1]+coef(mod)[3]*zoodata$x[nextOb-1]
# That would work, but there is a very nice function called predict
predicted=predict(mod,newdata=data.frame(x=zoodata[nextOb,'x'],y=zoodata[nextOb,'y']))
# I'm not sure why you used nextOb-1
attributes(predicted)<-NULL
# I added the square error as well as the prediction.
c(predicted=predicted,square.res=(predicted-zoodata[nextOb,'y'])^2)
}
}
rollapply(zoodata,width=20,FUN=prediction,by.column=F,align='right')
Your second question, about the numbering of your results, can be controlled by the align argument is rollapply. left would give you 1..60, center (the default) would give you 20..80 and right gets you 40..100.

Creating ts objects in R

I have only started playing around with time series in R so I have fallen at the first hurdle! I have a vector of daily temperature readings (with no date stamp) and I am having problems creating such an object.
data<-rnorm(3650, m=10, sd=2)
data_ts<-as.ts(data, frequency=365, start=c(1919, 1))
attributes(data_ts)
dcomp<-decompose(data_ts, type=c("additive"))
I think this code should be instructing R to make a ts object with daily measurements (frequency=365) starting at 1-1-1919. I dont understand the error message in the decompose command, I have a feeling I have not created the ts object correctly because data_ts$tsp does not look correct!
data <- rnorm(3650, m=10, sd=2)
# change is below, use ts() to create time series
data_ts <- ts(data, frequency=365, start=c(1919, 1))
attributes(data_ts)
dcomp<-decompose(data_ts, type=c("additive"))
plot(dcomp)
Produces:

Resources