Creating ts objects in R - r

I have only started playing around with time series in R so I have fallen at the first hurdle! I have a vector of daily temperature readings (with no date stamp) and I am having problems creating such an object.
data<-rnorm(3650, m=10, sd=2)
data_ts<-as.ts(data, frequency=365, start=c(1919, 1))
attributes(data_ts)
dcomp<-decompose(data_ts, type=c("additive"))
I think this code should be instructing R to make a ts object with daily measurements (frequency=365) starting at 1-1-1919. I dont understand the error message in the decompose command, I have a feeling I have not created the ts object correctly because data_ts$tsp does not look correct!

data <- rnorm(3650, m=10, sd=2)
# change is below, use ts() to create time series
data_ts <- ts(data, frequency=365, start=c(1919, 1))
attributes(data_ts)
dcomp<-decompose(data_ts, type=c("additive"))
plot(dcomp)
Produces:

Related

How to complete the forecast line using plot?

I have been working on a script in R that will predict a number.
# Load Forecast library
library(forecast)
# Load dataset
bwi <- read.csv(file="C:/Users/nsoria/Downloads/AMS Globales/TEC_BWI.csv", header=TRUE, sep=';', dec=",")
# Create time series starting in January 2015
ts_bwi <- ts(bwi$BWI, frequency = 12, start = c(2015,1))
# Pull out the seasonal, trend, and irregular components from the time series
model <- stl(ts_bwi, s.window = "periodic")
# Predict the next 5 months of SLA
pred <- forecast(model, h = 5)
# Plot the results
plot(pronostico)
This output gives this
Somehow, the forecasted line is not linked with the actual values.
Question: How can I make the line linked from the last known value to the first forecasted value?
Edit 01/01: Here is the link where the CSV is located to reproduce the case.
You need to add your real time series to the predicted one like in the code below
pred_mod<-pred
ts_real<-pred$x
pred_mod$x<-ts(c(ts_real,pred$mean),frequency=12,start=c(2015,1))
plot(pred_mod)
here the result

Frequency in xts vs ts for auto.arima

Q: What is the right way to set the frequency in an xts object given a set of dates? Ideally, auto.arima() called on this xts object would yield the same results as when called on an analogous ts object.
Detail: I was surprised to find different results from an auto.arima() fit based on whether I passed a ts or xts object. I found the difference had to do with the frequency (which, in the case of xts, was being reset to 1 despite my setting it to 12 in the construction). Below, setting up sim_ts_12 and estimating the intended model was relatively straightforward. But in my initial attempts at working with xts (sim_xts and sim_xts_not) I estimated the wrong model. I finally estimated the right model using xts (sim_xts_12, sim_ts2xts), but both of those approaches seem wrong in some way. I'd expect working with xts to be simpler than ts. But that doesn't seem to be the case here. Am I missing something?
sim <- scan(file="./sim.dat")
sim_ts_12 <- ts(sim, start=c(2016,1), frequency=12)
sim_ts2xts_12 <- as.xts(sim_ts_12)
sim_xts <- xts(x=sim, order.by=seq.Date(from=as.Date("2016-01-01"), by="month", length.out = length(sim)))
sim_xts_12_not <- xts(x=sim, order.by=seq.Date(from=as.Date("2016-01-01"), by="month", length.out = length(sim)), frequency=12)
sim_xts_12 <- sim_xts
attr(sim_xts_12, 'frequency') <- 12
auto.arima(sim_ts_12) # ARIMA(0,1,1)(0,1,0)[12]
auto.arima(sim_ts2xts_12) # ARIMA(0,1,1)(0,1,0)[12]
auto.arima(sim_xts) # ARIMA(0,1,1) with drift
auto.arima(sim_xts_12_not) # ARIMA(0,1,1) with drift
auto.arima(sim_xts_12) # ARIMA(0,1,1)(0,1,0)[12]
txt <- "0.04767597 0.07217235 0.03954613 0.03698637 0.04283896
0.03534811 0.04198519 0.04129214 0.04576022 0.03966146
0.03656881 0.04396736 0.04459328 0.07062732 0.03477407
0.0340033 0.039136 0.0347761 0.03819997 0.03634627
0.03966617 0.03455635 0.03009606 0.03927688 0.03959629
0.06554147 0.02908742 0.02619443 0.03179742 0.02468108
0.02612955 0.02300656 0.02988827 0.01878513 0.01399028
0.02601922 0.0250159 0.05610426 0.01537538 0.01231939
0.01330564 0.008744173 0.01296571 0.005741129 0.01674992
0.003210812 -0.007936987 0.01018758"
sim.dat <- scan(text=txt, what=numeric() )
UPDATE, NOT A DUPLICATE: The possible duplicate question/answer does not address the best practice method for handling frequency in an xts. The question does not ask for it, nor does the answer address it. The answer handles ts.

Create Forecast and Check Accuracy

I have data of the form SaleDateTime = '2015-01-02 23:00:00.000' SaleCount=4.
I'm trying to create an hourly forecast for the next 12 hours, using the code below.
I'm new to forecasting and could definitely appreciate some advice.
I'm trying to partition the data, train a model, plot the forecast with x axis of the form '2015-01-02 23:00:00.000', and test the accuracy of the model on a test time series.
I'm getting the error message below, when I try to run the accuracy as shown. Does anyone know why I'm getting the error message below?
When I run the plot as shown below it has an x axis from 0 to 400, does anyone know how to show that as something like '2015-01-02 23:00:00.000'? I would also like to narrow the plot to the last say 3 months.
My understanding is that if you don't specify a model for forecast, then it tries to fit the best model it can to the data for the forecast. Is that correct?
How do I filter for the same timeseries range with the forecast as the ts1Test that I'm trying to run accuracy on, is it something like ts(fcast2, start=2001, end = 8567) ?
Since I'm using the zoo package is the as.POSIXct step unnecessary, could I just do eventdata <- zoo(Value, order.by = SaleDateTime) instead?
library("forecast")
library("zoo")
SampleData<-SampleData
Value<-SampleData[,c("SaleDateTime","SaleCount")]
rDateTime<-as.POSIXct(SampleData$SaleDateTime, format="%Y-%m-%d %H:%M:%S")
eventdata <- zoo(Value, order.by = rDateTime)
##Partitioning data Training/Testing
ts1SampleTrain<-eventdata[1:2000,]
ts1Train<-ts(ts1SampleTrain$SaleCount, frequency=24)
ts1SampleTest<-eventdata[2001:28567,]
ts1Test<-ts(ts1SampleTest$SaleCount, frequency=24)
#Training Model
fcast2<-forecast(ts1Train,h=8567)
plot(fcast2)
accuracy(fcast2,ts1Test)
New Error:
Error in -.default(xx, ff[1:n]) : non-numeric argument to binary operator
To make your accuracy test run you should ensure that the length of your test data ts1Test and your forecasting horizon, h in fcast2<-forecast(ts1Train,h=8567) are of the same length. Now you have 26567 datapoints vs 8567.
Following your approach, the next toy example will work:
library(forecast)
library(zoo)
Value <- rnorm(1100)
rDateTime <- seq(as.POSIXct('2012-01-01 00:00:00'), along.with=Value, by='hour')
eventDate <- ts(zoo(Value, order.by=rDateTime), frequency = 24)
tsTrain <-eventDate[1:1000]
tsTest <- eventDate[1001:1100]
fcast<-forecast(tsTrain,h=100)
accuracy(fcast, tsTest)
ME RMSE MAE MPE MAPE MASE ACF1
Training set -2.821378e-04 9.932745e-01 7.990188e-01 1.003861e+02 1.007542e+02 7.230356e-01 4.638487e-02
Test set 0.02515008 1.02271839 0.86072703 99.79208174 100.14023919 NA NA
Concerning your other two questions:
Use of POSIX timestamps and zoo package. You don't need them to
use forecast. ts(Value, frequency) would suffice.
Plotting time series object with datetimes as your labels. The
following code snippet should get you started in this direction. Look for
axis function that provides the desired behavior:
par(mar=c(6,2,1,1)) # bottom, left, top, right margins
plot(tsTrain, type="l", xlab="", xaxt="n")
axis(side=1, at=seq(1,1000,100), label=format(rDateTime[seq(1,1000,100)], "%Y-%m-%d"), las=2)

How to Create a R TimeSeries for Hourly data

I have hourly snapshot of an event starting from 2012-05-15-0700 to 2013-05-17-1800. How can I create a Timeseries on this data and perform HoltWinters to it?
I tried the following
EventData<-ts(Eventmatrix$X20030,start=c(2012,5,15),frequency=8000)
HoltWinters(EventData)
But I got Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) : time series has no or less than 2 periods
What value should I put from Frequency?
I think you should consider using ets from the package forecast to perform exponential smoothing. Read this post to have a comparison between HoltWinters and ets .
require(xts)
require(forecast)
time_index <- seq(from = as.POSIXct("2012-05-15 07:00"),
to = as.POSIXct("2012-05-17 18:00"), by = "hour")
set.seed(1)
value <- rnorm(n = length(time_index))
eventdata <- xts(value, order.by = time_index)
ets(eventdata)
Now if you want to know more about the syntax of ets check the help of this function and the online book of Rob Hyndman (Chap 7 section 6)
Please take a look at the following post which might answer the question:
Decompose xts hourly time series
Its explains how you can create a xts object using POSIXct objects. This xts object can have its frequency attribute set manually and you will probably then be able to use HoltWinters

In R, what is the difference between class ts and class timeSeries?

In R, what is the difference between class ts and class timeSeries? I think I am getting a problem in HoltWinters because of that. I'm getting:
data(LakeHuron)
x <- LakeHuron
before <- window(x, end=1935)
after <- window(x, start=1935)
a <- .2
b <- 0
g <- 0
model <- HoltWinters(before, alpha=a, beta=b, gamma=g)
"Error in decompose(ts(x[1L:wind], start = start(x), frequency = f), seasonal) :
time series has no or less than 2 periods"
even though gamma=0.
Running R 2.11.1 (win32 x86) on a Windows 7 x64 machine.
ts comes from the stats package included with base R. It is useful for regular time series such as monthly, quarterly, annual, ... series common in goverment statistics. ts is used by arima() and other time series methods provided by base R and its stats packages. HoltWinters which you used here is one such example.
timeSeries is one of many add-on time series classes; this one comes from Rmetrics. Several CRAN Task Views discuss these more: TimeSeries, Econometrics as well as Finance.
Try the documentation on ts and/or HoltWinters to come to grips with the required format. ts uses either a fixed delta (eg 1/12 for monthly data) or frequency.
I've found the problem, studying the HoltWinters source code.
It turns out that the HoltWinters function, (for gamma=0, and if there is no seasonal component), expects gamma to be logical!! (zero = FALSE)
So, entering gamma as.logical(0) solves the bug.
Joris: thank you for the answer, that was illuminating.
It's two separate classes. ts is contained in the basic R installation, and the function HoltWinters() demands a ts time series.
timeSeries has a completely different structure. It's also specifically directed towards finances. The big difference with ts is that it allows for irregular timeseries. The class ts can only hold equispaced series.
Internally, ts has a slot "tsp" which contains the start, end and frequency of the timeseries.
> test <- ts(1:10, frequency = 4, start = c(1959, 2))
> slotNames(test)
[1] ".Data" "tsp" ".S3Class"
> slot(test,"tsp")
[1] 1959.25 1961.50 4.00
It's this slot that HoltWinters() needs but lacks in timeSeries. There the information on the times is contained in two slots, a position slot and a format slot. Together they define the times as a timeDate object.
> data = as.matrix(MSFT[, 4])
> charvec = rownames(MSFT)
> Close = timeSeries(data, charvec, units = "Close")
> slotNames(Close)
[1] ".Data" "units" "positions" "format" "FinCenter" "recordIDs" "title" "documentation"
> head(slot(Close,"positions"))
[1] 970012800 970099200 970185600 970444800 970531200 970617600
> slot(Close,"format")
[1] "%Y-%m-%d"

Resources