How do I add periods to time series in R after aggregation - r

I have a two variable dataframe (df) in R of daily sales for a ten year period from 2004-07-09 through 2014-12-31. Not every single date is represented in the ten year period, but pretty much most days Monday through Friday.
My objective is to aggregate sales by quarter, convert to a time series object, and run a seasonal decomposition and other time series forecasting.
I am having trouble with the conversion, as ulitmately I receive a error:
time series has no or less than 2 periods
Here's the structure of my code.
# create a time series object
library(xts)
x <- xts(df$amount, df$date)
# create a time series object aggregated by quarter
q.x <- apply.quarterly(x, sum)
When I try to run
fit <- stl(q.x, s.window = "periodic")
I get the error message
series is not periodic or has less than two periods
When I try to run
q.x.components <- decompose(q.x)
# or
decompose(x)
I get the error message
time series has no or less than 2 periods
So, how do I take my original dataframe, with a date variable and an amount variable (sales), aggregate that quarterly as a time series object, and then run a time series analysis?

I think I was able to answer my own question. I did this. Can anyone confirm if this structure makes sense?
library(lubridate)
# add a new variable indicating the calendar year.quarter (i.e. 2004.3) of each observation
df$year.quarter <- quarter(df$date, with_year = TRUE)
library(plyr)
# summarize gift amount by year.quarter
new.data <- ddply(df, .(year.quarter), summarize,
sum = round(sum(amount), 2))
# convert the new data to a quarterly time series object beginning
# in July 2004 (2004, Q3) and ending in December 2014 (2014, Q4)
nd.ts <- ts(new.data$sum, start = c(2004,3), end = c(2014,4), frequency = 4)

Related

Date Formatting in Time Series Codes

I have a .csv file that looks like this:
Date
Time
Demand
01-Jan-05
6:30
6
01-Jan-05
6:45
3
...
23-Jan-05
21:45
0
23-Jan-05
22:00
1
The days are broken into 15 minute increments from 6:30 - 22:00.
Now, I am trying to do a time series on this, but I am a little lost on the notation of this.
I have the following so far:
library(tidyverse)
library(forecast)
library(zoo)
tp <- read.csv(".csv")
tp.ts <- ts(tp$DEMAND, start = c(), end = c(), frequency = 63)
The frequency I am after is an entire day, which I believe makes the number 63.***
However, I am unsure as to how to notate the dates in c().
***Edit
If the frequency is meant to be observations per a unit of time, and I am trying to observe just (Demand) by the 15 minute time slots (Time) in each day (Date), maybe my Frequency is 1?
***Edit 2
So I think I am struggling with doing the time series because I have a Date column (which is characters) and a Time column.
Since I need the data for Demand at the given hours on the dates, maybe I need to convert the dates to be used in ts() and combine the Date and Time date into a new column?
If I do this, I am assuming this should give me the times I need (6:30 to 22:00) but with the addition of having the date?
However, the data is to be used to predict the Demand for the rest of the month. So maybe the Date is an important variable if the day of the week impacts Demand?
We assume you are starting with tp shown reproducibly in the Note at the end. A complete cycle of 24 * 4 = 96 points should be represented by one unit of time internally. The chron class does that so read it in as a zoo series z with chron time index and then convert that to ts giving ts_ser or possibly leave it as a zoo series depending on what you are going to do next.
library(zoo)
library(chron)
to_chron <- function(date, time) as.chron(paste(date, time), "%d-%b-%y %H:%M")
z <- read.zoo(tp, index = 1:2, FUN = to_chron, frequency = 4 * 24)
ts_ser <- as.ts(z)
Note
tp <- structure(list(Date = c("01-Jan-05", "01-Jan-05"), Time = c("6:30",
"6:45"), Demand = c(6L, 3L)), row.names = 1:2, class = "data.frame")

Understand frequency parameter while converting xts to ts object in R

What is the meaning of frequency below; when I have converted my xts object to ts object and tried printing ts object I got below information.
My data is hourly data. But I could not understand how this below frequency is calculated. I want to make sure my ts object is treating my data as hourly data.
Time Series:
Start = 1
End = 15548401
Frequency = 0.000277777777777778 (how this is equivalent to hourly frequency?)
So, My dataframe looks like below intitally:
y
1484337600 19.22819
1484341200 19.28906
1484344800 19.28228
1484348400 19.21669
1484352000 19.32759
1484355600 19.21833
1484359200 19.20626
1484362800 19.28737
1484366400 19.20651
1484370000 19.18424
It has epoch times and values. Epoch times are row.names in this dataframe.
Now, I converted into xts object using --
xts_dataframe <- xts(x = dataframe$y,
order.by = as.POSIXct(as.numeric(row.names(dataframe)), origin="1970-01-01"))
ts_dataframe <- as.ts(xts_dataframe)
Please suggest what I'm doing wrong? Basically I want to convert my initial dataframe to ts() object as I need to apply ARIMA on it. This data is per hour data. I'm really facing hard time to work with it.
The frequency is equivalent to 1/deltat, where deltat is the fraction of the sampling period between successive observations. ?frequency gives the example that deltat would be "1/12 for monthly data".
In the case of hourly data, deltat is 3600, since there are 3600 seconds in an hour. Since frequency = 1 / deltat, that means frequency = 1 / 3600, or 0.0002777778.

R: What does the frequency argument to xts do? [duplicate]

I'm creating an xts object with a weekly (7 day) frequency to use in forecasting. However, even when using the frequency=7 argument in the xts call, the resulting xts object has a frequency of 1.
Here's an example with random data:
> values <- rnorm(364, 10)
> days <- seq.Date(from=as.Date("2014-01-01"), to=as.Date("2014-12-30"), by='days')
> x <- xts(values, order.by=days, frequency=7)
> frequency(x)
[1] 1
I have also tried, after using the above code, frequency(x) <- 7. However, this changes the class of x to only zooreg and zoo, losing the xts class and messing with the time stamp formats.
Does xts automatically choose a frequency based on analyzing the data in some way? If so, how can you override this to set a specific frequency for forecasting purposes (in this case, passing a seasonal time series to ets from the forecast package)?
I understand that xts may not allow frequencies that don't make sense, but a frequency of 7 with daily time stamps seems pretty logical.
Consecutive Date class dates always have a frequency of 1 since consecutive dates are 1 apart. Use ts or zooreg to get a frequency of 7:
tt <- ts(values, frequency = 7)
library(zoo)
zr <- as.zooreg(tt)
# or
zr <- zooreg(values, frequency = 7)
These will create a series whose times are 1, 1+1/7, 1+2/7, ...
If we have some index values of zr
zrdates <- index(zr)[5:12]
we can recover the dates from zrdates like this:
days[match(zrdates, index(zr))]
As pointed out in the comments xts does not support this type of series.

Convert a time series from minutes to Day period

I've got an R time series object that is measured in 1 hour intervals.
library(lubridate)
library(timeSeries)
set.seed(100)
c <- Sys.time()
d <- c + hours(1:200)
e <- rnorm(200)
f <- data.frame(d,e)
g <- as.timeSeries(f)
I would like to convert this to a daily time, , I am fine with using the average or value of the data column for this conversion.
The outcome would be a time series object with one entry per day whose value is the average of all the hourly values of that particular day.
How can this be done?
First, take advantage of lubridate package to calculate date:
library(lubridate)
f$date <- floor_date(ymd_hms(f$d), "day")
Then, calculate average for given day with
library(dplyr)
dplyr::group_by(f, date) %>%
dplyr::summarise(avg = mean(e))
And use this for time series.

R - Daily data and Time Series by year and week

Hi I am new to R and this time series forecasting.
I have a sample data of sales by day for past 3 years and I would like to use this data set to produce plot to find seasonality and pattern.
My daily data format is like eg..
Date, Sales
2010-01-01, 5
2010-01-03, 3
2010-01-04, 2
..
2011-12-01, 4
..
2014-11-01, 1
What I want to see is similar to below plot but by week and year using ts function. Also, due to leap year some year has 53 weeks and some 52 weeks, any idea how this taken into account when plotting ?
Playing with this ts function is not easy to me so it will be great if someone could help with this ..
You should start by creating a ts object. Check ?ts for the syntax, but assuming your data above were stored in `data', it's basically
tsData <- ts(data, start=c(2010,1), frequency=365)
where start refers to the (year, month) and frequency is the number of samples per year. Then you can use plot.ts() to plot the entire time series
plot.ts(tsData)
To extract seasonal patterns or trends, you can use the decompose() function.
decompose(tsData)
Here is an sample
x <- 1:10
y <- 11:20
plot(x, y)
lines(x,y)
On your data sales and date.
You can replace y with date and x with sales. If you still have issue, plz post it to me.

Resources