Converting character variable into date variable with only years in R - r

I have a character variable with n observations per year, like this:
Years <- c("2010","2010","2011", "2011", "2012", "2012")
I would like R to read these characters as actual years, therefore I tried with
dates <- as.Date(Years, format = "%Y")
But the output obtained is this:
[1] "2010-09-24" "2010-09-24" "2011-09-24" "2011-09-24" "2012-09-24" "2012-09-24"
while I would like to keep them only with the year.
Is it possible to obtain a Date variable with only the years?
Thanks

Date class objects require a year, month and day. If you only have a year then you either have to add a month and day or not use Date class.
Also a time series should have one point for each index value. Aggregate the values corresonding to each year using mean, tail1 <- function(x) tail(x, 1) or other aggregation function so that there is only one point per year.
xts does not support just a year as the index but zoo does and if the series is regularly spaced a ts series could be used as well.
library(zoo)
# note that we are assuming a numeric year
DF <- data.frame(year = c(2010, 2010, 2011, 2012, 2012), value = 1:5)
z <- read.zoo(DF, aggregate = mean)
tt <- as.ts(z)
Another possibility is to use yearmon or yearqtr class. This has both a year and month or a year and a quarter but you don't need a day and internally January or Q1 is stored as a number equal to the year.
library(xts)
zm <- read.zoo(DF, FUN = as.yearmon, aggregate = mean)
xm <- as.xts(zm)
zq <- read.zoo(DF, FUN = as.yearqtr, aggregate = mean)
xq <- as.xts(zq)

Related

Time difference between quarters

I am currently working with dates in R and need to calculate the time difference between two quarters. I have used the zoo library to transform my dates into quarterly format, but I am struggling to calculate the difference between my dates.
Here is a sample code for reproducability:
sample_dataframe <- data.frame(First_Purchase_date = c(as.Date("2020-01-15"), as.Date("2019-02-10"),as.Date("2018-12-24")),Recent_Purchase_date = c(as.Date("2020-06-20"), as.Date("2020-10-10"), as.Date("2019-05-26")))
library(zoo)
#using zoo library to transform my dates into quarters
sample_dataframe$First_purchase_quarter <- as.yearqtr((sample_dataframe$First_Purchase_date), "%Y-%m-%d")
sample_dataframe$Recent_Purchase_quarter <- as.yearqtr((sample_dataframe$Recent_Purchase_date), "%Y-%m-%d")
What I want to achieve is to subtract Recent_Purchase_quarter from First_purchase_quarter to get a time difference in quarters.
So if Recent_Purchase_quarter is 2019 Q2 and First_Purchase_quarter is 2018 Q4 the result should be 2.
What would be the easiest way to get the time difference in quarters as described above?
#using zoo library to transform my dates into quarters
sample_dataframe$First_purchase_quarter <- as.yearqtr((sample_dataframe$First_Purchase_date), "%Y-%m-%d")
sample_dataframe$Recent_Purchase_quarter <- as.yearqtr((sample_dataframe$Recent_Purchase_date), "%Y-%m-%d")
sample_dataframe$diff <- (sample_dataframe[, 4] - sample_dataframe[, 3]) * 4
head(sample_dataframe$diff)
[1] 1 7 2

R Convert number to month

I'm trying to build a time series. My data frame has each month listed as a number. When I use as.Date() I get NA. How do I convert a number to its respective month, as a date.
Example
R Base has a built in month dataset. make sure your numbers are actually numeric by as.numeric() and then you can just use month.name[1] which outputs January
Below we assume that the month numbers given are the number of months relative to a base of the first month so for example month 13 would represent 12 months after month 1. Also we assume that the months re unique since that is the case in the question and since it is stated there that it represents a time series.
1) Let base be the year and month as a yearmon class object identifying the base year/month and assume months is vector of month numbers such that 1 is the base, 2 is one month later and so on. Since yearmon class represents a year and month as year + 0 for Jan, year + 1/12 for Feb, ..., year + 11/12 for Dec we have the code below to get a Date vector. Alternately use ym instead since that models a year and month already.
library(zoo)
# inputs
base <- as.yearmon("2020-01")
months <- 1:9
ym <- base + (months-1)/12
as.Date(ym)
## [1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01"
## [6] "2020-06-01" "2020-07-01" "2020-08-01" "2020-09-01"
For example, if we have this data.frame we can convert that to a zoo series or a ts series like this using base from above:
library(zoo)
DF <- data.frame(month = 1:9, value = 11:19) # input
z <- with(DF, zoo(value, base + (month-1)/12)) # zoo series
tt <- as.ts(z) # ts series
2) Alternately, if it were known that the series is consecutive months starting in January 2020 then we could ignore the month column and do this (where DF and base were shown above):
library(zoo)
zz <- zooreg(DF$value, base, freq = 12) # zooreg series
as.ts(zz) # ts series
3) This would also work to create a ts series if we can make the same assumptions as in (2). This uses only base R.
ts(DF$value, start = 2020, freq = 12)

Convert day of year to date assuming all years are non-leap years

I have a df with year and day of year as columns:
dat <- data.frame(year = rep(1980:2015, each = 365), day = rep(1:365,times = 36))
Please note that I am assuming 365 days in a year even if it is a leap year. I need to generate two things:
1) month
2) date
I did this:
# this tells me how many days in each month
months <- list(1:31, 32:59, 60:90, 91:120, 121:151, 152:181, 182:212, 213:243, 244:273, 274:304, 305:334, 335:365)
library(dplyr)
# this assigns each day to a month
dat1 <- dat %>% mutate(month = sapply(day, function(x) which(sapply(months, function(y) x %in% y))))
I want to produce a third column which is a date in the format year,month,day.
However, since I am assuming all years are non-leap years, I need to ensure that my dates also reflect this i.e. there should be no date as 29th Feb.
The reason I need to generate the date is because I want to generate number
of 15 days period of a year. A year will have 24 15-days period
1st Jan - 15th Jan: 1 period
16th Jan- 31st Jan: 2 period
1st Feb - 15th Feb: 3 period....
16th till 31st dec: 24th period)
I need dates to specify whether a day in a month falls in the first
half (i.e.d day <= 15) or second quarter (day > 15). I use the following
script to do this:
dat2 <- dat1 %>% mutate(twowk = month*2 - (as.numeric(format(date,"%d")) <= 15))
In order for me to run this above line, I need to generate date and hence my question.
A possible solution:
dat$dates <- as.Date(paste0(dat$year,'-',
format(strptime(paste0('1981-',dat$day), '%Y-%j'),
'%m-%d'))
)
What this does:
With strptime(paste0('1981-',dat$day), '%Y-%j') you get the dates of a non-leap year.
By embedding that in format with '%m-%d' you extract the month and the day in the month.
paste that together with the year in the year-column and wrap that in as.Date to get a non-leap-year date.

R How to use a complex function at seasonal period under hydroTSM and xts packages?

I want to calculate the seasonal mean of my parameter values (when x > 0.002). To do this, I use xts::period.apply() to separate the values seasonally. I use the "quarter" period in endpoints(), but the "quarter" period divides the year under four seasons as following:
"January+February+March",
"April+May+June",
"July+August+Septembre",
"October+November+December"
For example:
library(xts)
library(PerformanceAnalytics)
data(edhec)
head(edhec)
edhec_4yr <- edhec["1997/2001"]
ep <- endpoints(edhec_4yr, "quarter")
# mean
period.apply(edhec_4yr, INDEX = ep,
function(x) apply(x,2, function(y) mean(y[y>0.002])))
But for my study, I want my seasonal period divided as following:
"December+January+February",
"March+April+May",
"June+July+August",
"Septembre+October+November"
Can you help me how to change the order months of "quarter" period?
I can use the simple function (mean, max, min) under the hydroTSM package with the following function:
dm2seasonal(edhec_4yr, FUN=mean, season="DJF")
Where:
DJF : December, January, February
MAM : March, April, May
JJA : June, July, August
SON : September, October, November
But I cannot applied the complex function (mean with condition) as the following function:
dm2seasonal(edhec_4yr, season="DJF",
function(x) apply(x,2, function(y) mean(y[y>0.002])))
Can you help me how to improve this function in order to calculate mean value (when x > 0.02) for DJF for example?
The xts::endpoints() function always returns the last observation in a "standard" period, starting from the origin (midnight, 1970-01-01). So it can't easily do what you want.
You can calculate your own period end points by finding the observation on the last day of the last month in each 3-month window. Here's one way to do that with monthly data:
# .indexmon() returns a zero-based month
ep <- which((.indexmon(edhec_4yr) + 1) %in% c(2, 5, 8, 11))
aggfn <- function(x, bound = 0.002, ...) {
apply(x,2, function(y) mean(y[y > bound], ...))
}
period.apply(edhec_4yr, ep, aggfn)
If you have daily data, you need to find the last day of each month your periods end in. You can do that by using .indexmon() to find all months that end each season, then construct an xts object with the locations of all those observations in the original daily data object. Then you can use apply.monthly() and last() to extract the location of the last day of each season-ending month. The resulting object contains the end points you need to pass to period.apply().
data(prices)
prices <- as.xts(prices) # 'prices' is zoo; convert to xts
season_months <- (.indexmon(prices)+1) %in% c(2, 5, 8, 11)
ep_months <- xts(which(season_months), index(prices)[season_months])
ep_seasons <- as.numeric(apply.monthly(ep_months, last))
period.apply(prices, ep_seasons, aggfn)
And I should note that I'm thinking about how to specify end points in a more flexible manner, and I'll make sure to include a way to specify seasons.

Adding quarters to R date

I have a R time series data, where I am calculating the means for all values up to a particular date, and storing this means in the date + 4 quarters. The dates are all month ends. To achieve this, I am looking to increment 4 quarters to a date. My question is how can I add 4 quarters to an R date data-type. An illustration:
a <- as.Date("2006-01-01")
b <- as.Date("2011-01-01")
date_range <- quarter(seq.Date(a, b, by = "quarter"), with_year = TRUE)
> date_range[1] + 1
[1] 2007.1
> date_range[1] + quarter(1)
[1] 2007.1
> date_range[1] + 0.25
[1] 2006.35
One possible way I am thinking is to get year-quarter dates, and then adding 4 to it. But wasn't sure what is the best way to do this?
The problem is that quarters have different lengths. Q1 is shortest because it includes February (though it ties with Q2 in leap years). Things like this make "adding a quarter to a date" poorly defined. Even adding months to a date can be tricky at the ends months - what is 1 month after January 31?
Beginnings of months are more straightforward, and I would recommend you use the 1st day of quarters rather than the last (if you must use a specific date). lubridate provides functions like floor_date() and ceiling_date() to which you can pass unit = "quarter" and they will return the first day of the current or subsequent quarter, respectively. You can also always add months(3) to a day at the beginning of a month, though of course if your intention is to add 4 quarters you may as well just add 1 year.
Just add 12 months or a year instead?
Or if it must be quarters, define yourself a function, like so:
quarters <- function(x) {
months(3*x)
}
and then use it to add to the date sequence:
date_range <- seq.Date(a, b, by = "quarter")
date_range + quarters(4)
Lubridate has a function for quarters already included. This is a much better solution than creating your own function.
https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/quarter
Old answer but to those arriving here, lubridate has a function %m+%that adds months and preserves monthends.
a <- as.Date("2006-01-01")
Add future months worth of dates:
The original poster wanted 4 quarters in future so that will be 12 months.
future_date <- a %m+% months(12)
future_date
[1] "2007-01-01"
You could also do years as the period:
future_date <- a %m+% years(1)
Remove months from date:
Subtract dates with %m-%
If you wanted a date 3 months ago from 1/1/2006:
past_date <- a %m-% months(3)
past_date
[1] "2005-10-01"
Example with dates not at end of months:
mplus will preserve days in month:
as.Date("2022-10-10") %m-% months(3)
[1] "2022-07-10"
For more, see documentation on "Add and subtract months to a date without exceeding the last day of the new month"
Note that other answers that use Date class will give irregularly spaced series and so are unsuitable for time series analysis.
To do this in such a way that time series analyses can be performed and noting the zoo tag on the question, the yearmon class represents year/month as year + fraction where fraction is 0 for Jan, 1/12 for Feb, 2/12 for Mar, ..., 11/12 for Dec. Thus adding 4 quarters is just a matter of adding 1. (Adding x quarters is done by adding x/4.)
library(zoo)
ym <- yearmon(2006) + 0:11/12 # months in 2006
ym + 1 # one year later
Also this converts yearmon objects to end-of-month Date and in the second line Date to yearmon. Using frac = 0 or omitting frac in the first line would convert to beginning of month dates.
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-months
as.yearmon(d) # convert Date vector to yearmon
If your input dates represent quarters then there is also the yearqtr class which represents a year/quarter as year + fraction where fraction is 0, 1/4, 2/4, 3/4 for the 4 quarters of a year. Adding 4 quarters is done by adding 1 (or to add x quarters add x/4).
yq <- as.yearqtr(2006) + 0:3/4 # all quarters in 2006
yq + 1 # one year later
Conversions work similarly to yearmon:
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-quarters
as.yearqtr(d) # convert Date vector to yearqtr

Resources