How to use cut function on dates - r

I have the following two dates:
dates <- c("2019-02-01", "2019-06-30")
I want to create the following bins from above two dates:
2019-05-30, 2019-04-30, 2019-03-31, 2019-02-28
I used cut function along with seq,
dt <- as.Date(dates)
cut(seq(dt[1], dt[2], by = "month"), "month")
but this does not produce correct results.
Could you please shed some light on the use of cut function on dates?

We assume that what is wanted is all end of months between but not including the 2 dates in dates. In the question dates[1] is the beginning of the month and dates[2] is the end of the month but we do not assume that although if we did it might be simplified. We have produced descending series below but usually in R one uses ascending.
The first approach below uses a monthly sequence and cut and the second approach below uses a daily sequence.
No packages are used.
1) We define a first of the month function, fom, which given a Date or character date gives the Date of the first of the month using cut. Then we calculate monthly dates between the first of the months of the two dates, convert those to end of the month and then remove any dates that are not strictly between the dates in dates.
fom <- function(x) as.Date(cut(as.Date(x), "month"))
s <- seq(fom(dates[2]), fom(dates[1]), "-1 month")
ss <- fom(fom(s) + 32) - 1
ss[ss > dates[1] & ss < dates[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"
2) Another approach is to compute a daily sequence between the two elements of dates after converting to Date class and then only keep those for which the next day has a different month and is between the dates in dates. This does not use cut.
dt <- as.Date(dates)
s <- seq(dt[2], dt[1], "-1 day")
s[as.POSIXlt(s)$mon != as.POSIXlt(s+1)$mon & s > dt[1] & s < dt[2]]
## [1] "2019-05-31" "2019-04-30" "2019-03-31" "2019-02-28"

There is no need for cut here:
library(lubridate)
dates <- c("2019-02-01", "2019-06-30")
seq(min(ymd(dates)), max(ymd(dates)), by = "months") - 1
#> [1] "2019-01-31" "2019-02-28" "2019-03-31" "2019-04-30" "2019-05-31"
Created on 2021-11-25 by the reprex package (v2.0.1)

Related

How to round the date to the first day of the month when subtracting dates in R

I need to calculate the difference between the current month and the last.
I do so
library(lubridate)
last_month <- Sys.Date() - months(1)
and the result
> last_month
[1] "2022-12-04"
This is the correct answer, but I need the received date to always be from the first day
like this "2022-12-01".
That is, round it up to the first day of the month?
For example, when I do this in February,4 day , the result will be
"2023-01-04", but i need that it would be "2023-01-01".
How to round the date to the first day of the month when subtracting dates like this?
thanks for your help
You can use format, i.e.
format(Sys.Date() - months(1), '%Y-%m-01')
#[1] "2022-12-01"
You could use floor_date from lubridate based on month like this:
library(lubridate)
last_month <- Sys.Date() - months(1)
floor_date(last_month, "month")
#> [1] "2022-12-01"
Created on 2023-01-04 with reprex v2.0.2

R Convert number to month

I'm trying to build a time series. My data frame has each month listed as a number. When I use as.Date() I get NA. How do I convert a number to its respective month, as a date.
Example
R Base has a built in month dataset. make sure your numbers are actually numeric by as.numeric() and then you can just use month.name[1] which outputs January
Below we assume that the month numbers given are the number of months relative to a base of the first month so for example month 13 would represent 12 months after month 1. Also we assume that the months re unique since that is the case in the question and since it is stated there that it represents a time series.
1) Let base be the year and month as a yearmon class object identifying the base year/month and assume months is vector of month numbers such that 1 is the base, 2 is one month later and so on. Since yearmon class represents a year and month as year + 0 for Jan, year + 1/12 for Feb, ..., year + 11/12 for Dec we have the code below to get a Date vector. Alternately use ym instead since that models a year and month already.
library(zoo)
# inputs
base <- as.yearmon("2020-01")
months <- 1:9
ym <- base + (months-1)/12
as.Date(ym)
## [1] "2020-01-01" "2020-02-01" "2020-03-01" "2020-04-01" "2020-05-01"
## [6] "2020-06-01" "2020-07-01" "2020-08-01" "2020-09-01"
For example, if we have this data.frame we can convert that to a zoo series or a ts series like this using base from above:
library(zoo)
DF <- data.frame(month = 1:9, value = 11:19) # input
z <- with(DF, zoo(value, base + (month-1)/12)) # zoo series
tt <- as.ts(z) # ts series
2) Alternately, if it were known that the series is consecutive months starting in January 2020 then we could ignore the month column and do this (where DF and base were shown above):
library(zoo)
zz <- zooreg(DF$value, base, freq = 12) # zooreg series
as.ts(zz) # ts series
3) This would also work to create a ts series if we can make the same assumptions as in (2). This uses only base R.
ts(DF$value, start = 2020, freq = 12)

Adding quarters to R date

I have a R time series data, where I am calculating the means for all values up to a particular date, and storing this means in the date + 4 quarters. The dates are all month ends. To achieve this, I am looking to increment 4 quarters to a date. My question is how can I add 4 quarters to an R date data-type. An illustration:
a <- as.Date("2006-01-01")
b <- as.Date("2011-01-01")
date_range <- quarter(seq.Date(a, b, by = "quarter"), with_year = TRUE)
> date_range[1] + 1
[1] 2007.1
> date_range[1] + quarter(1)
[1] 2007.1
> date_range[1] + 0.25
[1] 2006.35
One possible way I am thinking is to get year-quarter dates, and then adding 4 to it. But wasn't sure what is the best way to do this?
The problem is that quarters have different lengths. Q1 is shortest because it includes February (though it ties with Q2 in leap years). Things like this make "adding a quarter to a date" poorly defined. Even adding months to a date can be tricky at the ends months - what is 1 month after January 31?
Beginnings of months are more straightforward, and I would recommend you use the 1st day of quarters rather than the last (if you must use a specific date). lubridate provides functions like floor_date() and ceiling_date() to which you can pass unit = "quarter" and they will return the first day of the current or subsequent quarter, respectively. You can also always add months(3) to a day at the beginning of a month, though of course if your intention is to add 4 quarters you may as well just add 1 year.
Just add 12 months or a year instead?
Or if it must be quarters, define yourself a function, like so:
quarters <- function(x) {
months(3*x)
}
and then use it to add to the date sequence:
date_range <- seq.Date(a, b, by = "quarter")
date_range + quarters(4)
Lubridate has a function for quarters already included. This is a much better solution than creating your own function.
https://www.rdocumentation.org/packages/lubridate/versions/1.7.4/topics/quarter
Old answer but to those arriving here, lubridate has a function %m+%that adds months and preserves monthends.
a <- as.Date("2006-01-01")
Add future months worth of dates:
The original poster wanted 4 quarters in future so that will be 12 months.
future_date <- a %m+% months(12)
future_date
[1] "2007-01-01"
You could also do years as the period:
future_date <- a %m+% years(1)
Remove months from date:
Subtract dates with %m-%
If you wanted a date 3 months ago from 1/1/2006:
past_date <- a %m-% months(3)
past_date
[1] "2005-10-01"
Example with dates not at end of months:
mplus will preserve days in month:
as.Date("2022-10-10") %m-% months(3)
[1] "2022-07-10"
For more, see documentation on "Add and subtract months to a date without exceeding the last day of the new month"
Note that other answers that use Date class will give irregularly spaced series and so are unsuitable for time series analysis.
To do this in such a way that time series analyses can be performed and noting the zoo tag on the question, the yearmon class represents year/month as year + fraction where fraction is 0 for Jan, 1/12 for Feb, 2/12 for Mar, ..., 11/12 for Dec. Thus adding 4 quarters is just a matter of adding 1. (Adding x quarters is done by adding x/4.)
library(zoo)
ym <- yearmon(2006) + 0:11/12 # months in 2006
ym + 1 # one year later
Also this converts yearmon objects to end-of-month Date and in the second line Date to yearmon. Using frac = 0 or omitting frac in the first line would convert to beginning of month dates.
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-months
as.yearmon(d) # convert Date vector to yearmon
If your input dates represent quarters then there is also the yearqtr class which represents a year/quarter as year + fraction where fraction is 0, 1/4, 2/4, 3/4 for the 4 quarters of a year. Adding 4 quarters is done by adding 1 (or to add x quarters add x/4).
yq <- as.yearqtr(2006) + 0:3/4 # all quarters in 2006
yq + 1 # one year later
Conversions work similarly to yearmon:
d <- as.Date(ym, frac = 1) # d is Date vector of end-of-quarters
as.yearqtr(d) # convert Date vector to yearqtr

sequence of monthly dates making sure it's the same day, or the last day of month in case of invalid

Given an initial date, I want to generate a sequence of dates with monthly intervals, ensuring every element has the same day as the initial date or the last day of the month in case the same day would yield an invalid date.
Sounds pretty standard, right?
Using difftime is not possible. Here's what the help file of difftime says:
Units such as "months" are not possible as they are not of constant
length. To create intervals of months, quarters or years use seq.Date
or seq.POSIXt.
But then looking at the help file of seq.POSIXt I find that:
Using "month" first advances the month without changing the day: if
this results in an invalid day of the month, it is counted forward
into the next month: see the examples.
This is the example in the help file.
seq(ISOdate(2000,1,31), by = "month", length.out = 4)
> seq(ISOdate(2000,1,31), by = "month", length.out = 4)
[1] "2000-01-31 12:00:00 GMT" "2000-03-02 12:00:00 GMT"
"2000-03-31 12:00:00 GMT" "2000-05-01 12:00:00 GMT"
So, given that the initial date is on day 31, this would yield invalid dates on February, April, etc. So, the sequence end up actually skipping those months because it "counts forward" and end up with March-02, instead of February-29.
If I start on 2000-01-31, I would like the sequence as follows:
2000-01-31
2000-02-29
2000-03-31
2000-04-30
...
And it should properly handle leap-years, so if the initial date is 2015-01-31 the sequence should be:
2015-01-31
2015-02-28
2015-03-31
2015-04-30
...
These are just examples to illustrate the problem and I do not know the initial date in advance, nor can I assume anything about it. The initial date may well be in the middle of the month (2015-01-15) in which case seq works fine. But it can also be, as in the examples, towards the end of the month on dates that using seq alone would be problematic (days 29, 30 and 31). I cannot assume either that the initial date is the last day of the month.
I have looked around trying to find a solution. In some questions here in SO (e.g. here) there is a "trick" to get the last day of a month, by getting the first day of the next month and simply subtract 1. And finding the first day is "easy" because it is just day 1.
So my solution so far is:
# Given an initial date for my sequence
initial_date <- as.Date("2015-01-31")
# Find the first day of the month
library(magrittr) # to use pipes and make the code more readable
firs_day_of_month <- initial_date %>%
format("%Y-%m") %>%
paste0("-01") %>%
as.Date()
# Generate a sequence from initial date, using seq
# This is the sequence that will have incorrect values in months that would
# have invalid dates
given_dat_seq <- seq(initial_date, by = "month", length.out = 4)
# And then generate an auxiliary sequence for the last day of the month
# I do this generating a sequence that starts the first day of the
# same month as initial date and it goes one month further
# (lenght 5 instead of 4) and substract 1 to all the elements
last_day_seq <- seq(firs_day_of_month, by = "month", length.out = 5)-1
# And finally, for each pair of elements, I take the min date of both
pmin(given_dat_seq, last_day_seq[2:5])
It works, but it is, at the same time, kinda dumb, hacky and convoluted. So I do not like it. And most importantly, I cannot believe there is no easier way to do this in R.
Can someone please point me to a simpler solution? (I guess it should have been as simple as seq(initial_date, "month", 4), but apparently it is not). I've googled it and looked here in SO and R mailing lists, but apart from the tricks I mentioned above, I couldn't find a solution.
The simplest solution is %m+% from lubridate, which solves this exact problem. So:
seq_monthly <- function(from,length.out) {
return(from %m+% months(c(0:(length.out-1))))
}
Output:
> seq_monthly(as.Date("2015-01-31"),length.out=4)
[1] "2015-01-31" "2015-02-28" "2015-03-31" "2015-04-30"
Similar to the lubridate answer, here is one using RcppBDT (which wraps the Boost Date.Time library from C++)
R> dt <- new(bdtDt, 2010, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2010-02-28"
[1] "2010-04-30"
[1] "2010-07-31"
[1] "2010-11-30"
[1] "2011-04-30"
R> dt <- new(bdtDt, 2000, 1, 31); for (i in 1:5) { dt$addMonths(i); print(dt) }
[1] "2000-02-29"
[1] "2000-04-30"
[1] "2000-07-31"
[1] "2000-11-30"
[1] "2001-04-30"
R>

Get beginning of next quarter from current Date in R

I am trying to get beginning of next quarter from the current date.
library(lubridate)
current_date = Sys.Date()
quarter(Sys.Date(), with_year= TRUE)
or from the function quarters.Date(Sys.Date())I could get quarter. I could not add quarter to the above code .
Are there are any other packages I can use or any other function in the default packages to do this?
1) zoo Convert to "yearqtr" class, add 1/4 and if you want the date at the end of the quarter apply as.Date using frac = 1
library(zoo)
today <- Sys.Date() # 2016-01-27
as.Date(as.yearqtr(today) + 1/4, frac = 1)
## [1] "2016-06-30"
Omit frac=1 if you want the start of the quarter. Omit as.Date if you want the "yearqtr" object:
as.yearqtr(today) + 1/4
[1] "2016 Q2"
2) base of R. This will give the beginning date of the next quarter with no packages. We use cut to get the beginning of the current quarter, convert to "Date" class and add enough days to get to the next quarter and apply cut and as.Date again:
as.Date(cut(as.Date(cut(today, "quarter")) + 100, "quarter"))
## [1] "2016-04-01"
If you want the end of the quarter add enough days to get to the second next quarter and subtact 1 day to get to the end of the prior quarter:
as.Date(cut(as.Date(cut(today, "quarter")) + 200, "quarter")) - 1
## [1] "2016-06-30"
The above answer does indeed work. If you would like to leverage the lubridate library, you can use the following:
ceiling_date(current_date, "quarter")
This will return the first date of the next quarter. Other helpful functions that are similar are floor_date and round_date. Both can be found in the lubridate documentation -
https://cran.r-project.org/web/packages/lubridate/lubridate.pdf.

Resources