Using lubridate, how to calculate the last day of the previous quarter for a given date?
The below formula doesn't seem to work for Nov 3rd, 2014 (other dates work)
library(lubridate)
date = as.POSIXct("2014-11-03")
date - days(day(date)) - months(month(date) %% 3 - 1)
# NA
Interesting enough, changing order works:
date - months(month(date) %% 3 - 1) - days(day(date))
# "2014-09-30 UTC"
Here are some possibilities with functions from packages zoo and timeDate, and base R. The zoo code was improved by #G.Grothendieck, and he also suggested the base alternative (thanks a lot!). I leave the lubridate solution(s) to someone else.
First, use class yearqtr in package zoo to represent the quarterly data. You may then use as.Date.yearqtr and the frac argument "which is a number between 0 and 1 inclusive that indicates the fraction of the way through the period that the result represents. The default is 0 which means the beginning of the period" (see ?yearqtr, and ?yearmon for frac).
Step by step:
library(zoo)
date <- as.Date("2014-11-03")
# current quarter
current_q <- as.yearqtr(date)
current_q
# [1] "2014 Q4"
# first date in current quarter
first_date_current_q <- as.Date(current_q, frac = 0)
first_date_current_q
# [1] "2014-10-01"
# last date in previous quarter
last_date_prev_q <- first_date_current_q - 1
last_date_prev_q
# [1] "2014-09-30"
And a short version by #G.Grothendieck (thanks!)
as.Date(as.yearqtr(date)) - 1
# [1] "2014-09-30"
A nice base R solution by #G.Grothendieck
as.Date(cut(date, "quarter")) - 1
# [1] "2014-09-30"
Another possibility is to use timeFirstDayInQuarter and timeLastDayInQuarter functions in package timeDate:
library(timeDate)
timeLastDayInQuarter(timeFirstDayInQuarter(date) - 1)
# GMT
# [1] [2014-09-30]
I know this is old, but since this question specifically asks for a lubridate solution and I couldn't find working code for it elsewhere, I figured I'd post, keeping in mind that the base R solution is likely the most succinct:
> sampleDate <- ymd('2015-11-11')
> yearsToSample <- years(year(sampleDate) - year(origin))
> yearsToSample
[1] "45y 0m 0d 0H 0M 0S"
> additionalMonths <- months((quarter(sampleDate) - 1) * 3)
> additionalMonths
[1] "9m 0d 0H 0M 0S"
> startOfQuarter <- ymd(origin) + yearsToSample + additionalMonths - days(1)
> startOfQuarter
[1] "2015-09-30 UTC"
Confirmed that this works for dates previous to the origin ('1970-01-01') as well.
Here is a pure lubridate solution for 2021:
> date <- ymd("2014-11-03")
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
Breaking it down:
> # The year and quarter
> quarter(date, with_year = TRUE)
[1] 2014.4
> # The first day of the quarter as a date
> yq(quarter(date, with_year = TRUE))
[1] "2014-10-01"
# The day before the first day of the quarter as a date
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
Related
I'm trying to subtract n months from a date as follows:
maturity <- as.Date("2012/12/31")
m <- as.POSIXlt(maturity)
m$mon <- m$mon - 6
but the resulting date is 01-Jul-2012, and not 30-Jun-2012, as I should expect.
Is there any short way to get such result?
1) seq.Date. Note that June has only 30 days so it cannot give June 31st thus instead it gives July 1st.
seq(as.Date("2012/12/31"), length = 2, by = "-6 months")[2]
## [1] "2012-07-01"
If we knew it was at month end we could do this:
seq(as.Date(cut(as.Date("2012/12/31"), "month")), length=2, by="-5 month")[2]-1
## "2012-06-30"
2) yearmon. Also if we knew it was month end then we could use the "yearmon" class of the zoo package like this:
library(zoo)
as.Date(as.yearmon(as.Date("2012/12/31")) -.5, frac = 1)
## [1] "2012-06-30"
This converts the date to "yearmon" subtracts 6 months (.5 of a year) and then converts it back to "Date" using frac=1 which means the end of the month (frac=0 would mean the beginning of the month). This also has the advantage over the previous solution that it is vectorized automatically, i.e. as.Date(...) could have been a vector of dates.
Note that if "Date" class is only being used as a way of representing months then we can get rid of it altogether and directly use "yearmon" since that models what we want in the first place:
as.yearmon("2012-12") - .5
## [1] "Jun 2012"
3) mondate. A third solution is the mondate package which has the advantage here that it returns the end of the month 6 months ago without having to know that we are month end:
library(mondate)
mondate("2011/12/31") - 6
## mondate: timeunits="months"
## [1] 2011/06/30
This is also vectorized.
4) lubridate. This lubridate answer has been changed in line with changes in the package:
library(lubridate)
as.Date("2012/12/31") %m-% months(6)
## [1] "2012-06-30"
lubridate is also vectorized.
5) sqldf/SQLite
library(sqldf)
sqldf("select date('2012-12-31', '-6 months') as date")
## date
## 1 2012-07-01
or if we knew we were at month end:
sqldf("select date('2012-12-31', '+1 day', '-6 months', '-1 day') as date")
## date
## 1 2012-06-30
you can use lubridate package for this
library(lubridate)
maturity <- maturity %m-% months(6)
there is no reason for changing the day field.
you can set your day field back to the last day in that month by
day(maturity) <- days_in_month(maturity)
lubridate works correctly with such calculations:
library(lubridate)
as.Date("2000-01-01") - days(1) # 1999-12-31
as.Date("2000-03-31") - months(1) # 2000-02-29
but sometimes fails:
as.Date("2000-02-29") - years(1) # NA, should be 1999-02-28
tidyverse has added the clock package in addition to the lubridate package that has nice functionality for this:
library(clock)
# sequence of dates
date_build(2018, 1:5, 31, invalid = "previous")
[1] "2018-01-31" "2018-02-28" "2018-03-31" "2018-04-30" "2018-05-31"
When the date is sequenced, 2018-02-31 is not a valid date. The invalid argument makes explicit what to do in this case: go to the last day of the "previous" valid date.
There is also a series add functions, but in your case you would use add_months. Again it has the invalid argument that you can specify:
x <- as.Date("2022-03-31")
# The previous valid moment in time
add_months(x, -1, invalid = "previous")
[1] "2022-02-28"
# The next valid moment in time, 2022-02-31 is not a valid date
add_months(x, -1, invalid = "next")
[1] "2022-03-01"
# Overflow the days. There were 28 days in February, 2020, but we
# specified 31. So this overflows 3 days past day 28.
add_months(x, -1, invalid = "overflow")
[1] "2022-03-03"
You can also specify invalid to be NA or if you leave off this argument you could get an error.
Technically you cannot add/subtract 1 month to all dates (although you can add/subtract 30 days to all dates, but I suppose, that's not something you want). I think this is what you are looking for
> lubridate::ceiling_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-02-01"
> lubridate::floor_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-01-01"
UPDATE, I just realised that Tung-nguyen also wrote the same method and has a two line version here https://stackoverflow.com/a/44690219/19563460
Keeping this answer here so newbies can see different ways of doing it
With the R updates, you can now do this easily in base R using seq.date(). Here are some examples of implementing this that should work without additional packages
ANSWER 1: typing directly
maturity <- as.Date("2012/12/31")
seq(maturity, length.out=2, by="-3 months")[2]
# see here for more help
?seq.date
ANSWER 2: Adding in some flexibility, e.g. 'n' months
maturity <- as.Date("2012/12/31")
n <- 3
bytime <- paste("-",n," months",sep="")
seq(maturity,length.out=2,by=bytime)[2]
ANSWER 3: Make a function
# Here's a little function that will let you add X days/months/weeks
# to any base R date. Commented for new users
#---------------------------------------------------------
# MyFunction
# DateIn, either a date or a string that as.Date can convert into one
# TimeBack, number of units back/forward
# TimeUnit, unit of time e.g. "weeks"/"month"/"days"
# Direction can be "back" or "forward", not case sensitive
#---------------------------------------------------------
MyFunction <- function(DateIn,TimeBack,TimeUnit,Direction="back"){
#--- Set up the by string
if(tolower(Direction)=="back"){
bystring <- paste("-",TimeBack," ",tolower(TimeUnit),sep="")
}else{
bystring <- paste(TimeBack," ",tolower(TimeUnit),sep="")
}
#--- Return the new date using seq in the base package
output <- seq(as.Date(DateIn),length.out=2,by=bystring)[2]
return(output)
}
# EXAMPLES
MyFunction("2000-02-29",3,"months","forward")
Answer <- MyFunction(DateIn="2002-01-01",TimeBack=14,
TimeUnit="weeks",Direction="back")
print(Answer)
maturity <- as.Date("2012/12/31")
n <- 3
MyFunction(DateIn=maturity,TimeBack=n,TimeUnit="months",Direction="back")
ANSWER 4: I quite like my little function, so I just uploaded it to my mini personal R package.
This is freely available, so now technically the answer is use the JumpDate function from the Greatrex.Functions package
Can't guarantee it'll work forever and no support available, but you're welcome to use it.
# Install/load my package
install.packages("remotes")
remotes::install_github('hgreatrex/Greatrex.Functions',force=TRUE)
library(Greatrex.Functions)
# run it
maturity <- as.Date("2012/12/31")
n <- 3
Answer <- JumpDate(DateIn=maturity,TimeBack=n,TimeUnit="months",
Direction="back",verbose=TRUE)
print(Answer)
JumpDate("2000-02-29",3,"months","forward")
# Help file here
?Greatrex.Functions::JumpDate
You can see how I made the function/package here:
https://github.com/hgreatrex/Greatrex.Functions/blob/master/R/JumpDate.r
With nice instructions here on making your own mini compilation of functions.
http://web.mit.edu/insong/www/pdf/rpackage_instructions.pdf
and here
How do I insert a new function into my R package?
Hope that helps! I hope it's also useful to see the different levels of designing an answer to a coding problem, depending on how often you need it and the level of flexibility you need.
I have a set of times in milliseconds that I want to convert to hh: mm. An example dataset would be:
data <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
I get this with manual calculation but it does not provide me the correct value for the minutes.
> data/1000/60/60
[1] 1.649028 1.510694 1.068167 2.066639 1.522278 1.917750
I tried this
format(as.POSIXct(Sys.Date())+data, "%H:%M")
[1] "12:01" "17:41" "07:10" "21:38" "05:16" "16:45"
but that is not even close. Any thoughts on that?
Thanks!
hrs = data/(60 * 60 * 1000)
mins = (hrs %% 1) * 60
secs = (mins %% 1) * 60
paste(trunc(hrs), trunc(mins), round(secs, 2), sep = ":")
#[1] "1:38:56.5" "1:30:38.5" "1:4:5.4" "2:3:59.9" "1:31:20.2" "1:55:3.9"
Also,
library(lubridate)
seconds_to_period(data/1000)
#[1] "1H 38M 56.5S" "1H 30M 38.5S" "1H 4M 5.40000000000009S"
#[4] "2H 3M 59.8999999999996S" "1H 31M 20.1999999999998S" "1H 55M 3.89999999999964S"
The zero point we can get by doing:
strftime(as.POSIXlt.numeric(0, format="%OS", origin="1970-01-01") - 7200, format="%R")
# [1] "00:00"
Accordingly:
t.adj <- 0
res <- strftime(as.POSIXlt.numeric(v/1000, format="%OS", origin="1970-01-01") - t.adj*3600,
format="%R", tz="GMT")
res
# [1] "01:38" "01:30" "01:04" "02:03" "01:31" "01:55"
class(res)
# [1] "character"
The date doesn't matter, since:
class(res)
# [1] "character"
Note, that this solution might depend on your Sys.getlocale("LC_TIME"). In the solution above there is an optional hour adjustment t.adj*, however in my personal locale it's set to zero to yield the right values.
Data
v <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
*To automate the localization you may want to look into the answers to this question.
I tried to generate a sequence of dates between two dates. By search all the old posts, I found very nice solution using seq.Date.
For example:
> seq.Date(as.Date("2016/1/15"), as.Date("2016/5/1"), by = "month")
[1] "2016-01-15" "2016-02-15" "2016-03-15" "2016-04-15"
The above function yields very nice solution. However, it doesnt work when the date is 30 or 31 in Jan.
> seq.Date(as.Date("2016/1/30"), as.Date("2016/5/1"), by = "month")
[1] "2016-01-30" "2016-03-01" "2016-03-30" "2016-04-30"
The second anniversary jumps to March instead of being capped at 29/Feb. I couldnt find a workaround for this.
Here's an approach that also works in other cases:
library(lubridate)
fun <- function(from, to, by) {
mySeq <- seq.Date(as.Date(from), as.Date(to), by = by)
as.Date(sapply(mySeq, function(d) d + 1 - which.max(day(d - 0:3))), origin = "1970-01-01")
}
fun("2016/1/30", "2016/5/1", "month")
# [1] "2016-01-30" "2016-02-29" "2016-03-30" "2016-04-30"
fun("2017/1/31", "2017/5/1", "month")
# [1] "2017-01-31" "2017-02-28" "2017-03-31" "2017-04-30"
fun("2017/1/29", "2017/5/1", "month")
# [1] "2017-01-29" "2017-02-28" "2017-03-29" "2017-04-29"
What fun does is that it subtracts 0:3 from each date and chooses the one that has the largest day.
With lubridate package
library('lubridate')
pmin(
ymd('2018-01-30') + months(0:11), # NA where month goes over
ymd('2018-01-01') + months(1:12) - days(1), # last day of month
na.rm = T
)
[1] "2018-01-30" "2018-02-28" "2018-03-30"
[4] "2018-04-30" "2018-05-30" "2018-06-30"
[7] "2018-07-30" "2018-08-30" "2018-09-30"
[10] "2018-10-30" "2018-11-30" "2018-12-30"
I am trying to convert a column of dates into Date objects in R, but I can't seem to get the desired results. These individuals have birth dates before January 1, 1970, so when I use as.Date R converts a date like 1/12/54, for example, to 2054-01-12. How can I work around this? Thanks so much.
No need for add-on packages, base R is fine. But you need to specify the century:
R> as.Date("1954-01-12")
[1] "1954-01-12"
R>
If you need non-default formats, just specify them:
R> as.Date("19540112", "%Y%m%d")
[1] "1954-01-12"
R>
Edit: In case your data really comes in using the %y% format, and you happen to make the policy decision that the 19th century is needed
, here is one base R way of doing it:
R> d <- as.Date("540112", "%y%m%d")
R> dlt <- as.POSIXlt(d)
R> dlt$year <- dlt$year - 100
R> as.Date(dlt)
[1] "1954-01-12"
R>
If everything is in the 1900s, its a one-liner - just format it with a two-digit year at the start and slap a 19 on the front and convert to a date. Again. Man this would look cool some %>% stuff:
s = c("1/12/54","1/12/74")
as.Date(format(as.Date(s,format="%d/%m/%y"), "19%y%m%d"), "%Y%m%d")
# [1] "1954-12-01" "1974-12-01"
If years from "69" to "99" are 1800s, then here's another one-liner:
library(dplyr) # for pipe operator:
s %>% as.Date(format="%d/%m/%y") %>%
format("%y%m%d") %>%
(function(d){
paste0(ifelse(d>700101,"18","19"),d)
}) %>%
as.Date("%Y%m%d")
## [1] "1954-12-01" "1874-12-01"
Note not thoroughly tested so might be some off-by-one errors or I've mixed months and days because you need to be ISO8601 Compliant
I would do:
library(lubridate)
x <- as.Date("1/12/54", format = "%m/%d/%y")
year(x) <- 1900 + year(x) %% 100
> x
[1] "1954-01-12"
Are there any built on functions that can be used on a data frame object to generate variables on a class Date time series to create day of the Week, Month, Year, Week of the Year, etc in R?
The weekdays, months, quarters, functions in the base package generate text output, looking for numerical output to denote that 3/5/2012, for example, is a Friday, 3rd day of the month, 1 week of the month, and the 63 day of the year, etc.
You get a few of those just from POSIXlt, with its weird convention. Year needs to 1900,
month is on the 0 to 11 range -- but you do get weekday and day-of-the-year.
R> dd <- as.Date("2012-05-03")
R> as.POSIXlt(dd)
[1] "2012-05-03 UTC"
Then
R> unclass(as.POSIXlt(dd))
$sec
[1] 0
$min
[1] 0
$hour
[1] 0
$mday
[1] 3
$mon
[1] 4
$year
[1] 112
$wday
[1] 4
$yday
[1] 123
$isdst
[1] 0
attr(,"tzone")
[1] "UTC"
R>
You can use the lubridate package to do a lot with dates.
From the help file: Lubridate provides tools that make it easier to parse and manipulate dates.
For example:
> library(lubridate)
> d <- today()
> d
[1] "2014-04-29"
> day(d)
[1] 29
> month(d)
[1] 4
> year(d)
[1] 2014
> week(d)
[1] 18
> weekdays(d)
[1] "Tuesday"
> days_in_month(d)
Apr
30
I prefer it to the built-in functions because it has a lot of date splicing, casting and arithmetic functions.
There are a couple of options that I can think of.
First, you could use the class as.POSIXlt so that you can subset with things like df$date$yday. The as.POSIXlt() includes the elements of dates as a list underneath that can be accessed that way.
Also, the package lubridate has functions like
yday(x)
wday(x)
mday(x)