Why subtracting months from a lubridate date gives inconsistent results? [duplicate] - r

I'm trying to subtract n months from a date as follows:
maturity <- as.Date("2012/12/31")
m <- as.POSIXlt(maturity)
m$mon <- m$mon - 6
but the resulting date is 01-Jul-2012, and not 30-Jun-2012, as I should expect.
Is there any short way to get such result?

1) seq.Date. Note that June has only 30 days so it cannot give June 31st thus instead it gives July 1st.
seq(as.Date("2012/12/31"), length = 2, by = "-6 months")[2]
## [1] "2012-07-01"
If we knew it was at month end we could do this:
seq(as.Date(cut(as.Date("2012/12/31"), "month")), length=2, by="-5 month")[2]-1
## "2012-06-30"
2) yearmon. Also if we knew it was month end then we could use the "yearmon" class of the zoo package like this:
library(zoo)
as.Date(as.yearmon(as.Date("2012/12/31")) -.5, frac = 1)
## [1] "2012-06-30"
This converts the date to "yearmon" subtracts 6 months (.5 of a year) and then converts it back to "Date" using frac=1 which means the end of the month (frac=0 would mean the beginning of the month). This also has the advantage over the previous solution that it is vectorized automatically, i.e. as.Date(...) could have been a vector of dates.
Note that if "Date" class is only being used as a way of representing months then we can get rid of it altogether and directly use "yearmon" since that models what we want in the first place:
as.yearmon("2012-12") - .5
## [1] "Jun 2012"
3) mondate. A third solution is the mondate package which has the advantage here that it returns the end of the month 6 months ago without having to know that we are month end:
library(mondate)
mondate("2011/12/31") - 6
## mondate: timeunits="months"
## [1] 2011/06/30
This is also vectorized.
4) lubridate. This lubridate answer has been changed in line with changes in the package:
library(lubridate)
as.Date("2012/12/31") %m-% months(6)
## [1] "2012-06-30"
lubridate is also vectorized.
5) sqldf/SQLite
library(sqldf)
sqldf("select date('2012-12-31', '-6 months') as date")
## date
## 1 2012-07-01
or if we knew we were at month end:
sqldf("select date('2012-12-31', '+1 day', '-6 months', '-1 day') as date")
## date
## 1 2012-06-30

you can use lubridate package for this
library(lubridate)
maturity <- maturity %m-% months(6)
there is no reason for changing the day field.
you can set your day field back to the last day in that month by
day(maturity) <- days_in_month(maturity)

lubridate works correctly with such calculations:
library(lubridate)
as.Date("2000-01-01") - days(1) # 1999-12-31
as.Date("2000-03-31") - months(1) # 2000-02-29
but sometimes fails:
as.Date("2000-02-29") - years(1) # NA, should be 1999-02-28

tidyverse has added the clock package in addition to the lubridate package that has nice functionality for this:
library(clock)
# sequence of dates
date_build(2018, 1:5, 31, invalid = "previous")
[1] "2018-01-31" "2018-02-28" "2018-03-31" "2018-04-30" "2018-05-31"
When the date is sequenced, 2018-02-31 is not a valid date. The invalid argument makes explicit what to do in this case: go to the last day of the "previous" valid date.
There is also a series add functions, but in your case you would use add_months. Again it has the invalid argument that you can specify:
x <- as.Date("2022-03-31")
# The previous valid moment in time
add_months(x, -1, invalid = "previous")
[1] "2022-02-28"
# The next valid moment in time, 2022-02-31 is not a valid date
add_months(x, -1, invalid = "next")
[1] "2022-03-01"
# Overflow the days. There were 28 days in February, 2020, but we
# specified 31. So this overflows 3 days past day 28.
add_months(x, -1, invalid = "overflow")
[1] "2022-03-03"
You can also specify invalid to be NA or if you leave off this argument you could get an error.

Technically you cannot add/subtract 1 month to all dates (although you can add/subtract 30 days to all dates, but I suppose, that's not something you want). I think this is what you are looking for
> lubridate::ceiling_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-02-01"
> lubridate::floor_date(as.Date("2020-01-31"), unit = "month")
[1] "2020-01-01"

UPDATE, I just realised that Tung-nguyen also wrote the same method and has a two line version here https://stackoverflow.com/a/44690219/19563460
Keeping this answer here so newbies can see different ways of doing it
With the R updates, you can now do this easily in base R using seq.date(). Here are some examples of implementing this that should work without additional packages
ANSWER 1: typing directly
maturity <- as.Date("2012/12/31")
seq(maturity, length.out=2, by="-3 months")[2]
# see here for more help
?seq.date
ANSWER 2: Adding in some flexibility, e.g. 'n' months
maturity <- as.Date("2012/12/31")
n <- 3
bytime <- paste("-",n," months",sep="")
seq(maturity,length.out=2,by=bytime)[2]
ANSWER 3: Make a function
# Here's a little function that will let you add X days/months/weeks
# to any base R date. Commented for new users
#---------------------------------------------------------
# MyFunction
# DateIn, either a date or a string that as.Date can convert into one
# TimeBack, number of units back/forward
# TimeUnit, unit of time e.g. "weeks"/"month"/"days"
# Direction can be "back" or "forward", not case sensitive
#---------------------------------------------------------
MyFunction <- function(DateIn,TimeBack,TimeUnit,Direction="back"){
#--- Set up the by string
if(tolower(Direction)=="back"){
bystring <- paste("-",TimeBack," ",tolower(TimeUnit),sep="")
}else{
bystring <- paste(TimeBack," ",tolower(TimeUnit),sep="")
}
#--- Return the new date using seq in the base package
output <- seq(as.Date(DateIn),length.out=2,by=bystring)[2]
return(output)
}
# EXAMPLES
MyFunction("2000-02-29",3,"months","forward")
Answer <- MyFunction(DateIn="2002-01-01",TimeBack=14,
TimeUnit="weeks",Direction="back")
print(Answer)
maturity <- as.Date("2012/12/31")
n <- 3
MyFunction(DateIn=maturity,TimeBack=n,TimeUnit="months",Direction="back")
ANSWER 4: I quite like my little function, so I just uploaded it to my mini personal R package.
This is freely available, so now technically the answer is use the JumpDate function from the Greatrex.Functions package
Can't guarantee it'll work forever and no support available, but you're welcome to use it.
# Install/load my package
install.packages("remotes")
remotes::install_github('hgreatrex/Greatrex.Functions',force=TRUE)
library(Greatrex.Functions)
# run it
maturity <- as.Date("2012/12/31")
n <- 3
Answer <- JumpDate(DateIn=maturity,TimeBack=n,TimeUnit="months",
Direction="back",verbose=TRUE)
print(Answer)
JumpDate("2000-02-29",3,"months","forward")
# Help file here
?Greatrex.Functions::JumpDate
You can see how I made the function/package here:
https://github.com/hgreatrex/Greatrex.Functions/blob/master/R/JumpDate.r
With nice instructions here on making your own mini compilation of functions.
http://web.mit.edu/insong/www/pdf/rpackage_instructions.pdf
and here
How do I insert a new function into my R package?
Hope that helps! I hope it's also useful to see the different levels of designing an answer to a coding problem, depending on how often you need it and the level of flexibility you need.

Related

Difference of Dates in R [duplicate]

I have a date in R, e.g.:
dt = as.Date('2010/03/17')
I would like to subtract 2 years from this date, without worrying about leap years and such issues, getting as.Date('2008-03-17').
How would I do that?
With lubridate
library(lubridate)
ymd("2010/03/17") - years(2)
The easiest thing to do is to convert it into POSIXlt and subtract 2 from the years slot.
> d <- as.POSIXlt(as.Date('2010/03/17'))
> d$year <- d$year-2
> as.Date(d)
[1] "2008-03-17"
See this related question: How to subtract days in R?.
You could use seq:
R> dt = as.Date('2010/03/17')
R> seq(dt, length=2, by="-2 years")[2]
[1] "2008-03-17"
If leap days are to be taken into account then I'd recommend using this lubridate function to subtract months, as other methods will return either March 1st or NA:
> library(lubridate)
> dt %m-% months(12*2)
[1] "2008-03-17"
# Try with leap day
> leapdt <- as.Date('2016/02/29')
> leapdt %m-% months(12*2)
[1] "2014-02-28"
Same answer than the one by rcs but with the possibility to operate it on a vector (to answer to MichaelChirico, I can't comment I don't have enough rep):
R> unlist(lapply(c("2015-12-01", "2016-12-01"),
function(x) { return(as.character(seq(as.Date(x), length=2, by="-1 years")[2])) }))
[1] "2014-12-01" "2015-12-01"
This way seems to do the job as well
dt = as.Date("2010/03/17")
dt-365*2
[1] "2008-03-17"
as.Date("2008/02/29")-365*2
## [1] "2006-03-01"
cur_date <- str_split(as.character(Sys.Date()), pattern = "-")
cur_yr <- cur_date[[1]][1]
cur_month <- cur_date[[1]][2]
cur_day <- cur_date[[1]][3]
new_year <- as.integer(year) - 2
new_date <- paste(new_year, cur_month, cur_day, sep="-")
Using Base R, you can simply use the following without installing any package.
1) Transform your character string to Date format, specifying the input format in the second argument, so R can correctly interpret your date format.
dt = as.Date('2010/03/17',"%Y/%m/%d")
NOTE: If you look now at your enviroment tab you will see dt as variable with the following value "2010-03-17" (Year-month-date separated by "-" not by "/")
2) specify how many years to substract
years_substract=2
3) Use paste() combined with format () to only keep Month and Day and Just substract 2 year from your original date. Format() function will just keep the specific part of your date accordingly with format second argument.
dt_substract_2years<-
as.Date(paste(as.numeric(format(dt,"%Y"))-years_substract,format(dt,"%m"),format(dt,"%d"),sep = "-"))
NOTE1: We used paste() function to concatenate date components and specify separator as "-" (sep = "-")as is the R separator for dates by default.
NOTE2: We also used as.numeric() function to transform year from character to numeric

Converting non-standard date format strings ("April-20") to date objects R

I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"

how to arrange false(no exist) character column to date in r?(similar Converting character column to Date in r)

ymd("2011-11-31")
All formats failed to parse. No formats found.[1] NA
2011-11 have 30 days not 31 so ymd get failed state.
My data have some false date in date column like this and I want to learn elegant way to handle.
is there any package or function that data to turn like this "2011-12-01"?
Not that I know of, but you could define your own function to handle it.
Here I take the year-month portion of the date and then add the number of days on and let it wrap into the next month (or even year) if required.
# two invalid, one valid date
x <- c("2011-11-31", "2000-04-31", "2010-01-10", "2011-12-32")
parse_bad_dates <- function(x) {
as.Date(paste(substr(x, 1, 7), "1"), format="%Y-%m %d") +
as.numeric(substr(x, 9, 10)) - 1
}
parse_bad_dates(x)
#[1] "2011-12-01" "2000-05-01" "2010-01-10" "2012-01-01"
Similar answer here but works with rolling months and years too
library(lubridate)
d <- c("2011-11-31",'2011-13-04','2011-12-32')
parse_false_date <- function(d) {
x <- strcapture("(\\d{4})-(\\d{2})-(\\d{2})", d,
data.frame(y=integer(),m=integer(),d=integer()))
make_date(x$y)+months(x$m-1)+days(x$d-1)
}
parse_false_date(d)
#> [1] "2011-12-01" "2012-01-04" "2012-01-01"

Return date automatically in R [duplicate]

I have a date in R, e.g.:
dt = as.Date('2010/03/17')
I would like to subtract 2 years from this date, without worrying about leap years and such issues, getting as.Date('2008-03-17').
How would I do that?
With lubridate
library(lubridate)
ymd("2010/03/17") - years(2)
The easiest thing to do is to convert it into POSIXlt and subtract 2 from the years slot.
> d <- as.POSIXlt(as.Date('2010/03/17'))
> d$year <- d$year-2
> as.Date(d)
[1] "2008-03-17"
See this related question: How to subtract days in R?.
You could use seq:
R> dt = as.Date('2010/03/17')
R> seq(dt, length=2, by="-2 years")[2]
[1] "2008-03-17"
If leap days are to be taken into account then I'd recommend using this lubridate function to subtract months, as other methods will return either March 1st or NA:
> library(lubridate)
> dt %m-% months(12*2)
[1] "2008-03-17"
# Try with leap day
> leapdt <- as.Date('2016/02/29')
> leapdt %m-% months(12*2)
[1] "2014-02-28"
Same answer than the one by rcs but with the possibility to operate it on a vector (to answer to MichaelChirico, I can't comment I don't have enough rep):
R> unlist(lapply(c("2015-12-01", "2016-12-01"),
function(x) { return(as.character(seq(as.Date(x), length=2, by="-1 years")[2])) }))
[1] "2014-12-01" "2015-12-01"
This way seems to do the job as well
dt = as.Date("2010/03/17")
dt-365*2
[1] "2008-03-17"
as.Date("2008/02/29")-365*2
## [1] "2006-03-01"
cur_date <- str_split(as.character(Sys.Date()), pattern = "-")
cur_yr <- cur_date[[1]][1]
cur_month <- cur_date[[1]][2]
cur_day <- cur_date[[1]][3]
new_year <- as.integer(year) - 2
new_date <- paste(new_year, cur_month, cur_day, sep="-")
Using Base R, you can simply use the following without installing any package.
1) Transform your character string to Date format, specifying the input format in the second argument, so R can correctly interpret your date format.
dt = as.Date('2010/03/17',"%Y/%m/%d")
NOTE: If you look now at your enviroment tab you will see dt as variable with the following value "2010-03-17" (Year-month-date separated by "-" not by "/")
2) specify how many years to substract
years_substract=2
3) Use paste() combined with format () to only keep Month and Day and Just substract 2 year from your original date. Format() function will just keep the specific part of your date accordingly with format second argument.
dt_substract_2years<-
as.Date(paste(as.numeric(format(dt,"%Y"))-years_substract,format(dt,"%m"),format(dt,"%d"),sep = "-"))
NOTE1: We used paste() function to concatenate date components and specify separator as "-" (sep = "-")as is the R separator for dates by default.
NOTE2: We also used as.numeric() function to transform year from character to numeric

Converting dates before January 1, 1970 in R

I am trying to convert a column of dates into Date objects in R, but I can't seem to get the desired results. These individuals have birth dates before January 1, 1970, so when I use as.Date R converts a date like 1/12/54, for example, to 2054-01-12. How can I work around this? Thanks so much.
No need for add-on packages, base R is fine. But you need to specify the century:
R> as.Date("1954-01-12")
[1] "1954-01-12"
R>
If you need non-default formats, just specify them:
R> as.Date("19540112", "%Y%m%d")
[1] "1954-01-12"
R>
Edit: In case your data really comes in using the %y% format, and you happen to make the policy decision that the 19th century is needed
, here is one base R way of doing it:
R> d <- as.Date("540112", "%y%m%d")
R> dlt <- as.POSIXlt(d)
R> dlt$year <- dlt$year - 100
R> as.Date(dlt)
[1] "1954-01-12"
R>
If everything is in the 1900s, its a one-liner - just format it with a two-digit year at the start and slap a 19 on the front and convert to a date. Again. Man this would look cool some %>% stuff:
s = c("1/12/54","1/12/74")
as.Date(format(as.Date(s,format="%d/%m/%y"), "19%y%m%d"), "%Y%m%d")
# [1] "1954-12-01" "1974-12-01"
If years from "69" to "99" are 1800s, then here's another one-liner:
library(dplyr) # for pipe operator:
s %>% as.Date(format="%d/%m/%y") %>%
format("%y%m%d") %>%
(function(d){
paste0(ifelse(d>700101,"18","19"),d)
}) %>%
as.Date("%Y%m%d")
## [1] "1954-12-01" "1874-12-01"
Note not thoroughly tested so might be some off-by-one errors or I've mixed months and days because you need to be ISO8601 Compliant
I would do:
library(lubridate)
x <- as.Date("1/12/54", format = "%m/%d/%y")
year(x) <- 1900 + year(x) %% 100
> x
[1] "1954-01-12"

Resources