This question already has answers here:
How to subtract months from a date in R?
(6 answers)
Closed 4 years ago.
I am trying to add a month to a date i have. But then its not possible in a straight manner so far. Following is what i tried.
d <- as.Date("2004-01-31")
d + 60
# [1] "2004-03-31"
Adding wont help as the month wont be overlapped.
seq(as.Date("2004-01-31"), by = "month", length = 2)
# [1] "2004-01-31" "2004-03-02"
Above might work , but again its not straight forward.
Also its also adding 30 days or something to the date which has issues like the below
seq(as.Date("2004-01-31"), by = "month", length = 10)
# [1] "2004-01-31" "2004-03-02" "2004-03-31" "2004-05-01" "2004-05-31" "2004-07-01" "2004-07-31" "2004-08-31" "2004-10-01" "2004-10-31"
In the above , for the first 2 dates , month haven’t changed.
Also the following approach also failed for month but was success for year
d <- as.POSIXlt(as.Date("2010-01-01"))
d$year <- d$year +1
d
# [1] "2011-01-01 UTC"
d <- as.POSIXlt(as.Date("2010-01-01"))
d$month <- d$month +1
d
Error in format.POSIXlt(x, usetz = TRUE) : invalid 'x' argument
What is the right method to do this ?
Function %m+% from lubridate adds one month without exceeding last day of the new month.
library(lubridate)
(d <- ymd("2012-01-31"))
1 parsed with %Y-%m-%d
[1] "2012-01-31 UTC"
d %m+% months(1)
[1] "2012-02-29 UTC"
It is ambiguous when you say "add a month to a date".
Do you mean
add 30 days?
increase the month part of the date by 1?
In both cases a whole package for a simple addition seems a bit exaggerated.
For the first point, of course, the simple + operator will do:
d=as.Date('2010-01-01')
d + 30
#[1] "2010-01-31"
As for the second I would just create a one line function as simple as that (and with a more general scope):
add.months= function(date,n) seq(date, by = paste (n, "months"), length = 2)[2]
You can use it with arbitrary months, including negative:
add.months(d, 3)
#[1] "2010-04-01"
add.months(d, -3)
#[1] "2009-10-01"
Of course, if you want to add only and often a single month:
add.month=function(date) add.months(date,1)
add.month(d)
#[1] "2010-02-01"
If you add one month to 31 of January, since 31th February is meaningless, the best to get the job done is to add the missing 3 days to the following month, March. So correctly:
add.month(as.Date("2010-01-31"))
#[1] "2010-03-03"
In case, for some very special reason, you need to put a ceiling to the last available day of the month, it's a bit longer:
add.months.ceil=function (date, n){
#no ceiling
nC=add.months(date, n)
#ceiling
day(date)=01
C=add.months(date, n+1)-1
#use ceiling in case of overlapping
if(nC>C) return(C)
return(nC)
}
As usual you could add a single month version:
add.month.ceil=function(date) add.months.ceil(date,1)
So:
d=as.Date('2010-01-31')
add.month.ceil(d)
#[1] "2010-02-28"
d=as.Date('2010-01-21')
add.month.ceil(d)
#[1] "2010-02-21"
And with decrements:
d=as.Date('2010-03-31')
add.months.ceil(d, -1)
#[1] "2010-02-28"
d=as.Date('2010-03-21')
add.months.ceil(d, -1)
#[1] "2010-02-21"
Besides you didn't tell if you were interested to a scalar or vector solution. As for the latter:
add.months.v= function(date,n) as.Date(sapply(date, add.months, n), origin="1970-01-01")
Note: *apply family destroys the class data, that's why it has to be rebuilt.
The vector version brings:
d=c(as.Date('2010/01/01'), as.Date('2010/01/31'))
add.months.v(d,1)
[1] "2010-02-01" "2010-03-03"
Hope you liked it))
Vanilla R has a naive difftime class, but the Lubridate CRAN package lets you do what you ask:
require(lubridate)
d <- ymd(as.Date('2004-01-01')) %m+% months(1)
d
[1] "2004-02-01"
Hope that helps.
The simplest way is to convert Date to POSIXlt format.
Then perform the arithmetic operation as follows:
date_1m_fwd <- as.POSIXlt("2010-01-01")
date_1m_fwd$mon <- date_1m_fwd$mon +1
Moreover, incase you want to deal with Date columns in data.table, unfortunately, POSIXlt format is not supported.
Still you can perform the add month using basic R codes as follows:
library(data.table)
dt <- as.data.table(seq(as.Date("2010-01-01"), length.out=5, by="month"))
dt[,shifted_month:=tail(seq(V1[1], length.out=length(V1)+3, by="month"),length(V1))]
Hope it helps.
"mondate" is somewhat similar to "Date" except that adding n adds n months rather than n days:
> library(mondate)
> d <- as.Date("2004-01-31")
> as.mondate(d) + 1
mondate: timeunits="months"
[1] 2004-02-29
Here's a function that doesn't require any packages to be installed. You give it a Date object (or a character that it can convert into a Date), and it adds n months to that date without changing the day of the month (unless the month you land on doesn't have enough days in it, in which case it defaults to the last day of the returned month). Just in case it doesn't make sense reading it, there are some examples below.
Function definition
addMonth <- function(date, n = 1){
if (n == 0){return(date)}
if (n %% 1 != 0){stop("Input Error: argument 'n' must be an integer.")}
# Check to make sure we have a standard Date format
if (class(date) == "character"){date = as.Date(date)}
# Turn the year, month, and day into numbers so we can play with them
y = as.numeric(substr(as.character(date),1,4))
m = as.numeric(substr(as.character(date),6,7))
d = as.numeric(substr(as.character(date),9,10))
# Run through the computation
i = 0
# Adding months
if (n > 0){
while (i < n){
m = m + 1
if (m == 13){
m = 1
y = y + 1
}
i = i + 1
}
}
# Subtracting months
else if (n < 0){
while (i > n){
m = m - 1
if (m == 0){
m = 12
y = y - 1
}
i = i - 1
}
}
# If past 28th day in base month, make adjustments for February
if (d > 28 & m == 2){
# If it's a leap year, return the 29th day
if ((y %% 4 == 0 & y %% 100 != 0) | y %% 400 == 0){d = 29}
# Otherwise, return the 28th day
else{d = 28}
}
# If 31st day in base month but only 30 days in end month, return 30th day
else if (d == 31){if (m %in% c(1, 3, 5, 7, 8, 10, 12) == FALSE){d = 30}}
# Turn year, month, and day into strings and put them together to make a Date
y = as.character(y)
# If month is single digit, add a leading 0, otherwise leave it alone
if (m < 10){m = paste('0', as.character(m), sep = '')}
else{m = as.character(m)}
# If day is single digit, add a leading 0, otherwise leave it alone
if (d < 10){d = paste('0', as.character(d), sep = '')}
else{d = as.character(d)}
# Put them together and convert return the result as a Date
return(as.Date(paste(y,'-',m,'-',d, sep = '')))
}
Some examples
Adding months
> addMonth('2014-01-31', n = 1)
[1] "2014-02-28" # February, non-leap year
> addMonth('2014-01-31', n = 5)
[1] "2014-06-30" # June only has 30 days, so day of month dropped to 30
> addMonth('2014-01-31', n = 24)
[1] "2016-01-31" # Increments years when n is a multiple of 12
> addMonth('2014-01-31', n = 25)
[1] "2016-02-29" # February, leap year
Subtracting months
> addMonth('2014-01-31', n = -1)
[1] "2013-12-31"
> addMonth('2014-01-31', n = -7)
[1] "2013-06-30"
> addMonth('2014-01-31', n = -12)
[1] "2013-01-31"
> addMonth('2014-01-31', n = -23)
[1] "2012-02-29"
addedMonth <- seq(as.Date('2004-01-01'), length=2, by='1 month')[2]
addedQuarter <- seq(as.Date('2004-01-01'), length=2, by='1 quarter')[2]
I turned antonio's thoughts into a specific function:
library(DescTools)
> AddMonths(as.Date('2004-01-01'), 1)
[1] "2004-02-01"
> AddMonths(as.Date('2004-01-31'), 1)
[1] "2004-02-29"
> AddMonths(as.Date('2004-03-30'), -1)
[1] "2004-02-29"
Related
I have a starting time specified as a year-month character, e.g. "2020-12". From the start, for each of T consecutive months, I need to generate n different dates (year-month-day), where the day is random.
Any help will be useful!
The data I'm working on:
data <- data.frame(
data = sample(seq(as.Date('2000/01/01'), as.Date('2020/01/01'), by="day"), 500),
price = round(runif(500, min = 10, max = 20),2),
quantity = round(rnorm(500,30),0)
)
func <- function(start, months, n) {
startdate <- as.Date(paste0(start, "-01"))
enddate <- seq(startdate, by = "month", length.out = months)
months <- seq_len(months)
enddate_lt <- as.POSIXlt(enddate)
enddate_lt$mon <- enddate_lt$mon + 1
enddate_lt$mday <- enddate_lt$mday - 1
days_per_month <- as.integer(format(enddate_lt, format = "%d"))
days <- lapply(days_per_month, sample, size = n)
dates <- Map(`+`, enddate, days)
do.call(c, dates)
}
set.seed(2021)
func("2020-12", 4, 3)
# [1] "2020-12-08" "2020-12-07" "2020-12-15" "2021-01-27" "2021-01-08" "2021-01-13" "2021-02-21" "2021-02-07" "2021-02-28"
# [10] "2021-03-28" "2021-03-07" "2021-03-15"
func("2020-12", 5, 2)
# [1] "2020-12-06" "2020-12-16" "2021-01-08" "2021-01-10" "2021-02-24" "2021-02-13" "2021-03-20" "2021-03-29" "2021-04-19"
# [10] "2021-04-28"
func("2020-12", 2, 10)
# [1] "2020-12-29" "2020-12-30" "2020-12-04" "2020-12-15" "2020-12-09" "2020-12-27" "2020-12-05" "2020-12-06" "2020-12-23"
# [10] "2020-12-17" "2021-01-03" "2021-01-20" "2021-01-05" "2021-01-22" "2021-01-23" "2021-01-06" "2021-01-10" "2021-01-07"
# [19] "2021-01-19" "2021-01-12"
Most of the dancing with POSIXlt objects is because it gives us clean (base R) access to the number of days in a month, which makes sampleing the days in a month rather simple. It can also be done (code-golf shorter) using the lubridate package, but I don't know that that is any more correct than this code is.
This just dumps out a sequence of random dates, with n days per month. It does not sort within each month, though it does output the months in order. (That's not a difficult extension, there just wasn't a requirement for it.) It doesn't put out a frame, you can easily extend this to fit in a frame or call data.frame(date = do.call(c, dates)) on the last line, depending on what you need to do with the output.
You could convert the start time to a class for monthly data, zoo::yearmon. Then use as.Date.yearmon and its frac argument ("a number between 0 and 1 inclusive that indicates the fraction of the way through the period that the result represents") with random values from runif (uniform between 0 and 1) to convert to a random date within each year-month.
start = "2020-12"
T = 3
n = 2
library(zoo)
set.seed(1)
as.Date(as.yearmon(start) + rep((1:T)/12, each = n), frac = runif(T * n))
# [1] "2021-01-08" "2021-01-12" "2021-02-16" "2021-02-25" "2021-03-07" "2021-03-27"
i have estimated a variable age.first.union as a time difference using lubridate by subracting the date of wedding wdow from the date of birth wdob. I got the following numeric vector
head(wm$age.first.union, 3)
[1] 15.43014 12.67123 17.34247
I would like to have the decimals converted into months (and possibly also into days, but that's a minor detail), so the first value would be 15 years and 5 months. What I did was to create a series of new variables and then perform some calculations. To get the number of months, first, I duplicated and truncated the age.first.union variable. Then I estimated the difference between the two to get only the decimal part and then used proportions (e.g. 0.43 : 10 = x : 12 ) to get the months.
I looked into the lubridate documentation but I could not find much on this. I tried the following
years(floor(dseconds(15.43014)))
but I got only the years
[1] "15y 0m 0d 0H 0M 0S"
One idea would be to get the durations in seconds
seconds(floor(dyears(15.43014)))
[1] "486604895S"
but then the challenge would be that months have difference lengths. Even an approximation of years = 365 days, and months = 30 days would be more then perfect, but I do not know how to do it apart from lengthy calculations.
One final idea would be to have years and month using the calculation as described at the beginning of this post, and then merge the two variables into the final one using something similar to make_date (but it looks like a make_duration does not seem to exist yet).
The whole process looks quite cumbersome to me, anyone has a different take?
Many thanks
Manolo
While lubridate provides a function decimal_date to convert a fractional date to D-M-Y date, you seem to be dealing with durations. So this won't work.
However, you can quite easily define a custom function to extract the integer year, month and fractional day (based on an average 30.42 days per month in a regular year):
age <- c(15.43014 12.67123 17.34247)
f <- function(x) {
year <- floor(x);
month <- floor((x - year) * 12);
day <- ((x - year) * 12 - month) * 30.42;
return(sprintf("%i years, %i months, %3.2f days", year, month, day))
}
lapply(age, f);
#[[1]]
#[1] "15 years, 5 months, 4.92 days"
#
#[[2]]
#[1] "12 years, 8 months, 1.67 days"
#
#[[3]]
#[1] "17 years, 4 months, 3.34 days"
Update
If you want to return the integer year, month and fractional day you can define f as
f <- function(x) {
year <- floor(x);
month <- floor((x - year) * 12);
day <- ((x - year) * 12 - month) * 30.42;
return(list(year = year, month = month, day = day))
}
which gives you e.g.
sapply(age, f);
# [,1] [,2] [,3]
#year 15 12 17
#month 5 8 4
#day 4.918306 1.665799 3.335249
We can define our own ym S3 class to represent year/month objects. Here we define several ym methods as well as extractor functions for years and months. The as.data.frame.ym method is a partial implementation. We have defined a month to be 1/12th of a year.
as.ym <- function(x, ...) structure(x, class = "ym")
as.data.frame.ym <- function(x, ...)
structure(list(x), row.names = seq_along(x), class = "data.frame")
years.ym <- as.integer
months.ym <- function(x) 12 * as.numeric(x) %% 1
format.ym <- function(x, ...) paste0(years.ym(x), "Y ", round(months.ym(x)), "M")
print.ym <- function(x, ...) print(format(x), ...)
# test
x <- c(15.43014, 12.67123, 17.34247) # test input
xx <- as.ym(x)
xx
## [1] "15Y 5M" "12Y 8M" "17Y 4M"
DF <- data.frame(x, xx)
DF
x xx
1 15.43014 15Y 5M
2 12.67123 12Y 8M
3 17.34247 17Y 4M
years.ym(xx)
## [1] 15 12 17
months.ym(xx)
## [1] 5.16168 8.05476 4.10964
class(xx)
## [1] "ym"
Days
To extend this to include days, as well, we assume that there are 365.25 days in a year and, again, we use 12 months in a year. We create a ymd S3 class for this.
as.ymd <- function(x, ...) structure(x, class = "ymd")
as.data.frame.ymd <- function(x, ...)
structure(list(x), row.names = seq_along(x), class = "data.frame")
years.ymd <- as.integer
months.ymd <- function(x) as.integer(12 * as.numeric(x) %% 1)
days.ymd <- function(x) (365.25 * as.numeric(x)) %% (365.25 / 12)
format.ymd <- function(x, ...)
paste0(years.ymd(x), "Y ", as.integer(months.ymd(x)), "M ", round(days.ymd(x), 1), "D")
print.ymd <- function(x, ...) print(format(x), ...)
xx <- as.ymd(x)
xx
## [1] "15Y 5M 4.9D" "12Y 8M 1.7D" "17Y 4M 3.3D"
DF <- data.frame(x, xx)
DF
x xx
1 15.43014 15Y 5M 4.9D
2 12.67123 12Y 8M 1.7D
3 17.34247 17Y 4M 3.3D
years.ymd(xx)
## [1] 15 12 17
months.ymd(xx)
## [1] 5 8 4
days.ymd(xx)
## [1] 4.921135 1.666758 3.337167
class(xx)
## [1] "ymd"
I have this number
20101213 which is a representation of this data 2010 Dec 13th I want to extract the year, month and day numbers from that number. So I should have three variables contain the values.
What I have tried:
value = 20101213
as.numeric(strsplit(as.character(value), "")[[1]])
The result is [1] 2 0 1 0 1 0 1 0
but I didn't know how to continue, may you help me please
You probably want to get this into a date-time format anyways for future computing, so how about:
(x <- strptime(20101213, "%Y%m%d"))
# [1] "2010-12-13 EST"
This will enable you to do computations that you wouldn't have been able to with just the year, month number, and day number, such as grabbing the day of the week (0=Sunday, 1=Monday, ...) or day of the year:
x$wday
# [1] 1
x$yday
# [1] 346
Further, you could easily extract the year, month number, and day of month number:
c(x$year+1900, x$mon+1, x$mday)
# [1] 2010 12 13
Edit: As pointed out by #thelatemail, an alternative that doesn't involve remembering offsets is:
as.numeric(c(format(x, "%Y"), format(x, "%m"), format(x, "%d")))
# [1] 2010 12 13
year <- as.numeric(substr(as.character(value),start = 1,stop = 4))
month <- as.numeric(substr(as.character(value),start = 5,stop = 6))
day <- as.numeric(substr(as.character(value),start = 7,stop = 8))
If you don't want to deal with string representation you could also just use mod function like this:
# using mod
year = floor(value/10000)
month = floor((value %% 10000)/100)
day = value %% 100
Which will then extract the relevant parts of the number as expected.
I have a vector of dates of the form BW01.68, BW02.68, ... , BW26.10. BW stands for "bi-week", so for example, "BW01.68" represents the first bi-week of the year 1968, and "BW26.10" represents the 26th (and final) bi-week of the year 2010. Using R, how could I convert this vector into actual dates, say, of the form 01-01-1968, 01-15-1968, ... , 12-16-2010? Is there a way for R to know exactly which dates correspond to each bi-week? Thanks for any help!
An alternative solution.
biwks <- c("BW01.68", "BW02.68", "BW26.10")
bw <- substr(biwks,3,4)
yr <- substr(biwks,6,7)
yr <- paste0(ifelse(as.numeric(yr) > 15,"19","20"),yr)
# the %j in the date format is the number of days into the year
as.Date(paste(((as.numeric(bw)-1) * 14) + 1,yr,sep="-"),format="%j-%Y")
#[1] "1968-01-01" "1968-01-15" "2010-12-17"
Though I will note that a 'bi-week' seems a strange measure and I can't be sure that just using 14 day blocks is what is intended in your work.
You can make this code a lot shorter. I have spaced out each step to help understanding but you could finish it off in one (long) line of code.
bw <- c('BW01.68', 'BW02.68','BW26.10','BW22.13')
# the gsub will ensure that bw01.1 the same as bw01.01, bw1.01, or bw1.1
#isolating year no
yearno <- as.numeric(
gsub(
x = bw,
pattern = "BW.*\\.",
replacement = ""
)
)
#isolating and converting bw to no of days
dayno <- 14 * as.numeric(
gsub(
x = bw,
pattern = "BW|\\.[[:digit:]]{1,2}",
replacement = ""
)
)
#cutoff year chosen as 15
yearno <- yearno + 1900
yearno[yearno < 1915] <- yearno[yearno < 1915] + 100
# identifying dates
dates <- as.Date(paste0('01/01/',yearno),"%d/%m/%Y") + dayno
# specifically identifinyg mondays of that week no
mondaydates <- dates - as.numeric(strftime(dates,'%w')) + 1
Output -
> bw
[1] "BW01.68" "BW02.68" "BW26.10" "BW22.13"
> dates
[1] "1968-01-15" "1968-01-29" "2010-12-31" "2013-11-05"
> mondaydates
[1] "1968-01-15" "1968-01-29" "2010-12-27" "2013-11-04"
PS: Just be careful that you're aligned with how bw is measured in your data and whether you're translating it correctly. You should be able to manipulate this to get it to work, for instance you might encounter a bw 27.
I would like a function that counts the number of specific days per month..
i.e.. Nov '13 -> 5 fridays.. while Dec'13 would return 4 Fridays..
Is there an elegant function that would return this?
library(lubridate)
num_days <- function(date){
x <- as.Date(date)
start = floor_date(x, "month")
count = days_in_month(x)
d = wday(start)
sol = ifelse(d > 4, 5, 4) #estimate that is the first day of the month is after Thu or Fri then the week will have 5 Fridays
sol
}
num_days("2013-08-01")
num_days(today())
What would be a better way to do this?
1) Here d is the input, a Date class object, e.g. d <- Sys.Date(). The result gives the number of Fridays in the year/month that contains d. Replace 5 with 1 to get the number of Mondays:
first <- as.Date(cut(d, "month"))
last <- as.Date(cut(first + 31, "month")) - 1
sum(format(seq(first, last, "day"), "%w") == 5)
2) Alternately replace the last line with the following line. Here, the first term is the number of Fridays from the Epoch to the next Friday on or after the first of the next month and the second term is the number of Fridays from the Epoch to the next Friday on or after the first of d's month. Again, we replace all 5's with 1's to get the count of Mondays.
ceiling(as.numeric(last + 1 - 5 + 4) / 7) - ceiling(as.numeric(first - 5 + 4) / 7)
The second solution is slightly longer (although it has the same number of lines) but it has the advantage of being vectorized, i.e. d could be a vector of dates.
UPDATE: Added second solution.
There are a number of ways to do it. Here is one:
countFridays <- function(y, m) {
fr <- as.Date(paste(y, m, "01", sep="-"))
to <- fr + 31
dt <- seq(fr, to, by="1 day")
df <- data.frame(date=dt, mon=as.POSIXlt(dt)$mon, wday=as.POSIXlt(dt)$wday)
df <- subset(df, df$wday==5 & df$mon==df[1,"mon"])
return(nrow(df))
}
It creates the first of the months, and a day in the next months.
It then creates a data frame of month index (on a 0 to 11 range, but we only use this for comparison) and weekday.
We then subset to a) be in the same month and b) on a Friday. That is your result set, and
we return the number of rows as your anwser.
Note that this only uses base R code.
Without using lubridate -
#arguments to pass to function:
whichweekday <- 5
whichmonth <- 11
whichyear <- 2013
#function code:
firstday <- as.Date(paste('01',whichmonth,whichyear,sep="-"),'%d-%m-%Y')
lastday <- if(whichmonth == 12) { '31-12-2013' } else {seq(as.Date(firstday,'%d-%m-%Y'), length=2, by="1 month")[2]-1}
sum(
strftime(
seq.Date(
from = firstday,
to = lastday,
by = "day"),
'%w'
) == whichweekday)