For statistical reasons, I would like to calculate the following:
date2 - date1 +1(day) ; in month.
For the above equation, this shall mean 1 day if date2 == date1.
For data, time data is only available for Y,M,D (no HH,MM).
The date below is available for demo.
date <- as.POSIXct("2009-03-08")
date2 <- as.POSIXct("2009-03-09")
I would like to get
( (date %--% date2)+1 )/ months(1)
(But this doesn’t work)
( (date %--% date2)+1 )/ days(1)
gives me 2 (days).
Now, I would like to calculate this value to month.
How can I achieve this?
Or rather, can I go straightforward,
like below equation?
( (date %--% date2)+1 )/ months(1)
First edition (Deprecated)
date %--% date2 is a <Interval>. You cannot add a numeric value to it. Instead, you need to convert it into a <Period>.
(as.period(date %--% date2) + days(1)) / months(1)
# [1] 0.06570842
Update
The above method is not precise because it cannot take months into account. The ideal output should be
(1 + 1) / 31
# [1] 0.06451613
becasue March has 31 days. The following way is able to consider the differences of days between different months.
(date %--% (date2 + days(1))) / months(1)
# [1] 0.06451613
For comparison, we change the dates to February and see the output:
date <- as.POSIXct("2009-02-08")
date2 <- as.POSIXct("2009-02-09")
(date %--% (date2 + days(1))) / months(1)
# [1] 0.07142857
which is equal to (1 + 1)/28.
The difftime command in base R subtracts one date from another. Unfortunately, it does not have the option to return the output in months; however, if we choose days, we can manually convert it.
Base
date1 <- as.POSIXct("2009-03-08")
date2 <- as.POSIXct("2009-03-09")
(1 + as.numeric(difftime(time1 = date2,time2 = date1,units = "days")))/30.4375
lubridate
library(lubridate)
date1 <- ymd("2009-03-08")
date2 <- ymd("2009-03-09")
(1 + as.numeric(date2 - date1))/30.4375
Output
[1] 0.06570842
Please try to divide the 2 dates by 30.4375 which is obtained from 365.25/12
Code
data.frame(date1=date, date2=date2) %>% mutate(date1=as.Date(date1),
date2=as.Date(date2),
diff=(date1-date2)/30.4375)
Related
Let's say we have this:
ex <- c('2012-41')
This represent the week 41 from the year 2012. How would I get the month from this?
Since a week can be between two months, I will be interested to get the month when that week started (here October).
Not duplicate to How to extract Month from date in R (do not have a standard date format like %Y-%m-%d).
you could try:
ex <- c('2019-10')
splitDate <- strsplit(ex, "-")
dateNew <- as.Date(paste(splitDate[[1]][1], splitDate[[1]][2], 1, sep="-"), "%Y-%U-%u")
monthSelected <- lubridate::month(dateNew)
3
I hope this helps!
This depends on the definition of week. See the discussion of %V and %W in ?strptime for two possible definitions of week. We use %V below but the function allows one to specify the other if desired. The function performs a sapply over the elements of x and for each such element it extracts the year into yr and forms a sequence of all dates for that year in sq. It then converts those dates to year-month and finds the first occurrence of the current component of x in that sequence, finally extracting the match's month.
yw2m <- function(x, fmt = "%Y-%V") {
sapply(x, function(x) {
yr <- as.numeric(substr(x, 1, 4))
sq <- seq(as.Date(paste0(yr, "-01-01")), as.Date(paste0(yr, "-12-31")), "day")
as.numeric(format(sq[which.max(format(sq, fmt) == x)], "%m"))
})
}
yw2m('2012-41')
## [1] 10
The following will add the week-of-year to an input of year-week formatted strings and return a vector of dates as character. The lubridate package weeks() function will add the dates corresponding to the end of the relevant week. Note for example I've added an additional case in your 'ex' variable to the 52nd week, and it returns Dec-31st
library(lubridate)
ex <- c('2012-41','2016-4','2018-52')
dates <- strsplit(ex,"-")
dates <- sapply(dates,function(x) {
year_week <- unlist(x)
year <- year_week[1]
week <- year_week[2]
start_date <- as.Date(paste0(year,'-01-01'))
date <- start_date+weeks(week)
#note here: OP asked for beginning of week.
#There's some ambiguity here, the above is end-of-week;
#uncommment here for beginning of week, just subtracted 6 days.
#I think this might yield inconsistent results, especially year-boundaries
#hence suggestion to use end of week. See below for possible solution
#date <- start_date+weeks(week)-days(6)
return (as.character(date))
})
Yields:
> dates
[1] "2012-10-14" "2016-01-29" "2018-12-31"
And to simply get the month from these full dates:
month(dates)
Yields:
> month(dates)
[1] 10 1 12
If I have a given date, how do I find the first and last days of the next month?
For example,
today <- as.Date("2009-04-04")
I want to find
# first date in next month
"2009-05-01"
# last date in next month
"2009-05-31"
You can do this with base R:
today <- as.Date("2009-04-04")
first <- function(x) {
x <- as.POSIXlt(x)
x$mon[] <- x$mon + 1
x$mday[] <- 1
x$isdst[] <- -1L
as.Date(x)
}
first(today)
#[1] "2009-05-01"
first(first(today)) - 1
#[1] "2009-05-31"
lubridate has some useful tools for this purpose.
library(lubridate)
today <- ymd("2009-04-12")
# First day of next month
first <- ceiling_date(today, unit = "month")
# Last day of next month
last <- ceiling_date(first, unit= "month") -1
first
#"2009-05-01"
last
#"2009-05-31"
Here are some solutions. We use today from the question to test. In both cases the input may be a Date class vector.
1) Base R Define function fom to give the first of the month of its Date
argument. Using that we can get the date of the first and last of the next month as follows. We use the facts that 31 and 62 days after the first of the month is necessarily a date in the next month and month after the next month.
fom <- function(x) as.Date(cut(x, "month"))
fom(fom(today) + 31)
## [1] "2009-05-01"
fom(fom(today) + 62) - 1
## [1] "2009-05-31"
2) yearmon yearmon class objects internally represent a year and month as the year plus 0 for January, 1/12 for Febrary, 2/12 for March and so on. Using as.Date.yearmon the frac argument specifies the fraction of the way through the month to output. The default is frac = 0 and results in the first of the month being output and frac = 1 means the end of the month.
library(zoo)
as.Date(as.yearmon(today) + 1/12)
## [1] "2009-05-01"
as.Date(as.yearmon(today) + 1/12, frac = 1)
## [1] "2009-05-31"
I have a data.table dt which have many records .
It have two columns datetime1, with value as "2017-04-19 09:54:00" of class POSIXct
another column have time values like "7.97" of class numeric. It is for the same date.
I want to calculate a difference in the time in minutes. how can I do it in R
Try this
time1 <- as.POSIXct('2017-04-19 09:54:00')
time2 <- as.POSIXct('2017-04-19 00:00:00') + 3600*7.97
60*as.numeric(time1 - time2)
You can use functions of lubridate to extract hour, minute, and second of the POSIXct and then calculate the difference.
library(lubridate)
x = as.POSIXct("2017-04-19 09:54:00", tz = "UTC")
hour(x) * 60 + minute(x) + second(x)/60 - 7.97 * 60
#[1] 115.8
I'm interested in generating a sequence of month ends or starts for a range of time. Even better if the dates are business days where they won't fall on Sunday or Saturday. How do you do this in R?
Here's two functions that I use for that task:
library(lubridate)
# returns a sequence of Business Month Ends between two inputs (inclusive)
# or for the one input
GetBizMonthEndFor <- function(dateChar1, dateChar2 = dateChar1){
# generate the sequence of month starts
dateChar <- seq(floor_date(as.Date(dateChar1), unit = "month"),
floor_date(as.Date(dateChar2), unit = "month"),
by = "month")
# add a month to each sequence element and subtract a day
dateChar <- dateChar + months(1) - days(1)
# if the day is saturday or sunday, subtract a day or two to hit the
# previous friday
dateChar[wday(dateChar) == 1] <- dateChar[wday(dateChar) == 1] - days(2)
dateChar[wday(dateChar) == 7] <- dateChar[wday(dateChar) == 7] - days(1)
dateChar
}
# returns a sequence of Business Month Starts between two inputs (inclusive)
# or for the one input
GetBizMonthStartFor <- function(dateChar1, dateChar2 = dateChar1){
# generate the sequence of month starts
dateChar <- seq(floor_date(as.Date(dateChar1), unit = "month"),
floor_date(as.Date(dateChar2), unit = "month"),
by = "month")
# January 1 is a holiday, so if the month start is january 1, make
# January 2 the business month start
dateChar[month(dateChar) == 1 & day(dateChar) == 1] <-
dateChar[month(dateChar) == 1 & day(dateChar) == 1] + days(1)
# If the day is a saturday or sunday, add a day or two to hit the next
# monday
dateChar[wday(dateChar) == 1] <- dateChar[wday(dateChar) == 1] + days(1)
dateChar[wday(dateChar) == 7] <- dateChar[wday(dateChar) == 7] + days(2)
dateChar
}
Note that the business month start has a 'Global Business Day' logic to it. If the day is New Years, then it isn't a business day. You can remove that to just have business days depend on weekdays. Now to showing that they work:
> GetBizMonthEndFor("2015-01-12", "2015-10-12")
[1] "2015-01-30" "2015-02-27" "2015-03-31" "2015-04-30" "2015-05-29" "2015-06-30"
[7] "2015-07-31" "2015-08-31" "2015-09-30" "2015-10-30"
Note that the inputs dates' day doesn't matter, it takes the months of the inputs and changes them to be the ends of the sequence of months. I haven't put logic in yet to make them only include month end/start dates between the input values, but it's not a hard fix. Something that IS added, though, is the ability to check a date's month end/start by just using one input instead of two.
> GetBizMonthStartFor("2015-11-01")
[1] "2015-11-02"
I want to correct source activity based on the difference between reference and measurement date and source half life (measured in years). Say I have
ref_date <- as.Date('06/01/08',format='%d/%m/%y')
and a column in my data.frame with the same date format, e.g.,
today <- as.Date(Sys.Date(), format='%d/%m/%y')
I can find the number of years between these dates using the lubridate package
year(today)-year(ref_date)
[1] 5
Is there a function I can use to get a floating point answer today - ref_date = 5.2y, for example?
Yes, of course, use difftime() with an as numeric:
R> as.numeric(difftime(as.Date("2003-04-05"), as.Date("2001-01-01"),
+ unit="weeks"))/52.25
[1] 2.2529
R>
Note that we do have to switch to weeks scaled by 52.25 as there is a bit of ambiguity
there in terms of counting years---a February 29 comes around every 4 years but not every 100th etc.
So you have to define that. difftime() handles all time units up to weeks. Months cannot be done for the same reason of the non-constant 'numerator'.
The lubridate package contains a built-in function, time_length, which can help perform this task.
time_length(difftime(as.Date("2003-04-05"), as.Date("2001-01-01")), "years")
[1] 2.257534
time_length(difftime(as.Date("2017-03-01"), as.Date("2012-03-01")),"years")
[1] 5.00274
Documentation for the lubridate package can be found here.
Inspired by Bryan F, time_length() would work better if using interval object
time_length(interval(as.Date("2003-04-05"), as.Date("2001-01-01")), "years")
[1] -2.257534
time_length(difftime(as.Date("2017-03-01"), as.Date("2012-03-01")),"years")
[1] 5.00274
time_length(interval(as.Date("2017-03-01"), as.Date("2012-03-01")),"years")
[1] -5
You can see if you use interval() to get the time difference and then pass it to time_length(), time_length() would take into account the fact that not all months and years have the same number of days, e.g., the leap year.
Not an exact answer to your question, but the answer from Dirk Eddelbuettel in some situations can produce small errors.
Please, consider the following example:
as.numeric(difftime(as.Date("2012-03-01"), as.Date("2017-03-01"), unit="weeks"))/52.25
[1] -4.992481
The correct answer here should be at least 5 years.
The following function (using lubridate package) will calculate a number of full years between two dates:
# Function to calculate an exact full number of years between two dates
year.diff <- function(firstDate, secondDate) {
yearsdiff <- year(secondDate) - year(firstDate)
monthsdiff <- month(secondDate) - month(firstDate)
daysdiff <- day(secondDate) - day(firstDate)
if ((monthsdiff < 0) | (monthsdiff == 0 & daysdiff < 0)) {
yearsdiff <- yearsdiff - 1
}
yearsdiff
}
You can modify it to calculate a fractional part depending on how you define the number of days in the last (not finished) year.
You can use the function AnnivDates() of the package BondValuation:
R> library('BondValuation')
R> DateIndexes <- unlist(
+ suppressWarnings(
+ AnnivDates("2001-01-01", "2003-04-05", CpY=1)$DateVectors[2]
+ )
+ )
R> names(DateIndexes) <- NULL
R> DateIndexes[length(DateIndexes)] - DateIndexes[1]
[1] 2.257534
Click here for documentation of the package BondValuation.
To get the date difference in years (floating point) you can convert the dates to decimal numbers of Year and calculate then their difference.
#Example Dates
x <- as.Date(c("2001-01-01", "2003-04-05"))
#Convert Date to decimal year:
date2DYear <- function(x) {
as.numeric(format(x,"%Y")) + #Get Year an add
(as.numeric(format(x,"%j")) - 0.5) / #Day of the year divided by
as.numeric(format(as.Date(paste0(format(x,"%Y"), "-12-31")),"%j")) #days of the year
}
diff(date2DYear(x)) #Get the difference in years
#[1] 2.257534
I subtract 0.5 from the day of the year as it is not known if you are at the beginning or the end of the day and %j starts with 1.
I think the difference between 2012-03-01 and 2017-03-01 need not to be 5 Years, as 2012 has 366 days and 2017 365 and 2012-03-01 is on the 61 day of the year and 2017-03-01 on the 60.
x <- as.Date(c("2012-03-01", "2017-03-01"))
diff(date2DYear(x))
#[1] 4.997713
Note that using time_length and interval from lubridate need not come to the same result when you make a cumulative time difference.
library(lubridate)
x <- as.Date(c("2012-01-01", "2012-03-01", "2012-12-31"))
time_length(interval(x[1], x[3]), "years")
#[1] 0.9972678
time_length(interval(x[1], x[2]), "years") +
time_length(interval(x[2], x[3]), "years")
#[1] 0.9995509 #!
diff(date2DYear(x[c(1,3)]))
#[1] 0.9972678
diff(date2DYear(x[c(1,2)])) + diff(date2DYear(x[c(2,3)]))
#[1] 0.9972678
x <- as.Date(c("2013-01-01", "2013-03-01", "2013-12-31"))
time_length(interval(x[1], x[3]), "years")
#[1] 0.9972603
time_length(interval(x[1], x[2]), "years") +
time_length(interval(x[2], x[3]), "years")
#[1] 0.9972603
diff(date2DYear(x[c(1,3)]))
#[1] 0.9972603
diff(date2DYear(x[c(1,2)])) + diff(date2DYear(x[c(2,3)]))
#[1] 0.9972603
Since you are already using lubridate package, you can obtain number of years in floating point using a simple trick:
find number of seconds in one year:
seconds_in_a_year <- as.integer((seconds(ymd("2010-01-01")) - seconds(ymd("2009-01-01"))))
now obtain number of seconds between the 2 dates you desire
seconds_between_dates <- as.integer(seconds(date1) - seconds(date2))
your final answer for number of years in floating points will be
years_between_dates <- seconds_between_dates / seconds_in_a_year