I am trying to determine the absolute number of days between two dates using lubridate.
library(lubridate)
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
dates$diff
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
abs(dates$diff)
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
I would have expected all of the values to be positive. Furthermore, min and max do not return the smallest and largest values.
min(dates$diff)
[1] 0
max(dates$diff)
[1] 0
Why do these functions behave differently on lubridate periods than on numeric/integer objects?
The simple answer is that period class objects from lubridate are not simple numeric objects. They are S4 objects. Their main data member is the numeric vector of seconds, with the minutes, hours, days and years all stored as attributes. When you try to apply mathematical operators on period objects, the operators don't apply to the attributes, only to the main numerical vector, which is the seconds part.
We can see this if we create a period of -1 seconds:
library(lubridate)
p <- as.period(diff(as.POSIXct(c("2020-09-24 21:00:01", "2020-09-24 21:00:00"))))
p
#> [1] "-1S"
abs(p)
#> [1] "1S"
Now let us examine our object's attributes:
attributes(p)
#> $year
#> [1] 0
#>
#> $month
#> [1] 0
#>
#> $day
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $minute
#> [1] 0
#>
#> $class
#> [1] "Period"
#> attr(,"package")
#> [1] "lubridate"
With S4 objects, you need to define what functions like abs and min will do by writing the "Math" and "Summary" group generics. However, these have not been defined for class "period", so they instead are called on the main data vector (which is just the vector of seconds). The Ops group generic has been defined though, which us why you can do things like dates$diff / 2 and get a sensible answer.
Why have they not been defined? That's one for the authors to answer. In the meantime, you could get the functionality you want by making abs an S3 method and specifically writing an abs.period method, like this:
abs <- function(x) UseMethod("abs")
abs.default <- function(x) base::abs(x)
abs.Period <- function(out)
{
new("Period", abs(out$second),
year = abs(out$year),
month = abs(out$month),
day = abs(out$day), hour = abs(out$hour),
minute = abs(out$minute))
}
Which would give your expected behaviour:
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
abs(dates$diff)
#> [1] "1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
However, this is probably not a great idea. Best to work with difftimes for arithmetic and and convert to periods if needed.
I hope that clarifies things a bit.
Related
In the following data frame the 'time' column is character in the format hour:minute:second
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time)
How can I convert 'time' column to a dedicated time class, so that I can perform arithmetic calculations on it?
Use the function chron in package chron:
time<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
library(chron)
x <- chron(times=time)
x
[1] 00:00:01 01:02:00 09:30:01 14:15:25
Do some useful things, like calculating the difference between successive elements:
diff(x)
[1] 01:01:59 08:28:01 04:45:24
chron objects store the values internally as a fraction of seconds per day. Thus 1 second is equivalent to 1/(60*60*24), or 1/86400, i.e. 1.157407e-05.
So, to add times, one simple option is this:
x + 1/86400
[1] 00:00:02 01:02:01 09:30:02 14:15:26
Using base R you could convert it to an object of class POSIXct, but this does add a date to the time:
id<-c(1,2,3,4)
time<-c("00:00:01","01:02:00","09:30:01","14:15:25")
df<-data.frame(id,time,stringsAsFactors=FALSE)
as.POSIXct(df$time,format="%H:%M:%S")
[1] "2012-08-20 00:00:01 CEST" "2012-08-20 01:02:00 CEST"
[3] "2012-08-20 09:30:01 CEST" "2012-08-20 14:15:25 CEST"
But that does allow you to perform arithmetic calculations on them.
Another possible alternative could be:
time <- c("00:00:01","01:02:00","09:30:01","14:15:25")
converted.time <- as.difftime(time, units = "mins") #"difftime" class
secss <- as.numeric(converted.time, units = "secs")
hourss <- as.numeric(converted.time, units = "hours")
dayss <- as.numeric(converted.time, units="days")
Or even:
w <- strptime(x = time, format = "%H:%M:%S") #"POSIXlt" "POSIXt" class
Using the ITime class in data.table package:
ITime is a time-of-day class stored as the integer number of seconds in the day.
library(data.table)
(it <- as.ITime(time))
# [1] "00:00:01" "01:02:00" "09:30:01" "14:15:25"
it + 10
# [1] "00:00:11" "01:02:10" "09:30:11" "14:15:35"
diff(it)
# [1] "01:01:59" "08:28:01" "04:45:24"
lubridate allows good flexibility on the time format :
library(lubridate)
time_hms_1<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
hms(time_hms_1)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hms_2<-c("0:00:01", "1:02:00", "9:30:01", "14:15:25")
hms(time_hms_2)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hm_1<-c("00:00", "01:02", "09:30", "14:15")
hm(time_hm_1)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
time_hm_2<-c("0:00", "1:02", "9:30", "14:15")
hm(time_hm_2)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
Created on 2020-07-03 by the reprex package (v0.3.0)
Yet another alternative using the hms package.
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time, stringsAsFactors = FALSE)
Convert column time to class hms
# install.packages("hms")
library(hms)
df$time <- as.hms(df$time)
Perform arithmetic calculations
diff(df$time)
#01:01:59
#08:28:01
#04:45:24
I have a set of times in milliseconds that I want to convert to hh: mm. An example dataset would be:
data <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
I get this with manual calculation but it does not provide me the correct value for the minutes.
> data/1000/60/60
[1] 1.649028 1.510694 1.068167 2.066639 1.522278 1.917750
I tried this
format(as.POSIXct(Sys.Date())+data, "%H:%M")
[1] "12:01" "17:41" "07:10" "21:38" "05:16" "16:45"
but that is not even close. Any thoughts on that?
Thanks!
hrs = data/(60 * 60 * 1000)
mins = (hrs %% 1) * 60
secs = (mins %% 1) * 60
paste(trunc(hrs), trunc(mins), round(secs, 2), sep = ":")
#[1] "1:38:56.5" "1:30:38.5" "1:4:5.4" "2:3:59.9" "1:31:20.2" "1:55:3.9"
Also,
library(lubridate)
seconds_to_period(data/1000)
#[1] "1H 38M 56.5S" "1H 30M 38.5S" "1H 4M 5.40000000000009S"
#[4] "2H 3M 59.8999999999996S" "1H 31M 20.1999999999998S" "1H 55M 3.89999999999964S"
The zero point we can get by doing:
strftime(as.POSIXlt.numeric(0, format="%OS", origin="1970-01-01") - 7200, format="%R")
# [1] "00:00"
Accordingly:
t.adj <- 0
res <- strftime(as.POSIXlt.numeric(v/1000, format="%OS", origin="1970-01-01") - t.adj*3600,
format="%R", tz="GMT")
res
# [1] "01:38" "01:30" "01:04" "02:03" "01:31" "01:55"
class(res)
# [1] "character"
The date doesn't matter, since:
class(res)
# [1] "character"
Note, that this solution might depend on your Sys.getlocale("LC_TIME"). In the solution above there is an optional hour adjustment t.adj*, however in my personal locale it's set to zero to yield the right values.
Data
v <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
*To automate the localization you may want to look into the answers to this question.
Using lubridate, how to calculate the last day of the previous quarter for a given date?
The below formula doesn't seem to work for Nov 3rd, 2014 (other dates work)
library(lubridate)
date = as.POSIXct("2014-11-03")
date - days(day(date)) - months(month(date) %% 3 - 1)
# NA
Interesting enough, changing order works:
date - months(month(date) %% 3 - 1) - days(day(date))
# "2014-09-30 UTC"
Here are some possibilities with functions from packages zoo and timeDate, and base R. The zoo code was improved by #G.Grothendieck, and he also suggested the base alternative (thanks a lot!). I leave the lubridate solution(s) to someone else.
First, use class yearqtr in package zoo to represent the quarterly data. You may then use as.Date.yearqtr and the frac argument "which is a number between 0 and 1 inclusive that indicates the fraction of the way through the period that the result represents. The default is 0 which means the beginning of the period" (see ?yearqtr, and ?yearmon for frac).
Step by step:
library(zoo)
date <- as.Date("2014-11-03")
# current quarter
current_q <- as.yearqtr(date)
current_q
# [1] "2014 Q4"
# first date in current quarter
first_date_current_q <- as.Date(current_q, frac = 0)
first_date_current_q
# [1] "2014-10-01"
# last date in previous quarter
last_date_prev_q <- first_date_current_q - 1
last_date_prev_q
# [1] "2014-09-30"
And a short version by #G.Grothendieck (thanks!)
as.Date(as.yearqtr(date)) - 1
# [1] "2014-09-30"
A nice base R solution by #G.Grothendieck
as.Date(cut(date, "quarter")) - 1
# [1] "2014-09-30"
Another possibility is to use timeFirstDayInQuarter and timeLastDayInQuarter functions in package timeDate:
library(timeDate)
timeLastDayInQuarter(timeFirstDayInQuarter(date) - 1)
# GMT
# [1] [2014-09-30]
I know this is old, but since this question specifically asks for a lubridate solution and I couldn't find working code for it elsewhere, I figured I'd post, keeping in mind that the base R solution is likely the most succinct:
> sampleDate <- ymd('2015-11-11')
> yearsToSample <- years(year(sampleDate) - year(origin))
> yearsToSample
[1] "45y 0m 0d 0H 0M 0S"
> additionalMonths <- months((quarter(sampleDate) - 1) * 3)
> additionalMonths
[1] "9m 0d 0H 0M 0S"
> startOfQuarter <- ymd(origin) + yearsToSample + additionalMonths - days(1)
> startOfQuarter
[1] "2015-09-30 UTC"
Confirmed that this works for dates previous to the origin ('1970-01-01') as well.
Here is a pure lubridate solution for 2021:
> date <- ymd("2014-11-03")
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
Breaking it down:
> # The year and quarter
> quarter(date, with_year = TRUE)
[1] 2014.4
> # The first day of the quarter as a date
> yq(quarter(date, with_year = TRUE))
[1] "2014-10-01"
# The day before the first day of the quarter as a date
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
I have a time column in a database, where hours and minutes are separated by a ":". I would like to remove the ":" so that time field becomes numeric as I will use numeric time for some calculation.
Input:
X
00:00
01:15
02:30
Output:
X
0000
0115
0230
I am new to R. My apologies if this is a silly question. Greatly appreciate any help. Thank you.
> x <- c("00:00", "01:15", "02:30")
> gsub(":", "", x)
[1] "0000" "0115" "0230"
If you really want numbers you can coerce to numeric or integer
> as.numeric(gsub(":", "", x))
[1] 0 115 230
GSee's answer does what you ask. However, if you're doing arithmetic with units of time, you might think about some easier ways.
library(lubridate)
X <- hm(c("00:00", "01:15", "02:30")) # Converts to lubridate time objects
X + minutes(1)
# [1] "1M 0S" "1H 16M 0S" "2H 31M 0S"
X + weeks(2)
# [1] "14d 0H 0M 0S" "14d 1H 15M 0S" "14d 2H 30M 0S"
It might be more sensible to use the R facilities for time parsing to convert first to date-time. At the moment you will have no way to capture the non-decimal character of your "time" values. 300-259 should be 1 , not 41. This set of commands illustrates some of R's date-time and Date functions:
> X <- c('00:00', '01:15', '02:30')
> as.POSIXct(X, format="%H:%M")
[1] "2013-08-05 00:00:00 PDT" "2013-08-05 01:15:00 PDT" "2013-08-05 02:30:00 PDT"
This will give the results in differences in seconds from midnight today:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct("2013-08-05 00:00:00 PDT"))
[1] 0 4500 9000
Now try to use todays's date, but then notice that there is an offset of 7 hours because as.POSIXct will assume this is GMT.UCT time:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct(Sys.Date()))
[1] 7.00 8.25 9.50
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))
[1] 0 4500 9000
So finish off the process by shifting 7 hours (=7*3600 seconds) and then converting to minutes:
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))/60
[1] 0 75 150
In the following data frame the 'time' column is character in the format hour:minute:second
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time)
How can I convert 'time' column to a dedicated time class, so that I can perform arithmetic calculations on it?
Use the function chron in package chron:
time<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
library(chron)
x <- chron(times=time)
x
[1] 00:00:01 01:02:00 09:30:01 14:15:25
Do some useful things, like calculating the difference between successive elements:
diff(x)
[1] 01:01:59 08:28:01 04:45:24
chron objects store the values internally as a fraction of seconds per day. Thus 1 second is equivalent to 1/(60*60*24), or 1/86400, i.e. 1.157407e-05.
So, to add times, one simple option is this:
x + 1/86400
[1] 00:00:02 01:02:01 09:30:02 14:15:26
Using base R you could convert it to an object of class POSIXct, but this does add a date to the time:
id<-c(1,2,3,4)
time<-c("00:00:01","01:02:00","09:30:01","14:15:25")
df<-data.frame(id,time,stringsAsFactors=FALSE)
as.POSIXct(df$time,format="%H:%M:%S")
[1] "2012-08-20 00:00:01 CEST" "2012-08-20 01:02:00 CEST"
[3] "2012-08-20 09:30:01 CEST" "2012-08-20 14:15:25 CEST"
But that does allow you to perform arithmetic calculations on them.
Another possible alternative could be:
time <- c("00:00:01","01:02:00","09:30:01","14:15:25")
converted.time <- as.difftime(time, units = "mins") #"difftime" class
secss <- as.numeric(converted.time, units = "secs")
hourss <- as.numeric(converted.time, units = "hours")
dayss <- as.numeric(converted.time, units="days")
Or even:
w <- strptime(x = time, format = "%H:%M:%S") #"POSIXlt" "POSIXt" class
Using the ITime class in data.table package:
ITime is a time-of-day class stored as the integer number of seconds in the day.
library(data.table)
(it <- as.ITime(time))
# [1] "00:00:01" "01:02:00" "09:30:01" "14:15:25"
it + 10
# [1] "00:00:11" "01:02:10" "09:30:11" "14:15:35"
diff(it)
# [1] "01:01:59" "08:28:01" "04:45:24"
lubridate allows good flexibility on the time format :
library(lubridate)
time_hms_1<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
hms(time_hms_1)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hms_2<-c("0:00:01", "1:02:00", "9:30:01", "14:15:25")
hms(time_hms_2)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hm_1<-c("00:00", "01:02", "09:30", "14:15")
hm(time_hm_1)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
time_hm_2<-c("0:00", "1:02", "9:30", "14:15")
hm(time_hm_2)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
Created on 2020-07-03 by the reprex package (v0.3.0)
Yet another alternative using the hms package.
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time, stringsAsFactors = FALSE)
Convert column time to class hms
# install.packages("hms")
library(hms)
df$time <- as.hms(df$time)
Perform arithmetic calculations
diff(df$time)
#01:01:59
#08:28:01
#04:45:24