I have a set of times in milliseconds that I want to convert to hh: mm. An example dataset would be:
data <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
I get this with manual calculation but it does not provide me the correct value for the minutes.
> data/1000/60/60
[1] 1.649028 1.510694 1.068167 2.066639 1.522278 1.917750
I tried this
format(as.POSIXct(Sys.Date())+data, "%H:%M")
[1] "12:01" "17:41" "07:10" "21:38" "05:16" "16:45"
but that is not even close. Any thoughts on that?
Thanks!
hrs = data/(60 * 60 * 1000)
mins = (hrs %% 1) * 60
secs = (mins %% 1) * 60
paste(trunc(hrs), trunc(mins), round(secs, 2), sep = ":")
#[1] "1:38:56.5" "1:30:38.5" "1:4:5.4" "2:3:59.9" "1:31:20.2" "1:55:3.9"
Also,
library(lubridate)
seconds_to_period(data/1000)
#[1] "1H 38M 56.5S" "1H 30M 38.5S" "1H 4M 5.40000000000009S"
#[4] "2H 3M 59.8999999999996S" "1H 31M 20.1999999999998S" "1H 55M 3.89999999999964S"
The zero point we can get by doing:
strftime(as.POSIXlt.numeric(0, format="%OS", origin="1970-01-01") - 7200, format="%R")
# [1] "00:00"
Accordingly:
t.adj <- 0
res <- strftime(as.POSIXlt.numeric(v/1000, format="%OS", origin="1970-01-01") - t.adj*3600,
format="%R", tz="GMT")
res
# [1] "01:38" "01:30" "01:04" "02:03" "01:31" "01:55"
class(res)
# [1] "character"
The date doesn't matter, since:
class(res)
# [1] "character"
Note, that this solution might depend on your Sys.getlocale("LC_TIME"). In the solution above there is an optional hour adjustment t.adj*, however in my personal locale it's set to zero to yield the right values.
Data
v <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
*To automate the localization you may want to look into the answers to this question.
Related
Does anybody have a neater way of rounding lubridate period and duration objects to minutes instead of seconds.
For example, I have the following pipe of code:
seconds(x = 3600115) %>% as.duration() %>% as.period()
This results in: 41d 16H 1M 55S. I would like to round it so it becomes: 41d 16H 2M 0S.
Is anybody aware of a better way than:
(seconds(x = 3600115) / 60) %>% as.numeric() %>% round() %>% dminutes() %>% as.period()
This results in: 41d 16H 2M 0S
A duration object is stored numerically as the number of seconds and period objects can conveniently be converted into seconds using period_to_seconds() so you could use a simple function for this:
library(lubridate)
# Create period object
p <- seconds_to_period(3600115)
# Create duration object
d <- as.duration(p)
minround <- function(x) {
stopifnot(is.period(x) || is.duration(x))
if (is.duration(x))
round(x / 60) * 60
else
seconds_to_period(round(period_to_seconds(x) / 60) * 60)
}
minround(p)
# [1] "41d 16H 2M 0S"
minround(d)
# [1] "3600120s (~5.95 weeks)"
A base R option :
as.POSIXct(3600115 - 86400, origin = '1970-01-01', tz = 'UTC') %>%
round('mins') %>%
format('%jd %HH %MM 0S')
#[1] "041d 16H 02M 0S"
Few things to note -
Used pipe for readability.
Output is a string object and not period object.
I am trying to determine the absolute number of days between two dates using lubridate.
library(lubridate)
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
dates$diff
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
abs(dates$diff)
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
I would have expected all of the values to be positive. Furthermore, min and max do not return the smallest and largest values.
min(dates$diff)
[1] 0
max(dates$diff)
[1] 0
Why do these functions behave differently on lubridate periods than on numeric/integer objects?
The simple answer is that period class objects from lubridate are not simple numeric objects. They are S4 objects. Their main data member is the numeric vector of seconds, with the minutes, hours, days and years all stored as attributes. When you try to apply mathematical operators on period objects, the operators don't apply to the attributes, only to the main numerical vector, which is the seconds part.
We can see this if we create a period of -1 seconds:
library(lubridate)
p <- as.period(diff(as.POSIXct(c("2020-09-24 21:00:01", "2020-09-24 21:00:00"))))
p
#> [1] "-1S"
abs(p)
#> [1] "1S"
Now let us examine our object's attributes:
attributes(p)
#> $year
#> [1] 0
#>
#> $month
#> [1] 0
#>
#> $day
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $minute
#> [1] 0
#>
#> $class
#> [1] "Period"
#> attr(,"package")
#> [1] "lubridate"
With S4 objects, you need to define what functions like abs and min will do by writing the "Math" and "Summary" group generics. However, these have not been defined for class "period", so they instead are called on the main data vector (which is just the vector of seconds). The Ops group generic has been defined though, which us why you can do things like dates$diff / 2 and get a sensible answer.
Why have they not been defined? That's one for the authors to answer. In the meantime, you could get the functionality you want by making abs an S3 method and specifically writing an abs.period method, like this:
abs <- function(x) UseMethod("abs")
abs.default <- function(x) base::abs(x)
abs.Period <- function(out)
{
new("Period", abs(out$second),
year = abs(out$year),
month = abs(out$month),
day = abs(out$day), hour = abs(out$hour),
minute = abs(out$minute))
}
Which would give your expected behaviour:
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
abs(dates$diff)
#> [1] "1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
However, this is probably not a great idea. Best to work with difftimes for arithmetic and and convert to periods if needed.
I hope that clarifies things a bit.
Using lubridate, how to calculate the last day of the previous quarter for a given date?
The below formula doesn't seem to work for Nov 3rd, 2014 (other dates work)
library(lubridate)
date = as.POSIXct("2014-11-03")
date - days(day(date)) - months(month(date) %% 3 - 1)
# NA
Interesting enough, changing order works:
date - months(month(date) %% 3 - 1) - days(day(date))
# "2014-09-30 UTC"
Here are some possibilities with functions from packages zoo and timeDate, and base R. The zoo code was improved by #G.Grothendieck, and he also suggested the base alternative (thanks a lot!). I leave the lubridate solution(s) to someone else.
First, use class yearqtr in package zoo to represent the quarterly data. You may then use as.Date.yearqtr and the frac argument "which is a number between 0 and 1 inclusive that indicates the fraction of the way through the period that the result represents. The default is 0 which means the beginning of the period" (see ?yearqtr, and ?yearmon for frac).
Step by step:
library(zoo)
date <- as.Date("2014-11-03")
# current quarter
current_q <- as.yearqtr(date)
current_q
# [1] "2014 Q4"
# first date in current quarter
first_date_current_q <- as.Date(current_q, frac = 0)
first_date_current_q
# [1] "2014-10-01"
# last date in previous quarter
last_date_prev_q <- first_date_current_q - 1
last_date_prev_q
# [1] "2014-09-30"
And a short version by #G.Grothendieck (thanks!)
as.Date(as.yearqtr(date)) - 1
# [1] "2014-09-30"
A nice base R solution by #G.Grothendieck
as.Date(cut(date, "quarter")) - 1
# [1] "2014-09-30"
Another possibility is to use timeFirstDayInQuarter and timeLastDayInQuarter functions in package timeDate:
library(timeDate)
timeLastDayInQuarter(timeFirstDayInQuarter(date) - 1)
# GMT
# [1] [2014-09-30]
I know this is old, but since this question specifically asks for a lubridate solution and I couldn't find working code for it elsewhere, I figured I'd post, keeping in mind that the base R solution is likely the most succinct:
> sampleDate <- ymd('2015-11-11')
> yearsToSample <- years(year(sampleDate) - year(origin))
> yearsToSample
[1] "45y 0m 0d 0H 0M 0S"
> additionalMonths <- months((quarter(sampleDate) - 1) * 3)
> additionalMonths
[1] "9m 0d 0H 0M 0S"
> startOfQuarter <- ymd(origin) + yearsToSample + additionalMonths - days(1)
> startOfQuarter
[1] "2015-09-30 UTC"
Confirmed that this works for dates previous to the origin ('1970-01-01') as well.
Here is a pure lubridate solution for 2021:
> date <- ymd("2014-11-03")
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
Breaking it down:
> # The year and quarter
> quarter(date, with_year = TRUE)
[1] 2014.4
> # The first day of the quarter as a date
> yq(quarter(date, with_year = TRUE))
[1] "2014-10-01"
# The day before the first day of the quarter as a date
> yq(quarter(date, with_year = TRUE)) - days(1)
[1] "2014-09-30"
I've read the lubridate package manual and have queried Stack Overflow with a variety of permutations of my question but have come up with no answer to my specific problem.
What I'm trying to do is calculate age in months at time of event as the difference between date of birth and some specific event date.
As such, I imported a SAS dataset using the sas7bdat package and converted my SAS date variables (DOB and Event) to R objects using the following code:
df$DOB <- as.Date(df$DOB, origin="1960-01-01")
df$DOB1 <- ymd(df$DOB)
And same thing for the Event variable:
df$Event <- as.Date(df$Event, origin="1960-01-01")
df$Event1 <- ymd(df$Event)
However, there are some NA values for DOB. So, for the following code which I want to use to calculate age (in months).
df$interval <- new_interval(df$DOB1,df$Event1)
df$Age1 <- df$interval %/% months(1)
I'm receiving the error:
Error in est[start + est * per < end] <- est[start + est * per < end] + : NAs are not allowed in subscripted assignments
What am I doing wrong? I've tried an if/else function but perhaps used it incorrectly.
(Note: For the SAS programmers out there, I'm trying to produce the same results as the following function:
IF DOB ne . THEN Tage=Floor(intck('month',DOB,Event)-(Day(Event)<Day(DOB)));
Simple example using lubridate package
library(lubridate)
date1='20160101'
date2='20160501'
x=interval(ymd(date1),ymd(date2))
x= x %/% months(1)
print(x)
# answer : 4
or follows is same:
x=as.period(x) %>% month()
print(x)
# answer : 4
Well so I give all credit for this answer to my talented work colleague. I neglected to include a reproducible example because whenever I would write a simple approximation of my problem, the df$Age1 <- df$interval %/% months(1) always worked! This left me totally stumped. It wasn't until I actually ran the code on my dataframe of 650,000+ birthdates and event dates that the error message...
Error in est[start + est * per < end] <- est[start + est * per < end] + :
NAs are not allowed in subscripted assignments
... would even come up! My colleague had the idea to process this calculation iteratively with the following function:
df$Age1 = rep(NA, nrow(df))
for (i in 1:nrow(df)) {
df$Age1[i]<- df$interval[i] %/% months(1)
}
df$Age1[1:15]
Using my dataframe, it became plain to see that this calculation got hung up on row 13!
> df$interval[13]
[1] 1995-10-31 19:00:00 EST--1996-05-26 20:00:00 EDT
So we aren't certain, but maybe the fact that the df$DOB[13] is 10/31 is screwing it up. This sort of problem with the lubridate package has been reported before (i.e., lubridate not being able to divide intervals by a period when one of the dates is at the end of the month):
https://github.com/hadley/lubridate/issues/235
The way we came to a solution was by using as.period and then converting it to months:
df$Age1<- as.period(df$interval)
head(df$Age1)
[1] "1y 2m 26d 0H 0M 0S" "6m 15d 23H 0M 0S"
[3] "4m 9d 23H 0M 0S" "3m 19d 23H 0M 0S"
[5] "3y 0m 25d 0H 0M 0S" "1y 1m 29d 1H 0M 0S"
df$Age1 <- df$Age1 %/% months(1)
head(df$Age1)
[1] 14 6 4 3 36 13
Here is another example of this reported issue with lubridate (1.3.3). Note that there may be different error messages depending on what else is in the dataset, and the issue seems to be dependent on the unit of measure (in my case months worked whereas years did not).
dat <- as.data.frame(list(Start = as.Date(c("1942-08-09", "1956-02-29")),
End = as.Date(c("2007-07-31", "2007-09-13"))))
int0 <- with(dat, new_interval(Start, End))
as.period(int0, unit = "years")
"Error in est[start + est * per > end] <- est[start + est * per > end] - :
NAs are not allowed in subscripted assignments"
int1 <- with(dat[1,], new_interval(Start, End))
as.period(int1, unit = "years")
[1] "64y 11m 22d 0H 0M 0S"
int2 <- with(dat[2,], new_interval(Start, End))
as.period(int2, unit = "years")
"Error in while (any(start + est * per > end)) est[start + est * per > :
missing value where TRUE/FALSE needed"
as.period(int0) %/% years(1)
[1] 64 51
as.period(int0, unit = "months")
[1] "779m 22d 0H 0M 0S" "618m 15d 0H 0M 0S"
Instead of
df$Age1 <- df$interval %/% months(1)
you can try:
df$Age1 <- NA
df$Age1[!is.na(df$DOB)] <- df$interval[!is.na(df$DOB)] %/% months(1)
I have a time column in a database, where hours and minutes are separated by a ":". I would like to remove the ":" so that time field becomes numeric as I will use numeric time for some calculation.
Input:
X
00:00
01:15
02:30
Output:
X
0000
0115
0230
I am new to R. My apologies if this is a silly question. Greatly appreciate any help. Thank you.
> x <- c("00:00", "01:15", "02:30")
> gsub(":", "", x)
[1] "0000" "0115" "0230"
If you really want numbers you can coerce to numeric or integer
> as.numeric(gsub(":", "", x))
[1] 0 115 230
GSee's answer does what you ask. However, if you're doing arithmetic with units of time, you might think about some easier ways.
library(lubridate)
X <- hm(c("00:00", "01:15", "02:30")) # Converts to lubridate time objects
X + minutes(1)
# [1] "1M 0S" "1H 16M 0S" "2H 31M 0S"
X + weeks(2)
# [1] "14d 0H 0M 0S" "14d 1H 15M 0S" "14d 2H 30M 0S"
It might be more sensible to use the R facilities for time parsing to convert first to date-time. At the moment you will have no way to capture the non-decimal character of your "time" values. 300-259 should be 1 , not 41. This set of commands illustrates some of R's date-time and Date functions:
> X <- c('00:00', '01:15', '02:30')
> as.POSIXct(X, format="%H:%M")
[1] "2013-08-05 00:00:00 PDT" "2013-08-05 01:15:00 PDT" "2013-08-05 02:30:00 PDT"
This will give the results in differences in seconds from midnight today:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct("2013-08-05 00:00:00 PDT"))
[1] 0 4500 9000
Now try to use todays's date, but then notice that there is an offset of 7 hours because as.POSIXct will assume this is GMT.UCT time:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct(Sys.Date()))
[1] 7.00 8.25 9.50
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))
[1] 0 4500 9000
So finish off the process by shifting 7 hours (=7*3600 seconds) and then converting to minutes:
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))/60
[1] 0 75 150