Does anybody have a neater way of rounding lubridate period and duration objects to minutes instead of seconds.
For example, I have the following pipe of code:
seconds(x = 3600115) %>% as.duration() %>% as.period()
This results in: 41d 16H 1M 55S. I would like to round it so it becomes: 41d 16H 2M 0S.
Is anybody aware of a better way than:
(seconds(x = 3600115) / 60) %>% as.numeric() %>% round() %>% dminutes() %>% as.period()
This results in: 41d 16H 2M 0S
A duration object is stored numerically as the number of seconds and period objects can conveniently be converted into seconds using period_to_seconds() so you could use a simple function for this:
library(lubridate)
# Create period object
p <- seconds_to_period(3600115)
# Create duration object
d <- as.duration(p)
minround <- function(x) {
stopifnot(is.period(x) || is.duration(x))
if (is.duration(x))
round(x / 60) * 60
else
seconds_to_period(round(period_to_seconds(x) / 60) * 60)
}
minround(p)
# [1] "41d 16H 2M 0S"
minround(d)
# [1] "3600120s (~5.95 weeks)"
A base R option :
as.POSIXct(3600115 - 86400, origin = '1970-01-01', tz = 'UTC') %>%
round('mins') %>%
format('%jd %HH %MM 0S')
#[1] "041d 16H 02M 0S"
Few things to note -
Used pipe for readability.
Output is a string object and not period object.
Related
I have two data frames with time data every 15 minutes but one starts precisely on time (0:00, 0:15, 0:30, 0:45, etc.) and one starts slightly off (0:03, 0:18, 0:33, 0:48, etc). I would like to round the slightly off one to the nearest 15 minutes interval so that I may later merge the data frames so that the data corresponding to those times are in the same rows. The data is in 24 hour time and, as an example, in the format of:
Time
0:00
0:15
0:30
0:45
1:00
I have tried the code below but r returns the error:
library(lubridate)
p_data <- read.csv("Filter12.csv", header = TRUE)
p_data$Time <- round_date(p_data$Time, "15 mins")
Error in UseMethod("reclass_date", orig) :
no applicable method for 'reclass_date' applied to an object of class "character"
In addition: Warning message: All formats failed to parse. No formats found.
I then tried converting the time column from character to numeric but recieved the error:
p_data$Time <- as.numeric(p_data$Time)
Warning message:
NAs introduced by coercion
I am very new to r (just started learning this week) so I apologize if this is due to a lack of common knowledge.
Here an approach with lubridate. We first generate an arbitrary data set and then round it to 15minutes:
library(lubridate)
x <- seq(as.POSIXct("2021-05-19 10:00"), as.POSIXct("2021-05-19 11:00"), 240)
x
round_date(x, unit="15 mins")
Edit: here the same idea with minutes only. We use a fake date, append the time, round it to 15min and extract minutes only:
library(lubridate)
x <- c("0:03", "0:18", "0:33", "0:48")
format(round_date(as.POSIXct(paste("1900-01-01 ", x)), unit="15 mins"), "%M")
1) times Convert to times class, round it and then convert back to character. The hour must be less than 24 but that seems to be the case in the question.
library(chron)
x <- c("0:03", "0:18", "0:33", "0:48") # input
sub(":..$", "", round(times(paste0(x, ":00")), "00:15:00"))
## [1] "00:00" "00:15" "00:30" "00:45"
2) Base R Convert to difftime and then numeric minutes, round it and finally display in required form.
mins <- 15 * round(as.double(as.difftime(x, format = "%H:%M"), "mins") / 15)
format(as.POSIXct(60 * mins, origin = "1970-01-01", tz = "GMT"), "%H:%M") ###
## [1] "00:00" "00:15" "00:30" "00:45"
2a) A base R solution not using difftime or POSIXct is:
mins <- with(read.table(text = x, sep = ":"), 15 * round((60 * V1 + V2) / 15))
sprintf("%02d:%02d", mins %/% 60, mins %% 60)
## [1] "00:00" "00:15" "00:30" "00:45"
I am trying to manipulate a date inside a datetime vector depending on time of day.
Each item in the vector newmagic looks something like this "2020-03-05 02:03:54 UTC"
For all the items that have a time between 19:00 and 23:59 I want to go back one day.
I tried writing an if statement:
if(hour(newmagic)>=19&hour(newmagic)<=23){
date(newmagic)<-date(newmagic)-1
}
giving me no output but
Warning message: In if (hour(newmagic) >= 19 & hour(newmagic) <= 23) {
: the condition has length > 1 and only the first element will be
used
when I limit the data to the condition and simply execute date()-1
newmagic[hour(newmagic)>=19&hour(newmagic)<=23&!is.na(newmagic)] <- date(newmagic[hour(newmagic)>=19&hour(newmagic)<=23&!is.na(newmagic)])-1
The output does remove 1 day but also sets the time to 0
Original:
"2020-03-07 20:58:00 UTC"
After date()-1
"2020-03-06 00:00:00 UTC"
I don't really know how to go on.
How can I adapt the if statement so that it will actually do what I intend to?
How can I rewrite the limitation in the second approach so that the time itself will stay intact?
Thank you for the help
You can try out this in your original data set. I have used lubridate and tidyverse
package. Initially I have split the data frame into date and time. Then I have converted the variables into date and time format and used the ifelse condition.
The code and the output is as follows:-
library(tidyverse)
library(lubridate)
ab <- data.frame(ymd_hms(c("2000-11-01 2:23:15", "2028-03-25 20:47:51",
"1990-05-14 22:45:30")))
colnames(ab) <- paste(c("Date_time"))
ab <- ab %>% separate(Date_time, into = c("Date", "Time"),
sep = " ", remove = FALSE)
ab$Date <- as.Date(ab$Date)
ab$Time <- hms(ab$Time)
ab$date_condition <- ifelse(hour(ab$Time) %in% c(19,20,21,22,23),
ab$date_condition <- ab$Date -1,
ab$date_condition <- ab$Date)
ab$date_condition <- as.Date(ab$date_condition, format = "%Y-%m-%d",
origin = "1970-01-01")
ab
# Date_time Date Time date_condition
1 2000-11-01 02:23:15 2000-11-01 2H 23M 15S 2000-11-01
2 2028-03-25 20:47:51 2028-03-25 20H 47M 51S 2028-03-24
3 1990-05-14 22:45:30 1990-05-14 22H 45M 30S 1990-05-13
I have a set of times in milliseconds that I want to convert to hh: mm. An example dataset would be:
data <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
I get this with manual calculation but it does not provide me the correct value for the minutes.
> data/1000/60/60
[1] 1.649028 1.510694 1.068167 2.066639 1.522278 1.917750
I tried this
format(as.POSIXct(Sys.Date())+data, "%H:%M")
[1] "12:01" "17:41" "07:10" "21:38" "05:16" "16:45"
but that is not even close. Any thoughts on that?
Thanks!
hrs = data/(60 * 60 * 1000)
mins = (hrs %% 1) * 60
secs = (mins %% 1) * 60
paste(trunc(hrs), trunc(mins), round(secs, 2), sep = ":")
#[1] "1:38:56.5" "1:30:38.5" "1:4:5.4" "2:3:59.9" "1:31:20.2" "1:55:3.9"
Also,
library(lubridate)
seconds_to_period(data/1000)
#[1] "1H 38M 56.5S" "1H 30M 38.5S" "1H 4M 5.40000000000009S"
#[4] "2H 3M 59.8999999999996S" "1H 31M 20.1999999999998S" "1H 55M 3.89999999999964S"
The zero point we can get by doing:
strftime(as.POSIXlt.numeric(0, format="%OS", origin="1970-01-01") - 7200, format="%R")
# [1] "00:00"
Accordingly:
t.adj <- 0
res <- strftime(as.POSIXlt.numeric(v/1000, format="%OS", origin="1970-01-01") - t.adj*3600,
format="%R", tz="GMT")
res
# [1] "01:38" "01:30" "01:04" "02:03" "01:31" "01:55"
class(res)
# [1] "character"
The date doesn't matter, since:
class(res)
# [1] "character"
Note, that this solution might depend on your Sys.getlocale("LC_TIME"). In the solution above there is an optional hour adjustment t.adj*, however in my personal locale it's set to zero to yield the right values.
Data
v <- c(5936500, 5438500, 3845400, 7439900, 5480200, 6903900)
*To automate the localization you may want to look into the answers to this question.
The result of the following instruction:
as.difftime("5 11:04:36", "%d %H:%M:%S", units =("mins"))
is
Time difference of -7975.4 mins
It seems that this function is calculating the time difference between Sys.Time() and the given value.
I actually need an object to store a time span value extracted from a string (). Am I using the wrong function or is it not the right way of using it?
Look into lubridate
I think you want something like Duration-class, Period-class, or Timespan-class
# Duration
dur <- duration(hours = 10, minutes = 6)
# [1] "36360s (~10.1 hours)"
# Period
per <- period(hours = 10, minutes = 6)
# [1] "10H 6M 0S"
I've read the lubridate package manual and have queried Stack Overflow with a variety of permutations of my question but have come up with no answer to my specific problem.
What I'm trying to do is calculate age in months at time of event as the difference between date of birth and some specific event date.
As such, I imported a SAS dataset using the sas7bdat package and converted my SAS date variables (DOB and Event) to R objects using the following code:
df$DOB <- as.Date(df$DOB, origin="1960-01-01")
df$DOB1 <- ymd(df$DOB)
And same thing for the Event variable:
df$Event <- as.Date(df$Event, origin="1960-01-01")
df$Event1 <- ymd(df$Event)
However, there are some NA values for DOB. So, for the following code which I want to use to calculate age (in months).
df$interval <- new_interval(df$DOB1,df$Event1)
df$Age1 <- df$interval %/% months(1)
I'm receiving the error:
Error in est[start + est * per < end] <- est[start + est * per < end] + : NAs are not allowed in subscripted assignments
What am I doing wrong? I've tried an if/else function but perhaps used it incorrectly.
(Note: For the SAS programmers out there, I'm trying to produce the same results as the following function:
IF DOB ne . THEN Tage=Floor(intck('month',DOB,Event)-(Day(Event)<Day(DOB)));
Simple example using lubridate package
library(lubridate)
date1='20160101'
date2='20160501'
x=interval(ymd(date1),ymd(date2))
x= x %/% months(1)
print(x)
# answer : 4
or follows is same:
x=as.period(x) %>% month()
print(x)
# answer : 4
Well so I give all credit for this answer to my talented work colleague. I neglected to include a reproducible example because whenever I would write a simple approximation of my problem, the df$Age1 <- df$interval %/% months(1) always worked! This left me totally stumped. It wasn't until I actually ran the code on my dataframe of 650,000+ birthdates and event dates that the error message...
Error in est[start + est * per < end] <- est[start + est * per < end] + :
NAs are not allowed in subscripted assignments
... would even come up! My colleague had the idea to process this calculation iteratively with the following function:
df$Age1 = rep(NA, nrow(df))
for (i in 1:nrow(df)) {
df$Age1[i]<- df$interval[i] %/% months(1)
}
df$Age1[1:15]
Using my dataframe, it became plain to see that this calculation got hung up on row 13!
> df$interval[13]
[1] 1995-10-31 19:00:00 EST--1996-05-26 20:00:00 EDT
So we aren't certain, but maybe the fact that the df$DOB[13] is 10/31 is screwing it up. This sort of problem with the lubridate package has been reported before (i.e., lubridate not being able to divide intervals by a period when one of the dates is at the end of the month):
https://github.com/hadley/lubridate/issues/235
The way we came to a solution was by using as.period and then converting it to months:
df$Age1<- as.period(df$interval)
head(df$Age1)
[1] "1y 2m 26d 0H 0M 0S" "6m 15d 23H 0M 0S"
[3] "4m 9d 23H 0M 0S" "3m 19d 23H 0M 0S"
[5] "3y 0m 25d 0H 0M 0S" "1y 1m 29d 1H 0M 0S"
df$Age1 <- df$Age1 %/% months(1)
head(df$Age1)
[1] 14 6 4 3 36 13
Here is another example of this reported issue with lubridate (1.3.3). Note that there may be different error messages depending on what else is in the dataset, and the issue seems to be dependent on the unit of measure (in my case months worked whereas years did not).
dat <- as.data.frame(list(Start = as.Date(c("1942-08-09", "1956-02-29")),
End = as.Date(c("2007-07-31", "2007-09-13"))))
int0 <- with(dat, new_interval(Start, End))
as.period(int0, unit = "years")
"Error in est[start + est * per > end] <- est[start + est * per > end] - :
NAs are not allowed in subscripted assignments"
int1 <- with(dat[1,], new_interval(Start, End))
as.period(int1, unit = "years")
[1] "64y 11m 22d 0H 0M 0S"
int2 <- with(dat[2,], new_interval(Start, End))
as.period(int2, unit = "years")
"Error in while (any(start + est * per > end)) est[start + est * per > :
missing value where TRUE/FALSE needed"
as.period(int0) %/% years(1)
[1] 64 51
as.period(int0, unit = "months")
[1] "779m 22d 0H 0M 0S" "618m 15d 0H 0M 0S"
Instead of
df$Age1 <- df$interval %/% months(1)
you can try:
df$Age1 <- NA
df$Age1[!is.na(df$DOB)] <- df$interval[!is.na(df$DOB)] %/% months(1)