In the following data frame the 'time' column is character in the format hour:minute:second
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time)
How can I convert 'time' column to a dedicated time class, so that I can perform arithmetic calculations on it?
Use the function chron in package chron:
time<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
library(chron)
x <- chron(times=time)
x
[1] 00:00:01 01:02:00 09:30:01 14:15:25
Do some useful things, like calculating the difference between successive elements:
diff(x)
[1] 01:01:59 08:28:01 04:45:24
chron objects store the values internally as a fraction of seconds per day. Thus 1 second is equivalent to 1/(60*60*24), or 1/86400, i.e. 1.157407e-05.
So, to add times, one simple option is this:
x + 1/86400
[1] 00:00:02 01:02:01 09:30:02 14:15:26
Using base R you could convert it to an object of class POSIXct, but this does add a date to the time:
id<-c(1,2,3,4)
time<-c("00:00:01","01:02:00","09:30:01","14:15:25")
df<-data.frame(id,time,stringsAsFactors=FALSE)
as.POSIXct(df$time,format="%H:%M:%S")
[1] "2012-08-20 00:00:01 CEST" "2012-08-20 01:02:00 CEST"
[3] "2012-08-20 09:30:01 CEST" "2012-08-20 14:15:25 CEST"
But that does allow you to perform arithmetic calculations on them.
Another possible alternative could be:
time <- c("00:00:01","01:02:00","09:30:01","14:15:25")
converted.time <- as.difftime(time, units = "mins") #"difftime" class
secss <- as.numeric(converted.time, units = "secs")
hourss <- as.numeric(converted.time, units = "hours")
dayss <- as.numeric(converted.time, units="days")
Or even:
w <- strptime(x = time, format = "%H:%M:%S") #"POSIXlt" "POSIXt" class
Using the ITime class in data.table package:
ITime is a time-of-day class stored as the integer number of seconds in the day.
library(data.table)
(it <- as.ITime(time))
# [1] "00:00:01" "01:02:00" "09:30:01" "14:15:25"
it + 10
# [1] "00:00:11" "01:02:10" "09:30:11" "14:15:35"
diff(it)
# [1] "01:01:59" "08:28:01" "04:45:24"
lubridate allows good flexibility on the time format :
library(lubridate)
time_hms_1<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
hms(time_hms_1)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hms_2<-c("0:00:01", "1:02:00", "9:30:01", "14:15:25")
hms(time_hms_2)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hm_1<-c("00:00", "01:02", "09:30", "14:15")
hm(time_hm_1)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
time_hm_2<-c("0:00", "1:02", "9:30", "14:15")
hm(time_hm_2)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
Created on 2020-07-03 by the reprex package (v0.3.0)
Yet another alternative using the hms package.
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time, stringsAsFactors = FALSE)
Convert column time to class hms
# install.packages("hms")
library(hms)
df$time <- as.hms(df$time)
Perform arithmetic calculations
diff(df$time)
#01:01:59
#08:28:01
#04:45:24
Related
I am trying to determine the absolute number of days between two dates using lubridate.
library(lubridate)
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
dates$diff
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
abs(dates$diff)
[1] "-1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
I would have expected all of the values to be positive. Furthermore, min and max do not return the smallest and largest values.
min(dates$diff)
[1] 0
max(dates$diff)
[1] 0
Why do these functions behave differently on lubridate periods than on numeric/integer objects?
The simple answer is that period class objects from lubridate are not simple numeric objects. They are S4 objects. Their main data member is the numeric vector of seconds, with the minutes, hours, days and years all stored as attributes. When you try to apply mathematical operators on period objects, the operators don't apply to the attributes, only to the main numerical vector, which is the seconds part.
We can see this if we create a period of -1 seconds:
library(lubridate)
p <- as.period(diff(as.POSIXct(c("2020-09-24 21:00:01", "2020-09-24 21:00:00"))))
p
#> [1] "-1S"
abs(p)
#> [1] "1S"
Now let us examine our object's attributes:
attributes(p)
#> $year
#> [1] 0
#>
#> $month
#> [1] 0
#>
#> $day
#> [1] 0
#>
#> $hour
#> [1] 0
#>
#> $minute
#> [1] 0
#>
#> $class
#> [1] "Period"
#> attr(,"package")
#> [1] "lubridate"
With S4 objects, you need to define what functions like abs and min will do by writing the "Math" and "Summary" group generics. However, these have not been defined for class "period", so they instead are called on the main data vector (which is just the vector of seconds). The Ops group generic has been defined though, which us why you can do things like dates$diff / 2 and get a sensible answer.
Why have they not been defined? That's one for the authors to answer. In the meantime, you could get the functionality you want by making abs an S3 method and specifically writing an abs.period method, like this:
abs <- function(x) UseMethod("abs")
abs.default <- function(x) base::abs(x)
abs.Period <- function(out)
{
new("Period", abs(out$second),
year = abs(out$year),
month = abs(out$month),
day = abs(out$day), hour = abs(out$hour),
minute = abs(out$minute))
}
Which would give your expected behaviour:
dates <- data.frame(
time1 = date(c("2011-01-01", "2012-01-01", "2013-01-01")),
time2 = date(c("2011-01-02", "2011-12-31", "2013-01-01"))
)
dates$diff <- days(dates$time1 - dates$time2)
abs(dates$diff)
#> [1] "1d 0H 0M 0S" "1d 0H 0M 0S" "0S"
However, this is probably not a great idea. Best to work with difftimes for arithmetic and and convert to periods if needed.
I hope that clarifies things a bit.
I have a column of "times" in string format in hour and minute (no seconds)
time ...
<char>
18:40
12:20
23:59
2:15
...
Is there a way to convert these into times and then round them down such that my data will look like this
time ...
<time>
18:00
12:00
23:00
2:00
...
POSIXct class needs both date and time, so if date is not provided it by default takes today's date. You can then use floor_date to round it down at the nearest hour.
library(lubridate)
floor_date(as.POSIXct(df$time, 'UTC', format = '%H:%M'), 'hour')
#[1] "2020-07-06 18:00:00 UTC" "2020-07-06 12:00:00 UTC" "2020-07-06 23:00:00 UTC"
#[4] "2020-07-06 02:00:00 UTC"
You can then use format to keep part that you are interested in.
format(floor_date(as.POSIXct(df$time, 'UTC', format = '%H:%M'), 'hour'), '%H:%M')
#[1] "18:00" "12:00" "23:00" "02:00"
A solution without date-time manipulation using regex :
sub(':.*', ':00', df$time)
#[1] "18:00" "12:00" "23:00" "2:00"
However, note that manipulating date and times using regex is probably not the best option.
data
df <- structure(list(time = c("18:40", "12:20", "23:59", "2:15")),
class = "data.frame", row.names = c(NA, -4L))
Maybe Period class in lubridate is what you need:
library(lubridate)
Parse periods with hour and minute
hm(df$time)
# [1] "18H 40M 0S" "12H 20M 0S" "23H 59M 0S" "2H 15M 0S"
Extract hours component
hour(hm(df$time))
# [1] 18 12 23 2
Create a new period object
hours(hour(hm(df$time)))
# [1] "18H 0M 0S" "12H 0M 0S" "23H 0M 0S" "2H 0M 0S"
What I'm trying to do is get lubridate to make the minute portion of the conversion always have a length of 3. This is so that further processing is of a uniform length. E.g the time 12:00 is converted to 12H 00M 0S instead of the default output of 12H 0M 0S. I can't see anything in the help for hm() so it may be necessary to use something outside of the lubridate package.
Below is some sample code of this idea:
library (lubridate)
df <- data.frame("Time" = c("13:04", "13:55", "8:00", "8:45"))
df$Time <- hm(df$Time)
# Desired output
# 13H 04M 0S
# 13H 55M 0S
# 8H 00M 0S
# 8H 45M 0S
This code will get your desired output:
library (lubridate)
df <- data.frame("Time" = c("13:04", "13:55", "8:00", "8:45"))
paste0(hour(hm(df$Time)), "H ", sprintf("%02d", minute(hm(df$Time))), "M 0S")
"13H 04M 0S" "13H 55M 0S" "8H 00M 0S" "8H 45M 0S"
However, I don't know what kind of "further processing" you aim to do, but this format won't allow for subsequent time/numeric computations.
hour(hm(df$Time)) #gives the hours
minute(hm(df$Time)) #gives the minutes
sprintf("%02d", ...argument2...) # forces the 2nd argument call (here the minutes) of that function to consist of 2 characters, which adds the zeroes when needed
paste0() #pastes the hours and minutes together as 1 string, with the string "H " inbetween and the string "M 0S" at the end
I have a time column in a database, where hours and minutes are separated by a ":". I would like to remove the ":" so that time field becomes numeric as I will use numeric time for some calculation.
Input:
X
00:00
01:15
02:30
Output:
X
0000
0115
0230
I am new to R. My apologies if this is a silly question. Greatly appreciate any help. Thank you.
> x <- c("00:00", "01:15", "02:30")
> gsub(":", "", x)
[1] "0000" "0115" "0230"
If you really want numbers you can coerce to numeric or integer
> as.numeric(gsub(":", "", x))
[1] 0 115 230
GSee's answer does what you ask. However, if you're doing arithmetic with units of time, you might think about some easier ways.
library(lubridate)
X <- hm(c("00:00", "01:15", "02:30")) # Converts to lubridate time objects
X + minutes(1)
# [1] "1M 0S" "1H 16M 0S" "2H 31M 0S"
X + weeks(2)
# [1] "14d 0H 0M 0S" "14d 1H 15M 0S" "14d 2H 30M 0S"
It might be more sensible to use the R facilities for time parsing to convert first to date-time. At the moment you will have no way to capture the non-decimal character of your "time" values. 300-259 should be 1 , not 41. This set of commands illustrates some of R's date-time and Date functions:
> X <- c('00:00', '01:15', '02:30')
> as.POSIXct(X, format="%H:%M")
[1] "2013-08-05 00:00:00 PDT" "2013-08-05 01:15:00 PDT" "2013-08-05 02:30:00 PDT"
This will give the results in differences in seconds from midnight today:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct("2013-08-05 00:00:00 PDT"))
[1] 0 4500 9000
Now try to use todays's date, but then notice that there is an offset of 7 hours because as.POSIXct will assume this is GMT.UCT time:
> as.numeric(as.POSIXct(X, format="%H:%M") - as.POSIXct(Sys.Date()))
[1] 7.00 8.25 9.50
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))
[1] 0 4500 9000
So finish off the process by shifting 7 hours (=7*3600 seconds) and then converting to minutes:
> as.numeric(as.POSIXct(X, format="%H:%M") - (as.POSIXct(Sys.Date())+7*3600))/60
[1] 0 75 150
In the following data frame the 'time' column is character in the format hour:minute:second
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time)
How can I convert 'time' column to a dedicated time class, so that I can perform arithmetic calculations on it?
Use the function chron in package chron:
time<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
library(chron)
x <- chron(times=time)
x
[1] 00:00:01 01:02:00 09:30:01 14:15:25
Do some useful things, like calculating the difference between successive elements:
diff(x)
[1] 01:01:59 08:28:01 04:45:24
chron objects store the values internally as a fraction of seconds per day. Thus 1 second is equivalent to 1/(60*60*24), or 1/86400, i.e. 1.157407e-05.
So, to add times, one simple option is this:
x + 1/86400
[1] 00:00:02 01:02:01 09:30:02 14:15:26
Using base R you could convert it to an object of class POSIXct, but this does add a date to the time:
id<-c(1,2,3,4)
time<-c("00:00:01","01:02:00","09:30:01","14:15:25")
df<-data.frame(id,time,stringsAsFactors=FALSE)
as.POSIXct(df$time,format="%H:%M:%S")
[1] "2012-08-20 00:00:01 CEST" "2012-08-20 01:02:00 CEST"
[3] "2012-08-20 09:30:01 CEST" "2012-08-20 14:15:25 CEST"
But that does allow you to perform arithmetic calculations on them.
Another possible alternative could be:
time <- c("00:00:01","01:02:00","09:30:01","14:15:25")
converted.time <- as.difftime(time, units = "mins") #"difftime" class
secss <- as.numeric(converted.time, units = "secs")
hourss <- as.numeric(converted.time, units = "hours")
dayss <- as.numeric(converted.time, units="days")
Or even:
w <- strptime(x = time, format = "%H:%M:%S") #"POSIXlt" "POSIXt" class
Using the ITime class in data.table package:
ITime is a time-of-day class stored as the integer number of seconds in the day.
library(data.table)
(it <- as.ITime(time))
# [1] "00:00:01" "01:02:00" "09:30:01" "14:15:25"
it + 10
# [1] "00:00:11" "01:02:10" "09:30:11" "14:15:35"
diff(it)
# [1] "01:01:59" "08:28:01" "04:45:24"
lubridate allows good flexibility on the time format :
library(lubridate)
time_hms_1<-c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
hms(time_hms_1)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hms_2<-c("0:00:01", "1:02:00", "9:30:01", "14:15:25")
hms(time_hms_2)
#> [1] "1S" "1H 2M 0S" "9H 30M 1S" "14H 15M 25S"
time_hm_1<-c("00:00", "01:02", "09:30", "14:15")
hm(time_hm_1)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
time_hm_2<-c("0:00", "1:02", "9:30", "14:15")
hm(time_hm_2)
#> [1] "0S" "1H 2M 0S" "9H 30M 0S" "14H 15M 0S"
Created on 2020-07-03 by the reprex package (v0.3.0)
Yet another alternative using the hms package.
id <- c(1, 2, 3, 4)
time <- c("00:00:01", "01:02:00", "09:30:01", "14:15:25")
df <- data.frame(id, time, stringsAsFactors = FALSE)
Convert column time to class hms
# install.packages("hms")
library(hms)
df$time <- as.hms(df$time)
Perform arithmetic calculations
diff(df$time)
#01:01:59
#08:28:01
#04:45:24