Converting to POSIXct a 24h+ hour - r

I can convert to POSIXct most of the time like for instance:
as.POSIXct( "20:16:32", format = "%H:%M:%S" )
[1] "2017-06-23 20:16:32 EDT"
But once the time goes beyond 24h, it fails:
as.POSIXct( "24:16:32", format = "%H:%M:%S" )
[1] NA
Which makes some sense as 24:16:32 should rather be read as 00:16:32
Such standards of 24+ are however well spread in the design of public transportation. I could of course replace all "24:" by "00:", but I am sure there is a more elegant way out.

Read the time string into a data frame dd and set next_day to 1 if the hour exceeds 24 or more or 0 if not. Subtract 24 from the hour if next_day is 1 and add 1 day's worth of seconds. Given that today is June 23, 2017 this would work for hours between 0 and 47.
x <- "24:16:32" # test input
dd <- read.table(text = x, sep = ":", col.names = c("hh", "mm", "ss"))
next_day <- dd$hh >= 24
s <- sprintf("%s %0d:%0d:%0d", Sys.Date(), dd$hh - 24 * next_day, dd$mm, dd$ss)
as.POSIXct(s) + next_day * 24 * 60 * 60
## "2017-06-24 00:16:32 EDT"

Related

Parse time data with subseconds

I have some time data
01:09:00
00:14:00
00:00:00
11:47:00
10:34:00
08:15:00
The data are measured in %M:%S:00 (to the first numbers are the minutes, the second numbers are the seconds). I would like to convert this into a total number of seconds. This is easy to do with lubridate but R keeps thinking the format is in %H:%M:%S.
Can lubridate calculate the total number of seconds elapsed in the format my data are in? If not, how is the best way to transform the data into an appropriate format?
I've thought about converting to character and just splicing out the minutes and seconds.
library(lubridate)
foo = function(x){
hms(sapply(strsplit(x, ":"), function(xx) paste("01", xx[1], xx[2], sep = ":")))
}
a = "01:09:00"
b = "00:14:00"
foo(a) - foo(b)
#[1] "1M -5S"
#OR
as.period(foo(a) - foo(b), unit = "secs")
#[1] "55S"
Maybe the following will do it.
NumSeconds <- function(x){
f <- function(y)
sum(sapply(strsplit(y, split=":"), as.numeric) * c(60, 1, 0))
unname(sapply(x, f))
}
x <- scan(what = "character", text = "
01:09:00
00:14:00
00:00:00
11:47:00
10:34:00
08:15:00")
NumSeconds(x)
[1] 69 14 0 707 634 495
You may use data.table::as.ITime and specify format as "%M:%S"*:
x <- c("01:09:00", "10:34:00")
as.integer(as.ITime(x, format = "%M:%S"))
# [1] 69 634
*The format argument is passed to strptime and...
Each input string is processed as far as necessary for the format specified: any trailing characters are ignored.
[...]
Note that %S does not read fractional parts on output.
Or, most likely faster, substr:
as.integer(substr(x, 1, 2)) * 60 + as.integer(substr(x, 4, 5))
# [1] 69 634

I want to convert a number to time in R

say I have a number 1234, and I need to convert that to 12:34 i.e 12:34pm and eventually convert that to minutes in the day starting from 0000.
A bit of integer division and modulo should work:
x <- c(1234,830)
(x %/% 100) * 60 + x %% 100
#[1] 754 510
If you absolutely need a time representation first:
tmp <- as.POSIXct(sprintf("%04d", x), format="%H%M")
tmp - trunc(tmp, "day")
#Time differences in mins
#[1] 754 510
We can do this with sub and times from chron
library(chron)
times(sub("(.{2})", "\\1:", sprintf("%04d:00", x)))
#[1] 12:34:00 08:30:00
If we need to convert to 'minute' then
library(lubridate)
minute(as.period(hms(sub("(.{2})", "\\1:", sprintf("%04d:00", x))), unit = "minute"))
#[1] 754 510
data
x <- c(1234,830)
Assuming your time integer is based on a 24 hour format (which should be otherwise you can't distinguish between am and pm):
time <- 1234
time_converted <- sub("(\\d+)(\\d{2})", "\\1:\\2", time)
> time_converted
[1] "12:34"
minutes <- as.POSIXlt(time_converted, format="%H:%M")$hour *60 + as.POSIXlt(time_converted, format="%H:%M")$min
> minutes
[1] 754
You could use:
x <- as.POSIXct(x = "1234", format = "%H%M", tz = "UTC")
minutes(x) + hour(x) * 60
Result:
[1] 754

How do I assign a value in R if within a certain range of time?

I have a large data set that collects multiple data points each day from people over multiple days. My R dataset has participants' responses and the timestamp for their response. I want to recode the timestamp to reflect which order prompt they responded to. So basically, I want to assign a value to the timestamp based on a range of time. So if on Monday, a response falls between 10:00 and 10:30, I want the value to be 1. If a response falls between 12:15 and 12:45, I want the value to be 2. If a response falls between 2:20 and 2:50, I want the value to be 3.
BUT I need that code to work only for Monday's data. For Tuesday's data, the timestamp ranges changes. For example, if a Tuesday response falls between 9:10 and 9:40, that value should be 1. And so on.
I can't for the life of me how to figure this out with an if else statement. When I write time into R, it thinks I'm writing a code for a series of values (10 through 30) rather than time (10:30).
Example of what I have:
Example of what I want: (see the new Prompt column)
So for 10/11/15 I want Prompt 1 to fall between 11:15:00 and 11:45:00, but for 11/11/15 I want Prompt 1 to be different--between 12:00:00 and 12:30:00
If you want to work with times and dates, the POSIXlt class is helpful. If your timestamps are
stored as strings, the first step is to convert them into POSIXlt. You can use "strptime" for this, e.g.
> t <- strptime("2015-01-01 12:18",format="%Y-%m-%d %H:%M")
> t
[1] "2015-01-01 12:18:00 CET"
> class(t)
[1] "POSIXlt" "POSIXt"
>
The following function "timerange" assigns a time range number to such a POSIXlt object:
R <- list( Sun = list(),
Mon = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),
Tue = list( c( "9:10", "9:40"), c("11:00","11:30"), c("13:15","13:40") ),
Wed = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),
Thu = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),
Fri = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ),
Sat = list( c("10:00","10:30"), c("12:15","12:40"), c("13:15","13:40") ) )
timerange <- function(t)
{
s <- unlist(strsplit(strftime(t,format="%Y-%m-%d %H:%M:%S %w")," "))
w <- as.numeric(s[3]) + 1
n <- sapply(R[[w]], function(x){ strptime(paste(s[1]," ",x,":00",sep=""),
format="%Y-%m-%d %H:%M:%S")})
return( which(sapply(n,function(x){ t-x[1]>=0 & t-x[2]<=0})) )
}
"R" is the list of all time ranges. You can change it as you like.
"strftime" is the counterpart to "strptime", i.e. it converts the POSIXlt object "t" into
a string of a desired format. This string is then spitted into the date part, the time part,
and the day of the week. The latter is used to pick the appropriate sublist in "R".
Then "strptime" is used to create a list of pairs of POSIXlt objects. The time part comes from the
appropriate sublist of "R", and the date part comes from "t". Each such pair represents a time interval.
Then the time range number is the index of the time interval which contains "t".
Some examples:
> t <- strptime("2015-01-01 12:18",format="%Y-%m-%d %H:%M")
> timerange(t)
[1] 2
> t <- strptime("2015-01-05 10:01",format="%Y-%m-%d %H:%M")
> timerange(t)
[1] 1
> t <- strptime("05.01.2015 13:25",format="%d.%m.%Y %H:%M")
> timerange(t)
[1] 3
I have a simpler solution using days, hours and minutes and your (manual) filters which you can use as a function.
Check my simple example:
library(lubridate)
# example dataset
dt = data.frame(responce = 1:3,
date = c("2015-08-10 10:15:34","2015-08-10 12:29:14","2015-08-11 09:12:18"),
stringsAsFactors = F)
dt
# responce date
# 1 1 2015-08-10 10:15:34
# 2 2 2015-08-10 12:29:14
# 3 3 2015-08-11 09:12:18
# transform to date and obtain day, hour and minutes
dt$date = ymd_hms(dt$date)
dt$day = wday(dt$date, label=T)
dt$hour = hour(dt$date)
dt$minute = minute(dt$date)
dt
# responce date day hour minute
# 1 1 2015-08-10 10:15:34 Mon 10 15
# 2 2 2015-08-10 12:29:14 Mon 12 29
# 3 3 2015-08-11 09:12:18 Tues 9 12
# create a column with an arbitrary value to start with and also double check in the end
dt$value = -1
# conditions for Monday
dt$value[dt$day=="Mon" & dt$hour==10 & dt$minute >= 0 & dt$minute <=30] = 1
dt$value[dt$day=="Mon" & dt$hour==12 & dt$minute >= 15 & dt$minute <=45] = 2
dt$value[dt$day=="Mon" & dt$hour==14 & dt$minute >= 20 & dt$minute <=50] = 3
# conditions for Tuesday
dt$value[dt$day=="Tues" & dt$hour==9 & dt$minute >= 10 & dt$minute <=40] = 1
dt
# responce date day hour minute value
# 1 1 2015-08-10 10:15:34 Mon 10 15 1
# 2 2 2015-08-10 12:29:14 Mon 12 29 2
# 3 3 2015-08-11 09:12:18 Tues 9 12 1
# double check all your rows matched (you have no -1 values)
dt[dt$value == -1]
# data frame with 0 columns and 3 rows
I ended up using some of both of those answers.
library(lubridate)
#change data to POSIXct class
data$StartDate <- dmy(as.character(data$StartDate))
data$EndDate <- dmy(as.character(data$EndDate))
data$StartTime2 <- hms(as.character(data$StartTime))
data$EndTime2 <- hms(as.character(data$Endataime))
I didn't have to do both, but I did anyway. I created an additional variable because changing it makes it look funny.
#check me out
class(data$StartDate)
#[1] "POSIXct" "POSIXt"
class(data$StartTime2)
#[1] "Period"
#attr(,"package")
#[1] "lubridate"
Based off the second comment I then did:
data$day = wday(data$StartDate, label=T)
data$hour = hour(data$StartTime2)
data$minute = minute(data$StartTime2)
# create a column with an arbitrary value to start with and also double check in the end
data$prompt = -1
# conditions for Tuesday (10/11/2015)
data$prompt[data$day=="Tues" & data$hour==11 & data$minute >= 10 & data$minute <=40] = 1
data$prompt[data$day=="Tues" & data$hour==13 & data$minute >= 35 & data$minute <=59] = 2
data$prompt[data$day=="Tues" & data$hour==16 & data$minute >= 15 & data$minute <=45] = 3
And so on. I know I have to fix the prompt 2 for this day because it goes into hour 14, but that's to play with next. Thanks for your help!

Convert hours:minutes:seconds to minutes

I have a vector "Time.Training" in the format hours:minutes:seconds (e.g.
Time.Training <- c("1:00:00", "0:45:00", "0:30:00", "1:30:00")
I would like to convert this into minutes in the format:
Time.Training.Minutes <- c(60, 45, 30, 90)
I'm wondering if someone has a straightforward method of doing this in R.
Many thanks.
Matt
Using lubridate:
Time.Training<- c("1:00:00", "0:45:00", "0:30:00", "1:30:00")
library(lubridate)
res <- hms(Time.Training) # format to 'hours:minutes:seconds'
hour(res)*60 + minute(res) # convert hours to minutes, and add minutes
## [1] 60 45 30 90
Try this. We basically converting to POSIXlt class first by pasting a real date to the vector using the Sys.Date() function (because there is no hour class in base R) and then using hour and min arguments in order to achieve the output
Res <- as.POSIXlt(paste(Sys.Date(), Time.Training))
Res$hour*60 + Res$min
## [1] 60 45 30 90
Use as.difftime:
> Time.Training<- c("1:00:00", "0:45:00", "0:30:00", "1:30:00")
> strtoi(as.difftime(Time.Training, format = "%H:%M:%S", units = "mins"))
[1] 60 45 30 90
Here are some alternatives:
1) The chron package has a "times" class in which 1 unit is a day and there are 60 * 24 minutes in a day so:
library(chron)
60 * 24 * as.numeric(times(Time.Training))
giving:
[1] 60 45 30 90
1a) Another approach using chron is the following (giving the same answer):
library(chron)
ch <- times(Time.training)
60 * hours(ch) + minutes(ch)
2) Here is an approach using read.table and matrix/vector multiplication. No packages are needed:
c(as.matrix(read.table(text = Time.Training, sep = ":")) %*% c(60, 1, 1/60))
(Using "POSIXlt" is probably the most straight-forward approach without packages but another answer already provides that.)
Taking the hour column from the date time column and create a new cloumn hour and give only hour data in that column 2011-01-01 00:00:01
Ans :
bikeshare$hour<-sapply(bikeshare$datetime,function(x){format(x,"%H")})

R if statement across columns of data table to populate new column

data <- data.frame(dates = c("2014-10-28 00:01:59.526","2014-10-27 13:30:01.526"),
times = c("23:59:59","13:29:55"),
hour = c(23,13),
minute = c(59,29),
second = c(59,55))
data[,1] <- as.POSIXct(data[,1])
data[,2] <- as.factor(data[,2])
class(data[,1])
class(data[,2])
class(data[,3])
class(data[,4])
class(data[,5])
data
dates times hour minute second
1 2014-10-28 00:01:59.526 23:59:59 23 59 59
2 2014-10-27 13:30:01.526 13:29:55 13 29 55
I need to populate a new column "NewDate" with a POSIXct data that is the combination of the date and time column BUT there IF the hour column shows 23 then the date for "NEwDate" should be the date from the "date" column MINUS 1 day otherwise it should be the date from the "date" column.
So the final output should be:
date time hour minute second NewDate
1 2014-10-28 00:01:59.526 23:59:59 23 59 59 2014-10-27 23:59:59 #NewDate = date-1 + time
2 2014-10-27 13:30:01.526 13:29:55 13 29 55 2014-10-17 13:29:55 #NewDate = date + time
(NewDate has to be a POSIXct)
What is the best way to do this WITHOUT looping down the data frame and doing something like:
library(lubridate) #lubridate contains hour(), minute(), second()
CorrectTIME <- function(date, hour, minute, second)
{
NewDate<- vector("numeric",length(date))
for(i in 1:length(date))
{
if(hour[i] > hour(date[i]) )
{
NewDate[i] =ISOdatetime(year(date[i]), month(date[i]), day(date[i])-1, hour[i], minute[i], second[i], tz="GMT")
}else
{
NewDate[i] =ISOdatetime(year(date[i]), month(date[i]), day(date[i]), hour[i], minute[i], second[i], tz="GMT")
}
}
}
with(data, paste ( format( as.Date(dates) - (hour == 23) , "%Y-%m-%d"),
paste( hour, minute, second, sep=":")))
#[1] "2014-10-27 23:59:59" "2014-10-27 13:29:55"
Ooops, forgot the as.POSIXct:
as.POSIXct( with(data, paste ( format( as.Date(dates) - (hour==23) ,
"%Y-%m-%d"), paste( hour, minute, second, sep=":"))) )
[1] "2014-10-27 23:59:59 PDT" "2014-10-27 13:29:55 PDT"

Resources