I would like to find the mid time between time 1 and time 2 in R. I already have the duration in hours. Sometimes it's overnight (row 1) and sometimes not (row 2).
Here are the first two rows in the data frame
dat <- data.frame(id=1:2, tm1=c("23:00","01:00"), tm2=c("07:00","06:00"), dur=c("8.0","5.0"))
So in row 1 the mid time should be 3:00 and in row 2 it should be 03:30.
you can just add half of the duration
format(strptime(dat$tm1,format = '%H:%M') + dat$dur*3600/2, format = '%H:%M')
Here is one way -
library(dplyr)
dat %>%
mutate(across(starts_with('tm'), as.POSIXct, tz = 'UTC', format = '%H:%M'),
tm2 = if_else(tm1 > tm2, tm2 + 86400, tm2),
midtime = tm1 + difftime(tm2, tm1, units = 'secs')/2,
across(c(starts_with('tm'), midtime), format, '%H:%M'))
# id tm1 tm2 dur midtime
#1 1 23:00 07:00 8.0 03:00
#2 2 01:00 06:00 5.0 03:30
The logic is -
Convert tm1 and tm2 to POSIXct class. This will add today's date in both the columns.
Now if tm1 > tm2 (like in row 1) add 1 day to tm2.
Subtract the two times, divide it by 2 and add the difference to tm1 to get midtime.
Finally change the time in '%H:%M' format in all columns.
Related
I would like to find the mid time between two times in column SLQ300 (sleep time) and SLQ310 (wake up time) in my data frame of 6000 participants.
Added: I have already the duration in column SLD012. So if we could add half of the duration to the sleep time, it would be great.
Eg. in the first row it should be the midpoint between 23:00 and 07:00, which is 03:00.
And in the 9th row, it should be between 01:00 and 06:00, which is 03:30.
Thank you in advance!
Data frame
Try this:
time2num <- function(x) {
vapply(strsplit(x, ':'), function(y) sum(as.numeric(y) * c(60, 1)),
numeric(1), USE.NAMES=FALSE)
}
# sample data
dat <- data.frame(id=1:2, tm1=c("23:00","01:00"), tm2=c("07:00","06:00"))
# the code
dat[,c("tm1n","tm2n")] <- lapply(dat[,c("tm1","tm2")], time2num)
dat
# id tm1 tm2 tm1n tm2n
# 1 1 23:00 07:00 1380 420
# 2 2 01:00 06:00 60 360
with(dat, ifelse(tm1n > tm2n, 24*60, 0) + tm2n - tm1n)
# [1] 480 300 ### minutes
Or you can use modulus:
with(dat, tm2n - tm1n) %% (12*60)
(though I haven't tested it in all sorts of combinations).
And the mid time:
format(as.POSIXct(paste(Sys.Date(), dat$tm1)) +
60*with(dat, ifelse(tm1n > tm2n, 24*60, 0) + tm2n - tm1n)/2,
format="%H:%M")
# [1] "03:00" "03:30"
So i have a dataset in R:
IncidentID Time Vehicle
19002 4:48 Car
19003 12:30 Motorcycle
19004 14:00 Car
19005 9:30 Bicycle
And I'm trying to filter out some data, since its quite a large dataset. The above is just a few examples of data.
I want to filter out the data according to the time, where say i want to obtain the data where the Time is between 12pm to 6pm (18:00 in 24 hour format), hence i would have:
IncidentID Time Vehicle
19003 12:30 Motorcycle
19004 14:00 Car
I did:
incident <- read.csv("incident.csv")
afternoon_incident <- incident[which(incident$Time >= 12 && incident$Time <= 18),]
But I'm getting the error saying:
1: In Ops.factor(web$Time, 6:0) : ‘>=’ not meaningful for factors
2: In Ops.factor(web$Time, 12:0) : ‘<=’ not meaningful for factors
You can use lubridate to convert Time field into time object and then extract hour for filtering:
library(lubridate)
incident$Time <- hm(as.character(incident$Time))
incident[which(hour(incident$Time) >= 12 & hour(incident$Time) <= 18), ]
You need to first convert the Time into actual date-time object using as.POSIXct and then compare.
As you want to subset based on hour, we can extract only hour part of the data using format and keep rows which are in between 12 and 18 hour. Using base R, we can do
df$hour <- as.numeric(format(as.POSIXct(df$Time, format = "%H:%M"), "%H"))
subset(df, hour >= 12 & hour <= 18)
# IncidentID Time Vehicle hour
#2 19003 12:30 Motorcycle 12
#3 19004 14:00 Car 14
You can remove the hour column later if not needed.
For a general solution, we can create a date-time column and then compare
df$datetime <- as.POSIXct(df$Time, format = "%H:%M")
subset(df, datetime >= as.POSIXct("12:30:00", format = "%T") &
datetime <= as.POSIXct("18:30:00", format = "%T"))
How to get date difference with R (in term of minutes) when day, month and year were not provided.
For instance minutes betweeen "23:14:01" and "00:02:01".
You can use difftime:
a <- strptime("23:14:01",format = "%H:%M:%S")
b <- strptime("00:02:01",format = "%H:%M:%S")
difftime(a,b, units = "mins")
# Time difference of 1392 mins
difftime_res_2 <- 1440 - difftime_res # In case the times are from following days
difftime_res_2
# Time difference of 48 mins
I have a data frame with hour stamp and corresponding temperature measured. The measurements are taken at random intervals over time continuously. I would like to convert the hours to respective date-time and temperature measured. My data frame looks like this: (The measurement started at 20/05/2016)
Time, Temp
09.25,28
10.35,28.2
18.25,29
23.50,30
01.10,31
12.00,36
02.00,25
I would like to create a data.frame with respective date-time and Temp like below:
Time, Temp
2016-05-20 09:25,28
2016-05-20 10:35,28.2
2016-05-20 18:25,29
2016-05-20 23:50,30
2016-05-21 01:10,31
2016-05-21 12:00,36
2016-05-22 02:00,25
I am thankful for any comments and tips on the packages or functions in R, I can have a look to do this. Thanks for your time.
A possible solution in base R:
df$Time <- as.POSIXct(strptime(paste('2016-05-20', sprintf('%05.2f',df$Time)), format = '%Y-%m-%d %H.%M', tz = 'GMT'))
df$Time <- df$Time + cumsum(c(0,diff(df$Time)) < 0) * 86400 # 86400 = 60 * 60 * 24
which gives:
> df
Time Temp
1 2016-05-20 09:25:00 28.0
2 2016-05-20 10:35:00 28.2
3 2016-05-20 18:25:00 29.0
4 2016-05-20 23:50:00 30.0
5 2016-05-21 01:10:00 31.0
6 2016-05-21 12:00:00 36.0
7 2016-05-22 02:00:00 25.0
An alternative with data.table (off course you can also use cumsum with diff instead of rleid & shift):
setDT(df)[, Time := as.POSIXct(strptime(paste('2016-05-20', sprintf('%05.2f',Time)), format = '%Y-%m-%d %H.%M', tz = 'GMT')) +
(rleid(Time < shift(Time, fill = Time[1]))-1) * 86400]
Or with dplyr:
library(dplyr)
df %>%
mutate(Time = as.POSIXct(strptime(paste('2016-05-20',
sprintf('%05.2f',Time)),
format = '%Y-%m-%d %H.%M', tz = 'GMT')) +
cumsum(c(0,diff(Time)) < 0)*86400)
which will both give the same result.
Used data:
df <- read.table(text='Time, Temp
09.25,28
10.35,28.2
18.25,29
23.50,30
01.10,31
12.00,36
02.00,25', header=TRUE, sep=',')
You can use a custom date format combined with some code that detects when a new day begins (assuming the first measurement takes place earlier in the day than the last measurement of the previous day).
# starting day
start_date = "2016-05-20"
values=read.csv('values.txt', colClasses=c("character",NA))
last=c(0,values$Time[1:nrow(values)-1])
day=cumsum(values$Time<last)
Time = strptime(paste(start_date,values$Time), "%Y-%m-%d %H.%M")
Time = Time + day*86400
values$Time = Time
I have a raw dataset of observations taken at 5 minute intervals between 6am and 9pm during weekdays only. These do not come with date-time information for plotting etc so I am attempting to create a vector of date-times to add to this to my data. ie this:
X425 X432 X448
1 0.07994814 0.1513559 0.1293103
2 0.08102852 0.1436480 0.1259074
to this
X425 X432 X448
2010-05-24 06:00 0.07994814 0.1513559 0.1293103
2010-05-24 06:05 0.08102852 0.1436480 0.1259074
I have gone about this as follows:
# using lubridate and xts
library(xts)
library(lubridate)
# sequence of 5 min intervals from 06:00 to 21:00
sttime <- hms("06:00:00")
intervals <- sttime + c(0:180) * minutes(5)
# sequence of days from 2010-05-24 to 2010-11-05
dayseq <- timeBasedSeq("2010-05-24/2010-11-05/d")
# add intervals to dayseq
dayPlusTime <- function(days, times) {
dd <- NULL
for (i in 1:2) {
dd <- c(dd,(days[i] + times))}
return(dd)
}
obstime <- dayPlusTime(dayseq, intervals)`
But obstime is coming out as a list. days[1] + times works so I guess it's something to do with the way the POSIXct objects are concatenated together to make dd but i can't figure out what am I doing wrong otr where to go next.
Any help appreciated
A base alternative:
# create some dummy dates
dates <- Sys.Date() + 0:14
# select non-weekend days
wd <- dates[as.integer(format(dates, format = "%u")) %in% 1:5]
# create times from 06:00 to 21:00 by 5 min interval
times <- format(seq(from = as.POSIXct("2015-02-18 06:00"),
to = as.POSIXct("2015-02-18 21:00"),
by = "5 min"),
format = "%H:%M")
# create all date-time combinations, paste, convert to as.POSIXct and sort
wd_times <- sort(as.POSIXct(do.call(paste, expand.grid(wd, times))))
One of the issues is that your interval vector does not change the hour when the minutes go over 60.
Here is one way you could do this:
#create the interval vector
intervals<-c()
for(p in 6:20){
for(j in seq(0,55,by=5)){
intervals<-c(intervals,paste(p,j,sep=":"))
}
}
intervals<-c(intervals,"21:0")
#get the days
dayseq <- timeBasedSeq("2010-05-24/2010-11-05/d")
#concatenate everything and format to POSIXct at the end
obstime<-strptime(unlist(lapply(dayseq,function(x){paste(x,intervals)})),format="%Y-%m-%d %H:%M", tz="GMT")