How to use military time to compute time interval - r

I'm using R and am trying to compute the number of hours someone slept. Currently, the bed time and wake times are reported in military time, so when I use difftime(), the interval between sleeping at 9pm (21:00) and waking up at 7AM (07:00) ends up being 14 hours, instead of 10 hours. Can someone help me figure out what I need to do so that it gives me the correct time difference?
Example data:
bedtime waketime
1 1899-12-31 01:00:00 1899-12-31 06:00:00
2 1899-12-31 21:00:00 1899-12-31 07:00:00
3 1899-12-31 22:00:00 1899-12-31 06:00:00
Script used:
difftime(PSQI$wakeup_3, PSQI$bedtime_1, units = "hours")
[1] 5.00 -14.00 -16.00
When what I would am looking for is
[1] 5.00 10.00 8.00
Thank you to anyone who can help!

Combining the comments above from #thelatemail and #Dave2e, we can do
with(df, ifelse(bedtime > waketime, waketime + 86400 - bedtime, waketime - bedtime))
#[1] 5 10 8
We add 86400 seconds (1 day) only if bedtime > waketime and then take the difference. Make sure columns bedtime and waketime are actual POSIXct class.
data
df <- structure(list(bedtime = structure(c(-2209096006, -2209024006,
-2209020406), class = c("POSIXct", "POSIXt"), tzone = ""), waketime =
structure(c(-2209078006,
-2209074406, -2209078006), class = c("POSIXct", "POSIXt"), tzone =
"")), row.names = c(NA,
-3L), class = "data.frame")

Related

Calculate time difference between 2 timestamps in hours using R

I'm trying to obtain the time difference between 2 timestamps in hours.
I have the data:
ID Lat Long Traffic Start_Time End_Time
1 -80.424 40.4242 54 2018-01-01 01:00 2018-01-01 01:10
2 -80.114 40.4131 30 2018-01-01 02:30 2018-01-01 02:40
3 -80.784 40.1142 12 2018-01-01 06:15 2018-01-01 07:20
I want to get the data like this
ID Lat Long Traffic Start_Time End_Time differ_hrs
1 -80.424 40.4242 54 2018-01-01 01:00 2018-01-01 01:10 00:50
2 -80.114 40.4131 30 2018-01-02 08:30 2018-01-02 08:40 01:10
3 -80.784 40.1142 12 2018-01-04 19:26 2018-01-04 20:11 01:15
I tried this code to capture the difference in hours:
df$differ_hrs<- difftime(df$End_Time, df$Start_Time, units = "hours")
However, it captures the difference like this:
ID Lat Long Traffic Start_Time End_Time differ_hrs
1 -80.424 40.4242 54 2018-01-01 01:00 2018-01-01 01:10 0.5
2 -80.114 40.4131 30 2018-01-02 08:30 2018-01-02 08:40 0.70
3 -80.784 40.1142 12 2018-01-04 19:26 2018-01-04 20:11 0.75
then I tried to set the difference in hours into format="%H%M" using the code:
df$differ_HHMM<- format(strptime(df$differ_hrs, format="%H%M"), format = "%H:%M")
But it produces all NAs.
So I decided to try a different way where I calculate the difference and set the format in the command itself adding "%H%M" like this:
df$differ_HHMM<- as.numeric(difftime(strptime(paste(df[,6]),"%Y-%m-%d %H:%M:%S"), strptime(paste(df[,5]),"%Y-%m-%d %H:%M:%S"),format="%H%M", units = "hours"))
but I keep getting the error message:
Error in difftime(strptime(paste(df[, 6]), "%Y-%m-%d %H:%M:%S"), strptime(paste(df[, :
unused argument (format = "%H:%M:%S")
Is there any way to calculate the time difference in %H:%M format?
I really appreciate your suggestions
The difference is a difftime class built on top of numeric. We could specify the units in difftime as seconds and use seconds_to_period from lubridate
library(lubridate)
df$differ_hrs<- as.numeric(difftime(df$End_Time, df$Start_Time,
units = "secs"))
out <- seconds_to_period(df$differ_hrs)
df$differ_HHMM <- sprintf('%02d:%02d', out#hour, out$minute)
NOTE: format works only on Date or Datetime class i.e. POSIXct, POSIXlt and not on numeric/difftime objects
data
df <- structure(list(ID = 1:3, Lat = c(-80.424, -80.114, -80.784),
Long = c(40.4242, 40.4131, 40.1142), Traffic = c(54L, 30L,
12L), Start_Time = structure(c(1514786400, 1514791800, 1514805300
), class = c("POSIXct", "POSIXt"), tzone = ""), End_Time = structure(c(1514787000,
1514792400, 1514809200), class = c("POSIXct", "POSIXt"), tzone = "")), row.names = c(NA,
-3L), class = "data.frame")

diff time r studio is not giving the outcome properly

Im trying to calculate the difftime in minutes between two columns and the results seems to be not right.
BLOCK_DATE_TIME.x
a1_0
diff
2019-04-26 19:07:00
2019-04-27 09:00:00
773
2019-08-27 08:30:00
2019-08-27 09:00:00
-30
the code to calculate diff is the following:
test$diff <- difftime(test$a1_0,test$BLOCK_DATE_TIME.x,units = "mins")
The desire outcome is for the first line 833 and for the second 30.
Thanks in advance
It is possible that the columns are not Datetime class. If we convert to POSIXct, it works as expected
library(dplyr)
library(lubridate)
test1 <- test %>%
mutate(across(c(BLOCK_DATE_TIME.x, a1_0), ymd_hms),
diff = difftime(a1_0, BLOCK_DATE_TIME.x, units = "mins"))
test1
# BLOCK_DATE_TIME.x a1_0 diff
#1 2019-04-26 19:07:00 2019-04-27 09:00:00 833 mins
#2 2019-08-27 08:30:00 2019-08-27 09:00:00 30 mins
data
test <- structure(list(BLOCK_DATE_TIME.x = c("2019-04-26 19:07:00",
"2019-08-27 08:30:00"
), a1_0 = c("2019-04-27 09:00:00", "2019-08-27 09:00:00")), row.names = c(NA,
-2L), class = "data.frame")

How can I add 1 to a column in R when A conditional is met?

I am trying to fill a new column in a data frame (in R) based on the following conditional:
df$B<- ifelse(difftime(df$A,lag(df$A))>minutes(30), increment(1), increment(0))
Here, the A column is time. So in A, every time the time difference between row i and row i-1 is greater than 30 minutes, I increment the new column B by one.
A B
1:00 1
1:31 2
1:40 2
2:30 3
Example
Any help is greatly appreciated, thank you.
In base R, you can use cumsum with difftime :
df$B <- cumsum(c(TRUE, difftime(df$A[-1], df$A[-nrow(df)], units = 'mins') > 30))
df
# A B
#1 2020-02-03 01:00:00 1
#2 2020-02-03 02:00:00 2
#3 2020-02-03 02:15:00 2
#4 2020-02-03 03:00:00 3
data
Make sure class(df$A) returns "POSIXct" :
df <- structure(list(A = structure(c(1580691600, 1580695200, 1580696100,
1580698800), class = c("POSIXct", "POSIXt"), tzone = "UTC")),
class = "data.frame", row.names = c(NA, -4L))

Finding gaps between intervals using data.table

I have the following problem: given a set of non-overlapping intervals in a data.table, report the gaps between the intervals.
I have implemented this once in SQL, however I am struggling with data.table due to the lack of a lead function or lag function. For completeness, I have here the SQL code. I know the functionality has been implemented in data.table version 1.9.5. as by the changelog. So is this possible with data.table without doing a lot of merges and without a lag or lead function?
In principle, I am not fully against using merges (aka joins) as long as performance does not suffer. I think this has an easy implementation, but I can't figure out how to "get" the previous end time to be the starting time of my gap table.
For example:
# The numbers represent seconds from 1970-01-01 01:00:01
dat <- structure(
list(ID = c(1L, 1L, 1L, 2L, 2L, 2L),
stime = structure(c(as.POSIXct("2014-01-15 08:00:00"),
as.POSIXct("2014-01-15 11:00:00"),
as.POSIXct("2014-01-16 11:30:00"),
as.POSIXct("2014-01-15 09:30:00"),
as.POSIXct("2014-01-15 12:30:00"),
as.POSIXct("2014-01-15 13:30:00")
),
class = c("POSIXct", "POSIXt"), tzone = ""),
etime = structure(c(as.POSIXct("2014-01-15 10:30:00"),
as.POSIXct("2014-01-15 12:00:00"),
as.POSIXct("2014-01-16 13:00:00"),
as.POSIXct("2014-01-15 11:00:00"),
as.POSIXct("2014-01-15 12:45:00"),
as.POSIXct("2014-01-15 14:30:00")
),
class = c("POSIXct", "POSIXt"), tzone = "")
),
.Names = c("ID", "stime", "etime"),
sorted = c("ID", "stime", "etime"),
class = c("data.table", "data.frame"),
row.names = c(NA,-6L)
)
dat <- data.table(dat)
This results in:
ID stime etime
1 2014-01-15 10:30:00 2014-01-15 11:00:00
1 2014-01-15 12:00:00 2014-01-16 11:30:00
2 2014-01-15 11:00:00 2014-01-15 12:30:00
2 2014-01-15 12:45:00 2014-01-15 13:30:00
Notice: the gaps are reported evenly across days.
A variation on David's answer, likely a little less efficient, but simpler to type out:
setkey(dat, stime)[, .(stime=etime[-.N], etime=stime[-1]), by=ID]
Produces:
ID stime etime
1: 1 2014-01-15 10:30:00 2014-01-15 11:00:00
2: 1 2014-01-15 12:00:00 2014-01-16 11:30:00
3: 2 2014-01-15 11:00:00 2014-01-15 12:30:00
4: 2 2014-01-15 12:45:00 2014-01-15 13:30:00
setkey is just to make sure table is sorted by time.
If I'm not missing something, you are missing a row in your desired output, so here's my attempt using shift from the devel version as you mentioned.
library(data.table) ## v >= 1.9.5
indx <- dat[, .I[-.N], by = ID]$V1
dat[, .(ID, stimes = etime, etime = shift(stime, type = "lead"))][indx]
res
# ID stime etime
# 1: 1 2014-01-15 10:30:00 2014-01-15 11:00:00
# 2: 1 2014-01-15 12:00:00 2014-01-16 11:30:00
# 3: 2 2014-01-15 11:00:00 2014-01-15 12:30:00
# 4: 2 2014-01-15 12:45:00 2014-01-15 13:30:00

Combining time series data with different resolution in R

I have read in and formatted my data set like shown under.
library(xts)
#Read data from file
x <- read.csv("data.dat", header=F)
x[is.na(x)] <- c(0) #If empty fill in zero
#Construct data frames
rawdata.h <- data.frame(x[,2],x[,3],x[,4],x[,5],x[,6],x[,7],x[,8]) #Hourly data
rawdata.15min <- data.frame(x[,10]) #15 min data
#Convert time index to proper format
index.h <- as.POSIXct(strptime(x[,1], "%d.%m.%Y %H:%M"))
index.15min <- as.POSIXct(strptime(x[,9], "%d.%m.%Y %H:%M"))
#Set column names
names(rawdata.h) <- c("spot","RKup", "RKdown","RKcon","anm", "pp.stat","prod.h")
names(rawdata.15min) <- c("prod.15min")
#Convert data frames to time series objects
data.htemp <- xts(rawdata.h,order.by=index.h)
data.15mintemp <- xts(rawdata.15min,order.by=index.15min)
#Select desired subset period
data.h <- data.htemp["2013"]
data.15min <- data.15mintemp["2013"]
I want to be able to combine hourly data from data.h$prod.h with data, with 15 min resolution, from data.15min$prod.15min corresponding to the same hour.
An example would be to take the average of the hourly value at time 2013-12-01 00:00-01:00 with the last 15 minute value in that same hour, i.e. the 15 minute value from time 2013-12-01 00:45-01:00. I'm looking for a flexible way to do this with an arbitrary hour.
Any suggestions?
Edit: Just to clarify further: I want to do something like this:
N <- NROW(data.h$prod.h)
for (i in 1:N){
prod.average[i] <- mean(data.h$prod.h[i] + #INSERT CODE THAT FINDS LAST 15 MIN IN HOUR i )
}
I found a solution to my problem by converting the 15 minute data into hourly data using the very useful .index* function from the xts package like shown under.
prod.new <- data.15min$prod.15min[.indexmin(data.15min$prod.15min) %in% c(45:59)]
This creates a new time series with only the values occuring in the 45-59 minute interval each hour.
For those curious my data looked like this:
Original hourly series:
> data.h$prod.h[1:4]
2013-01-01 00:00:00 19.744
2013-01-01 01:00:00 27.866
2013-01-01 02:00:00 26.227
2013-01-01 03:00:00 16.013
Original 15 minute series:
> data.15min$prod.15min[1:4]
2013-09-30 00:00:00 16.4251
2013-09-30 00:15:00 18.4495
2013-09-30 00:30:00 7.2125
2013-09-30 00:45:00 12.1913
2013-09-30 01:00:00 12.4606
2013-09-30 01:15:00 12.7299
2013-09-30 01:30:00 12.9992
2013-09-30 01:45:00 26.7522
New series with only the last 15 minutes in each hour:
> prod.new[1:4]
2013-09-30 00:45:00 12.1913
2013-09-30 01:45:00 26.7522
2013-09-30 02:45:00 5.0332
2013-09-30 03:45:00 2.6974
Short answer
df %>%
group_by(t = cut(time, "30 min")) %>%
summarise(v = mean(value))
Long answer
Since, you want to compress the 15 minutes time series to a smaller resolution (30 minutes), you should use dplyr package or any other package that computes the "group by" concept.
For instance:
s = seq(as.POSIXct("2017-01-01"), as.POSIXct("2017-01-02"), "15 min")
df = data.frame(time = s, value=1:97)
df is a time series with 97 rows and two columns.
head(df)
time value
1 2017-01-01 00:00:00 1
2 2017-01-01 00:15:00 2
3 2017-01-01 00:30:00 3
4 2017-01-01 00:45:00 4
5 2017-01-01 01:00:00 5
6 2017-01-01 01:15:00 6
The cut.POSIXt, group_by and summarise functions do the work:
df %>%
group_by(t = cut(time, "30 min")) %>%
summarise(v = mean(value))
t v
1 2017-01-01 00:00:00 1.5
2 2017-01-01 00:30:00 3.5
3 2017-01-01 01:00:00 5.5
4 2017-01-01 01:30:00 7.5
5 2017-01-01 02:00:00 9.5
6 2017-01-01 02:30:00 11.5
A more robust way is to convert 15 minutes values into hourly values by taking average. Then do whatever operation you want to.
### 15 Minutes Data
min15 <- structure(list(V1 = structure(1:8, .Label = c("2013-01-01 00:00:00",
"2013-01-01 00:15:00", "2013-01-01 00:30:00", "2013-01-01 00:45:00",
"2013-01-01 01:00:00", "2013-01-01 01:15:00", "2013-01-01 01:30:00",
"2013-01-01 01:45:00"), class = "factor"), V2 = c(16.4251, 18.4495,
7.2125, 12.1913, 12.4606, 12.7299, 12.9992, 26.7522)), .Names = c("V1",
"V2"), class = "data.frame", row.names = c(NA, -8L))
min15
### Hourly Data
hourly <- structure(list(V1 = structure(1:4, .Label = c("2013-01-01 00:00:00",
"2013-01-01 01:00:00", "2013-01-01 02:00:00", "2013-01-01 03:00:00"
), class = "factor"), V2 = c(19.744, 27.866, 26.227, 16.013)), .Names = c("V1",
"V2"), class = "data.frame", row.names = c(NA, -4L))
hourly
### Convert 15min data into hourly data by taking average of 4 values
min15$V1 <- as.POSIXct(min15$V1,origin="1970-01-01 0:0:0")
min15 <- aggregate(. ~ cut(min15$V1,"60 min"),min15[setdiff(names(min15), "V1")],mean)
min15
names(min15) <- c("time","min15")
names(hourly) <- c("time","hourly")
### merge the corresponding values
combined <- merge(hourly,min15)
### average of hourly and 15min values
rowMeans(combined[,2:3])

Resources