R- Calculate mean of Time variable mean(DateTime) - r

This is an example of my dataset:
> head(daily[,c(6,7)])->test
> head(test)
timeMin min
316 2013-05-02 13:45:00 3239
317 2013-05-03 12:30:00 3260
318 2013-05-04 12:30:00 3165
319 2013-05-05 12:30:00 3404
320 2013-05-06 12:30:00 3514
321 2013-05-07 13:15:00 3626
I need mean(timeMin), in order to know what´s the time of the day (hour:minute) at what the event usually happens. I have tried this:
library(lubridate)
> test$hourMin<-paste(hour(test$timeMin),minute(test$timeMin),sep=":”)
> test$hourMin <- hm(test$hourMin)
And I got this:
> head(test)
timeMin min hourMin
316 2013-05-02 13:45:00 3239 13H 45M 0S
317 2013-05-03 12:30:00 3260 12H 30M 0S
318 2013-05-04 12:30:00 3165 12H 30M 0S
319 2013-05-05 12:30:00 3404 12H 30M 0S
320 2013-05-06 12:30:00 3514 12H 30M 0S
321 2013-05-07 13:15:00 3626 13H 15M 0S
however, when I try to calculate the mean I had no result:
> mean(test$hourMin)
[1] 0
It should be straightforward, but I don´t know how to do it, since I am a beginner. I would appreciate any help. Thanks

It's really not elegant, but the only way I found for now is to change the date components to the same day and to compute the mean on the result. With lubridate :
time <- df$timeMin
time <- update(time, year=2000, month=1, mday=1)
mean(time)
# [1] "2000-01-01 12:50:00 CET"
Hopefully someone will provide something better...

I'm calculating seconds past 1st Jan, 2013 midnight and then taking mean of that and adding it back to 1st Jan, 2013 midnight.
I guess there are packages that can do this from just one command but if you, like me, don't wish to rely too much on packages then this solution should work for you.
library(data.table)
timetable <- data.table(TimeMin = c("2013-05-02 13:45:00",
"2013-05-03 12:30:00",
"2013-05-04 12:30:00",
"2013-05-05 12:30:00",
"2013-05-06 12:30:00",
"2013-05-07 13:15:00")
)
timetable <- timetable[, TimePastMin :=
difftime(
"2013-01-01 00:00:00",
TimeMin,
units = "secs"
)
]
meanTimePastMin <- mean(timetable[, TimePastMin])
meanTimeMin <- strptime("2013-01-01 00:00:00", "%Y-%m-%d %H:%M:%S") - meanTimePastMin
meanTimeMin
# "2013-05-05 00:50:00 IST"

Related

Calculating mean and sd of bedtime (hh:mm) in R - problem are times before/after midnight

I got the following dataset:
data <- read.table(text="
wake_time sleep_time
08:38:00 23:05:00
09:30:00 00:50:00
06:45:00 22:15:00
07:27:00 23:34:00
09:00:00 23:00:00
09:05:00 00:10:00
06:40:00 23:28:00
10:00:00 23:30:00
08:10:00 00:10:00
08:07:00 00:38:00", header=T)
I used the chron-package to calculate the average wake_time:
> mean(times(data$wake_time))
[1] 08:20:12
But when I do the same for the variable sleep_time, this happens:
> mean(times(data$sleep_time))
[1] 14:04:00
I guess the result is distorted because the sleep_time contains times before and after midnight.
But how can I solve this problem?
Additionally:
How can I calculate the sd of the times. I want to use it like "mean wake-up-time 08:20 ± 44 min" for example.
THe times values are stored as numbers 0-1 representing a fraction of a day. If the sleep time is earlier than the wake time, you can "add a day" before taking the mean. For example
library(chron)
wake <- times(data$wake_time)
sleep <- times(data$sleep_time)
times(mean(ifelse(sleep < wake, sleep+1, sleep)))
# [1] 23:40:00
And since the values are parts of a day, if you want the sd in minutes, you'd take the partial day values and convert to minutes
sd(ifelse(sleep < wake, sleep+1, sleep) * 24*60)
# [1] 47.60252

subtract periods and DST changes

I have data
library(data.table); library(lubridate)
dat <- data.table(t=as.POSIXct(c("2014-09-26 01:01:00","2014-09-26 02:01:00","2014-09-26 03:01:00"), tz="CET"))
> dat
t
1: 2014-09-26 01:01:00
2: 2014-09-26 02:01:00
3: 2014-09-26 03:01:00
and I would like to subtract 180 days. Because of DST change, the result using days(.) of lubridate is
> dat$t - days(180)
[1] "2014-03-30 01:01:00 CET" NA "2014-03-30 03:01:00 CEST"
and I wonder whether there is a way of subtracting days that accounts for DST changes.
Subtract the number of seconds in 180 days:
dat$t - 180*60*24*60
[1] "2014-03-30 00:01:00 CET" "2014-03-30 01:01:00 CET" "2014-03-30 03:01:00 CEST"

Create a time series with a row every 15 minutes

I'm having trouble creating a time series (POSIXct or dttm column) with a row every 15 minutes.
Something that will look like this for every 15 minutes between Jan 1st 2015 and Dec 31st 2016 (here as month/day/year hour:minutes):
1/15/2015 0:00
1/15/2015 0:15
1/15/2015 0:30
1/15/2015 0:45
1/15/2015 1:00
A loop starting date of 01/01/2015 0:00 and then adding 15 minutes until 12/31/2016 23:45?
Does anyone has an idea of how this can be done easily?
Little bit easier to read
library(lubridate)
seq(ymd_hm('2015-01-01 00:00'),ymd_hm('2016-12-31 23:45'), by = '15 mins')
intervals.15.min <- 0 : (366 * 24 * 60 * 60 / 15 / 60)
res <- as.POSIXct("2015-01-01","GMT") + intervals.15.min * 15 * 60
res <- res[res < as.POSIXct("2016-01-01 00:00:00 GMT")]
head(res)
# "2015-01-01 00:00:00 GMT" "2015-01-01 00:15:00 GMT" "2015-01-01 00:30:00 GMT"
tail(res)
# "2015-12-31 23:15:00 GMT" "2015-12-31 23:30:00 GMT" "2015-12-31 23:45:00 GMT"

Finding date based on time

I was thinking of how to find date(which does not exist in the table) based on time.
Example: Remember, I only have the time
time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45")
I know for the fact that the start date is 2014-08-28, but how do I get the date which changes after midnight.
Expected outcome would be
9:44 2014-08-28
15:30 2014-08-28
23:48 2014-08-28
00:30 2014-08-29
05:30 2014-08-29
15:30 2014-08-29
22:00 2014-08-29
00:45 2014-08-30
Here's an example using data.table package ITime class which enables you to manipulate time (upon converting time to this class you can now subtract/add minutes/hours/etc.)
library(data.table)
time <- as.ITime(time)
Date <- as.IDate("2014-08-28") + c(0, cumsum(diff(time) < 0))
data.table(time, Date)
# time Date
# 1: 09:44:00 2014-08-28
# 2: 15:30:00 2014-08-28
# 3: 23:48:00 2014-08-28
# 4: 00:30:00 2014-08-29
# 5: 05:30:00 2014-08-29
# 6: 15:30:00 2014-08-29
# 7: 22:00:00 2014-08-29
# 8: 00:45:00 2014-08-30
Using the chron package we assume that a later time is on the same day and an earlier time is on the next day:
library(chron)
date <- as.Date("2014-08-28") + cumsum(c(0, diff(times(paste0(time, ":00"))) < 0))
data.frame(time, date)
giving:
time date
1 9:44 2014-08-28
2 15:30 2014-08-28
3 23:48 2014-08-28
4 00:30 2014-08-29
5 05:30 2014-08-29
6 15:30 2014-08-29
7 22:00 2014-08-29
8 00:45 2014-08-30
Here's one way to do it:
time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45")
times <- sapply(strsplit(time, ":", TRUE), function(x) Reduce("+", as.numeric(x) * c(60, 1)))
as.POSIXct("2014-08-28") + times + 60*60*24*cumsum(c(F, tail(times < lag(times), -1)))
# [1] "2014-08-28 00:09:44 CEST" "2014-08-28 00:15:30 CEST" "2014-08-28 00:23:48 CEST" "2014-08-29 00:00:30 CEST" "2014-08-29 00:05:30 CEST" "2014-08-29 00:15:30 CEST" "2014-08-29 00:22:00 CEST" "2014-08-30 00:00:45 CEST"
You can concatenate system date with time and get result. For example, in Oracle we can get date with time as:
to_char(sysdate,'DD-MM-RRRR')|| ' ' || To_char(sysdate,'HH:MIAM')
This will result as eg. 12-09-2015 09:50 AM
For your requirement, use this as:
to_char(sysdate,'DD-MM-RRRR')|| ' 00:45' and so on.

subset by vector in r

I am trying to subset an xts object of OHLC hourly data with a vector.
If i create the vector myself with the following command
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
testdfx[lookup]
testdfx[lookup]
I get the correct data displayed which shows all the hours that match the dates in the vector (00:00 to 23:00.
> head(testdfx[lookup])
open high low close
2012-01-12 00:00:00 1.27081 1.27217 1.27063 1.27211
2012-01-12 01:00:00 1.27212 1.27216 1.27089 1.27119
2012-01-12 02:00:00 1.27118 1.27166 1.27017 1.27133
2012-01-12 03:00:00 1.27134 1.27272 1.27133 1.27261
2012-01-12 04:00:00 1.27260 1.27262 1.27141 1.27183
2012-01-12 05:00:00 1.27183 1.27230 1.27145 1.27165
> tail(testdfx[lookup])
open high low close
2012-03-19 18:00:00 1.32451 1.32554 1.32386 1.32414
2012-03-19 19:00:00 1.32417 1.32465 1.32331 1.32372
2012-03-19 20:00:00 1.32373 1.32415 1.32340 1.32372
2012-03-19 21:00:00 1.32373 1.32461 1.32366 1.32376
2012-03-19 22:00:00 1.32377 1.32424 1.32359 1.32366
2012-03-19 23:00:00 1.32364 1.32406 1.32333 1.32336
However when I extract a dates from an object and create a vector to use for subsetting I only get the hours of 00:00-19:00 displayed in my subset.
> head(testdfx[dates])
open high low close
2007-01-05 00:00:00 1.3092 1.3093 1.3085 1.3088
2007-01-05 01:00:00 1.3087 1.3092 1.3075 1.3078
2007-01-05 02:00:00 1.3079 1.3091 1.3078 1.3084
2007-01-05 03:00:00 1.3083 1.3084 1.3073 1.3074
2007-01-05 04:00:00 1.3073 1.3080 1.3061 1.3071
2007-01-05 05:00:00 1.3070 1.3072 1.3064 1.3069
> tail(euro[nfp.releases])
open high low close
2014-01-10 14:00:00 1.35892 1.36625 1.35728 1.36366
2014-01-10 15:00:00 1.36365 1.36784 1.36241 1.36743
2014-01-10 16:00:00 1.36742 1.36866 1.36693 1.36719
2014-01-10 17:00:00 1.36720 1.36752 1.36579 1.36617
2014-01-10 18:00:00 1.36617 1.36663 1.36559 1.36624
2014-01-10 19:00:00 1.36630 1.36717 1.36585 1.36702
I have compared both objects containing the require dates and they appear to be the same.
> class(lookup)
[1] "character"
> class(nfp.releases)
[1] "character"
> str(lookup)
chr [1:4] "2012-01-12" "2012-01-31" "2012-03-05" "2012-03-19"
> str(nfp.releases)
chr [1:86] "2014-02-07" "2014-01-10" "2013-12-06" "2013-11-08" ..
I am new to R but have tried everything over the past 3 days to get this to work. If I can't to it this way I will end up having to create a variable by hand but as its got 86 dates this may take some time.
Thanks in advance.
I cannot reproduce your problem
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
time_index <- seq(from = as.POSIXct("2012-01-01 07:00"), to = as.POSIXct("2012-05-17 18:00"), by = "hour")
set.seed(1)
value <- matrix(rnorm(n = 4*length(time_index)),length(time_index),4)
testdfx <- xts(value, order.by = time_index)
testdfx[lookup[1]]
testdfx["2012-01-12"]
Thanks for the response guys I actually thought i had deleted this thread but obviously not.
The problem in the case above was to be found around 3' from the computer. When looking through the data I was only interested in Fridays which also means that the FX market is closing down for the week end.
Sorry to have wasted your time and Admin please remove.

Resources