I have the following time intervals that I would like to split into 10 equally spaced instances.
head(data)
stoptime starttime
1 2014-08-19 14:52:04 2014-08-19 15:22:04
2 2014-08-19 16:27:14 2014-08-19 17:17:33
3 2014-08-19 18:05:59 2014-08-19 18:09:12
4 2014-08-19 17:25:35 2014-08-19 17:29:06
5 2014-08-19 18:23:29 2014-08-19 18:57:34
6 2014-08-19 07:39:15 2014-08-19 07:48:49
I am able to take the midpoint using this code
one_day$midtime = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) /2 , origin = '1970-01-01')
however, when I try to extend this code to ten equally spaced instances it goes completely wrong. Why is this happening and how can I fix this code?
one_day$first = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .1 , origin = '1970-01-01')
one_day$second = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .2, origin = '1970-01-01')
one_day$thrid = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .3, origin = '1970-01-01')
one_day$fourth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .4, origin = '1970-01-01')
one_day$fifth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .5, origin = '1970-01-01')
one_day$sixth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .6, origin = '1970-01-01')
one_day$seventh = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .7, origin = '1970-01-01')
one_day$eighth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .8, origin = '1970-01-01')
one_day$ninth = as.POSIXct((as.numeric(one_day$stoptime) + as.numeric(one_day$starttime)) * .9, origin = '1970-01-01')
head(one_day)
diff.time stoptime starttime midtime first
1 1800 2014-08-19 14:52:04 2014-08-19 15:22:04 2014-08-19 15:07:04 1978-12-05 03:49:24
2 3019 2014-08-19 16:27:14 2014-08-19 17:17:33 2014-08-19 16:52:23 1978-12-05 04:10:28
3 193 2014-08-19 18:05:59 2014-08-19 18:09:12 2014-08-19 18:07:35 1978-12-05 04:25:31
4 211 2014-08-19 17:25:35 2014-08-19 17:29:06 2014-08-19 17:27:20 1978-12-05 04:17:28
5 2045 2014-08-19 18:23:29 2014-08-19 18:57:34 2014-08-19 18:40:31 1978-12-05 04:32:06
6 574 2014-08-19 07:39:15 2014-08-19 07:48:49 2014-08-19 07:44:02 1978-12-05 02:20:48
second thrid fourth fifth sixth
1 1987-11-08 12:38:49 1996-10-11 21:28:14 2005-09-15 06:17:39 2014-08-19 15:07:04 2023-07-23 23:56:28
2 1987-11-08 13:20:57 1996-10-11 22:31:26 2005-09-15 07:41:54 2014-08-19 16:52:23 2023-07-24 02:02:52
3 1987-11-08 13:51:02 1996-10-11 23:16:33 2005-09-15 08:42:04 2014-08-19 18:07:35 2023-07-24 03:33:06
4 1987-11-08 13:34:56 1996-10-11 22:52:24 2005-09-15 08:09:52 2014-08-19 17:27:20 2023-07-24 02:44:48
5 1987-11-08 14:04:12 1996-10-11 23:36:18 2005-09-15 09:08:25 2014-08-19 18:40:31 2023-07-24 04:12:37
6 1987-11-08 09:41:36 1996-10-11 17:02:25 2005-09-15 00:23:13 2014-08-19 07:44:02 2023-07-23 15:04:50
seventh eighth ninth
1 2032-06-26 08:45:53 2041-05-30 17:35:18 2050-05-04 02:24:43
2 2032-06-26 11:13:20 2041-05-30 20:23:49 2050-05-04 05:34:18
3 2032-06-26 12:58:37 2041-05-30 22:24:08 2050-05-04 07:49:39
4 2032-06-26 12:02:16 2041-05-30 21:19:44 2050-05-04 06:37:12
5 2032-06-26 13:44:44 2041-05-30 23:16:50 2050-05-04 08:48:56
6 2032-06-25 22:25:38 2041-05-30 05:46:27 2050-05-03 13:07:15
dput(data1)
structure(list(stoptime = structure(c(1408477924, 1408483634,
1408489559, 1408487135, 1408490609, 1408451955, 1408452727, 1408498708,
1408486644, 1408454996), class = c("POSIXct", "POSIXt"), tzone = "EST"),
starttime = structure(c(1408479724, 1408486653, 1408489752,
1408487346, 1408492654, 1408452529, 1408455826, 1408501153,
1408488389, 1408458514), class = c("POSIXct", "POSIXt"), tzone = "EST")), .Names = c("stoptime",
"starttime"), row.names = c(NA, 10L), class = "data.frame")
1: Seq
First of all you have to convert the columns of your dataframe as POSIXct or POSIXlt class, because the r base function seq has a method for objects of that class.
Just see this semplified code:
library(lubridate)
a <- "2014-08-19 14:52:04"
b <- "2014-08-19 15:22:04"
a <- ymd_hms(a)
b <- ymd_hms(b)
a
[1] "2014-08-19 14:52:04 UTC"
b
[1] "2014-08-19 15:22:04 UTC"
Then you have to just use the seq function and set the parameters length.out with the value of the sequence you are seeking. The code will automatically create a sequence of values from the start to the end equally divided.
seq(a, b, length.out = 10)
[1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
[3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
[5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
[7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
[9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"
2: Vectorize step 1
Now that you know how to achieve your goal, it is just a matter of trying how to vectorize it along values.
I bet there are several approaches, here is one. With the mapply function you can loop trough the elements and match the first element (of the first object) with the first element (of the second object) and so on. Keep in mind that you have to specify which parameters are fixed with the MoreArg arguments.
Here is the code:
mapply(seq,
to = data1$starttime,
from = data1$stoptime,
MoreArgs = list(length.out = 10),
SIMPLIFY = F)
that produces a list of your desired data but not in the desired format sadly:
[[1]]
[1] "2014-08-19 14:52:04 UTC" "2014-08-19 14:55:24 UTC"
[3] "2014-08-19 14:58:44 UTC" "2014-08-19 15:02:04 UTC"
[5] "2014-08-19 15:05:24 UTC" "2014-08-19 15:08:44 UTC"
[7] "2014-08-19 15:12:04 UTC" "2014-08-19 15:15:24 UTC"
[9] "2014-08-19 15:18:44 UTC" "2014-08-19 15:22:04 UTC"
[[2]]
[1] "2014-08-19 16:27:14 UTC" "2014-08-19 16:32:49 UTC"
[3] "2014-08-19 16:38:24 UTC" "2014-08-19 16:44:00 UTC"
[5] "2014-08-19 16:49:35 UTC" "2014-08-19 16:55:11 UTC"
[7] "2014-08-19 17:00:46 UTC" "2014-08-19 17:06:22 UTC"
[9] "2014-08-19 17:11:57 UTC" "2014-08-19 17:17:33 UTC"
[[3]]
[1] "2014-08-19 18:05:59 UTC" "2014-08-19 18:06:20 UTC"
[3] "2014-08-19 18:06:41 UTC" "2014-08-19 18:07:03 UTC"
[5] "2014-08-19 18:07:24 UTC" "2014-08-19 18:07:46 UTC"
[7] "2014-08-19 18:08:07 UTC" "2014-08-19 18:08:29 UTC"
[9] "2014-08-19 18:08:50 UTC" "2014-08-19 18:09:12 UTC"
[[4]]
[1] "2014-08-19 17:25:35 UTC" "2014-08-19 17:25:58 UTC"
[3] "2014-08-19 17:26:21 UTC" "2014-08-19 17:26:45 UTC"
[5] "2014-08-19 17:27:08 UTC" "2014-08-19 17:27:32 UTC"
[7] "2014-08-19 17:27:55 UTC" "2014-08-19 17:28:19 UTC"
[9] "2014-08-19 17:28:42 UTC" "2014-08-19 17:29:06 UTC"
[[5]]
[1] "2014-08-19 18:23:29 UTC" "2014-08-19 18:27:16 UTC"
[3] "2014-08-19 18:31:03 UTC" "2014-08-19 18:34:50 UTC"
[5] "2014-08-19 18:38:37 UTC" "2014-08-19 18:42:25 UTC"
[7] "2014-08-19 18:46:12 UTC" "2014-08-19 18:49:59 UTC"
[9] "2014-08-19 18:53:46 UTC" "2014-08-19 18:57:34 UTC"
[[6]]
[1] "2014-08-19 07:39:15 UTC" "2014-08-19 07:40:18 UTC"
[3] "2014-08-19 07:41:22 UTC" "2014-08-19 07:42:26 UTC"
[5] "2014-08-19 07:43:30 UTC" "2014-08-19 07:44:33 UTC"
[7] "2014-08-19 07:45:37 UTC" "2014-08-19 07:46:41 UTC"
[9] "2014-08-19 07:47:45 UTC" "2014-08-19 07:48:49 UTC"
[[7]]
[1] "2014-08-19 07:52:07 UTC" "2014-08-19 07:57:51 UTC"
[3] "2014-08-19 08:03:35 UTC" "2014-08-19 08:09:20 UTC"
[5] "2014-08-19 08:15:04 UTC" "2014-08-19 08:20:48 UTC"
[7] "2014-08-19 08:26:33 UTC" "2014-08-19 08:32:17 UTC"
[9] "2014-08-19 08:38:01 UTC" "2014-08-19 08:43:46 UTC"
[[8]]
[1] "2014-08-19 20:38:28 UTC" "2014-08-19 20:42:59 UTC"
[3] "2014-08-19 20:47:31 UTC" "2014-08-19 20:52:03 UTC"
[5] "2014-08-19 20:56:34 UTC" "2014-08-19 21:01:06 UTC"
[7] "2014-08-19 21:05:38 UTC" "2014-08-19 21:10:09 UTC"
[9] "2014-08-19 21:14:41 UTC" "2014-08-19 21:19:13 UTC"
[[9]]
[1] "2014-08-19 17:17:24 UTC" "2014-08-19 17:20:37 UTC"
[3] "2014-08-19 17:23:51 UTC" "2014-08-19 17:27:05 UTC"
[5] "2014-08-19 17:30:19 UTC" "2014-08-19 17:33:33 UTC"
[7] "2014-08-19 17:36:47 UTC" "2014-08-19 17:40:01 UTC"
[9] "2014-08-19 17:43:15 UTC" "2014-08-19 17:46:29 UTC"
[[10]]
[1] "2014-08-19 08:29:56 UTC" "2014-08-19 08:36:26 UTC"
[3] "2014-08-19 08:42:57 UTC" "2014-08-19 08:49:28 UTC"
[5] "2014-08-19 08:55:59 UTC" "2014-08-19 09:02:30 UTC"
[7] "2014-08-19 09:09:01 UTC" "2014-08-19 09:15:32 UTC"
[9] "2014-08-19 09:22:03 UTC" "2014-08-19 09:28:34 UTC"
At this point I guess it is just a matter of same data manipulation but I can't figure out a way (now).
You can't just multiply the time interval by 0.1, you have to add that 0.1 of the time interval to the earlier time. For example:
one_day$firstexample = one_day$stoptime + 0.1*difftime(one_day$starttime, one_day$stoptime, units = "mins")
As a side note, if you find yourself typing out very similar things multiple times, that's usually a sign that you should turn it into a function.
Related
I have convert my date from chr to POSIXCT using formula below.
crime2$Date = parse_date_time(crime2$Date, orders = c('dmy_HM'),tz="UTC")
so my date actually now in this format.
> head(crime2$Date, 10)
[1] "2015-03-18 19:44:00 UTC" "2015-03-18 22:45:00 UTC"
[3] "2015-03-18 22:30:00 UTC" "2015-03-18 22:00:00 UTC"
[5] "2015-03-18 23:00:00 UTC" "2015-03-18 21:35:00 UTC"
[7] "2015-03-18 22:50:00 UTC" "2015-03-18 23:40:00 UTC"
[9] "2015-03-18 23:30:00 UTC" "2015-03-18 22:45:00 UTC"
However, if i want to remove the time and keep the date only, what can i do about this?
Example, they will look like this
" 2015-03-18 " "2015-03-18 "
I'm trying to floor continuous timestamps to 'every x hours' with lubridate:floor_date. However, when my time interval is greater than the hour of the first timestamp, it floors relative to midnight instead of my first timestamp. I have not found a way to set a reference timestamp for my start time. I have timestamps in UTC but need to floor them relative to for example 6:00 and 18:00 local time, which would be 12 hour intervals when referenced to local midnight, but doesn't work for UTC time when it keeps referencing to (UTC) midnight.
I know I could convert my timestamps to local time, but that is less than ideal. Is there a way to define the reference timestamps for floor_date that I'm missing?
Basically, what I'd like to do is floor the timestamps "every hour" relative to the start of my timeseries instead of each timestamp individually flooring relative to its midnight.
timestamps<-structure(c(1578628800, 1578632400, 1578636000, 1578639600, 1578643200,
1578646800, 1578650400, 1578654000, 1578657600, 1578661200), class = c("POSIXct",
"POSIXt"), tzone = "UTC")
floor_date(timestamps, '4 hours')
[1] "2020-01-10 04:00:00 UTC" "2020-01-10 04:00:00 UTC" "2020-01-10 04:00:00 UTC"
[4] "2020-01-10 04:00:00 UTC" "2020-01-10 08:00:00 UTC" "2020-01-10 08:00:00 UTC"
[7] "2020-01-10 08:00:00 UTC" "2020-01-10 08:00:00 UTC" "2020-01-10 12:00:00 UTC"
[10] "2020-01-10 12:00:00 UTC"
floor_date(timestamps, '5 hours')
[1] "2020-01-10 00:00:00 UTC" "2020-01-10 05:00:00 UTC" "2020-01-10 05:00:00 UTC"
[4] "2020-01-10 05:00:00 UTC" "2020-01-10 05:00:00 UTC" "2020-01-10 05:00:00 UTC"
[7] "2020-01-10 10:00:00 UTC" "2020-01-10 10:00:00 UTC" "2020-01-10 10:00:00 UTC"
[10] "2020-01-10 10:00:00 UTC"
Try the clock package:
clock::date_floor(timestamps, 'hour', n = 4)
[1] "2020-01-10 04:00:00 UTC" "2020-01-10 04:00:00 UTC"
[3] "2020-01-10 04:00:00 UTC" "2020-01-10 04:00:00 UTC"
[5] "2020-01-10 08:00:00 UTC" "2020-01-10 08:00:00 UTC"
[7] "2020-01-10 08:00:00 UTC" "2020-01-10 08:00:00 UTC"
[9] "2020-01-10 12:00:00 UTC" "2020-01-10 12:00:00 UTC"
clock::date_floor(timestamps, 'hour', n = 5)
[1] "2020-01-10 01:00:00 UTC" "2020-01-10 01:00:00 UTC"
[3] "2020-01-10 06:00:00 UTC" "2020-01-10 06:00:00 UTC"
[5] "2020-01-10 06:00:00 UTC" "2020-01-10 06:00:00 UTC"
[7] "2020-01-10 06:00:00 UTC" "2020-01-10 11:00:00 UTC"
[9] "2020-01-10 11:00:00 UTC" "2020-01-10 11:00:00 UTC"
In my dataframe, I have dates of the form
YYYY-MM-DD
YYYY.MM.DD
YYYY-MM-DD HH:MM
I want to standardise it into the form:
YYYY-MM-DD
in R.
I've tried the parse_date_time() function in R, but not all the columns are parsed. Why is that so? Any help would be appreciated :-)
Edit
An example for the usage of
parse_date_time(emails$mail_sent_date)
is this
[1] NA "2010-04-17 UTC" "2012-01-26 UTC" "2014-11-15 UTC" "2014-07-17 UTC" "2010-02-22 UTC" "2010-07-17 UTC" "2012-10-27 UTC" "2014-01-18 UTC"
[10] "2010-01-11 UTC" NA "2010-11-28 UTC" "2012-09-24 UTC" "2014-05-30 UTC" "2014-05-30 UTC" "2010-07-31 UTC" "2007-07-28 UTC" NA
[19] "2014-08-29 UTC" "2015-06-05 UTC" "2008-11-03 UTC" "2018-03-18 UTC" "2019-01-12 UTC" "2011-07-23 UTC" NA "2007-11-19 UTC" "2019-04-07 UTC"
[28] "2010-11-28 UTC" "2019-11-22 UTC" "2019-03-28 UTC" "2013-06-22 UTC" "2013-12-08 UTC" "2012-06-08 UTC" "2011-12-09 UTC" "2017-10-23 UTC" "2017-03-26 UTC"
[37] "2019-01-31 UTC" "2020-03-14 UTC" "2014-05-30 UTC" "2011-12-31 UTC" "2015-05-14 UTC" "2010-03-27 UTC" "2014-12-08 UTC" "2015-05-24 UTC" "2014-11-15 UTC"
[46] NA "2018-05-26 UTC" "2019-02-28 UTC" NA "2015-06-11 UTC" "2012-06-09 UTC" "2013-06-16 UTC" NA "2014-07-12 UTC"
[55] "2012-09-20 UTC" "2010-05-22 UTC" "2019-11-07 UTC" "2011-03-07 UTC" "2007-10-05 UTC" "2018-03-17 UTC" "2007-06-22 UTC" "2007-02-01 UTC" "2020-03-29 UTC"
[64] "2010-03-21 UTC" "2019-02-28 UTC" NA "2008-03-17 UTC" "2013-03-14 UTC" "2014-05-12 UTC" "2015-12-19 UTC" "2010-04-05 UTC" NA
[73] "2008-02-07 UTC" "2007-08-12 UTC" "2011-12-02 UTC" "2014-02-02 UTC" "2011-07-25 UTC" "2014-06-12 UTC" NA NA "2013-10-06 UTC"
[82] "2019-05-18 UTC" "2011-12-19 UTC" NA "2012-03-18 UTC" "2013-07-22 UTC" "2017-01-21 UTC" "2013-09-26 UTC" "2019-04-18 UTC" "2012-10-01 UTC"
[91] "2018-09-01 UTC" "2019-11-22 UTC" "2013-07-05 UTC" "2013-07-22 UTC" "2008-10-11 UTC" "2018-04-29 UTC" NA "2019-06-24 UTC" "2018-04-19 UTC"
[100] "2015-08-21 UTC" NA NA "2015-04-09 UTC" "2012-02-11 UTC" "2011-11-13 UTC" "2013-04-11 UTC" "2007-10-07 UTC" "2007-10-08 UTC"
[109] "2012-01-14 UTC" "2012-06-02 UTC" "2011-07-04 UTC" "2019-05-17 UTC" "2012-09-09 UTC" NA "2018-09-29 UTC" "2015-06-04 UTC" "2014-01-13 UTC"
[118] "2014-01-13 UTC" "2012-09-24 UTC" "2018-05-28 UTC" "2018-07-21 UTC" "2010-04-26 UTC" "2011-02-20 UTC" "2013-06-21 UTC" "2008-12-14 UTC" "2011-04-25 UTC"
[127] "2014-07-31 UTC" "2015-06-08 UTC" "2015-10-25 UTC" "2019-06-29 UTC" "2011-02-21 UTC" "2017-01-09 UTC" NA "2015-06-21 UTC" "2014-07-28 UTC"
[136] "2013-11-04 UTC" "2014-07-24 UTC" NA "2019-09-13 UTC" "2007-06-09 UTC" "2014-12-13 UTC" "2015-10-16 UTC" "2010-06-19 UTC" "2015-05-14 UTC"
[145] "2011-07-29 UTC" "2007-10-01 UTC" NA NA "2010-09-25 UTC" "2010-04-15 UTC" "2020-03-05 UTC" "2017-06-30 UTC" NA
[154] "2019-06-10 UTC" "2018-10-04 UTC" "2015-05-11 UTC" "2010-05-22 UTC" "2014-07-26 UTC" "2015-01-25 UTC" "2015-07-04 UTC" "2015-07-04 UTC" "2014-07-17 UTC"
[163] "2010-09-18 UTC" "2007-01-08 UTC" "2019-10-21 UTC" "2014-06-30 UTC" "2008-08-01 UTC" NA "2010-08-13 UTC" NA NA
[172] "2012-11-24 UTC" "2014-11-20 UTC" "2018-05-14 UTC" "2015-10-05 UTC" "2020-01-26 UTC" "2018-04-21 UTC" "2011-07-04 UTC" "2015-02-22 UTC" "2015-02-22 UTC"
[181] "2008-10-11 UTC" "2017-01-05 UTC" "2011-05-21 UTC" NA "2015-09-27 UTC" "2011-08-28 UTC" "2019-03-09 UTC" "2018-11-29 UTC" "2014-07-11 UTC"
[190] "2013-06-14 UTC" "2018-06-04 UTC" "2014-11-03 UTC" "2019-03-01 UTC" "2007-10-12 UTC" "2018-01-06 UTC" NA "2010-11-28 UTC" "2017-10-23 UTC"
[199] "2014-03-23 UTC" "2018-11-11 UTC" "2019-05-18 UTC" "2014-10-02 UTC" NA NA "2011-07-31 UTC" "2010-07-16 UTC" "2015-04-09 UTC"
[208] "2015-10-01 UTC" "2015-10-09 UTC" "2011-04-01 UTC" "2018-11-11 UTC" "2018-11-11 UTC" "2011-08-28 UTC" "2018-07-21 UTC" NA "2011-02-21 UTC"
[217] "2018-03-17 UTC" NA "2014-05-11 UTC" "2012-03-23 UTC" "2014-05-25 UTC" "2014-03-23 UTC" "2013-01-20 UTC" NA "2014-07-11 UTC"
[226] "2014-09-08 UTC" "2013-05-24 UTC" NA "2010-07-17 UTC" NA "2019-01-01 UTC" NA "2013-06-15 UTC" "2019-01-19 UTC"
[235] "2020-02-02 UTC" "2013-03-14 UTC" "2012-08-04 UTC" "2015-02-13 UTC" "2010-06-18 UTC" NA "2013-10-20 UTC" "2015-12-17 UTC" "2017-09-01 UTC"
[244] "2013-03-28 UTC" "2010-04-01 UTC" "2017-07-24 UTC" "2007-09-30 UTC" "2017-05-27 UTC" NA "2006-11-17 UTC" "2007-11-18 UTC" "2019-12-01 UTC"
[253] "2015-10-12 UTC" "2015-03-27 UTC" "2017-12-02 UTC" "2018-09-03 UTC" "2018-03-04 UTC" "2015-03-14 UTC" NA "2010-01-25 UTC" "2008-07-04 UTC"
[262] "2015-04-29 UTC" "2013-04-05 UTC" NA "2007-11-02 UTC" "2010-06-13 UTC" "2019-02-16 UTC" "2015-04-09 UTC" "2013-07-27 UTC" NA
[271] "2018-08-25 UTC" "2019-06-14 UTC"
Warning message:
39 failed to parse.
A similar warning message returned when I used ymd()
1) Assuming that the formats are precisely the ones shown in the question (if not please fix the question) then this uses only base R. This makes use of the fact that as.Date will ignore junk at the end.
x <- c("2000-10-01", "2000.10.01", "2000-10-01 03:04")
as.Date(chartr(".", "-", x))
## [1] "2000-10-01" "2000-10-01" "2000-10-01"
2) Another approach is the anytime package:
library(anytime)
anydate(x)
## [1] "2000-10-01" "2000-10-01" "2000-10-01"
Use lubridate package and ymd function
library(lubridate)
ymd(column_of_your_dataframe)
I am working in R.
I have to generate a series of dates and times. In particular, I would like to have two data points per day, hence to assign twice each date with a different time, for instance:
"2001-05-13 00:00:00"
"2001-05-13 12:00:00"
"2001-05-14 00:00:00"
"2001-05-14 12:00:00"
I found the following code to produce a series of dates:
seq(as.Date("2000/1/1"), as.Date("2003/1/1"), by = 0.5)
Nevertheless, even if I set the by = 0.5, the code returns only a date , not a datetime.
Any idea how to produce a series of datetimes?
as.Date will produce only dates, use as.POSIXct to produce date-time.
seq(as.POSIXct("2000-01-01 00:00:00", tz = 'UTC'),
as.POSIXct("2003-01-01 00:00:00", tz = 'UTC'), by = '12 hours')
# [1] "2000-01-01 00:00:00 UTC" "2000-01-01 12:00:00 UTC"
# [3] "2000-01-02 00:00:00 UTC" "2000-01-02 12:00:00 UTC"
# [5] "2000-01-03 00:00:00 UTC" "2000-01-03 12:00:00 UTC"
# [7] "2000-01-04 00:00:00 UTC" "2000-01-04 12:00:00 UTC"
# [9] "2000-01-05 00:00:00 UTC" "2000-01-05 12:00:00 UTC"
#[11] "2000-01-06 00:00:00 UTC" "2000-01-06 12:00:00 UTC"
#[13] "2000-01-07 00:00:00 UTC" "2000-01-07 12:00:00 UTC"
#...
#...
I want to generate a working week / working day sequence (Monday-Friday; 8am - 5pm) in R. However I only figured out how to extract a working week (Monday-Friday) with 24 hours.
library(timeDate)
start <- as.POSIXct("2010-01-01")
interval <- 60
seq_1 <- as.timeDate(seq(from=start, by=interval*60, length.out = 200))
seq_2 <- seq_1[isWeekday(seq_1)]; seq_2
dayOfWeek(seq_2)
Is there a similar function which can extract only working hours? Thanks
You can use function format to obtain hours
seq_2[as.numeric(format(seq_2,'%H')) %in% 8:15 ]
Select weekdays and then repeat with frequency equal to the desired hours. I'm afraid I missed your 8 o;clock start and used the phrase "9 to 5" as my guide:
twoyears <- seq.Date(as.Date("2010-01-01"), by='day', length.out=365*2)
twoworkyrs <- twoyears[isWeekday(twoyears, wday = 1:5)]
twoworkyrs[ 1:10]
# [1] "2010-01-01" "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" "2010-01-08"
# [7] "2010-01-11" "2010-01-12" "2010-01-13" "2010-01-14"
workhours <- as.POSIXct( as.numeric(rep(twoworkyrs, each=9))*24*3600 + # weekdays
(9:17)*3600 , n # working hours
origin="1970-01-01", tz="America/LosAngeles")
#----- First two weeks ----------------
> workhours[1:90]
[1] "2010-01-01 09:00:00 UTC" "2010-01-01 10:00:00 UTC" "2010-01-01 11:00:00 UTC"
[4] "2010-01-01 12:00:00 UTC" "2010-01-01 13:00:00 UTC" "2010-01-01 14:00:00 UTC"
[7] "2010-01-01 15:00:00 UTC" "2010-01-01 16:00:00 UTC" "2010-01-01 17:00:00 UTC"
[10] "2010-01-04 09:00:00 UTC" "2010-01-04 10:00:00 UTC" "2010-01-04 11:00:00 UTC"
[13] "2010-01-04 12:00:00 UTC" "2010-01-04 13:00:00 UTC" "2010-01-04 14:00:00 UTC"
[16] "2010-01-04 15:00:00 UTC" "2010-01-04 16:00:00 UTC" "2010-01-04 17:00:00 UTC"
[19] "2010-01-05 09:00:00 UTC" "2010-01-05 10:00:00 UTC" "2010-01-05 11:00:00 UTC"
[22] "2010-01-05 12:00:00 UTC" "2010-01-05 13:00:00 UTC" "2010-01-05 14:00:00 UTC"
[25] "2010-01-05 15:00:00 UTC" "2010-01-05 16:00:00 UTC" "2010-01-05 17:00:00 UTC"
[snipped
I must admit that timezone conversions are one of my weakest suits.