How to unify the time format in r? - r
I have a dataset test. It was like 30-06-22 23:55:00, 1/7/2022 0:00 AM in excel, I have no idea why there are two different formats in one column and I can't change the format in excel. It's weird because the time format of each month from the 1st to the 12th is different from the rest of the days. Therefore, I import the data to R and try to unify the formats by using parse_date_time() function. But now it changed to 44568" in R. And I got Warning message: 20737 failed to parse. after running the code test$Time<- parse_date_time(test$Time, orders= c("%d-%m-%y %H%M%S","%d/%m/%Y %I:%M:%S %p" ))
I was so confused about why the formats are different and how to unify the data formats in the same way like 1/7/2022 0:00 AM (d/m/Y H:M AM/PM)
test<- c("30-06-22 20:35:00", "30-06-22 20:40:00", "30-06-22 20:45:00",
"30-06-22 20:50:00", "30-06-22 20:55:00", "30-06-22 21:00:00",
"30-06-22 21:05:00", "30-06-22 21:10:00", "30-06-22 21:15:00",
"30-06-22 21:20:00", "30-06-22 21:25:00", "30-06-22 21:30:00",
"30-06-22 21:35:00", "30-06-22 21:40:00", "30-06-22 21:45:00",
"30-06-22 21:50:00", "30-06-22 21:55:00", "30-06-22 22:00:00",
"30-06-22 22:05:00", "30-06-22 22:10:00", "30-06-22 22:15:00",
"30-06-22 22:20:00", "30-06-22 22:25:00", "30-06-22 22:30:00",
"30-06-22 22:35:00", "30-06-22 22:40:00", "30-06-22 22:45:00",
"30-06-22 22:50:00", "30-06-22 22:55:00", "30-06-22 23:00:00",
"30-06-22 23:05:00", "30-06-22 23:10:00", "30-06-22 23:15:00",
"30-06-22 23:20:00", "30-06-22 23:25:00", "30-06-22 23:30:00",
"30-06-22 23:35:00", "30-06-22 23:40:00", "30-06-22 23:45:00",
"30-06-22 23:50:00", "30-06-22 23:55:00", "44568", "44568.003472222219",
"44568.006944444445", "44568.010416666664", "44568.013888888891",
"44568.017361111109", "44568.020833333336", "44568.024305555555",
"44568.027777777781", "44568.03125", "44568.034722222219", "44568.038194444445",
"44568.041666666664", "44568.045138888891", "44568.048611111109",
"44568.052083333336", "44568.055555555555", "44568.059027777781",
"44568.0625", "44568.065972222219", "44568.069444444445", "44568.072916666664",
"44568.076388888891", "44568.079861111109", "44568.083333333336",
"44568.086805555555", "44568.090277777781", "44568.09375", "44568.097222222219",
"44568.100694444445", "44568.104166666664", "44568.107638888891",
"44568.111111111109", "44568.114583333336", "44568.118055555555",
"44568.121527777781", "44568.125", "44568.128472222219", "44568.131944444445",
"44568.135416666664", "44568.138888888891", "44568.142361111109",
"44568.145833333336", "44568.149305555555", "44568.152777777781",
"44568.15625", "44568.159722222219", "44568.163194444445", "44568.166666666664",
"44568.170138888891", "44568.173611111109", "44568.177083333336",
"44568.180555555555", "44568.184027777781", "44568.1875", "44568.190972222219",
"44568.194444444445", "44568.197916666664", "44568.201388888891",
"44568.204861111109")
We may do
library(parsedate)
library(dplyr)
v1 <- as.numeric(test)
v1 <- coalesce(openxlsx::convertToDateTime(v1), parse_date(test))
v1
-output
[> v1
[1] "2022-06-30 20:35:00 UTC" "2022-06-30 20:40:00 UTC" "2022-06-30 20:45:00 UTC" "2022-06-30 20:50:00 UTC" "2022-06-30 20:55:00 UTC" "2022-06-30 21:00:00 UTC"
[7] "2022-06-30 21:05:00 UTC" "2022-06-30 21:10:00 UTC" "2022-06-30 21:15:00 UTC" "2022-06-30 21:20:00 UTC" "2022-06-30 21:25:00 UTC" "2022-06-30 21:30:00 UTC"
[13] "2022-06-30 21:35:00 UTC" "2022-06-30 21:40:00 UTC" "2022-06-30 21:45:00 UTC" "2022-06-30 21:50:00 UTC" "2022-06-30 21:55:00 UTC" "2022-06-30 22:00:00 UTC"
[19] "2022-06-30 22:05:00 UTC" "2022-06-30 22:10:00 UTC" "2022-06-30 22:15:00 UTC" "2022-06-30 22:20:00 UTC" "2022-06-30 22:25:00 UTC" "2022-06-30 22:30:00 UTC"
[25] "2022-06-30 22:35:00 UTC" "2022-06-30 22:40:00 UTC" "2022-06-30 22:45:00 UTC" "2022-06-30 22:50:00 UTC" "2022-06-30 22:55:00 UTC" "2022-06-30 23:00:00 UTC"
[31] "2022-06-30 23:05:00 UTC" "2022-06-30 23:10:00 UTC" "2022-06-30 23:15:00 UTC" "2022-06-30 23:20:00 UTC" "2022-06-30 23:25:00 UTC" "2022-06-30 23:30:00 UTC"
[37] "2022-06-30 23:35:00 UTC" "2022-06-30 23:40:00 UTC" "2022-06-30 23:45:00 UTC" "2022-06-30 23:50:00 UTC" "2022-06-30 23:55:00 UTC" "2022-01-07 05:00:00 UTC"
[43] "2022-01-07 05:05:00 UTC" "2022-01-07 05:10:00 UTC" "2022-01-07 05:15:00 UTC" "2022-01-07 05:20:00 UTC" "2022-01-07 05:25:00 UTC" "2022-01-07 05:30:00 UTC"
[49] "2022-01-07 05:35:00 UTC" "2022-01-07 05:40:00 UTC" "2022-01-07 05:45:00 UTC" "2022-01-07 05:50:00 UTC" "2022-01-07 05:55:00 UTC" "2022-01-07 06:00:00 UTC"
[55] "2022-01-07 06:05:00 UTC" "2022-01-07 06:10:00 UTC" "2022-01-07 06:15:00 UTC" "2022-01-07 06:20:00 UTC" "2022-01-07 06:25:00 UTC" "2022-01-07 06:30:00 UTC"
[61] "2022-01-07 06:35:00 UTC" "2022-01-07 06:40:00 UTC" "2022-01-07 06:45:00 UTC" "2022-01-07 06:50:00 UTC" "2022-01-07 06:55:00 UTC" "2022-01-07 07:00:00 UTC"
[67] "2022-01-07 07:05:00 UTC" "2022-01-07 07:10:00 UTC" "2022-01-07 07:15:00 UTC" "2022-01-07 07:20:00 UTC" "2022-01-07 07:25:00 UTC" "2022-01-07 07:30:00 UTC"
[73] "2022-01-07 07:35:00 UTC" "2022-01-07 07:40:00 UTC" "2022-01-07 07:45:00 UTC" "2022-01-07 07:50:00 UTC" "2022-01-07 07:55:00 UTC" "2022-01-07 08:00:00 UTC"
[79] "2022-01-07 08:05:00 UTC" "2022-01-07 08:10:00 UTC" "2022-01-07 08:15:00 UTC" "2022-01-07 08:20:00 UTC" "2022-01-07 08:25:00 UTC" "2022-01-07 08:30:00 UTC"
[85] "2022-01-07 08:35:00 UTC" "2022-01-07 08:40:00 UTC" "2022-01-07 08:45:00 UTC" "2022-01-07 08:50:00 UTC" "2022-01-07 08:55:00 UTC" "2022-01-07 09:00:00 UTC"
[91] "2022-01-07 09:05:00 UTC" "2022-01-07 09:10:00 UTC" "2022-01-07 09:15:00 UTC" "2022-01-07 09:20:00 UTC" "2022-01-07 09:25:00 UTC" "2022-01-07 09:30:00 UTC"
[97] "2022-01-07 09:35:00 UTC" "2022-01-07 09:40:00 UTC" "2022-01-07 09:45:00 UTC" "2022-01-07 09:50:00 UTC" "2022-01-07 09:55:00 UTC"
Related
remove time from POSIXct Date
I have convert my date from chr to POSIXCT using formula below. crime2$Date = parse_date_time(crime2$Date, orders = c('dmy_HM'),tz="UTC") so my date actually now in this format. > head(crime2$Date, 10) [1] "2015-03-18 19:44:00 UTC" "2015-03-18 22:45:00 UTC" [3] "2015-03-18 22:30:00 UTC" "2015-03-18 22:00:00 UTC" [5] "2015-03-18 23:00:00 UTC" "2015-03-18 21:35:00 UTC" [7] "2015-03-18 22:50:00 UTC" "2015-03-18 23:40:00 UTC" [9] "2015-03-18 23:30:00 UTC" "2015-03-18 22:45:00 UTC" However, if i want to remove the time and keep the date only, what can i do about this? Example, they will look like this " 2015-03-18 " "2015-03-18 "
Date standardisation in a dataframe in R
In my dataframe, I have dates of the form YYYY-MM-DD YYYY.MM.DD YYYY-MM-DD HH:MM I want to standardise it into the form: YYYY-MM-DD in R. I've tried the parse_date_time() function in R, but not all the columns are parsed. Why is that so? Any help would be appreciated :-) Edit An example for the usage of parse_date_time(emails$mail_sent_date) is this [1] NA "2010-04-17 UTC" "2012-01-26 UTC" "2014-11-15 UTC" "2014-07-17 UTC" "2010-02-22 UTC" "2010-07-17 UTC" "2012-10-27 UTC" "2014-01-18 UTC" [10] "2010-01-11 UTC" NA "2010-11-28 UTC" "2012-09-24 UTC" "2014-05-30 UTC" "2014-05-30 UTC" "2010-07-31 UTC" "2007-07-28 UTC" NA [19] "2014-08-29 UTC" "2015-06-05 UTC" "2008-11-03 UTC" "2018-03-18 UTC" "2019-01-12 UTC" "2011-07-23 UTC" NA "2007-11-19 UTC" "2019-04-07 UTC" [28] "2010-11-28 UTC" "2019-11-22 UTC" "2019-03-28 UTC" "2013-06-22 UTC" "2013-12-08 UTC" "2012-06-08 UTC" "2011-12-09 UTC" "2017-10-23 UTC" "2017-03-26 UTC" [37] "2019-01-31 UTC" "2020-03-14 UTC" "2014-05-30 UTC" "2011-12-31 UTC" "2015-05-14 UTC" "2010-03-27 UTC" "2014-12-08 UTC" "2015-05-24 UTC" "2014-11-15 UTC" [46] NA "2018-05-26 UTC" "2019-02-28 UTC" NA "2015-06-11 UTC" "2012-06-09 UTC" "2013-06-16 UTC" NA "2014-07-12 UTC" [55] "2012-09-20 UTC" "2010-05-22 UTC" "2019-11-07 UTC" "2011-03-07 UTC" "2007-10-05 UTC" "2018-03-17 UTC" "2007-06-22 UTC" "2007-02-01 UTC" "2020-03-29 UTC" [64] "2010-03-21 UTC" "2019-02-28 UTC" NA "2008-03-17 UTC" "2013-03-14 UTC" "2014-05-12 UTC" "2015-12-19 UTC" "2010-04-05 UTC" NA [73] "2008-02-07 UTC" "2007-08-12 UTC" "2011-12-02 UTC" "2014-02-02 UTC" "2011-07-25 UTC" "2014-06-12 UTC" NA NA "2013-10-06 UTC" [82] "2019-05-18 UTC" "2011-12-19 UTC" NA "2012-03-18 UTC" "2013-07-22 UTC" "2017-01-21 UTC" "2013-09-26 UTC" "2019-04-18 UTC" "2012-10-01 UTC" [91] "2018-09-01 UTC" "2019-11-22 UTC" "2013-07-05 UTC" "2013-07-22 UTC" "2008-10-11 UTC" "2018-04-29 UTC" NA "2019-06-24 UTC" "2018-04-19 UTC" [100] "2015-08-21 UTC" NA NA "2015-04-09 UTC" "2012-02-11 UTC" "2011-11-13 UTC" "2013-04-11 UTC" "2007-10-07 UTC" "2007-10-08 UTC" [109] "2012-01-14 UTC" "2012-06-02 UTC" "2011-07-04 UTC" "2019-05-17 UTC" "2012-09-09 UTC" NA "2018-09-29 UTC" "2015-06-04 UTC" "2014-01-13 UTC" [118] "2014-01-13 UTC" "2012-09-24 UTC" "2018-05-28 UTC" "2018-07-21 UTC" "2010-04-26 UTC" "2011-02-20 UTC" "2013-06-21 UTC" "2008-12-14 UTC" "2011-04-25 UTC" [127] "2014-07-31 UTC" "2015-06-08 UTC" "2015-10-25 UTC" "2019-06-29 UTC" "2011-02-21 UTC" "2017-01-09 UTC" NA "2015-06-21 UTC" "2014-07-28 UTC" [136] "2013-11-04 UTC" "2014-07-24 UTC" NA "2019-09-13 UTC" "2007-06-09 UTC" "2014-12-13 UTC" "2015-10-16 UTC" "2010-06-19 UTC" "2015-05-14 UTC" [145] "2011-07-29 UTC" "2007-10-01 UTC" NA NA "2010-09-25 UTC" "2010-04-15 UTC" "2020-03-05 UTC" "2017-06-30 UTC" NA [154] "2019-06-10 UTC" "2018-10-04 UTC" "2015-05-11 UTC" "2010-05-22 UTC" "2014-07-26 UTC" "2015-01-25 UTC" "2015-07-04 UTC" "2015-07-04 UTC" "2014-07-17 UTC" [163] "2010-09-18 UTC" "2007-01-08 UTC" "2019-10-21 UTC" "2014-06-30 UTC" "2008-08-01 UTC" NA "2010-08-13 UTC" NA NA [172] "2012-11-24 UTC" "2014-11-20 UTC" "2018-05-14 UTC" "2015-10-05 UTC" "2020-01-26 UTC" "2018-04-21 UTC" "2011-07-04 UTC" "2015-02-22 UTC" "2015-02-22 UTC" [181] "2008-10-11 UTC" "2017-01-05 UTC" "2011-05-21 UTC" NA "2015-09-27 UTC" "2011-08-28 UTC" "2019-03-09 UTC" "2018-11-29 UTC" "2014-07-11 UTC" [190] "2013-06-14 UTC" "2018-06-04 UTC" "2014-11-03 UTC" "2019-03-01 UTC" "2007-10-12 UTC" "2018-01-06 UTC" NA "2010-11-28 UTC" "2017-10-23 UTC" [199] "2014-03-23 UTC" "2018-11-11 UTC" "2019-05-18 UTC" "2014-10-02 UTC" NA NA "2011-07-31 UTC" "2010-07-16 UTC" "2015-04-09 UTC" [208] "2015-10-01 UTC" "2015-10-09 UTC" "2011-04-01 UTC" "2018-11-11 UTC" "2018-11-11 UTC" "2011-08-28 UTC" "2018-07-21 UTC" NA "2011-02-21 UTC" [217] "2018-03-17 UTC" NA "2014-05-11 UTC" "2012-03-23 UTC" "2014-05-25 UTC" "2014-03-23 UTC" "2013-01-20 UTC" NA "2014-07-11 UTC" [226] "2014-09-08 UTC" "2013-05-24 UTC" NA "2010-07-17 UTC" NA "2019-01-01 UTC" NA "2013-06-15 UTC" "2019-01-19 UTC" [235] "2020-02-02 UTC" "2013-03-14 UTC" "2012-08-04 UTC" "2015-02-13 UTC" "2010-06-18 UTC" NA "2013-10-20 UTC" "2015-12-17 UTC" "2017-09-01 UTC" [244] "2013-03-28 UTC" "2010-04-01 UTC" "2017-07-24 UTC" "2007-09-30 UTC" "2017-05-27 UTC" NA "2006-11-17 UTC" "2007-11-18 UTC" "2019-12-01 UTC" [253] "2015-10-12 UTC" "2015-03-27 UTC" "2017-12-02 UTC" "2018-09-03 UTC" "2018-03-04 UTC" "2015-03-14 UTC" NA "2010-01-25 UTC" "2008-07-04 UTC" [262] "2015-04-29 UTC" "2013-04-05 UTC" NA "2007-11-02 UTC" "2010-06-13 UTC" "2019-02-16 UTC" "2015-04-09 UTC" "2013-07-27 UTC" NA [271] "2018-08-25 UTC" "2019-06-14 UTC" Warning message: 39 failed to parse. A similar warning message returned when I used ymd()
1) Assuming that the formats are precisely the ones shown in the question (if not please fix the question) then this uses only base R. This makes use of the fact that as.Date will ignore junk at the end. x <- c("2000-10-01", "2000.10.01", "2000-10-01 03:04") as.Date(chartr(".", "-", x)) ## [1] "2000-10-01" "2000-10-01" "2000-10-01" 2) Another approach is the anytime package: library(anytime) anydate(x) ## [1] "2000-10-01" "2000-10-01" "2000-10-01"
Use lubridate package and ymd function library(lubridate) ymd(column_of_your_dataframe)
Sequence of only time (no dates) in r
I am trying to make a sequence that only consists of times with one hour interval, without dates. It should look like this: "00:00:00" "1:00:00" "2:00:00" "3:00:00" I know that this code works: dat <- seq( from=as.POSIXct("00:00:00","%H:%M:%S", tz="UTC"), to=as.POSIXct("23:00:00", "%H:%M:%S", tz="UTC"), by="hour" ) Which gives [1] "2018-04-10 00:00:00 UTC" "2018-04-10 01:00:00 UTC" "2018-04-10 02:00:00 UTC" "2018-04-10 03:00:00 UTC" "2018-04-10 04:00:00 UTC" [6] "2018-04-10 05:00:00 UTC" "2018-04-10 06:00:00 UTC" "2018-04-10 07:00:00 UTC" "2018-04-10 08:00:00 UTC" "2018-04-10 09:00:00 UTC" [11] "2018-04-10 10:00:00 UTC" "2018-04-10 11:00:00 UTC" "2018-04-10 12:00:00 UTC" "2018-04-10 13:00:00 UTC" "2018-04-10 14:00:00 UTC" [16] "2018-04-10 15:00:00 UTC" "2018-04-10 16:00:00 UTC" "2018-04-10 17:00:00 UTC" "2018-04-10 18:00:00 UTC" "2018-04-10 19:00:00 UTC" [21] "2018-04-10 20:00:00 UTC" "2018-04-10 21:00:00 UTC" "2018-04-10 22:00:00 UTC" "2018-04-10 23:00:00 UTC" But that is not what I want. Therefore I tried library(chron) seq(from = times("00:00:00"), to =times("23:00:00"), by="hour") which gives an error Error in convert.times(times., fmt) : format h:m:s may be incorrect In addition: Warning message: In unpaste(times, sep = fmt$sep, fnames = fmt$periods, nfields = 3) : wrong number of fields in entry(ies) 1 I am stuck now, so I hope somebody can help me with this. Of course I could just type it out, but I want to have a clean solution.
Using package chron which provides a times class: library(chron) times("00:00:00") + (0:23)/24 #[1] 00:00:00 01:00:00 02:00:00 03:00:00 04:00:00 05:00:00 06:00:00 07:00:00 08:00:00 09:00:00 10:00:00 11:00:00 12:00:00 13:00:00 14:00:00 #[16] 15:00:00 16:00:00 17:00:00 18:00:00 19:00:00 20:00:00 21:00:00 22:00:00 23:00:00
You can use strftime() to extract values in any format to character: dat <- seq( from=as.POSIXct("00:00:00","%H:%M:%S", tz="UTC"), to=as.POSIXct("23:00:00", "%H:%M:%S", tz="UTC"), by="hour" ) strftime(dat, format="%H:%M:%S") #"02:00:00" "03:00:00" "04:00:00" "05:00:00" "06:00:00" "07:00:00" #"08:00:00" "09:00:00" "10:00:00" "11:00:00" "12:00:00" "13:00:00" #"14:00:00" "15:00:00" "16:00:00" "17:00:00" "18:00:00" "19:00:00" #"20:00:00" "21:00:00" "22:00:00" "23:00:00" "00:00:00" "01:00:00"
When you have a POSIXct class, to extract only the hour, minutes and seconds you just need to do: as.character(format(from, "%H:%M:%S")) as.character(format(to, "%H:%M:%S"))
R - How to calculate in a new column the difference in seconds between the first and the remaining dates
I have the following dates and I want to calculate the difference between the first date and the other dates. e.g. The difference must be date 2- date 1, date 3 - date 1 etc, in seconds and in another column. Any help is appreciated I am new in R. "2009-06-01 16:00:00 UTC" "2009-06-29 16:00:00 UTC" "2009-06-29 17:00:00 UTC" "2009-06-30 16:00:00 UTC" "2009-06-30 17:00:00 UTC" "2009-06-30 18:00:00 UTC" "2009-06-30 19:00:00 UTC" "2009-07-01 08:00:00 UTC" "2009-07-01 09:00:00 UTC" "2009-07-01 10:00:00 UTC" "2009-07-01 16:00:00 UTC" "2009-07-01 17:00:00 UTC" "2009-07-01 18:00:00 UTC" "2009-07-01 19:00:00 UTC" "2009-07-02 08:00:00 UTC" "2009-07-02 09:00:00 UTC" "2009-07-02 10:00:00 UTC" "2009-07-02 16:00:00 UTC" "2009-07-02 17:00:00 UTC" "2009-07-02 18:00:00 UTC" "2009-07-02 19:00:00 UTC" "2009-07-04 10:00:00 UTC" "2009-07-04 16:00:00 UTC" "2009-07-04 17:00:00 UTC" "2010-06-22 16:00:00 UTC" "2010-06-22 17:00:00 UTC" "2010-06-22 18:00:00 UTC" "2010-08-20 16:00:00 UTC" "2011-06-02 16:00:00 UTC" "2011-06-02 17:00:00 UTC" "2011-06-02 18:00:00 UTC" "2011-06-03 10:00:00 UTC" "2011-06-03 16:00:00 UTC" "2011-06-03 17:00:00 UTC" "2011-06-03 18:00:00 UTC" "2011-06-03 19:00:00 UTC"
First you'll want to convert your character strings to dates. Once you've done this, you can easily use difftime() to calculate time distances. There are a number of packages that help you with this and even more ways to do so. So in addition to the answer provided using the lubridate package, here is a way to solve it in base R: # (I'll assume your data is saved in a vector called my_dates) my_dates <- gsub(" UTC", "", my_dates) # removes " UTC" from all your dates (for no reason, see edit below) my_dates <- as.POSIXlt(df$date) # converts to date format difftime(time1 = my_dates, time2 = my_dates[1], units = "sec") Time differences in secs # [1] 0 2419200 2422800 2505600 2509200 2512800 2516400 2563200 2566800 2570400 2592000 2595600 # [13] 2599200 2602800 2649600 2653200 2656800 2678400 2682000 2685600 2689200 2829600 2851200 2854800 # [25] 33350400 33354000 33357600 38448000 63158400 63162000 63165600 63223200 63244800 63248400 63252000 63255600 Note: In my initial answer, I used as.Date.character(), but this ignored the times after the dates! as.Date() also ignores the time and only focuses on the dates. POSIXlt() does the job and keeps both the times and the dates. Edit from comment: Apparently difftime() is clever enough to recognise strings as dates and automatically gets the right format for the dates, too!: difftime(my_dates, my_dates[1], units = "secs") # Time differences in secs # [1] 0 2419200 2422800 2505600 2509200 2512800 2516400 2563200 # 2566800 2570400 2592000 2595600 # [13] 2599200 2602800 2649600 2653200 2656800 2678400 2682000 2685600 2689200 2829600 2851200 2854800 # [25] 33350400 33354000 33357600 38448000 63158400 63162000 63165600 63223200 63244800 63248400 63252000 63255600
The lubridate package is your friend in this scenario: library(lubridate) d <- read.table(text='"2009-06-01 16:00:00 UTC" "2009-06-29 16:00:00 UTC" "2009-06-29 17:00:00 UTC" "2009-06-30 16:00:00 UTC" "2009-06-30 17:00:00 UTC" "2009-06-30 18:00:00 UTC" "2009-06-30 19:00:00 UTC" "2009-07-01 08:00:00 UTC" "2009-07-01 09:00:00 UTC" "2009-07-01 10:00:00 UTC" "2009-07-01 16:00:00 UTC" "2009-07-01 17:00:00 UTC" "2009-07-01 18:00:00 UTC" "2009-07-01 19:00:00 UTC" "2009-07-02 08:00:00 UTC" "2009-07-02 09:00:00 UTC" "2009-07-02 10:00:00 UTC" "2009-07-02 16:00:00 UTC" "2009-07-02 17:00:00 UTC" "2009-07-02 18:00:00 UTC" "2009-07-02 19:00:00 UTC" "2009-07-04 10:00:00 UTC" "2009-07-04 16:00:00 UTC" "2009-07-04 17:00:00 UTC" "2010-06-22 16:00:00 UTC" "2010-06-22 17:00:00 UTC" "2010-06-22 18:00:00 UTC" "2010-08-20 16:00:00 UTC" "2011-06-02 16:00:00 UTC" "2011-06-02 17:00:00 UTC" "2011-06-02 18:00:00 UTC" "2011-06-03 10:00:00 UTC" "2011-06-03 16:00:00 UTC" "2011-06-03 17:00:00 UTC" "2011-06-03 18:00:00 UTC" "2011-06-03 19:00:00 UTC"', stringsAsFactors=FALSE) d <- ymd_hms(d[, 1]) sapply(d, function(x) x-d)
Generate a working day sequence in R
I want to generate a working week / working day sequence (Monday-Friday; 8am - 5pm) in R. However I only figured out how to extract a working week (Monday-Friday) with 24 hours. library(timeDate) start <- as.POSIXct("2010-01-01") interval <- 60 seq_1 <- as.timeDate(seq(from=start, by=interval*60, length.out = 200)) seq_2 <- seq_1[isWeekday(seq_1)]; seq_2 dayOfWeek(seq_2) Is there a similar function which can extract only working hours? Thanks
You can use function format to obtain hours seq_2[as.numeric(format(seq_2,'%H')) %in% 8:15 ]
Select weekdays and then repeat with frequency equal to the desired hours. I'm afraid I missed your 8 o;clock start and used the phrase "9 to 5" as my guide: twoyears <- seq.Date(as.Date("2010-01-01"), by='day', length.out=365*2) twoworkyrs <- twoyears[isWeekday(twoyears, wday = 1:5)] twoworkyrs[ 1:10] # [1] "2010-01-01" "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" "2010-01-08" # [7] "2010-01-11" "2010-01-12" "2010-01-13" "2010-01-14" workhours <- as.POSIXct( as.numeric(rep(twoworkyrs, each=9))*24*3600 + # weekdays (9:17)*3600 , n # working hours origin="1970-01-01", tz="America/LosAngeles") #----- First two weeks ---------------- > workhours[1:90] [1] "2010-01-01 09:00:00 UTC" "2010-01-01 10:00:00 UTC" "2010-01-01 11:00:00 UTC" [4] "2010-01-01 12:00:00 UTC" "2010-01-01 13:00:00 UTC" "2010-01-01 14:00:00 UTC" [7] "2010-01-01 15:00:00 UTC" "2010-01-01 16:00:00 UTC" "2010-01-01 17:00:00 UTC" [10] "2010-01-04 09:00:00 UTC" "2010-01-04 10:00:00 UTC" "2010-01-04 11:00:00 UTC" [13] "2010-01-04 12:00:00 UTC" "2010-01-04 13:00:00 UTC" "2010-01-04 14:00:00 UTC" [16] "2010-01-04 15:00:00 UTC" "2010-01-04 16:00:00 UTC" "2010-01-04 17:00:00 UTC" [19] "2010-01-05 09:00:00 UTC" "2010-01-05 10:00:00 UTC" "2010-01-05 11:00:00 UTC" [22] "2010-01-05 12:00:00 UTC" "2010-01-05 13:00:00 UTC" "2010-01-05 14:00:00 UTC" [25] "2010-01-05 15:00:00 UTC" "2010-01-05 16:00:00 UTC" "2010-01-05 17:00:00 UTC" [snipped I must admit that timezone conversions are one of my weakest suits.