Converting values in Date Time column into common format [duplicate] - r

This question already has answers here:
Parse datetime with lubridate
(3 answers)
Parsing dates with different formats
(3 answers)
Closed 2 years ago.
I have a dataframe with a column containing Date Time values, but the values are in different formats.
I want to bring them all to the format "dd/mm/yyyy hh:mm". I tried using the lubridate package to convert the dates with the AM/PM text appended to the dates, but am unable to do so.
Date_Time
"11/01/2019 10:00"
"11/01/2019 11:00"
"11/01/2019 12:00"
"11/01/2019 13:00"
"11/01/2019 14:00"
"11/01/2019 15:00"
"11/01/2019 16:00"
"10/03/2019 23:00"
"10/04/2019 1:00"
"10/28/2019 05:00:00 AM"
"10/28/2019 10:00:00 PM"
"10/29/2019 02:00:00 AM"
"10/29/2019 03:00:00 AM"
"10/31/2019 01:00:00 PM"
"10/31/2019 02:00:00 PM"
"10/31/2019 10:00:00 PM"

You can use lubridate's parse_date_time :
lubridate::parse_date_time(df$Date_Time, c('mdYHM', 'mdYIMSp'))
#[1] "2019-11-01 10:00:00 UTC" "2019-11-01 11:00:00 UTC" "2019-11-01 12:00:00 UTC"
#[4] "2019-11-01 13:00:00 UTC" "2019-11-01 14:00:00 UTC" "2019-11-01 15:00:00 UTC"
#[7] "2019-11-01 16:00:00 UTC" "2019-10-03 23:00:00 UTC" "2019-10-04 01:00:00 UTC"
#[10]"2019-10-28 05:00:00 UTC" "2019-10-28 22:00:00 UTC" "2019-10-29 02:00:00 UTC"
#[13]"2019-10-29 03:00:00 UTC" "2019-10-31 13:00:00 UTC" "2019-10-31 14:00:00 UTC"
#[16]"2019-10-31 22:00:00 UTC"
data
df <- structure(list(Date_Time = c("11/01/2019 10:00", "11/01/2019 11:00",
"11/01/2019 12:00", "11/01/2019 13:00", "11/01/2019 14:00", "11/01/2019 15:00",
"11/01/2019 16:00", "10/03/2019 23:00","10/04/2019 1:00","10/28/2019 05:00:00 AM",
"10/28/2019 10:00:00 PM", "10/29/2019 02:00:00 AM", "10/29/2019 03:00:00 AM",
"10/31/2019 01:00:00 PM", "10/31/2019 02:00:00 PM", "10/31/2019 10:00:00 PM"
)), class = "data.frame", row.names = c(NA, -16L))

Related

Generate an ordered series of datetime

I am working in R.
I have to generate a series of dates and times. In particular, I would like to have two data points per day, hence to assign twice each date with a different time, for instance:
"2001-05-13 00:00:00"
"2001-05-13 12:00:00"
"2001-05-14 00:00:00"
"2001-05-14 12:00:00"
I found the following code to produce a series of dates:
seq(as.Date("2000/1/1"), as.Date("2003/1/1"), by = 0.5)
Nevertheless, even if I set the by = 0.5, the code returns only a date , not a datetime.
Any idea how to produce a series of datetimes?
as.Date will produce only dates, use as.POSIXct to produce date-time.
seq(as.POSIXct("2000-01-01 00:00:00", tz = 'UTC'),
as.POSIXct("2003-01-01 00:00:00", tz = 'UTC'), by = '12 hours')
# [1] "2000-01-01 00:00:00 UTC" "2000-01-01 12:00:00 UTC"
# [3] "2000-01-02 00:00:00 UTC" "2000-01-02 12:00:00 UTC"
# [5] "2000-01-03 00:00:00 UTC" "2000-01-03 12:00:00 UTC"
# [7] "2000-01-04 00:00:00 UTC" "2000-01-04 12:00:00 UTC"
# [9] "2000-01-05 00:00:00 UTC" "2000-01-05 12:00:00 UTC"
#[11] "2000-01-06 00:00:00 UTC" "2000-01-06 12:00:00 UTC"
#[13] "2000-01-07 00:00:00 UTC" "2000-01-07 12:00:00 UTC"
#...
#...

Keep date and time format consistent [duplicate]

This question already has an answer here:
R: set a column of a dataframe as date-time in order to create intervals
(1 answer)
Closed 3 years ago.
I have some dates and times that for some reason switch between formats at the turn of a new month. I would like to convert them to be consistently Day, Month, Year (19 rather than 2019) and 24 hr clock with seconds. I have managed to switch half using strptime but does not work for entire dataset.
Here is an example:
datetime <- c("02/28/19 11:39:00 PM", "02/28/19 11:40:00 PM", "02/28/19 11:41:00 PM",
"02/28/19 11:42:00 PM", "02/28/19 11:43:00 PM", "02/28/19 11:44:00 PM",
"02/28/19 11:45:00 PM", "02/28/19 11:46:00 PM", "02/28/19 11:47:00 PM",
"02/28/19 11:48:00 PM", "02/28/19 11:49:00 PM", "02/28/19 11:50:00 PM",
"02/28/19 11:51:00 PM", "02/28/19 11:52:00 PM", "02/28/19 11:53:00 PM",
"02/28/19 11:54:00 PM", "02/28/19 11:55:00 PM", "02/28/19 11:56:00 PM",
"02/28/19 11:57:00 PM", "02/28/19 11:58:00 PM", "02/28/19 11:59:00 PM",
"03/01/2019 00:00", "03/01/2019 00:01", "03/01/2019 00:02", "03/01/2019 00:03",
"03/01/2019 00:04", "03/01/2019 00:05", "03/01/2019 00:06", "03/01/2019 00:07",
"03/01/2019 00:08", "03/01/2019 00:09", "03/01/2019 00:10", "03/01/2019 00:11",
"03/01/2019 00:12", "03/01/2019 00:13", "03/01/2019 00:14", "03/01/2019 00:15",
"03/01/2019 00:16", "03/01/2019 00:17", "03/01/2019 00:18", "03/01/2019 00:19",
"03/01/2019 00:20", "03/01/2019 00:21", "03/01/2019 00:22", "03/01/2019 00:23",
"03/01/2019 00:24", "03/01/2019 00:25", "03/01/2019 00:26", "03/01/2019 00:27",
"03/01/2019 00:28", "03/01/2019 00:29")
One possible approach is
lubridate::mdy_hms(datetime, truncated = 1L)
which returns
[1] "2019-02-28 23:39:00 UTC" "2019-02-28 23:40:00 UTC" "2019-02-28 23:41:00 UTC" "2019-02-28 23:42:00 UTC"
[5] "2019-02-28 23:43:00 UTC" "2019-02-28 23:44:00 UTC" "2019-02-28 23:45:00 UTC" "2019-02-28 23:46:00 UTC"
[9] "2019-02-28 23:47:00 UTC" "2019-02-28 23:48:00 UTC" "2019-02-28 23:49:00 UTC" "2019-02-28 23:50:00 UTC"
[13] "2019-02-28 23:51:00 UTC" "2019-02-28 23:52:00 UTC" "2019-02-28 23:53:00 UTC" "2019-02-28 23:54:00 UTC"
[17] "2019-02-28 23:55:00 UTC" "2019-02-28 23:56:00 UTC" "2019-02-28 23:57:00 UTC" "2019-02-28 23:58:00 UTC"
[21] "2019-02-28 23:59:00 UTC" "2019-03-01 00:00:00 UTC" "2019-03-01 00:01:00 UTC" "2019-03-01 00:02:00 UTC"
[25] "2019-03-01 00:03:00 UTC" "2019-03-01 00:04:00 UTC" "2019-03-01 00:05:00 UTC" "2019-03-01 00:06:00 UTC"
[29] "2019-03-01 00:07:00 UTC" "2019-03-01 00:08:00 UTC" "2019-03-01 00:09:00 UTC" "2019-03-01 00:10:00 UTC"
[33] "2019-03-01 00:11:00 UTC" "2019-03-01 00:12:00 UTC" "2019-03-01 00:13:00 UTC" "2019-03-01 00:14:00 UTC"
[37] "2019-03-01 00:15:00 UTC" "2019-03-01 00:16:00 UTC" "2019-03-01 00:17:00 UTC" "2019-03-01 00:18:00 UTC"
[41] "2019-03-01 00:19:00 UTC" "2019-03-01 00:20:00 UTC" "2019-03-01 00:21:00 UTC" "2019-03-01 00:22:00 UTC"
[45] "2019-03-01 00:23:00 UTC" "2019-03-01 00:24:00 UTC" "2019-03-01 00:25:00 UTC" "2019-03-01 00:26:00 UTC"
[49] "2019-03-01 00:27:00 UTC" "2019-03-01 00:28:00 UTC" "2019-03-01 00:29:00 UTC"
for the given sample dataset.

Parse 24:00 AM datetime in R

I have some data with an unconventional date format for midnight. In the raw data, midnight is being treated as "1/1/2018 24:00 AM" instead of "1/2/2018 00:00 AM". Why would anyone do this?!
I'd like to convert this character vector into a POSIXct() format.
Here is some example data:
datetime <- c("1/1/2018 11:00 PM", "1/1/2018 24:00 AM", "1/2/2018 01:00 AM")
The following code fails to parse midnight but does what I want otherwise:
as.POSIXct(datetime, format = "%m/%d/%Y %I:%M %p")
This returns the following:
[1] "2018-01-01 23:00:00 GMT" NA "2018-01-02 01:00:00 GMT"
An alternative is to use lubridate::mdy_hm which parses 24:00 AM correctly as 00:00 AM on the next day.
library(lubridate)
mdy_hm(datetime)
#[1] "2018-01-01 23:00:00 UTC" "2018-01-02 00:00:00 UTC"
#[3] "2018-01-02 01:00:00 UTC"

Convert string with AM/PM to 24 hours date format

I have been trying to convert a string with times in AM/PM format to 24 hour date object. 'strtime` coverts to 24 hour format, by keeps adding today's date on the output. The problem is that I need ONLY TIME, such as '13:00'.
Thank you very much in advance!
am_pm <- c("8:00 am","3:30 PM", "10:00 AM", "8:00 AM", "9:00 PM", "9:00
AM")
strptime(am_pm, "%I:%M %p")
[1] "2017-10-19 08:00:00 UTC" "2017-10-19 15:30:00 UTC" "2017-10-19 10:00:00
UTC" "2017-10-19 08:00:00 UTC"
[5] "2017-10-19 21:00:00 UTC" "2017-10-19 09:00:00 UTC"
You will need to add an extra step for that. Use as.ITime() function from the lubridate package.
am_pm <- c("8:00 am","3:30 PM", "10:00 AM", "8:00 AM", "9:00 PM", "9:00 AM")
mytime <- strptime(am_pm, "%I:%M %p")
as.ITime(mytime)
"08:00:00" "15:30:00" "10:00:00" "08:00:00" "21:00:00" "09:00:00"
from the docs

Generate a working day sequence in R

I want to generate a working week / working day sequence (Monday-Friday; 8am - 5pm) in R. However I only figured out how to extract a working week (Monday-Friday) with 24 hours.
library(timeDate)
start <- as.POSIXct("2010-01-01")
interval <- 60
seq_1 <- as.timeDate(seq(from=start, by=interval*60, length.out = 200))
seq_2 <- seq_1[isWeekday(seq_1)]; seq_2
dayOfWeek(seq_2)
Is there a similar function which can extract only working hours? Thanks
You can use function format to obtain hours
seq_2[as.numeric(format(seq_2,'%H')) %in% 8:15 ]
Select weekdays and then repeat with frequency equal to the desired hours. I'm afraid I missed your 8 o;clock start and used the phrase "9 to 5" as my guide:
twoyears <- seq.Date(as.Date("2010-01-01"), by='day', length.out=365*2)
twoworkyrs <- twoyears[isWeekday(twoyears, wday = 1:5)]
twoworkyrs[ 1:10]
# [1] "2010-01-01" "2010-01-04" "2010-01-05" "2010-01-06" "2010-01-07" "2010-01-08"
# [7] "2010-01-11" "2010-01-12" "2010-01-13" "2010-01-14"
workhours <- as.POSIXct( as.numeric(rep(twoworkyrs, each=9))*24*3600 + # weekdays
(9:17)*3600 , n # working hours
origin="1970-01-01", tz="America/LosAngeles")
#----- First two weeks ----------------
> workhours[1:90]
[1] "2010-01-01 09:00:00 UTC" "2010-01-01 10:00:00 UTC" "2010-01-01 11:00:00 UTC"
[4] "2010-01-01 12:00:00 UTC" "2010-01-01 13:00:00 UTC" "2010-01-01 14:00:00 UTC"
[7] "2010-01-01 15:00:00 UTC" "2010-01-01 16:00:00 UTC" "2010-01-01 17:00:00 UTC"
[10] "2010-01-04 09:00:00 UTC" "2010-01-04 10:00:00 UTC" "2010-01-04 11:00:00 UTC"
[13] "2010-01-04 12:00:00 UTC" "2010-01-04 13:00:00 UTC" "2010-01-04 14:00:00 UTC"
[16] "2010-01-04 15:00:00 UTC" "2010-01-04 16:00:00 UTC" "2010-01-04 17:00:00 UTC"
[19] "2010-01-05 09:00:00 UTC" "2010-01-05 10:00:00 UTC" "2010-01-05 11:00:00 UTC"
[22] "2010-01-05 12:00:00 UTC" "2010-01-05 13:00:00 UTC" "2010-01-05 14:00:00 UTC"
[25] "2010-01-05 15:00:00 UTC" "2010-01-05 16:00:00 UTC" "2010-01-05 17:00:00 UTC"
[snipped
I must admit that timezone conversions are one of my weakest suits.

Resources