Formatting inconsistent datetime variable [duplicate] - r

This question already has answers here:
Convert dd/mm/yy and dd/mm/yyyy to Dates
(6 answers)
Closed 5 years ago.
I have a large dataset (a few millions observations) that contains a datetime variable with an inconsistent format: "%Y-%m-%d %H:%M:%S" ; "%m/%d/%Y and %H:%M:%S".
Here is how the dataset looks like:
df <- data.frame(var1 = c(1:6),
var2 = c("A", "B", "C", "A", "B", "C"),
datetime = c("2013-07-01 00:00:02", "2016-07-01 00:00:01",
"9/2/2014 00:01:20", "9/1/2014 00:00:25",
"1/1/2015 0:07", "6/1/2015 0:01"))
Is there an efficient way to format the datetime variable into a unique, consistent date time format?

You can use lubridate package like this.
lubridate::parse_date_time(x = df$datetime, c("ymd HMS","mdy HMS"))
[1] "2013-07-01 00:00:02 UTC" "2016-07-01 00:00:01 UTC" "2014-09-02 00:01:20 UTC"
[4] "2014-09-01 00:00:25 UTC" NA NA
Warning message:
2 failed to parse.
lubridate::parse_date_time(x = df$datetime, c("ymd HMS","mdy HMS","mdy HM"))
[1] "2013-07-01 00:00:02 UTC" "2016-07-01 00:00:01 UTC" "2014-09-02 00:01:20 UTC"
[4] "2014-09-01 00:00:25 UTC" "2015-01-01 00:07:00 UTC" "2015-06-01 00:01:00 UTC"
You can specify your date-time formats as needed, you may compare two examples I mentioned.
Hope this helps you. :)

POSIXCT solution using parse_date_time.
EDIT: incorporating #Akarsh Jain's POSIXCT formatting for better time alignment.
df$new_date <- parse_date_time(df$datetime, c("%Y-%m-%d %H:%M:%S", "%m/%d/%Y %H:%M:%S", "%m/%d/%Y %H:%M"))

Related

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

How to format these times in R

I've got some unusually formatted strings that I need to get into datetime objects, and I can't find what I need from the strptime function documentation. Examples of strings that I need formatted are:
4/16/2018 0:00:00
8/30/2019 14:35:00
11/15/2017 8:15:10
I can't find any specification that matches these kinds of strings that I can use as the format in strptime. Am I going to have to format the strings first?
We can use the format "%m/%d/%Y %H:%M:%S" to convert to DateTime
as.POSIXct(str1, format = "%m/%d/%Y %H:%M:%S")
#[1] "2018-04-16 00:00:00 CDT" "2019-08-30 14:35:00 CDT" "2017-11-15 08:15:10 CST"
data
str1 <- c("4/16/2018 0:00:00", "8/30/2019 14:35:00", "11/15/2017 8:15:10")
I think this is enough:
library(lubridate)
dates = c('4/16/2018 0:00:00', '8/30/2019 14:35:00', '11/15/2017 8:15:10')
dates = mdy_hms(dates)
dates
Output:
[1] "2018-04-16 00:00:00 UTC" "2019-08-30 14:35:00 UTC" "2017-11-15 08:15:10 UTC"
Seem to be quite common strings for strptime. Or was there another problem?
x <- "4/16/2018 0:00:01"
strptime(x ,"%m/%d/%Y %H:%M:%S")

Including seconds when using strptime with examples such as 10-10-2010 00:00:00 [duplicate]

This question already has answers here:
How can I keep midnight (00:00h) using strptime() in R?
(2 answers)
Closed 3 years ago.
I have had a good hunt around and sure this has to have been answered before but I cant seem to find any help!
I have a series of times in a data frame, some of which have the following time stamp in the following format:
Date <- '2018-10-10'
Time <- '00:00:00'
When I use the strptime function it returns only the date, it removes the 00:00:00, see below:
datetime <- strptime(paste(Date,Time),
format = "%Y-%m-%d %H:%M:%S",
tz = 'GMT')
> datetime
[1] "2018-10-10 GMT"
if for example it was Time <- 00:00:01 it would return
> datetime
[1] "2018-10-10 00:00:01 GMT"
Does anyone know a way of ensuring the output for 00:00:00 instances are displayed. Desired output:
"2018-10-10 00:00:00 GMT"
Many thanks!!
Jim
When you type datetime and hit <Enter>, R will use a/the suitable print method to display datetime. Just because datetime returns "2018-10-10 GMT" doesn't mean that datetime has forgotten about the seconds.
To ensure a consistent format of your POSIXlt object, you could use format
format(datetime, "%Y-%m-%d %H:%M:%S", usetz = T)
#[1] "2018-10-10 00:00:00 GMT"
Similar for case 2
Date <- '2018-10-10'
Time <- '00:00:01'
datetime <- strptime(paste(Date,Time), format = "%Y-%m-%d %H:%M:%S", tz = 'GMT')
format(datetime, "%Y-%m-%d %H:%M:%S", usetz = T)
#[1] "2018-10-10 00:00:01 GMT"
Sample data
Date <- '2018-10-10'
Time <- '00:00:00'
datetime <- strptime(paste(Date,Time), format = "%Y-%m-%d %H:%M:%S", tz = 'GMT')

I would like to extract the time from a character vector [duplicate]

This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"

Converting Navy sunset/sunrise data into time

I have downloaded the 2015-2017 sunset/sunrise data from the Navy and I am trying to format it into dates and time to further use with other data I have. This is how my data set looks in R.
I have managed to convert the date into the right R data format. Yet, I still can't convert my rise/set column data from a number/integer format to a time data (as hh:mm).
Based on one internet source, I wrote the following codes:
Sun2015$Srise<- format(strptime(Sun2015$Rise, format="%H:%M"))
However this gives NA in my data
OR
Sun2015$Srise<-str_pad(Sun2015$Rise, 4, pad="0")
Sun2015$Srise<-hour(hm(Sun2015$Srise))
Yet, I received the following error:
Warning message: In .parse_hms(..., order = "HM", quiet = quiet) :
Some strings failed to parse.
Is there a better way to convert the columns into the right time format so that I can merge the date and time columns into date-time columns for sunset and sunrise?
Thank you in advance for your help.
You can convert your military time to 2400 time strings using sprint("%04d", data) and go from there. For example, with the first 5 lines of your data:
# Sample of your data
Day <- c("1/1/2015", "1/2/2015", "1/3/2015", "1/4/2015", "1/5/2015")
Rise <- c(652,652,652,653,653)
Set <- c(1755,1756,1756,1757,1757)
sun2015 <- data.frame(Day, Rise, Set)
# Convert to 2400 style strings with leading zeroes where necessary
sun2015$Rise <- sprintf("%04d", sun2015$Rise)
sun2015$Set <- sprintf("%04d", sun2015$Set)
# Merge with your date
sun2015$day_rise <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Rise), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
sun2015$day_set <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Set), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
> sun2015$day_rise
[1] "2015-01-01 06:52:00 UTC" "2015-01-02 06:52:00 UTC" "2015-01-03 06:52:00 UTC" "2015-01-04 06:53:00 UTC"
[5] "2015-01-05 06:53:00 UTC"
> sun2015$day_set
[1] "2015-01-01 17:55:00 UTC" "2015-01-02 17:56:00 UTC" "2015-01-03 17:56:00 UTC" "2015-01-04 17:57:00 UTC"
[5] "2015-01-05 17:57:00 UTC"
You can adjust to the appropriate time zone if necessary.

Resources