NA returned while using strptime - r

I have this data frame which gives me Date and Time columns. I am trying to combine these 2 columns but strptime is returning NA. i want to understand why is it happening?
x <- data.frame(date = "1/2/2007", time = "00:00:02")
y <- strptime(paste(x$date,x$time,sep = " "), format = "%b/%d/%y %H:%M:%S")

We need %m and %Y in place of %b and %y (%b - Abbreviated month name in the current locale on this platform. %y - Year without century (00–99)).
strptime(paste(x$date,x$time,sep = " "), "%m/%d/%Y %H:%M:%S")
#[1] "2007-01-02 00:00:02 IST"
For understanding the format, it is better to check ?strptime
Or we can use mdy_hms from lubridate
library(lubridate)
with(x, mdy_hms(paste(date, time)))
#[1] "2007-01-02 00:00:02 UTC"

Related

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

parsing a character date time format in R

I am trying to parse a column into two variables, "date" and "time" in R. I have installed the lubridate library.
The current csv file has the following timestamp format: yyyyMMdd hh:mm a (e.g. '20170423 12:26 AM') and imports the column as character.
I'm trying this but its not working on my current variable 'Tran_Date' (below code doesn't work):
transactions_file <- as_date('Tran_Date', "%Y%m%d %H:%M %p")
I like the base R solution like this,
Tran_Date <- as.POSIXct("20170423 12:26 AM", format = "%Y%m%d %I:%M %p")
Tran_Date
#> [1] "2017-04-23 00:26:00 CEST"
transactions_file <- data.frame(
date = format(Tran_Date,"%m/%d/%Y"),
time = format(Tran_Date,"%H:%M")) # possibly add %p if you use %I
transactions_file
#> date time
#> 1 04/23/2017 00:26
with lubridate,
# install.packages(c("tidyverse"), dependencies = TRUE)
library(lubridate)
Tran_Date <- ymd_hm("20170423 12:26 AM")
then you could recycle the above or use some combination of day(Tran_Date) cbind paste with month(Tran_Date) and similar with paste(hour(Tran_Date), minute(Tran_Date), sep = ":") or most likely something smarter.

Dealing with date-time string that has day of the week

I have a date-time string that has day of the week and some meta-data in the string.
d <- "Fri, 14 Jul 2000 06:59:00 -0700 (PDT)"
I need to convert it into a date-time object (e.g. I have a column of these in a data.table) for further analysis. I have dealt with this using regexes to strip off meta-data from the string. Is there a better approach?
What I have is:
m <- regexpr("^\\w+,\\s+", d, perl=TRUE)
regmatches(d, m)
m <- regexpr("\\s-?\\d+\\s\\(\\w+\\)$", d, perl=TRUE)
regmatches(d, m)
ds <- sub("^\\w+,\\s+", "", d)
ds <- sub("\\s-?\\d+\\s\\(\\w+\\)$", "", ds)
Now I can convert this to date-time objects of class Date, Posixlt or Posixct for use in analysis.
dd <- strptime(ds, format="%d %b %Y %H:%M:%S")
dd <- as.Date(ds, format="%d %b %Y %H:%M:%S")
dd <- as.POSIXct(ds, format="%d %b %Y %H:%M:%S")
I wrote the anytime package to help with (among other things) these silly format strings -- so it heuristically just tries a number of them (and focuses on sane ones).
The input you have here qualifies (and is in fact a pretty common form):
R> anytime("Fri, 14 Jul 2000 06:59:00 -0700 (PDT)")
[1] "2000-07-14 06:59:00 CDT"
R>
We do not currently try to capture the timezone offset information at the end, so you have to deal with that after the fact. The display is in CDT which is my local timezone.
There is some more information about anytime on its webpage.
assuming the format of string is going to be constant across your data :
time = trimws(unlist(strsplit(d, "[,-]"))[2])
#[1] "14 Jul 2000 06:59:00"
tz = unlist(strsplit(d, "[,-]"))[3]
tz = gsub("[^A-Z]", "", tz)
#[1] "PDT"
> as.Date(time, format = "%d %b %Y")
[1] "2000-07-14"
> as.POSIXct(time, format = "%d %b %Y %H:%M:%S") #specify th etimezone with tz
[1] "2000-07-14 06:59:00 IST"

R convert factor in format "1/15/2016 3:20:00 AM" to Date

Input is in the format
"1/15/2016 3:20:00 AM"
mydate<- factor("1/15/2016 3:20:00 AM")
I tried many codes such as
mydate<-as.Date(mydate, format = "%m/%d/%y %I:%M %p")
but getting NA values as output,
Please help!
Your string has both a date and a time component. I would first parse the full date and time to a POSIXct value, and then, if you're really only interested in the date part, you can coerce to Date:
dtstr <- '1/15/2016 3:20:00 AM';
dt <- as.POSIXct(dstr,format='%m/%d/%Y %I:%M:%S %p');
dt;
## [1] "2016-01-15 03:20:00 EST"
date <- as.Date(dt);
date;
## [1] "2016-01-15"

convert character string with timezone to date in r

I have a vector of character strings looking like this. I want to convert them to dates. The characters for time-zone is posing trouble.
> a
[1] "07/17/2014 5:01:22 PM EDT" "7/17/2014 2:01:05 PM PDT" "07/17/2014 4:00:48 PM CDT" "07/17/2014 3:05:16 PM MDT"
If I use: strptime(a, "%d/%m/%Y %I:%M:%S %p %Z") I get [1] NA
If i omit the "%Z" for time-zone, and use this:
strptime(a, "%m/%d/%Y %I:%M:%S %p", tz = "EST5EDT") I get
[1] "2014-07-17 17:01:22 EDT"
Since my strings contain various time zones - PDT, CDT, EDT, MDT , I can't default all time zones to EST5EDT. One way to overcome is split the vector into different vectors for each time-zone, remove the letters PDT / EDT etc. and apply the right timezone with strptime - "EST5EDT" , "CST6CDT" etc. Is there any other way to solve this?
If the date is always the first part of the elements of the character vector and it is always followed by the time, splitting the elements by the whitespaces is a possibility. If only the date is needed:
dates <- sapply(a, function(x) strsplit(x, split = " ")[[1]][1])
dates <- as.Date(as.character(dates), format = "%m/%d/%Y")
[1] "2014-07-17" "2014-07-17" "2014-07-17" "2014-07-17"
If also the time is needed:
datetime <- sapply(a, function(x) paste(strsplit(x, split = " ")[[1]][1:3],
collapse = " "))
datetime <- strptime(as.character(datetime), format = "%m/%d/%Y %I:%M:%S %p")
[1] "2014-07-17 17:01:22 CEST" "2014-07-17 14:01:05 CEST"
You can set a different timezone using the tz argument here.

Resources