Parsing ISO8601 date and time format in R [duplicate] - r

This question already has answers here:
Using strptime %z with special timezone format
(2 answers)
Closed 9 years ago.
This should be quick - we are parsing the following format in R:
2013-04-05T07:49:54-07:00
My current approach is
require(stringr)
timenoT <- str_replace_all("2013-04-05T07:49:54-07:00", "T", " ")
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S%z", tz="UTC")
but it gives NA.

%z is the signed offset in hours, in the format hhmm, not hh:mm. Here's one way to remove the last :.
newstring <- gsub("(.*).(..)$","\\1\\2","2013-04-05T07:49:54-07:00")
(timep <- strptime(newstring, "%Y-%m-%dT%H:%M:%S%z", tz="UTC"))
# [1] "2013-04-05 14:49:54 UTC"
Also note that you don't have to remove the "T".

You don't the string replacement.
NA just means that the whole did not work, so do it pieces to build your expression:
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%d")
[1] "2013-04-05"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M")
[1] "2013-04-05 07:49:00"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M:%S")
[1] "2013-04-05 07:49:54"
R>
Also, for reasons I never fully understood -- but which probably reside with C library function underlying it, %z only works on output, not input. So your NA mostly likely comes from your use of %z.

strptime("2013-04-05 07:49:54-07:00", "%Y-%m-%d %H:%M:%S", tz="UTC") gives 2013-04-05 07:49:54 UTC
Try
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S", tz="UTC")

Related

Converting long integer into date and time in r [duplicate]

This question already has answers here:
Convert integer to class Date
(3 answers)
Closed 1 year ago.
I have date and time information in the following format:
z <- 20201019083000
I want to convert it into a readable date and time format such as follows:
"2020-10-19 20:20"
So far I have tried this but cannot get the correct answer.
#in local
as.POSIXct(z, origin = "1904-01-01")
"642048-10-22 14:43:20 KST"
#in UTC
as.POSIXct(z, origin = "1960-01-01", tz = "GMT")
"642104-10-23 05:43:20 GMT"
#in
as.POSIXct(as.character(z), format = "%H%M%S")
"2021-07-13 20:20:10 KST"
Any better way to do it?
library(lubridate)
ymd_hms("20201019083000")
# [1] "2020-10-19 08:30:00 UTC"
# or, format the output:
format(ymd_hms("20201019083000"), "%Y-%m-%d %H:%M")
# "2020-10-19 08:30"
You can use as.POSIXct or strptime with the format %Y%m%d%H%M%S:
as.POSIXct(as.character(z), format="%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
strptime(z, "%Y%m%d%H%M%S")
#[1] "2020-10-19 08:30:00 CEST"
Your tried format "%H%M%S" dos not include Year %Y , Month %m and Day %d.

Convert dates to text in R

I have the following dataset with dates (YYYY-MM-DD):
> dates
[1] "20180412" "20180424" "20180506" "20180518" "20180530" "20180611" "20180623" "20180705" "20180717" "20180729"
I want to convert them in:
DD-MMM-YYYY but with the month being text. For example 20180412 should become 12Apr2018
Any suggestion on how to proceed?
M
You can try something like this :
# print today's date
today <- Sys.Date()
format(today, format="%B %d %Y") "June 20 2007"
where The following symbols can be used with the format( ) function to print dates 1
You need to first parse the text strings as Date objects, and then format these Date objects to your liking to have the different text output:
R> library(anytime) ## one easy way to parse dates and times
R> dates <- anydate(c("20180412", "20180424", "20180506", "20180518", "20180530",
+ "20180611", "20180623", "20180705", "20180717", "20180729"))
R> dates
[1] "2018-04-12" "2018-04-24" "2018-05-06" "2018-05-18" "2018-05-30"
[6] "2018-06-11" "2018-06-23" "2018-07-05" "2018-07-17" "2018-07-29"
R>
R> txtdates <- format(dates, "%d%b%Y")
R> txtdates
[1] "12Apr2018" "24Apr2018" "06May2018" "18May2018" "30May2018"
[6] "11Jun2018" "23Jun2018" "05Jul2018" "17Jul2018" "29Jul2018"
R>
You could use the as.Date() and format() functions:
dts <- c("20180412", "20180424", "20180506", "20180518", "20180530",
"20180611", "20180623")
format(as.Date(dts, format = "%Y%m%d"), "%d%b%Y")
More information here
Simply use as.POSIXct and as.format:
dates <- c("20180412", "20180424", "20180506")
format(as.POSIXct(dates, format="%Y%m%d"),format="%d%b%y")
Output:
[1] "12Apr18" "24Apr18" "06May18"

formatting time in R error

I have a Time column in my df with value 1.01.2016 0:00:05. I want it without the seconds and therefore used df$Time <- as.POSIXct(df$Time, format = "%d.%m.%Y :%H:%M", tz = "Asia/Kolkata"). But I get NA value. What is the problem here?
I suspect there are two things working here: the storage of a time object (POSIXt), and the representation of that object.
The string you present is (I believe) not a proper POSIXt (whether POSIXct or POSIXlt) object for R, which means it is just a character string. In that case, you can remove it with:
gsub(':[^:]*$', '', '1.01.2016 0:00:05')
# [1] "1.01.2016 0:00"
However, that is still just a string, not a date or time object. If you parse it into a time-object that R knows about:
as.POSIXct("1.01.2016 0:00:05", format = "%d.%m.%Y %H:%M:%S", tz = "Asia/Kolkata")
# [1] "2016-01-01 00:00:05 IST"
then you now have a time object that R knows something about ... and it defaults to representing it (printing it on the console) with seconds-precision. Typically, all that is available to change for the console-printing is the precision of the seconds, as in
options("digits.secs")
# $digits.secs
# NULL
Sys.time()
# [1] "2018-06-26 18:21:06 PDT"
options("digits.secs"=3)
Sys.time()
# [1] "2018-06-26 18:21:10.090 PDT"
then you can get more. But alas, I do know think there is an R-option to say "always print my POSIXt objects in this way". So your only choice is (at the point where you no longer need it to be a time-like object) to change it back into a string with no time-like value:
x <- as.POSIXct("1.01.2016 0:00:05", format = "%d.%m.%Y %H:%M:%S", tz = "Asia/Kolkata")
x
# [1] "2016-01-01 00:00:05 IST"
?strptime
# see that day-of-month can either be "%d" for 01-31 or "%e" for 1-31
format(x, format="%e.%m.%Y %H:%M")
# [1] " 1.01.2016 00:00"
(This works equally well for a vector.)
Part of me suggests convert to POSIXt and back to string as opposed to my gsub example because using as.POSIXct will tell you when the string does not match the date-time-like object you are expecting, whereas gsub will happily do something wrong or nothing.
Try asPOSIXlt:
> test <- "1.01.2016 0:00:05"
> as.POSIXlt(test, "%d.%m.%Y %H:%M:%S", tz="Asia/Kolkata")
[1] "2016-01-01 00:00:05 IST"

converting date into timestamp in R

i've searched for threads about timestamp conversion in R, but could not figure this out.
I need to convert time column into timestamp so R would read it as dates. When the cell has only date without time, there is no problem, but the current format (either with + or without it in the cell - R considers it as integer or factor).
How do i convert it into timestamp?
thank you
You do not need to remove the +:
R> crappyinput <- c("2014-11-29 15:23:02+", "2014-11-29 15:38:36+",
+ "2014-11-29 15:52:49+")
R> pt <- strptime(crappyinput, "%Y-%m-%d %H:%M:%S")
R> pt
[1] "2014-11-29 15:23:02 CST" "2014-11-29 15:38:36 CST" "2014-11-29 15:52:49 CST"
R>
It will simply be ignored as trailing garbage.
would this work for you?
t <- c("2014-11-29 15:23:02+")
t <- substr(t, 1, nchar(t)-1)
t
[1] "2014-11-29 15:23:02"
t <- strptime(t, format="%Y-%m-%d")
str(t)
POSIXlt[1:1], format: "2014-11-29"

Converting chr "00:00:00" to date-time "00:00:00"

My question comes from this question. The question had the following character string.
x <- "2007-02-01 00:00:00"
y <- "02/01/2007 00:06:10"
If you try to convert this string to date-class object, something funny happens.
This is a sample from #nrusell's answer.
as.POSIXct(x,tz=Sys.timezone())
[1] "2007-02-01 EST"
as.POSIXct(y,format="%m/%d/%Y %H:%M:%S",tz=Sys.timezone())
[1] "2007-02-01 00:06:10 EST"
As you see, 00:00:00 disappears from the first example. #Richard Scriven left the following example in our discussion using lubridate.
dt <- as.POSIXct("2007-02-01 00:00:00")
hour(dt) <- hour(dt)+1
dt
[1] "2007-02-01 01:00:00 EST"
hour(dt) <- hour(dt)-1
dt
[1] "2007-02-01 EST"
Once again, 00:00:00 disappears. Why does R avoid keeping 00:00:00 in date-class object after conversion? How can we keep 00:00:00?
It is just the print that remove the precision if the time part of a date is a midnight. This is literlay explained in ??strftime help, specially the format parameter:
A character string. The default is "%Y-%m-%d %H:%M:%S" if any
component has a time component which is not midnight, and "%Y-%m-%d"
otherwise
One idea is to redefine the S3 method print for POSIXct object:
print.POSIXct <- function(x,...)print(format(x,"%Y-%m-%d %H:%M:%S"))
Now for your example if your print your x date(with midnight part) you get:
x <- "2007-02-01 00:00:00"
x <- as.POSIXct(x,tz=Sys.timezone())
x
[1] "2007-02-01 00:00:00"

Resources