String format parsing in R [duplicate] - r

This question already has answers here:
How to convert dd/mm/yy to yyyy-mm-dd in R
(6 answers)
Closed 3 years ago.
When I parse a character string into a date, Why does this not throw an error or an NA? I have tried the following
t <- "31-Oct-2012"
as.Date(t, format = "%d-%B-%Y") # this produces the expected result
as.Date(t, format = "%d-%B-%y") # I was expecting an NA
Instead I get
[1] "2020-10-31"

Because %y is for two digit year, so it takes only first two digits and ignores the rest. It treats t as
as.Date("31-Oct-20", format = "%d-%B-%y")
#[1] "2020-10-31"
This also works when you have anything after 2-digit year. See
as.Date("31-Oct-20ABC", format = "%d-%B-%y")
#[1] "2020-10-31"
R tries to "auto-complete" when there is less information, it returns some (incorrect) date for
as.Date("31-Oct-20", format = "%d-%B-%Y")
#[1] "0020-10-31"
but returns NA for
as.Date("31-Oct-ABC20", format = "%d-%B-%y")
#[1] NA

Related

Convert a date time character string to date - YYYYMM format [duplicate]

This question already has answers here:
converting datetime string to POSIXct date/time format in R
(1 answer)
How to convert string "MMM DD YYYY" to date YYYY-MM-DD
(1 answer)
Closed 7 months ago.
I have a character string date time string but need to convert the same into YYYYMM date format. Cannot seem to find a solution as all functions are converting into NA or weird date format.
Date_format_current <- '02/09/2020 23:35'
You can use the following code
library(lubridate)
library(zoo)
current <- '02/09/2020 23:35'
as.yearmon(dmy_hm(current))
#> [1] "Sep 2020"
#Or
format(dmy_hm(current), "%Y-%m")
#> [1] "2020-09"
#Or
format_ISO8601(dmy_hm(current), precision = "ym")
#> [1] "2020-09"

R - Numeric(int) to Date conversion [duplicate]

This question already has answers here:
integer data frame to date in R [duplicate]
(3 answers)
Closed 4 years ago.
I have a date.frame and a column with
values(20180213190133, 20180213190136, 20180213190173 , 20180213190193 , 20180213190213, 20180213190233, 20180213190333, 20180213190533, 20180213190733, 20180213190833, 20180213190833, 20180213190833, 201802131901833, 20180213191133, 20180213192133, 20180213194133, 20180213199133, 20180213199133, 20180213199133, 20180213199133, 20180213190136.... 1200 entries)
I want to convert this column which is of type int to Date.
I tried using :
as.Date() and as.POSIXct(). Both doesn't work. I am getting N/A value.
Please let me how can I convert this filed from int to Date.
Thanks
Try this:
Input data
values<-c(20180213190133, 20180213190136, 20180213190173)
values_date<-as.Date(substr(as.character(values),start = 1,stop=8), format = "%Y%m%d")
> values_date
[1] "2018-02-13" "2018-02-13" "2018-02-13"
> class(values_date)
[1] "Date"
If you want to mantain also hour/minute/second you can try this:
values_date<-as.POSIXlt(as.character(values), format = "%Y%m%d%H%M%S")
After this the class will be "POSIXlt" "POSIXt" and not Date but there are some strange info in your input data
In the third number, last two figures are "73", this number is incorrect for seconds and you will have NA in output.
values_date
[1] "2018-02-13 19:01:33 CET" "2018-02-13 19:01:36 CET" NA

How to convert to a character date into standard date format in R [duplicate]

This question already has an answer here:
Convert R character to date [duplicate]
(1 answer)
Closed 4 years ago.
I have a csv file (df) with dates as " Mar-97, Apr-97..." . After importing to r with read.csv and stringAsFactors = F, the class(dates) is character.
I have tried : df$dates <- as.Date(df$Dates , format = "%d-%b-%y") & as.Date(df$Dates , format = "%b-%y"). class is converted to Date but it shows NA values?
you can try lubridate library:
library(lubridate)
> parse_date_time("Mar-97", "m y")
[1] "1997-03-01 UTC"
and you can vectorize
df=c("Mar 17","Apr 17")
> parse_date_time(df, "m y")
[1] "2017-03-01 UTC" "2017-04-01 UTC"

How to extract the hour of this kind of timestamp? [duplicate]

This question already has an answer here:
Extracting 2 digit hour from POSIXct in R
(1 answer)
Closed 7 years ago.
This is how my timestamp looks like
date1 = timestamp[1]
date1
[1] 07.01.2016 11:52:35
3455 Levels: 01.02.2016 00:11:52 01.02.2016 00:23:35 01.000:30:21 31.01.2016 23:16:18
When I try to extract the hour, I get an invalid 'trim' argument. Thank you !
format(date1, "%I")
Error in format.default(structure(as.character(x), names = names(x), dim = dim(x), :
invalid 'trim' argument
How can I extract single components like the hour from this timestamp.
With base R:
format(strptime(x,format = '%d.%m.%Y %H:%M:%S'), "%H")
[1] "11"
data
x <- as.factor("07.01.2016 11:52:35")
You first need to parse the time
d = strptime(date1,format = '%d.%m.%Y %H:%M:%S')
Then use lubridate to extract parts like hour etc.
library(lubridate)
hour(d)
minute(d)

Parsing ISO8601 date and time format in R [duplicate]

This question already has answers here:
Using strptime %z with special timezone format
(2 answers)
Closed 9 years ago.
This should be quick - we are parsing the following format in R:
2013-04-05T07:49:54-07:00
My current approach is
require(stringr)
timenoT <- str_replace_all("2013-04-05T07:49:54-07:00", "T", " ")
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S%z", tz="UTC")
but it gives NA.
%z is the signed offset in hours, in the format hhmm, not hh:mm. Here's one way to remove the last :.
newstring <- gsub("(.*).(..)$","\\1\\2","2013-04-05T07:49:54-07:00")
(timep <- strptime(newstring, "%Y-%m-%dT%H:%M:%S%z", tz="UTC"))
# [1] "2013-04-05 14:49:54 UTC"
Also note that you don't have to remove the "T".
You don't the string replacement.
NA just means that the whole did not work, so do it pieces to build your expression:
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%d")
[1] "2013-04-05"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M")
[1] "2013-04-05 07:49:00"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M:%S")
[1] "2013-04-05 07:49:54"
R>
Also, for reasons I never fully understood -- but which probably reside with C library function underlying it, %z only works on output, not input. So your NA mostly likely comes from your use of %z.
strptime("2013-04-05 07:49:54-07:00", "%Y-%m-%d %H:%M:%S", tz="UTC") gives 2013-04-05 07:49:54 UTC
Try
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S", tz="UTC")

Resources