I am trying to parse Date-time in AM/PM format in R. I found that '%p' can handle this. However, when I try this:
mydate <- as.POSIXct("01.01.1970 01:00:00 PM", format="%d.%m.%Y %H:%M:%S %p", tz = "UTZ")
mydate
[1] "1970-01-01 01:00:00 UTZ"
> as.numeric(mydate)
[1] 3600
This is clearly 1 AM. I would have expected the output:
[1] "1970-01-01 13:00:00 UTZ"
[1] 46800
What am I missing?
It is considering as 1 AM instead of 1 PM, hence you get 3600 as output.
as.POSIXct("01.01.1970 01:00:00 PM", format="%d.%m.%Y %H:%M:%S %p", tz = "UTC")
#[1] "1970-01-01 01:00:00 UTC"
The document at ?strptime says
%p
AM/PM indicator in the locale. Used in conjunction with %I and not with %H. An empty string in some locales (and the behaviour is undefined if used for input in such a locale).
You need to use %I instead of %H
mydate <- as.POSIXct("01.01.1970 01:00:00 PM", format="%d.%m.%Y %I:%M:%S %p",
tz = "UTC")
as.numeric(mydate)
#[1] 46800
An alternative with lubridate
library(lubridate)
seconds(mdy_hms("01.01.1970 01:00:00 PM"))
#[1] "46800S"
Related
I have a dataset with a column where date and time is stored.
The data I have is:
03/17/2020 09:30:00 PM
I want to convert AM/PM to a 24hour format.
My attempt was using this:
as.POSIXct(df$Date, format="%d/%m/%Y %I:%M:%S %p", tz="UTC")
When I run this with the whole dataset, the majority of dates turns into "NA".
Why is this happening? I am really confused.
Using lubridate:
x <- "03/17/2020 09:30:00 PM"
lubridate::mdy_hms(x)
[1] "2020-03-17 21:30:00 UTC"
Using as.Posixct: note that you need the month / day convention, not the day/month:
as.POSIXct(x, format="%m/%d/%Y %I:%M:%S %p", tz = "UTC")
[1] "2020-03-17 21:30:00 UTC"
I have one dataset that has a column with 200k rows and every row presents different timestamps.
Example:
02/20/2019 01:30:00 PM
15/02/2019 13:30:00
I have tried to use in R Studio:
dataset <-as.POSIXlt(dataset,format= "%m/%d/%Y %H:%M")
dataset <-as.POSIXlt(dataset,format= "%H:%M:%S")
dataset <-as.POSIXlt(dataset,format= "%m/%d/%Y %I:%M:%S")
But the value changed for "01:30:00" what is considered "AM" or sometimes it brings "NA" as a result.
Do you know if there is another way?
See the parse_date_time() function from the lubridate package:
dat <- c("02/20/2019 01:30:00 PM", "15/02/2019 13:30:00")
lubridate::parse_date_time(dat, orders = c("%d/%m/%Y %H:%M:%S", "%m/%d/%Y %I:%M:%S %p"))
[1] "2019-02-20 13:30:00 UTC" "2019-02-15 13:30:00 UTC"
Note that if you have other combinations, such month/day/year with the 24 hour clock time, you'll have to add that specification in the orders argument:
dat2 <- c("20/02/2019 01:30:00 PM", "15/02/2019 13:30:00")
lubridate::parse_date_time(dat2, orders = c("%d/%m/%Y %H:%M:%S", "%m/%d/%Y %I:%M:%S %p")) # Wrong
[1] "2019-02-20 01:30:00 UTC" "2019-02-15 13:30:00 UTC"
lubridate::parse_date_time(dat2, orders = c("%d/%m/%Y %H:%M:%S", "%m/%d/%Y %I:%M:%S %p", "%d/%m/%Y %I:%M:%S %p")) # Right
[1] "2019-02-20 13:30:00 UTC" "2019-02-15 13:30:00 UTC"
I am trying to figure out how to format the output of Sys.time() in R.
For example:
t <- Sys.time()
print(t)
# [1] "2017-07-26 09:41:29 CEST"
which is correct.
I want to make a string out of t made of the date, the hour and minute and the timezone.
I can use
format(t, format = "%F %R %Z")
# [1] "2017-07-26 09:41 CEST"
Which is what I expect.
However, I am having a hard time understanding the output if I set the timezone explicitly. For example:
format(t, format = "%F %R %Z", tz = "Europe/Stockholm")
# [1] "2017-07-26 09:41 CEST"
produces what I expect, but:
format(t, format = "%F %R %Z", tz = "CEST")
# [1] "2017-07-26 07:41 CEST"
which I think is wrong, I would have expected the output to be "2017-07-26 09:41 CEST" or "2017-07-26 09:41 Europe/Stockholm"
Also
format(t, format = "%F %R %Z", tz = "UTC+02:00")
# [1] "2017-07-26 05:41 UTC"
which I find even weirder, since I would have expected the output to be "2017-07-26 10:41 UTC+02:00"
In the answer I would like to know two things:
why my examples give the output they give?
is there any way to have the timezone always written like "2017-07-26 10:41 UTC+02:00" or "2017-07-26 10:41 Europe/Stockholm"?
Even though R displays time zone in the console as "CEST" (which is %Z), there is no valid timezone by that name. You can check OlsonNames() for valid timezone names.
any(grepl("CEST", OlsonNames()))
#[1] FALSE
For cases when the timezone is displayed as CEST, it is still stored as "Europe/Stockholm" internally. We can check using dput
as.POSIXct("2017-07-26 10:46:12", tz = "Europe/Stockholm")
#[1] "2017-07-26 10:46:12 CEST"
dput(as.POSIXct("2017-07-26 10:46:12", tz = "Europe/Stockholm"))
#structure(1501058772, class = c("POSIXct", "POSIXt"), tzone = "Europe/Stockholm")
Note that %Z is only for output and is not reliable for input. CEST is not a valid value for tz and if you use invalid values for tz, they will commonly be treated as UTC (Read more at ?format.POSIXct or ?strptime). That is why you get unexpected output with format(t, format = "%F %R %Z", tz = "CEST")
Just use "Europe/Stockholm" explicitly.
any(grepl("Europe/Stockholm", OlsonNames()))
#[1] TRUE
As for formatting time in the specific format, try
format(as.POSIXct("2017-07-26 10:46:12", tz = "UTC"), "%F %R UTC%z")
#[1] "2017-07-26 10:46 UTC+0000"
I have a timestamp vector like
time_stamp <- c("7/1/2013", "7/1/2013 12:00:30 AM", "7/1/2013 12:01:00 AM", "7/1/2013 12:01:30 AM", "8/1/2013","8/1/2013 11:02:30 PM")
I want to format this to date class. I tried
strptime(time_stamp, format = "%d/%m/%Y %H:%M:%S", tz = "GMT")
but since two timestamps have missing times it results in NAs, which should be substituted by default: 12:00:00.
I can run a loop such as:
for (i in 1:length(time_stamp))
{
if(nchar(time_stamp[i])<11)
{
time_stamp[i] <- paste(time_stamp[i], " 12:00:00 AM")
}
}
time_stamp <- format(strptime(time_stamp, format = "%d/%m/%Y %I:%M:%S %p", tz = "GMT"), "%d/%m/%Y %H:%M:%S", tz = "GMT")
Is there a faster and cleaner way to accomplish this? The vector is a part of large dataset so I don't want to loop over it.
lubridate::parse_date_time can take multiple token orders, with or without the %:
lubridate::parse_date_time(time_stamp, orders = c("dmy IMS p", "dmy"))
## [1] "2013-01-07 00:00:00 UTC" "2013-01-07 00:00:30 UTC" "2013-01-07 00:01:00 UTC"
## [4] "2013-01-07 00:01:30 UTC" "2013-01-08 00:00:00 UTC" "2013-01-08 23:02:30 UTC"
Or use its truncated parameter:
lubridate::parse_date_time(time_stamp, orders = 'dmy IMS p', truncated = 4)
which returns the same thing.
Or use a bit of regex replacement and then process as normal:
as.POSIXct(sub("(\\d{4}$)", "\\1 00:00:00", time_stamp),
format = "%d/%m/%Y %H:%M:%S", tz = "GMT")
#[1] "2013-01-07 00:00:00 GMT" "2013-01-07 12:00:30 GMT" "2013-01-07 12:01:00 GMT"
#[4] "2013-01-07 12:01:30 GMT" "2013-01-08 00:00:00 GMT" "2013-01-08 11:02:30 GMT"
I am trying to convert this time stamp to POSIXct
t1 <- c("19-Jun-13 06.00.00.00 PM")
If I do this:
t1 <- as.POSIXct(t1, format="%d-%b-%y %H:%M:%S")
whould this convert this time stamp right? Does that considder the AM/PM at the end?
Read ?strptime. %p, which only works with %I, not %H. Your time format is also incorrect. Your times are separated by ".", not ":".
as.POSIXct("19-Jun-13 06.00.00.00 PM", format="%d-%b-%y %I.%M.%OS %p")
I don't understad why strptime doesn't recognize properly the %p format. However, the function dmy_hmsfrom package lubridate works well.
lubridate::dmy_hms("19-Jun-13 06.00.00.00 PM") creates the following result:
[1] "2013-06-19 18:00:00 UTC"
which you can "reformat" if you want, say Y-m-d H:M:
as.POSIXct(dmy_hms("19-Jun-13 06.00.00.00 PM"), format="%Y-%m-%d %H:%M")
[1] "2013-06-19 18:00:00 UTC"