as.POSIXct, timezone part is ignored - r

Here is my example.
test <- as.POSIXct(as.Date("2019-11-01"), tz = "UTC")
test
It prints:
[1] "2019-10-31 19:00:00 CDT"
It looks like it ignored tz parameter:
attr(test, "tzone")
returns NULL.
Why is it coming with "19" hours and not 00? How can I make it 00 hours and take UTC?
UPDATE
Here is even better case:
test_2 <- as.POSIXct("2019-11-01 00:00:00", tz = "UTC")
str(test_2)
attr(test_2, "tzone")
strftime(test_2, "%H")
It generates:
POSIXct[1:1], format: "2019-11-01"
[1] "UTC"
[1] "19"
Now it looks like parameter tz is not ignored, but hour is 19, but why?

We can use with_tz from lubridate
library(lubridate)
test1 <- with_tz(as.Date("2019-11-01"), tzone = 'UTC')
attr(test1, 'tzone')
#[1] "UTC"
Also, as.POSIXct can be directly applied
test2 <- as.POSIXct("2019-11-01", tz = "UTC")
test2
#[1] "2019-11-01 UTC"
attr(test2, 'tzone')
#[1] "UTC"
With strftime, use the option for tz
strftime(test2, "%H", tz = 'UTC')
#[1] "00"
If we check the strftime, it is doing a second conversion with as.POSIXlt and then with format gets the formatted output
strftime
function (x, format = "", tz = "", usetz = FALSE, ...)
format(as.POSIXlt(x, tz = tz), format = format, usetz = usetz,
...)
According to ?as.POSIXlt
tz - time zone specification to be used for the conversion, if one is required. System-specific (see time zones), but "" is the current time zone, and "GMT" is UTC (Universal Time, Coordinated). Invalid values are most commonly treated as UTC, on some platforms with a warning.
as.POSIXlt(test2)$hour
#[1] 0
as.POSIXlt(test2, tz = "")$hour
#[1] 20
The "" uses the Sys.timezone by default
as.POSIXlt(test2, tz = Sys.timezone())$hour
#[1] 20

Related

R round a date with timezone

timestamp = 1491800340000
I'm having trouble with some date manipulation in R. The timestamp above is:
2017-04-10T04:59:00.000 GMT
2017-04-09T23:59:00.000 America/Bogota (Local time)
I want to round it to 2017-04-09T00:00:00.000 GMT because my daily aggregations are set to 00:00 GMT.
How can I do that?
Here's what I tried:
> Sys.timezone()
[1] "America/Bogota"
> timestamp = 1491800340000
> date = strptime(timestamp / 1000, "%s");
[1] "2017-04-09 23:59:00 COT"
> midnightLocal = trunc(date, "day");
[1] "2017-04-09 COT"
> midnightUTC = strptime(format(midnightLocal, "%Y-%m-%d"), "%Y-%m-%d", tz = "UTC");
[1] "2017-04-09 UTC"
> truncatedtimestamp = as.integer(format(midnightUTC, "%s"));
[1] 1491714000
which is 2017-04-09T05:00:00.000 GMT (not midnight as I expected). Looks like I failed to specify the timezone somewhere?
I tried many things like POSIXct but did not succeed.
Any hint is appreciated!
Cheers
ps: I'd prefer not to install any package
A little trickery:
timestamp = 1491800340000
ts <- as.POSIXct(timestamp / 1000, origin = "1970-01-01 00:00:00 GMT")
ts2 <- as.Date(trunc(ts, "day"))
attr(ts2, "tzone") <- "GMT"
format(ts2, "%Y-%m-%d %H:%M:%S %Z") # to prove it's midnight
# [1] "2017-04-09 00:00:00 UTC"
class(ts2)
# [1] "Date"
From here you have a couple of options: a little brute-force (numeric conversion) or perhaps the more time-friendly/safe way.
Brute-force numeric:
ts3a <- as.numeric(ts2) * 60*60*24
ts3a
# [1] 1491696000
as.POSIXct(ts3a, origin = "1970-01-01 00:00:00 GMT", tz = "GMT")
# [1] "2017-04-09 GMT"
Time-friendly/safe:
ts3b <- as.POSIXct(ts2)
attr(ts3b, "tzone") <- "GMT"
ts3b
# [1] "2017-04-09 GMT"
(Since they are POSIXct, it's showing the date only because it is midnight; you can easily prove it's correct.)

How to properly handle timezone when passing POSIXct objects between R and Postgres DBMS?

I am struggling to understand what exactly happens behind the scenes when passing POSIXct objects between R and Postgres using RPostgreSQL. In the following example, I define two timestamp fields: one with a timezone the other one without. However, it appears that they are treated exactly the same when passing POSIXct objects via dbWriteTable and dbReadTable.
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, host = "127.0.0.1", port = "5432", user= "postgres",
dbname = "test_db")
q <- "
CREATE TABLE test_table
(
dttm timestamp without time zone,
dttmtz timestamp with time zone
)"
dbSendQuery(con, q)
# using timezone CET
dttm <- as.POSIXct("2016-01-01 10:20:10", tz="CET")
df <- data.frame(dttm = dttm, dttmtz = dttm)
dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)
# using timezone UTC
dttm <- as.POSIXct("2016-01-01 14:20:10", tz="UTC")
df <- data.frame(dttm = dttm, dttmtz = dttm)
dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)
df2 <- dbReadTable(con, "test_table")
Both fields come out exactly the same. It appears as if the timezones are completely discarded.
df2$dttm
[1] "2016-01-01 10:20:10 CET" "2016-01-01 14:20:10 CET"
df2$dttmtz
"2016-01-01 10:20:10 CET" "2016-01-01 14:20:10 CET"
QUESTIONS:
What exactly goes on behind the scenes?
How can I properly pass the POSIXct's timezone back and forth?
I think you've pointed out a bug in RPostgreSQL: it does not seem to be getting time zone from R for POSIXct objects. Timezone information can be passed correctly to PostgreSQL by formatting timestamps as character with offset from UTC (see example at bottom of this answer; added 2018-09-21). But first, here's an illustration of the apparent bug:
Modifying your code:
library(RPostgreSQL)
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, port = "5432", user= "postgres",
dbname = "test")
# timestamps in three different time zones
dt1 <- as.POSIXct("2016-01-01 10:20:10", tz="US/Eastern")
dt2 <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dt3 <- as.POSIXct("2016-01-01 10:20:10", tz="Asia/Tokyo")
df <- data.frame(dt1=dt1, dt2=dt2, dt3=dt3)
q <- "
CREATE TABLE test_table
(
dt1 timestamp with time zone,
dt2 timestamp with time zone,
dt3 timestamp with time zone,
PRIMARY KEY (dt1)
)"
dbSendQuery(con, q)
dbWriteTable(con, "test_table", df, overwrite=FALSE, append=T, row.names=0)
df2 <- dbReadTable(con, "test_table")
note that all three timestamps are equal
timezones not handled correctly
df2$dt1
"2016-01-01 10:20:10 EST"
df2$dt2
"2016-01-01 10:20:10 EST"
df2$dt3
"2016-01-01 10:20:10 EST"
And same is true in postgres
- as seen in pgadmin here
This suggests postgres is not getting timezone from R
Note that if we manually change one time zone in test_table
(e.g., first record in pgadmin)
eg,
and fetch
df2 <- dbReadTable(con, "test_table")
then the timezone is correctly handled
df2$dt1
"2016-01-01 05:20:10 EST"
df2$dt2
"2016-01-01 10:20:10 EST"
df2$dt3
"2016-01-01 10:20:10 EST"
So this suggets that RPostgreSQL is not correctly passing time zone information to postgres but that RPostgreSQL is correctly getting time zone information from postgres.
answer to the original question
To pass a timestamp with timezone from R to Postgres using RPostgreSQL, just format it as a character string with the offset from UTC (e.g., "2016-01-01 10:20:10-0500"; e.g., use format and then pass it to Postgres, same as above.
E.g.:
#convert POSIXct to character with offset from UTC
df$dt1 <- format(df$dt1, format = "%Y-%m-%d %H:%M:%OS%z")
df$dt2 <- format(df$dt2, format = "%Y-%m-%d %H:%M:%OS%z")
df$dt3 <- format(df$dt3, format = "%Y-%m-%d %H:%M:%OS%z")
##> df
## dt1 dt2 dt3
##1 2016-01-01 10:20:10-0500 2016-01-01 10:20:10+0000 2016-01-01 10:20:10+0900
q <- "
CREATE TABLE test_table2
(
dt1 timestamp with time zone,
dt2 timestamp with time zone,
dt3 timestamp with time zone,
PRIMARY KEY (dt1)
)"
dbSendQuery(con, q)
dbWriteTable(con, "test_table2", df, overwrite=FALSE, append=T, row.names=0)
df3 <- dbReadTable(con, "test_table2")
#Note that times are now correct (in local time zone)
##> df3$dt1
##[1] "2016-01-01 10:20:10 EST"
##> df3$dt2
##[1] "2016-01-01 05:20:10 EST"
##> df3$dt3
##[1] "2015-12-31 20:20:10 EST"
First, let me say that I am not a fan of the way R handles this. Let's take a value of time in UTC:
dttm.utc <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dttm.utc
[1] "2016-01-01 10:20:10 UTC"
Now we can convert it to CET timezone relatively easily:
dttm.cet <- format( dttm.utc, tz = "CET", usetz = T )
dttm.cet
[1] "2016-01-01 11:20:10 CET"
But check this out, each of those values is of a different class. The first is in a POSIX format, but the second has been converted to character class by the format function.
class( dttm.utc )
[1] "POSIXct" "POSIXt"
class( dttm.cet )
[1] "character"
That's no good, because it means we can't just do the same conversion again in the other direction, we need to first convert the latter value to POSIX class first, being very careful not to let R muck around with the timezone:
dttm.cet <- as.POSIXct( dttm.cet, tz = "CET" )
dttm.cet
[1] "2016-01-01 11:20:10 CET"
class( dttm.cet )
[1] "POSIXct" "POSIXt"
Now we can convert it:
format( dttm.cet, tz = "UTC", usetz = TRUE )
[1] "2016-01-01 10:20:10 UTC"
But that puts us back to character class. Very annoying. Here's a workaround. Build the two-step conversion into a function, and use that from now on.
convert.tz <- function( x, tz ) {
new <- format( x, tz = tz, usetz = T )
return( as.POSIXct( new, tz = tz ) )
}
Give that a try:
dttm.utc <- as.POSIXct("2016-01-01 10:20:10", tz="UTC")
dttm.utc
[1] "2016-01-01 10:20:10 UTC"
dttm.cet <- convert.tz( dttm.utc, "CET" )
dttm.cet
[1] "2016-01-01 11:20:10 CET"
class( dttm.utc )
[1] "POSIXct" "POSIXt"
class( dttm.cet )
[1] "POSIXct" "POSIXt"
So now the conversion doesn't change the format, which means we can go either way in the conversion, without changing the method:
convert.tz( dttm.cet, "UTC" )
[1] "2016-01-01 10:20:10 UTC"
Ahhh. Much better.
Of course, you could stick with base R, and do this every time you convert.
dttm.cet <- as.POSIXct( format( dttm.utc, tz = "CET", usetz = T ), tz = "CET" )
But personally, I like the function a lot better.

R: date formatting ignores timezone for POSIXlt objects

I cannot get R to format POSIXlt objects in the desired timezone. POSIXct works as expected. Is this a bug or am I missing something?
date.str = "2015-12-09 13:30"
from = "Europe/London"
to = "America/Los_Angeles"
lt = as.POSIXlt(date.str, tz=from)
format(lt, tz=to, usetz=TRUE)
#[1] "2015-12-09 13:30:00 GMT"
ct = as.POSIXct(date.str, tz=from)
format(ct, tz=to, usetz=TRUE)
#[1] "2015-12-09 05:30:00 PST"
The tzone attributes are the same:
attributes(ct)$tzone
#[1] "Europe/London"
attributes(lt)$tzone
#[1] "Europe/London"
Solution
As pointed out by #nicola, format.POSIXlt has no tz parameter. To print a POSIXlt date in another timezone one can use lubridate package to convert a POSIXlt object to the desired timezone first:
require(lubridate)
lt.changed = with_tz(lt, tz=to)
format(lt.changed, usetz=TRUE)
#[1] "2015-12-09 05:30:00 PST"

as.Date() does not respect POSIXct time zones

Okay so here is a subtle "quirk" in the r as.Date function converting from a POSIXct with a timezone, which I am wondering if it is a bug.
> as.POSIXct("2013-03-29", tz = "Europe/London")
[1] "2013-03-29 GMT"
> as.Date(as.POSIXct("2013-03-29", tz = "Europe/London"))
[1] "2013-03-29"
So far no problem, but.....
> as.POSIXct("2013-04-01", tz = "Europe/London")
[1] "2013-04-01 BST"
> as.Date(as.POSIXct("2013-04-01", tz = "Europe/London"))
[1] "2013-03-31"
Anybody seen this? Is this a bug or another quirk? April fools?
The default time zone for as.Date.POSIXct is "UTC" (see the help page). Try as.Date(as.POSIXct("2013-04-01", tz = "Europe/London"),tz = "Europe/London").

as.POSIXct gives an unexpected timezone

I'm trying to convert a yearmon date (from the zoo package) to a POSIXct in the UTC timezone.
This is what I tried to do:
> as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
[1] "2010-01-01 01:00:00 CET"
I get the same when I convert a Date:
> as.POSIXct(as.Date("2010-01-01"),tz="UTC")
[1] "2010-01-01 01:00:00 CET"
The only way to get it to work is to pass a character as an argument:
> as.POSIXct("2010-01-01", tz="UTC")
[1] "2010-01-01 UTC"
I looked into the documentation of DateTimeClasses, tzset and timezones. My /etc/localtime is set to Europe/Amsterdam. I couldn't find a way to set the tz to UTC, other than setting the TZ environment variable:
> Sys.setenv(TZ="UTC")
> as.POSIXct(as.Date("2010-01-01"),tz="UTC")
[1] "2010-01-01 UTC"
Is it possible to directly set the timezone when creating a POSIXct from a yearmon or Date?
Edit:
I checked the functions as.POSIXct.yearmon. This one passes to the as.POSIXct.Date.
> zoo:::as.POSIXct.yearmon
function (x, tz = "", ...)
as.POSIXct(as.Date(x), tz = tz, ...)
<environment: namespace:zoo>
So like Joshua says the timezone gets lost in the as.POSIXct.Date. For now I'll use Richies suggestion to set the tzone by hand using:
attr(x, "tzone") <- 'UTC'
This solves the issue of the lost tzone, which is only used for presentation and not internally like Grothendieck and Dwin suggested.
This is because as.POSIXct.Date doesn't pass ... to .POSIXct.
> as.POSIXct.Date
function (x, ...)
.POSIXct(unclass(x) * 86400)
<environment: namespace:base>
You are setting the timezone correctly in your code. The problem you are perceiving is only at the output stage. POSIX values are all referenced to UTC/GMT. Dates are assumed to be midnight times. Midnight UTC is 1 AM CET ( which is apparently where you are).
> as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
[1] "2009-12-31 19:00:00 EST" # R reports the time in my locale's timezone
> dtval <- as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
> format(dtval, tz="UTC") # report the date in UTC note it is the correct date ... there
[1] "2010-01-01"
> format(dtval, tz="UTC", format="%Y-%m-%d ")
[1] "2010-01-01 " # use a format string
> format(dtval, tz="UTC", format="%Y-%m-%d %OS3")
[1] "2010-01-01 00.000" # use decimal time
See ?strptime for many, many other format possibilities.
In the help page ?as.POSIXct, for the tz argument it says
A timezone specification to be used
for the conversion, if one is
required. System-specific (see time
zones), but ‘""’ is the current
timezone, and ‘"GMT"’ is UTC
(Universal Time, Coordinated).
Does as.POSIXct(as.yearmon("2010-01-01"), tz="GMT") work for you?
After more perusal of the documentation, in the details section we see:
Dates without times are treated as
being at midnight UTC.
So in your example, the tz argument is ignored. If you use as.POSIXlt it is easier to see what happens with the timezone. The following should all give the same answer, with UTC as the timezone.
unclass(as.POSIXlt(as.yearmon("2010-01-01")))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "UTC"))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "GMT"))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "CET"))
In fact, since you are using as.yearmon (which strips the time out) you will never get to set the timezone. Compare, e.g.,
unclass(as.POSIXlt(as.yearmon("2010-01-01 12:00:00"), tz = "CET"))
unclass(as.POSIXlt("2010-01-01 12:00:00", tz = "CET"))
This seems to be an oddity with the date/time "POSIXct" class methods. Try formatting the "Date" or "yearmon" variable first so that as.POSIXct.character rather than as.POSIXct.{Date, yearmon} is dispatched:
Date
> d <- as.Date("2010-01-01")
> as.POSIXct(format(d), tz = "UTC")
[1] "2010-01-01 UTC"
yearmon
> library(zoo)
> y <- as.yearmon("2010-01")
> as.POSIXct(format(y, format = "%Y-%m-01"), tz = "UTC")
[1] "2010-01-01 UTC"
> # or
> as.POSIXct(format(as.Date(y)), tz = "UTC")
[1] "2010-01-01 UTC"

Resources