Unexpected date when converting POSIXct date-time to Date - timezone issue? - r

When I try to coerce a POSIXct date-time to a Date using as.Date, it seems to return wrong date.
I suspect it has got something to do with the time zone. I tried the tz argument in as.Date, but it didn't give the expected date.
# POSIXct returns day of month 24
data$Time[3]
# [1] "2020-03-24 00:02:00 IST"
class(data$Time[3])
# [1] "POSIXct" "POSIXt"
# coerce to Date, returns 23
as.Date(data$Time[3])
# [1] "2020-03-23"
# try the time zone argument, without luck
as.Date(data$Time[3], tz = "IST")
# [1] "2020-03-23"
# Warning message:
# In as.POSIXlt.POSIXct(x, tz = tz) : unknown timezone 'IST'
Sys.timezone()
# [1] "Asia/Calcutta"
Any ideas what may be going wrong here?

Using the setup in the Note at the end we can use any of these:
# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"
# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"
# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"
Note
We have assumed this setup.
Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")
R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"

The clue is in the warning message. as.Date() doesn't know how to interpret IST as a timezone and so defaults to UTC. Assuming that IST is Indian Standard Time (rather than Irish Standard time) and that IST is UTC+5:30, as.Date() is giving the expected result, even if it is incorrect for your purposes.
Providing a date with a timezone expressed as an offset from UTC gives the desired result.
as.Date("2020-03-24 00:02:00 UTC+5:30")
[1] "2020-03-24"

Related

How to change a number into datetime format in R

I have a vector a = 40208.64507.
In excel, I can automatically change a to a datetime: 2010/1/30 15:28:54 by click the Date type.
I tried some methods but I cannot get the same result in R, just as in excel.
a = 40208.64507
# in Excel, a can change into: 2010/1/30 15:28:54
as.Date(a, origin = "1899-12-30")
lubridate::as_datetime(a, origin = "1899-12-30")
Is there any way to get the same results in R as in Excel?
Here are several ways. chron class is the closest to Excel in terms of internal representations -- they are the same except for origin -- and the simplest so we list that one first. We also show how to use chron as an intermediate step to get POSIXct.
Base R provides an approach which avoids package dependencies and lubridate might be used if you are already using it.
1) Add the appropriate origin using chron to get a chron datetime or convert that to POSIXct. Like Excel, chron works in days and fractions of a day, but chron uses the UNIX Epoch as origin whereas Excel uses the one shown below.
library(chron)
a <- 40208.64507
# chron date/time
ch <- as.chron("1899-12-30") + a; ch
## [1] (01/30/10 15:28:54)
# POSIXct date/time in local time zone
ct <- as.POSIXct(ch); ct
## [1] "2010-01-30 10:28:54 EST"
# POSIXct date/time in UTC
as.POSIXct(format(ct), tz = "UTC")
## [1] "2010-01-30 10:28:54 UTC"
2) Using only base R convert the number to Date class using the indicated origin and then to POSIXct.
# POSIXct with local time zone
ct <- as.POSIXct(as.Date(a, origin = "1899-12-30")); ct
## [1] "2010-01-30 10:28:54 EST"
# POSIXct with UTC time zone
as.POSIXct(format(ct), tz = "UTC")
## [1] "2010-01-30 15:28:54 UTC"
3) Using lubridate it is similar to base R so we can write
library(lubridate)
# local time zone
as_datetime(as_date(a, origin = "1899-12-30"), tz = "")
[1] "2010-01-30 15:28:54 EST"
# UTC time zone
as_datetime(as_date(a, origin = "1899-12-30"))
[1] "2010-01-30 15:28:54 UTC"

Sys.Date() with as.POSIXct()

Trying to get current date in a POSIXct class. I have tried the following:
as.POSIXct(Sys.Date(), format = "%m/%d/%y", tz = "EST")
and got
[1] "2021-02-12 19:00:00 EST"
and I wish to only get the date without the time but in POSIXct class. For instance:
[1] "2021-02-12"
Convert the Date class object to character first:
as.POSIXct(format(Sys.Date()))
## [1] "2021-02-13 EST"
Even shorter is:
trunc(Sys.time(), "day")
## [1] "2021-02-13 EST"
Note:
POSIXct objects are stored internally as seconds since the Epoch and not as separate date and time so they always have times; however, if the time is midnight as it is here then it does not display when printed using the default formatting.
if you only need the Date it is normally better to use Date class since using POSIXct class can result in subtle time zone errors if you are not careful and there is typically no reason to expose yourself to that potential if you don't need to.
if you change the session's time zone then it won't display without the time because midnight in one time zone is not midnight other time zones.
x <- as.POSIXct(format(Sys.Date()))
x
## [1] "2021-02-13 EST"
# change time zone
Sys.setenv(tz = "GMT")
x
## [1] "2021-02-13 05:00:00 GMT"
# change back
Sys.setenv(tz = "")
x
## [1] "2021-02-13 EST"

System date in Posixlt and Posixct

I am trying to get the last minute of yesterday using Sys.Date() in Posix time.
force_tz(as.POSIXlt(Sys.Date()-1), tz = 'America/New_York') + 86399
# [1] "2018-01-12 23:59:59 EST"
CORRECT
force_tz(as.POSIXct(Sys.Date()-1), tz = 'America/New_York') + 86399
# [1] "2018-01-12 15:59:59 EST"
INCORRECT
Sys.Date()
# [1] "2018-01-13"
Why does as.Posixct and as.Posixlt return two different values using Sys.Date() and why is the difference 8 hours even after applying force_tz from lubridate ?
As ever, debugonce is your friend. Running debugonce(force_tz), you can see that the difference in output comes from when force_tz hits the branches checking first is.POSIXct(time) (in which case the default tzone = "" is applied); in the POSIXlt case, the default branch is hit, where as.POSIXct is applied to time and tz(time) (which comes out as UTC for a POSIXlt object) is used as the time zone.
This comes down to something subtle happening; from ?as.POSIXlt.Date:
Dates without times are treated as being at midnight UTC.
Hence
tz(as.POSIXlt(Sys.Date()-1))
# [1] "UTC"
But
tz(as.POSIXct(Sys.Date()-1))
# [1] ""
What's peculiar is this can't be overridden -- as.POSIXlt.Date doesn't accept a tz argument:
formals(as.POSIXlt.Date)
# $x
# $...
If you want to use POSIXct, how about the following?
force_tz(as.POSIXct(sprintf('%s 00:00:00', Sys.Date())), 'America/New_York') - 1L
# [1] "2018-01-12 23:59:59 EST"

as.POSIXct gives an unexpected timezone

I'm trying to convert a yearmon date (from the zoo package) to a POSIXct in the UTC timezone.
This is what I tried to do:
> as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
[1] "2010-01-01 01:00:00 CET"
I get the same when I convert a Date:
> as.POSIXct(as.Date("2010-01-01"),tz="UTC")
[1] "2010-01-01 01:00:00 CET"
The only way to get it to work is to pass a character as an argument:
> as.POSIXct("2010-01-01", tz="UTC")
[1] "2010-01-01 UTC"
I looked into the documentation of DateTimeClasses, tzset and timezones. My /etc/localtime is set to Europe/Amsterdam. I couldn't find a way to set the tz to UTC, other than setting the TZ environment variable:
> Sys.setenv(TZ="UTC")
> as.POSIXct(as.Date("2010-01-01"),tz="UTC")
[1] "2010-01-01 UTC"
Is it possible to directly set the timezone when creating a POSIXct from a yearmon or Date?
Edit:
I checked the functions as.POSIXct.yearmon. This one passes to the as.POSIXct.Date.
> zoo:::as.POSIXct.yearmon
function (x, tz = "", ...)
as.POSIXct(as.Date(x), tz = tz, ...)
<environment: namespace:zoo>
So like Joshua says the timezone gets lost in the as.POSIXct.Date. For now I'll use Richies suggestion to set the tzone by hand using:
attr(x, "tzone") <- 'UTC'
This solves the issue of the lost tzone, which is only used for presentation and not internally like Grothendieck and Dwin suggested.
This is because as.POSIXct.Date doesn't pass ... to .POSIXct.
> as.POSIXct.Date
function (x, ...)
.POSIXct(unclass(x) * 86400)
<environment: namespace:base>
You are setting the timezone correctly in your code. The problem you are perceiving is only at the output stage. POSIX values are all referenced to UTC/GMT. Dates are assumed to be midnight times. Midnight UTC is 1 AM CET ( which is apparently where you are).
> as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
[1] "2009-12-31 19:00:00 EST" # R reports the time in my locale's timezone
> dtval <- as.POSIXct(as.yearmon("2010-01-01"), tz="UTC")
> format(dtval, tz="UTC") # report the date in UTC note it is the correct date ... there
[1] "2010-01-01"
> format(dtval, tz="UTC", format="%Y-%m-%d ")
[1] "2010-01-01 " # use a format string
> format(dtval, tz="UTC", format="%Y-%m-%d %OS3")
[1] "2010-01-01 00.000" # use decimal time
See ?strptime for many, many other format possibilities.
In the help page ?as.POSIXct, for the tz argument it says
A timezone specification to be used
for the conversion, if one is
required. System-specific (see time
zones), but ‘""’ is the current
timezone, and ‘"GMT"’ is UTC
(Universal Time, Coordinated).
Does as.POSIXct(as.yearmon("2010-01-01"), tz="GMT") work for you?
After more perusal of the documentation, in the details section we see:
Dates without times are treated as
being at midnight UTC.
So in your example, the tz argument is ignored. If you use as.POSIXlt it is easier to see what happens with the timezone. The following should all give the same answer, with UTC as the timezone.
unclass(as.POSIXlt(as.yearmon("2010-01-01")))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "UTC"))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "GMT"))
unclass(as.POSIXlt(as.yearmon("2010-01-01"), tz = "CET"))
In fact, since you are using as.yearmon (which strips the time out) you will never get to set the timezone. Compare, e.g.,
unclass(as.POSIXlt(as.yearmon("2010-01-01 12:00:00"), tz = "CET"))
unclass(as.POSIXlt("2010-01-01 12:00:00", tz = "CET"))
This seems to be an oddity with the date/time "POSIXct" class methods. Try formatting the "Date" or "yearmon" variable first so that as.POSIXct.character rather than as.POSIXct.{Date, yearmon} is dispatched:
Date
> d <- as.Date("2010-01-01")
> as.POSIXct(format(d), tz = "UTC")
[1] "2010-01-01 UTC"
yearmon
> library(zoo)
> y <- as.yearmon("2010-01")
> as.POSIXct(format(y, format = "%Y-%m-01"), tz = "UTC")
[1] "2010-01-01 UTC"
> # or
> as.POSIXct(format(as.Date(y)), tz = "UTC")
[1] "2010-01-01 UTC"

How to extract the correct timezones from POSIXct and POSIXlt objects?

time1 = as.POSIXlt("2010-07-01 16:00:00", tz="Europe/London")
time1
# [1] "2010-07-01 16:00:00 Europe/London"
but
time2 = as.POSIXct("2010-07-01 16:00:00", tz="Europe/London")
time2
# [1] "2010-07-01 16:00:00 BST"
Why is the timezone presented differently? It is important for me because I need to extract the time zones from my date.
base::format(time1, format="%Z")
# [1] "BST"
base::format(time2, format="%Z")
# [1] "BST"
both give the same "BST" for British Saving Time!
The issue is that "BST" does not seam to be recognized by POSIXct/POSIXlt format:
as.POSIXlt("2010-07-01 16:00:00", tz="BST")
# [1] "2010-07-01 16:00:00 BST"
# Warning messages:
# 1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
# unknown timezone 'BST'
# 2: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 3: In strptime(x, f, tz = tz) : unknown timezone 'BST'
as.POSIXct("2010-07-01 16:00:00", tz="BST")
# [1] "2010-07-01 16:00:00 GMT"
# Warning messages:
# 1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
# unknown timezone 'BST'
# 2: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 3: In strptime(x, f, tz = tz) : unknown timezone 'BST'
# 4: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 5: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'BST'
I am really confused.
I have 2 questions:
1/ What is the difference between POSIXct and POSIXlt formats
2/ Any one knows what time zone I can use?
"Europe/London" works with POSIXlt but not POSIXct. Plus it cannot be extracted from a time using base::format
"BST" is not recognized as a valid timezone in as.POSIXct or as.POSIXlt functions.
#Koshke showed you already
the difference in internal representation of both date types, and
that internally, both timezone specifications are the same.
You can get the timezone out in a standardized manner using attr(). This will get the timezone in the form specified in the zone.tab file, which is used by R to define the timezones (More info in ?timezones ).
eg :
> attr(time1,"tzone")
[1] "Europe/London"
> attr(time2,"tzone")
[1] "Europe/London"
I am quite amazed though that POSIXct uses different indications for the timezones than POSIXlt, whereas the attributes are equal. Apparently, this "BST" only pops up when the POSIXct is printed. Before it gets printed, POSIXct gets converted again to POSIXlt, and the tzone attribute gets amended with synonyms :
> attr(as.POSIXlt(time2),"tzone")
[1] "Europe/london" "GMT" "BST"
This happens somewhere downstream of the internal R function as.POSIXlt, which I'm not able to look at for the moment due to more acute problems to solve. But feel free to go through it and see what exactly is going on there.
On a sidenote, "BST" is not recognized as a timezone (and it is not mentioned in zone.tab either) on my Windows 7 / R 2.13.0 install.
perhaps, unclassing the objects helps you to inspect the differences:
> unclass(time1)
$sec
[1] 0
$min
[1] 0
... snip
$yday
[1] 181
$isdst
[1] 1
attr(,"tzone")
[1] "Europe/London"
> unclass(time2)
[1] 1277996400
attr(,"tzone")
[1] "Europe/London"
thus, the POSIXlt contains the date as a list of component, while the POSIXct contains it as a numeric, i.e., UNIX epoch time.
As for the timezone, it would be beyond the scope of R.
See the explanation in http://en.wikipedia.org/wiki/Tz_database
As for the different behavior of
as.POSIXct("2010-07-01 16:00:00", tz="BST")
as.POSIXlt("2010-07-01 16:00:00", tz="BST")
I suspect there is a bug in as.POSIXct, which does not process the tz argument.
1/ What is the difference between POSIXct and POSIXlt formats
POSIXct is seconds since the epoch
POSIXlt splits datetimes into %Y-%m-%d or %Y/%m/%d %H:%M:%S or other such formats

Resources