I'd like to change the time zone of a POSIXct object in R, using the with_tz() function in the lubridate package.
This example I pulled from the web works for me:
meeting <- ymd_hms("2011-07-01 09:00:00", tz = "Pacific/Auckland")
with_tz(meeting, "America/Chicago")
But this one does not, using a snippet of some data:
atime <- as.POSIXct("2016-11-04 18:04:30",
format="%Y-%m-%d %H:%M:%S",
tz="PST")
atime_utc <- with_tz(atime, "UTC")
str() and tz() show that the new object has a time zone of "UTC", and is a POSIXct object, but the times are identical. There should be 8 hours between them after the time zone conversion.
Another solution using a different function would be fine, too.
The comments above should be well taken, but you can also try force_tz depending on your needs:
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
atime <- as.POSIXct("2016-11-04 18:04:30",
format="%Y-%m-%d %H:%M:%S",
tz="PST")
#> Warning in strptime(x, format, tz = tz): unknown timezone 'PST'
#> Warning in as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...): unknown
#> timezone 'PST'
tz(atime)
#> [1] "PST"
atime_utc <- with_tz(atime, "UTC")
force_tz(atime, "UTC")
#> [1] "2016-11-04 10:04:30 UTC"
Created on 2019-03-03 by the reprex package (v0.2.1)
Related
I have a time series of CO2 data with a UNIX timestamp that looks like this: 1658759863
e.g.
df <- data.frame(timestamp=seq(1653998631,1663998631,10),co2_ppm=runif(1000001))
I then convert to POSIXct and filter data only after 11:52:51
df <- df%>%
mutate(timestamp=as.POSIXct(timestamp,origin="1970-01-01",tz = "GMT"))%>%
filter(timestamp>"2022-07-28 11:52:51" )
attr(df$timestamp,"tzone")
[1] "GMT"
When I filter it to only plot value after "11:52:00", it returns after "10:52:00"
why?
This is a daylight savings issue, see the example below.
s <- Sys.time()
as.POSIXct(as.integer(s), origin = "1970-01-01", tz = "GMT")
#> [1] "2022-07-28 15:05:57 GMT"
as.POSIXct(as.integer(s), origin = "1970-01-01", tz = "Europe/London")
#> [1] "2022-07-28 16:05:57 WEST"
s
#> [1] "2022-07-28 16:05:57 BST"
Created on 2022-07-28 by the reprex package (v2.0.1)
R's time is the number of seconds since an origin and time zone and though London and Greenwich are in the same time zone, London's official time is off by 1 hour in this time of year.
I have a table with date and temperature, and I need to convert the date from this format 1/1/2021 12:00:00 AM to just 1/1/21. Please help. I tried to add a new column with the new date using this code
SFtemps$Date <- as.Date(strptime(SFtemps$Date, "%m/%d/%y %H:%M"))
but it's not working.
It should be %I to represent hours as decimal number (01–12), not %H, and %y to
years without century (00–99).
x <- "1/1/2021 12:00:00 AM"
format(strptime(x, "%m/%d/%Y %I:%M:%S %p"), "%m/%d/%y")
[1] "01/01/21"
Note that after you re-foramt the time object, it'll be a pure character string and lose all attributes of a time object.
Your string is actually a timestamp, using lubridate you can convert it to the desired format and then leverage stamp function to get your data formatted in the manner that suits you.
org_str <- "1/1/2021 12:00:00 AM"
library("lubridate")
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
x <- mdy_hms(x = org_str, tz = "UTC")
sf <- stamp("29/01/1997")
#> Multiple formats matched: "%d/%Om/%Y"(1), "%d/%m/%Y"(1)
#> Using: "%d/%Om/%Y"
sf(x)
#> [1] "01/01/2021"
Created on 2022-04-22 by the reprex package (v2.0.1)
Explanation
Contrary to common perception the date doesn't have a format as such. Only string representation of the date can be of a specific format. Consider the following example. Running code:
x <- Sys.Date()
unclass(x)
will return an integer. This is because1:
Thus, objects of the class Date are internally represented as numbers. More specifically, dates in R are actually integers.
The same is applicable to the POSIX objects
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
xn <- now()
class(xn)
#> [1] "POSIXct" "POSIXt"
unclass(xn)
#> [1] 1650644616
#> attr(,"tzone")
#> [1] ""
Created on 2022-04-22 by the reprex package (v2.0.1)
You can always run:
format.Date(lubridate::now(), "%d-%m")
# [1] "22-04"
However, by doing this you are not formatting the date but creating a string representation of the date / timestamp object. Your string representation would have to be converted to timestamp / date object if you want to use it in common date time operations.
Suggestions
Whenever working with date time / date objects I would argue that it's advisable to keep those objects as a correct class maintaining the relevant details (such as time zone in case of timestamps) and only leverage formatting functions when utilising the data in analysis / visual representation, as shown in the sample.
1 Essentials of dates and times
Your format is a bit off. These are two options that work (lubridate is my favorite):
SFtemps <- list()
SFtemps$Date <- '1/1/2021 12:00:00 AM'
> as.Date(strptime(SFtemps$Date, "%m/%d/%Y %r"))
[1] "2021-01-01"
> as.Date(lubridate::mdy_hms(SFtemps$Date))
[1] "2021-01-01"
This will give you the date. If you want to see it as 1/1/2021, use the format function
You can try tryFormats
as.Date(x, format, tryFormats = c("%m/%d/%y, %H:%M"),
optional = FALSE)
I am trying to find out if it is possible to save the date time zone into a
file in R.
Lets create a date/time variable set to EST time.
x <- as.POSIXct("2015-06-01 12:00:00", tz = "EST")
x
#> [1] "2015-06-01 12:00:00 EST"
lubridate::tz(x)
#> [1] "EST"
tmpfile <- tempfile()
I will save this date information (which is in EST time) using base R,
readr and data.table to see how the information is preserved.
Base R version
The date gets converted to UTC and the time is not changed.
write.csv(data.frame(x = x), tmpfile)
df <- read.csv(tmpfile)
df
#> X x
#> 1 1 2015-06-01 12:00:00
lubridate::tz(df$x)
#> [1] "UTC"
tibble/readr version
It gets converted to UTC and the time is formatted as UTC as described in
the help. Time is now 17:00:00.
… POSIXct values are formatted as ISO8601 with a UTC timezone Note:
POSIXct objects in local or non-UTC timezones will be converted to UTC
before writing.
readr::write_csv(tibble::tibble(x = x), tmpfile)
df <- readr::read_csv(tmpfile)
#> Rows: 1 Columns: 1
#> ── Column specification ────────────────────────────────────────────────────────
#> Delimiter: ","
#> dttm (1): x
#>
#> ℹ Use `spec()` to retrieve the full column specification for this data.
#> ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
df
#> # A tibble: 1 × 1
#> x
#> <dttm>
#> 1 2015-06-01 17:00:00
lubridate::tz(df$x)
#> [1] "UTC"
data.table version
The date gets converted to UTC and the time is formatted as UTC as described in
the help. Time is now also 17:00:00.
data.table::fwrite(data.table::data.table(x = x), tmpfile)
df <- data.table::fread(tmpfile)
df
#> x
#> 1: 2015-06-01 17:00:00
lubridate::tz(df$x)
#> [1] "UTC"
Is it possible to preserve the time zone when saving a file? If the user is
not paying attention, this can go silently and create issues.
Created on 2022-03-14 by the reprex package (v2.0.1)
Saving to an .RDS is often the best choice to preserve an object exactly as it’s represented in R. In this case, it preserves timezone without conversion:
saveRDS(data.frame(x = x), tmpfile)
df <- readRDS(tmpfile)
df
# x
# 1 2015-06-01 12:00:00
lubridate::tz(df$x)
# [1] "EST"
df$x
# [1] "2015-06-01 12:00:00 EST"
You could simply paste the time zone on to the end of the datetime string.
x <- as.POSIXct("2015-06-01 12:00:00", tz = "EST")
df <- data.frame(x = paste(as.character(x), lubridate::tz(x)))
tmpfile <- tempfile()
write.csv(df, tmpfile, row.names = FALSE)
new_df <- read.csv(text = csv)
new_df
#> x
#> 1 2015-06-01 12:00:00 EST
as.POSIXct(new_df$x, strsplit(new_df$x, ' ')[[1]][3])
#> [1] "2015-06-01 12:00:00 EST"
When I try to coerce a POSIXct date-time to a Date using as.Date, it seems to return wrong date.
I suspect it has got something to do with the time zone. I tried the tz argument in as.Date, but it didn't give the expected date.
# POSIXct returns day of month 24
data$Time[3]
# [1] "2020-03-24 00:02:00 IST"
class(data$Time[3])
# [1] "POSIXct" "POSIXt"
# coerce to Date, returns 23
as.Date(data$Time[3])
# [1] "2020-03-23"
# try the time zone argument, without luck
as.Date(data$Time[3], tz = "IST")
# [1] "2020-03-23"
# Warning message:
# In as.POSIXlt.POSIXct(x, tz = tz) : unknown timezone 'IST'
Sys.timezone()
# [1] "Asia/Calcutta"
Any ideas what may be going wrong here?
Using the setup in the Note at the end we can use any of these:
# same date as print(x) shows
as.Date(as.character(x))
## [1] "2020-03-24"
# use the time zone stored in x (or system time zone if that is "")
as.Date(x, tz = attr(x, "tzone"))
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = "")
## [1] "2020-03-24"
# use system time zone
as.Date(x, tz = Sys.timezone())
## [1] "2020-03-24"
# use indicated time zone
as.Date(x, tz = "Asia/Calcutta")
## [1] "2020-03-24"
Note
We have assumed this setup.
Sys.setenv(TZ = "Asia/Calcutta")
x <- structure(1584988320, class = c("POSIXct", "POSIXt"), tzone = "")
R.version.string
## [1] "R version 4.0.2 Patched (2020-06-24 r78745)"
The clue is in the warning message. as.Date() doesn't know how to interpret IST as a timezone and so defaults to UTC. Assuming that IST is Indian Standard Time (rather than Irish Standard time) and that IST is UTC+5:30, as.Date() is giving the expected result, even if it is incorrect for your purposes.
Providing a date with a timezone expressed as an offset from UTC gives the desired result.
as.Date("2020-03-24 00:02:00 UTC+5:30")
[1] "2020-03-24"
time1 = as.POSIXlt("2010-07-01 16:00:00", tz="Europe/London")
time1
# [1] "2010-07-01 16:00:00 Europe/London"
but
time2 = as.POSIXct("2010-07-01 16:00:00", tz="Europe/London")
time2
# [1] "2010-07-01 16:00:00 BST"
Why is the timezone presented differently? It is important for me because I need to extract the time zones from my date.
base::format(time1, format="%Z")
# [1] "BST"
base::format(time2, format="%Z")
# [1] "BST"
both give the same "BST" for British Saving Time!
The issue is that "BST" does not seam to be recognized by POSIXct/POSIXlt format:
as.POSIXlt("2010-07-01 16:00:00", tz="BST")
# [1] "2010-07-01 16:00:00 BST"
# Warning messages:
# 1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
# unknown timezone 'BST'
# 2: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 3: In strptime(x, f, tz = tz) : unknown timezone 'BST'
as.POSIXct("2010-07-01 16:00:00", tz="BST")
# [1] "2010-07-01 16:00:00 GMT"
# Warning messages:
# 1: In strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
# unknown timezone 'BST'
# 2: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 3: In strptime(x, f, tz = tz) : unknown timezone 'BST'
# 4: In structure(xx, class = c("POSIXct", "POSIXt"), tzone = tz) :
# unknown timezone 'BST'
# 5: In as.POSIXlt.POSIXct(x, tz) : unknown timezone 'BST'
I am really confused.
I have 2 questions:
1/ What is the difference between POSIXct and POSIXlt formats
2/ Any one knows what time zone I can use?
"Europe/London" works with POSIXlt but not POSIXct. Plus it cannot be extracted from a time using base::format
"BST" is not recognized as a valid timezone in as.POSIXct or as.POSIXlt functions.
#Koshke showed you already
the difference in internal representation of both date types, and
that internally, both timezone specifications are the same.
You can get the timezone out in a standardized manner using attr(). This will get the timezone in the form specified in the zone.tab file, which is used by R to define the timezones (More info in ?timezones ).
eg :
> attr(time1,"tzone")
[1] "Europe/London"
> attr(time2,"tzone")
[1] "Europe/London"
I am quite amazed though that POSIXct uses different indications for the timezones than POSIXlt, whereas the attributes are equal. Apparently, this "BST" only pops up when the POSIXct is printed. Before it gets printed, POSIXct gets converted again to POSIXlt, and the tzone attribute gets amended with synonyms :
> attr(as.POSIXlt(time2),"tzone")
[1] "Europe/london" "GMT" "BST"
This happens somewhere downstream of the internal R function as.POSIXlt, which I'm not able to look at for the moment due to more acute problems to solve. But feel free to go through it and see what exactly is going on there.
On a sidenote, "BST" is not recognized as a timezone (and it is not mentioned in zone.tab either) on my Windows 7 / R 2.13.0 install.
perhaps, unclassing the objects helps you to inspect the differences:
> unclass(time1)
$sec
[1] 0
$min
[1] 0
... snip
$yday
[1] 181
$isdst
[1] 1
attr(,"tzone")
[1] "Europe/London"
> unclass(time2)
[1] 1277996400
attr(,"tzone")
[1] "Europe/London"
thus, the POSIXlt contains the date as a list of component, while the POSIXct contains it as a numeric, i.e., UNIX epoch time.
As for the timezone, it would be beyond the scope of R.
See the explanation in http://en.wikipedia.org/wiki/Tz_database
As for the different behavior of
as.POSIXct("2010-07-01 16:00:00", tz="BST")
as.POSIXlt("2010-07-01 16:00:00", tz="BST")
I suspect there is a bug in as.POSIXct, which does not process the tz argument.
1/ What is the difference between POSIXct and POSIXlt formats
POSIXct is seconds since the epoch
POSIXlt splits datetimes into %Y-%m-%d or %Y/%m/%d %H:%M:%S or other such formats