Eliminate timestamp timezone offset in lattice xyplot - r

When using lattice to plot values against hourly timestamps, I've found there is an annoying timezone shift from UTC to local time in the graph's x-axis labels. Though this example uses lubridate, the issue occurs when using POSIXct directly. For example:
library(lattice)
library(lubridate)
foo <- data.frame(t = seq(ymd_hms("2015-01-01 00:00:00"),
ymd_hms("2015-01-02 00:00:00"),
by = "hour"),
y = 1:25)
head(foo)
xyplot(y~t, foo) # time axis is behind by 5 hours (EST = UTC-5)
One solution is to specify the timezone explicitly:
tz(foo$t) <- "" # or tz(foo$t) <- "EST"
head(foo)
xyplot(y~t, foo) # time axis now agrees
Are there other ways to get lattice to plot directly in UTC without modifying the timezone of the data? Perhaps using the scales = list(x = list(format = ...)) argument? I can imagine situations where it would be bad to change the data's timezone, specifically when dealing with daylight saving events.

A brute force approach would be to temporarily overwrite your system's timezone with the timezone from your data, and reset it to its previous value once the plot has been created:
tzn = Sys.getenv("TZ")
Sys.setenv(TZ = tz(foo$t))
xyplot(
y ~ t
, data = foo
)
Sys.setenv(TZ = tzn)

This is a bug; the process of combining per-panel limits into an overall limit (even if there's only one panel) loses attributes, including the tz attribute.
p <- xyplot(y ~ t, foo)
str(attributes(foo$t))
# List of 2
# $ class: chr [1:2] "POSIXct" "POSIXt"
# $ tzone: chr "UTC"
str(attributes(p$x.limits))
# List of 1
# $ class: chr [1:2] "POSIXct" "POSIXt"
Fixed by setting
attributes(p$x.limits)$tzone <- "UTC"
Other workarounds are
xyplot(y ~ t, foo, xlim = extendrange(foo$t))
xyplot(y ~ t, foo, scales = "free")

Related

Use of as.POSIXct and understanding timezone in time series- R

I am looking to set x-axis limits on a rather simple time series plot in R.
My plot produces limits that are 6 hours ahead of my time zone (plot would start and end at 14:00:00 in the example below).
I am currently in "America/Denver".
My data was previously plotted so that everything was shifted 6 hours back but I was able to align that properly on the x-axis, but now the bounds/limits of the x-axis are still a problem.
date_format <- function(format = "%b %d - %H:%M") {function(x) format(x, format)}
lims <- as.POSIXct(strptime(c("2021-05-04 08:00:00","2021-05-08 08:00:00"), format = "%Y-%m-%d %H:%M"))
combo_ch1short <- ggplot(data = data_combo_ch1short, aes(x = DateTime, y = Z.kOhm, color = probe.pair.name)) +
scale_x_datetime(labels = date_format(), limits = lims, date_breaks = "12 hours") + ...
Sorry, pretty new to this. Any help is GREATLY appreciated!
Edit:
data_combo_ch1short:
Time probe.pair.name DateTime Z.kOhm
1617890878 ch_1_ch_2 2021-04-12 17:52:32 5228.69
1617890878 ch_1_ch_3 2021-04-12 17:52:32 5031.88
1618251752 ch_1_ch_2 2021-04-12 18:22:32 4089.37
1618251752 ch_1_ch_3 2021-04-12 18:22:32 4231.90
...
You can create lims in any timezone by specifying the timezone in tz argument.
lims <- as.POSIXct(c("2021-05-04 08:00:00","2021-05-08 08:00:00"), tz = 'US/Mountain')

Dynamically Change as.POSIXlt Value

In R, I am trying to read a file that has a timestamp, and update the timestamp based on the condition of another field. The below code works with no problem:
t <- data.frame(user = as.character(c("bshelton#email1.com", "lwong#email1.com")),
last_update = rep(as.POSIXlt(Sys.time(), tz = "America/Los_Angeles"), 2))
Sys.sleep(5)
t$last_update <- as.POSIXlt(ifelse(t$user == "bshelton#email1.com", Sys.time(), t$last_update), origin = "1970-01-01")
print(t)
The problem is when I read an existing file and try to dynamically change an as.POSIXlt value. The following code is producing the error that accompanies it in the code block afterwards:
t <- data.frame(user = as.character(c("bshelton#email1.com", "lwong2#email1.com")),
last_update = rep(as.POSIXlt(Sys.time(), tz = "America/Los_Angeles"), 2))
write.csv(t, "so_question.csv", row.names = FALSE)
t <- read.csv("so_question.csv")
t$last_update <- as.POSIXlt(t$last_update)
Sys.sleep(5)
t$last_update <- as.POSIXlt(ifelse(t$user == "bshelton#email1.com", Sys.time(), t$last_update), origin = "1970-01-01")
Error in as.POSIXlt.default(ifelse(t$user == "bshelton#email1.com", Sys.time(), :
do not know how to convert 'ifelse(t$user == "bshelton#email1.com", Sys.time(), t$last_update)' to class “POSIXlt”
In addition: Warning message:
In ans[!test & ok] <- rep(no, length.out = length(ans))[!test & :
number of items to replace is not a multiple of replacement length
The first case is curiously working only because you don't have what you think—those datetimes are in fact POSIXct, not POSIXlt:
last_update <- rep(as.POSIXlt(Sys.time(), tz = "America/Los_Angeles"), 2)
str(last_update)
#> POSIXlt[1:2], format: "2019-07-28 20:52:10" "2019-07-28 20:52:10"
t <- data.frame(user = as.character(c("bshelton#email1.com", "lwong#email1.com")),
last_update = last_update)
str(t)
#> 'data.frame': 2 obs. of 2 variables:
#> $ user : Factor w/ 2 levels "bshelton#email1.com",..: 1 2
#> $ last_update: POSIXct, format: "2019-07-28 20:52:10" "2019-07-28 20:52:10"
If you dig into ?data.frame, it says
data.frame converts each of its arguments to a data frame by calling as.data.frame(optional = TRUE). As that is a generic function, methods can be written to change the behaviour of arguments according to their classes: R comes with many such methods. Character variables passed to data.frame are converted to factor columns unless protected by I or argument stringsAsFactors is false. If a list or data frame or matrix is passed to data.frame it is as if each component or column had been passed as a separate argument (except for matrices protected by I).
This is what's happening: as.data.frame.POSIXlt in fact converts to POSIXct:
now <- Sys.time()
str(now)
#> POSIXct[1:1], format: "2019-07-28 22:50:12"
str(data.frame(time = now))
#> 'data.frame': 1 obs. of 1 variable:
#> $ time: POSIXct, format: "2019-07-28 22:50:12"
as.data.frame.POSIXlt
#> function (x, row.names = NULL, optional = FALSE, ...)
#> {
#> value <- as.data.frame.POSIXct(as.POSIXct(x), row.names,
#> optional, ...)
#> if (!optional)
#> names(value) <- deparse(substitute(x))[[1L]]
#> value
#> }
#> <bytecode: 0x7fc938a11060>
#> <environment: namespace:base>
More immediately, since Sys.time() returns a POSIXct object, ifelse(t$user == "bshelton#email1.com", Sys.time(), t$last_update) in the second case is getting a POSIXct object for one observation and POSIXlt for the other. The POSIXlt object's class attribute is dropped by ifelse revealing the list underneath, which ifelse then doesn't know how to turn into a vector together with the unclassed POSIXct object (which is just a number).
The solution here, then, is to follow the hint data.frame is giving you and use POSIXct instead of POSIXlt.
If you really want to make it work with POSIXlt, you can iterate over the conditions and POSIXlt vector with Map with if/else (which maintain attributes including class, but only handle scalar conditions) and coerce the resulting list back to a vector with do.call(c, ...):
t <- data.frame(user = as.character(c("bshelton#email1.com", "lwong#email1.com")),
last_update = rep(as.POSIXlt(Sys.time(), tz = "America/Los_Angeles"), 2))
t$last_update <- as.POSIXlt(t$last_update)
t$last_update <- do.call(c, Map(
function(condition, last_update){
if (condition) {
as.POSIXlt(Sys.time() + 5)
} else {
last_update
}
},
condition = t$user == "bshelton#email1.com",
last_update = t$last_update
))
t
#> user last_update
#> 1 bshelton#email1.com 2019-07-28 23:11:04
#> 2 lwong#email1.com 2019-07-28 23:10:59
...but frankly that's a little silly. Just use POSIXct instead, and your life will be better.

invalid 'tz' value, problems with time zone

I'm working with minute data of NASDAQ, it has the index "2015-07-13 12:05:00 EST". I adjusted the system time with Sys.setenv(TZ = 'EST').
I want to program a simple buy/hold/sell strategy, therefore I create a vector of flat positions as a foundation.
pos_flat <- xts(rep(0, nrow(NASDAQ)), index(NASDAQ))
Then I want to apply a constraint, that in a certain time window, positions are bound to be flat, which in my case means equal to 1.
pos_flat["T13:41/T14:00"] <- 1
And this returns the error:
"Error in as.POSIXlt.POSIXct(.POSIXct(.index(x)), tz = indexTZ(x)) :invalid 'tz' value".
I also get this error doing other calculations, I just used this example because it is easy and shows the problem.
As extra information:
> Sys.timezone
function (location = TRUE)
{
tz <- Sys.getenv("TZ", names = FALSE)
if (nzchar(tz))
return(tz)
if (location)
return(.Internal(tzone_name()))
z <- as.POSIXlt(Sys.time())
zz <- attr(z, "tzone")
if (length(zz) == 3L)
zz[2L + z$isdst]
else zz[1L]
}
<bytecode: 0x03648ff4>
<environment: namespace:base>
I don't understand the problem with the tz value... Any ideas?
The source of your "invalid 'tz' value" error is because, for whatever reason, R doesn't accept tz = df$var. If you set tz = 'America/New_York' or some other character value, then it will work.
Better answer (instead of using force_tz below) for converting UTC times to various timezones based on location. It is also simpler and better than looping through or using a nested ifelse. I subset and change tz based on a timezone column (which my data already has, if not you can create it). Just make sure you account for all timezones in your data
(unique(df$timezone))
df$datetime2[df$timezone == 'America/New_York'] <- format(df$datetime, tz="America/New_York")[df$timezone == 'America/New_York']
df$datetime2[df$timezone == 'America/Chicago'] <- format(df$datetime, tz="America/Chicago")[df$timezone == 'America/Chicago']
df$datetime2[df$timezone == 'America/Denver'] <- format(df$datetime, tz="America/Denver")[df$timezone == 'America/Denver']
df$datetime2[df$timezone == 'America/Los_Angeles'] <- format(df$datetime, tz="America/Los_Angeles")[df$timezone == 'America/Los_Angeles']
Previous solution: Converting to Local Time in R - Vector of Timezones
require(lubridate)
require(dplyr)
df = data.frame(timestring = c("2015-12-12 13:34:56", "2015-12-14 16:23:32"), localzone = c("America/Los_Angeles", "America/New_York"), stringsAsFactors = F)
df$moment = as.POSIXct(df$timestring, format="%Y-%m-%d %H:%M:%S", tz="UTC")
df = df %>% rowwise() %>% mutate(localtime = force_tz(moment, localzone))
df
You are getting errors because "EST" is not a valid timezone specification. It's an abbreviation that's often used when printing and displaying timezones.
The index is printed as "2015-07-13 12:05:00 EST" because "EST" probably represents Eastern Standard Time in the United States. If you want to set the TZ environment variable to that timezone, you should use Sys.setenv() with Country/City notation:
Sys.setenv(TZ = "America/New_York")
You can also set the timezone in the xts constructor:
pos_flat <- xts(rep(0, nrow(NASDAQ)), index(NASDAQ), tzone = "America/New_York")
Your error occurs because of a misinterpretation of the time object. You need to have UNIX timestamps in order to use something like
pos_flat["T13:41/T14:00"] <- 1
Try a conversion of your indices by doing something like this:
index(NASDAQ) <- as.POSIXct(strptime(index(NASDAQ), "%Y-%m-%d %H:%M:%S"))
As you want to use EST, you have to change your environment variables (if you are not living in EST timezone). So all in all, this should work:
Sys.setenv(TZ = 'EST')
#load stuff
#...
index(NASDAQ) <- as.POSIXct(strptime(index(NASDAQ), "%Y-%m-%d %H:%M:%S"))
pos_flat <- xts(rep(0, nrow(NASDAQ)), index(NASDAQ))
pos_flat["T13:41/T14:00"] <- 1
For further information, have a look at the POSIXct and POSIXlt structures in R.
Best regards

transform "mFilter" object (list of Time-Series) to plot with ggplot2

I'm working with the hpfilter from the mFilter package and I can't seem to find a simple way to convert the list of Time-Series objects by hpfilter to a format I can use with ggplot2. I realize I can take it all apart and put it back together, but I imagine there's some simple way I have overlooked? I tried the code suggested in the SO discussion R list to data frame. However I couldn't find a way to convert the list of Time-Series objects to a data.frame in any simple way. The final goal is to reproduce the default plot produced by the mFilter object (see below)
Here's some example code
# install.packages(c("mFilter"), dependencies = TRUE)
library(mFilter)
data(unemp)
unemp.hp <- hpfilter(unemp, type=c("lambda"), freq = 1606)
# str(unemp.hp)
class(unemp.hp)
# [1] "mFilter"
plot(unemp.hp)
Hit <Return> to see next plot:
Also, why am I asked to " Hit <Return>" to see the plot?
The plot function calls plot.mFilter which has parameter ask=interactive() and it is set as TRUE for interactive sessions,
you could disable this by ask=FALSE in call for plot
plot(unemp.hp,ask=FALSE)
Data:
library(mFilter)
library(ggplot2)
library(gridExtra)
# library(zoo)
data(unemp)
unemp.hp <- hpfilter(unemp, type=c("lambda"), freq = 1606)
# str(unemp.hp)
class(unemp.hp)
# [1] "mFilter"
plot(unemp.hp,ask=FALSE)
To check for slots of object unemp.hp
names(unemp.hp)
# [1] "cycle" "trend" "fmatrix" "title" "xname" "call" "type" "lambda" "method"
#[10] "x"
The relevant objects are x (the main unemp series) , trend and cycle. All three objects are of class ts, we first convert them to
data.frame using custom function and plot using ggplot and gridExtra (for grid.arrange)
objectList = list(unemp.hp$x,unemp.hp$trend,unemp.hp$cycle)
names(objectList) = c("unemp","trend","cycle")
sapply(objectList,class)
#unemp trend cycle
# "ts" "ts" "ts"
Conversion from ts to data.frame:
fn_ts_to_DF = function(x) {
DF = data.frame(date=zoo::as.Date(time(objectList[[x]])),tseries=as.matrix(objectList[[x]]))
colnames(DF)[2]=names(objectList)[x]
return(DF)
}
DFList=lapply(seq_along(objectList),fn_ts_to_DF)
names(DFList) = c("unemp","trend","cycle")
seriesTrend = merge(DFList$unemp,DFList$trend,by="date")
cycleSeries = DFList$cycle
Plots:
gSeries = ggplot(melt(seriesTrend,"date"),aes(x=date,y=value,color=variable)) + geom_line() +
ggtitle('Hodrick-Prescot Filter for unemp') +
theme(legend.title = element_blank(),legend.justification = c(0.1, 0.8), legend.position = c(0, 1),
legend.direction = "horizontal",legend.background = element_rect(fill="transparent",size=.5, linetype="dotted"))
gCycle = ggplot(cycleSeries,aes(x=date,y=cycle)) + geom_line(color="#619CFF") + ggtitle("Cyclical component (deviations from trend)")
gComb = grid.arrange(gSeries,gCycle,nrow=2)
I tried to use the prior answer, didn't worked for me.
I was getting the trend and cycle from a GDP quarterly series.
This data was a time series, so I did this, and worked for me:
list <- list(gdp_ln$x, gdp_ln$trend, gdp_ln$cycle)
names(list)=c("gdp","trend","cycle")
gdp<- data.frame((sapply(list,c)))
Data:
> dput(gdp_ln)
structure(c(16.0275785360442, 16.0477176062761, 16.0718936895007,
16.0899963371452, 16.0875707712141, 16.0981391378223, 16.0988601288276,
16.1110815092797, 16.1244321329861, 16.1384685077996, 16.1451472350838,
16.148178781735, 16.161163569502, 16.1418894206861, 16.1634877625667,
16.1965372621761, 16.2216815829736, 16.2387677536829, 16.249412380526,
16.2690521777631, 16.2812185880068, 16.2951024427095, 16.2964024092233,
16.3127733881018, 16.3233290487177, 16.3369922768377, 16.3486515031696,
16.3489275708763, 16.3451264371757, 16.3524856433069, 16.3666338513045,
16.3801691039135, 16.3959993202765, 16.4135937981601, 16.4321203154987,
16.4488104165345, 16.4344524213544, 16.4302554348621, 16.4240722287677,
16.425087582257, 16.4350803035092, 16.4507216431126, 16.4670532627455,
16.4985227751756, 16.5094864456079, 16.5352746165004, 16.5504689966469,
16.5594976247513, 16.5754312535087, 16.592641573353, 16.6003340665324,
16.6063100774853, 16.6163655606058, 16.6370227688187, 16.6564363783854,
16.6577160570216, 16.6543595214556, 16.6773721241902, 16.6911082706925,
16.6935398489076, 16.6956102943815, 16.6798673418354, 16.6772670544553,
16.6678707780266, 16.6606889172344, 16.6678398460835, 16.6668473810049,
16.676020524389, 16.6775934319312, 16.6882821147755, 16.6957985899994,
16.7032334217472, 16.6926036544774, 16.7027214366522, 16.7103625977254,
16.7105344224572, 16.7042504851486, 16.7063913529457, 16.7100598555556,
16.6960591147037, 16.686477079594, 16.5740423808036, 16.6181175035946
), .Tsp = c(2000, 2020.5, 4), class = "ts")

Why read.zoo gives index as dates when times are available

I'm trying to understand my difficulties in the past with inputting zoo objects. The following two uses of read.zoo give different results despite the default argument for tz supposedly being "" and that is the only difference between the two read.zoo calls:
Lines <- "2013-11-25 12:41:21 2
2013-11-25 12:41:22.25 2
2013-11-25 12:41:22.75 75
2013-11-25 12:41:24.22 3
2013-11-25 12:41:25.22 1
2013-11-25 12:41:26.22 1"
library(zoo)
z <- read.zoo(text = Lines, index = 1:2)
> dput(z)
structure(c(2L, 2L, 75L, 3L, 1L, 1L), index = structure(c(16034,
16034, 16034, 16034, 16034, 16034), class = "Date"), class = "zoo")
z <- read.zoo(text = Lines, index = 1:2, tz="")
> dput(z)
structure(c(2L, 2L, 75L, 3L, 1L, 1L), index = structure(c(1385412081,
1385412082.25, 1385412082.75, 1385412084.22, 1385412085.22, 1385412086.22
), class = c("POSIXct", "POSIXt"), tzone = ""), class = "zoo")
>
The answer (of course) is in the sources for read.zoo(), wherein there is:
....
ix <- if (missing(format) || is.null(format)) {
if (missing(tz) || is.null(tz))
processFUN(ix)
else processFUN(ix, tz = tz)
}
else {
if (missing(tz) || is.null(tz))
processFUN(ix, format = format)
else processFUN(ix, format = format, tz = tz)
}
....
Even though the default for tz is "", in your first case tz is considered missing (by missing()) and hence processFUN(ix) is used. When you set tz = "", it is no longer missing and hence you get processFUN(ix, tz = tz).
Without looking at the details of read.zoo() this could possibly be handled better by having tz = NULL or tz (no default) in the arguments and then in the code, if tz needs to be set to "" for some reason, do:
if (missing(tz) || is.null(tz)) {
tz <- ""
}
or perhaps this is not even needed if all the is required is to avoid the confusion about the two different calls?
Effectively, the default index class is "Date" unless tz is used in which case the default is "POSIXct". Thus the first example in the question gives "Date" class since that is the default and the second "POSIXct" since tz was specified.
If you want to specify the class without making use of these defaults then to be explicit use the FUN argument:
read.zoo(...whatever..., FUN = as.Date)
read.zoo(...whatever..., FUN = as.POSIXct) # might need FUN=paste,FUN2=as.POSIXct
read.zoo(...whatever..., FUN = as.yearmon)
# etc.
The FUN argument can also take a custom function as shown in the examples in the package.
Note that it always assumes standard formats (e.g. "%Y-%m-%d" in the case of "Date" class) if no format is specified and never tries to automatically determine the format.
The way it works is explained in detail in ?read.zoo and there are many examples in ?read.zoo (there are 78 lines of code in the examples section) as well as in an entire vignette (one of six vignettes) dedicated just to read.zoo" : Reading Data in zoo.
Added Have expanded the above. Also, in the development version of zoo available here the heuristic has been improved and with that improvement the first example in the question does recognize the date/times and chooses POSIXct. Also some clarification of the simple heuristic has been added to the read.zoo help file so that the many examples provided do not have to be relied upon as much.
Here are some examples. Note that the heuristic referred to is a heuristic to determine the class of the time index only. It can only identify "numeric", "Date" and "POSIXct" classes. The heuristic cannot identify other classes (although you can specify them yourself using FUN=). Also the heuristic does not identify formats. If the format is not provided using format= or implicitly through FUN= then standard format is assumed, e.g. "%Y-%m-%d" in the case of "Date".
Lines <- "2013-11-25 12:41:21 2
2013-12-25 12:41:22.25 3
2013-12-26 12:41:22.75 8"
# explicit. Uses POSIXct.
z <- read.zoo(text = Lines, index = 1:2, FUN = paste, FUN2 = as.POSIXct)
# tz implies POSIXct
z <- read.zoo(text = Lines, index = 1:2, tz = "")
# heuristic: Date now; devel ver uses POSIXct
z <- read.zoo(text = Lines, index = 1:2)
Lines <- "2013-11-251 2
2013-12-25 3
2013-12-26 8"
z <- read.zoo(text = Lines, FUN = as.Date) # explicit. Uses Date.
z <- read.zoo(text = Lines, format = "%Y-%m-%d") # format & no tz implies Date
z <- read.zoo(text = Lines) # heuristic: Date
Note:
(1) In general, its safer to be explicit by using FUN or by using tz and/or format as opposed to relying on the heuristic. If you are explicit by using FUN or semi-explicit by using tz and/or format then there is no change between the current and the development versions of read.zoo.
(2) Its safer to rely on the documentation rather than the internals as the internals can change without warning and in fact have changed in the development version. If you really want to look at the code despite this then the key statement that selects the class of the index if FUN is not explicitly defined is the if (is.null(FUN)) ... statement in the read.zoo source.
(3) I recommend using read.zoo as being easier, direct and compact rather than workarounds such as read.table followed by zoo. I have been using read.zoo for years as have many others and it seems pretty solid to me but if anyone finds specific problems with read.zoo or with the documentation (always possible since there is quite a bit of it) they can always be reported. Even though the package has been around for years improvements are still being made.
I suspect your use of read.zoo tripped you up. Here is what I did:
library(zoo)
tt <- read.table(text=Lines)
z <- zoo(as.integer(tt[,3]), order.by=as.POSIXct(paste(tt[,1], tt[,2])))
Now z is a proper zoo object:
R> z
2013-11-25 12:41:21.00 2013-11-25 12:41:22.25 2013-11-25 12:41:22.75
2 2 75
2013-11-25 12:41:24.22 2013-11-25 12:41:25.22 2013-11-25 12:41:26.22
3 1 1
R> class(z)
[1] "zoo"
R> class(index(z))
[1] "POSIXct" "POSIXt"
R>
And by making sure I used a POSIXct object for the index, I am in fact getting a POSIXct object back.

Resources