Trying to use ggplotly to graph time series data with a vertical line to indicate dates of interest.
Call fails with Error in Ops.Date(z[[xy]], 86400000) : * not defined for "Date" objects. I have tried unsuccessfully using both the latest CRAN and development versions of ggplot2 (as per plotly recommendation). Other SO questions (e.g., ggplotly and geom_bar when using dates - latest version of plotly (4.7.0)) do not address my concerns.
As illustrated below with plot object p - both ggplot and ggplotly work as expected. However, when a geom_vline() is added to the plot in p2, it only works correctly in ggplot, failing when calling ggplotly(p2).
library(plotly)
library(ggplot2)
library(magrittr)
set.seed(1)
df <- data.frame(date = seq(from = lubridate::ymd("2019-01-01"), by = 1, length.out = 10),
y = rnorm(10))
p <- df %>%
ggplot(aes(x = date, y = y)) +
geom_line()
p ## plots as expected
ggplotly(p) ## plots as expected
p2 <- p + geom_vline(xintercept = lubridate::ymd("2019-01-08"), linetype = "dashed")
p2 ## plots as expected
ggplotly(p2) ##fails
I just solved this using #Axeman's suggestion. In your case, you can just replace the date:
lubridate::ymd("2019-01-01")
becomes
as.numeric(lubridate::ymd("2019-01-01"))
Not pretty, but it works.
For future reference:
The pop-up window for vertical lines created via date (or POSIX*) to numeric conversions is rather blank. This is particularly valid for POSIX* applications where the exact time can often not be read off directly.
In case you need more significant pop-up content, the definition of a text aesthetic could be helpful (just ignore the 'unknown aesthetics' warning as it doesn't seem to apply). Then, simply specify what you want to see during mouse hover via the tooltip argument, ie. rule out xintercept, and you're all set.
p2 = p +
geom_vline(
aes(
xintercept = as.numeric(lubridate::ymd("2019-01-08"))
, text = "date: 2019-01-08"
)
, linetype = "dashed"
)
ggplotly(p2, tooltip = c("x", "y", "text"))
Related
I am trying to add lines for confidence intervals in R but lines() isn't working. In the following code b is a dataframe, 100 observations of 2 variables 'pred' and 'se'.
plot(c(1:300),b$pred,type="l",lwd=1.5)
lines(c(1:300),b$pred+2*b$se,type="l",lty=2,col='red')
The first line is working but the second is not. I have tried it with and without the x values (plot works with or without, lines works for neither). I can get lines to work for different dataframes, but not this one.
It seems very fragile to me to use 1:300 when also referencing b; it might work when b has 300 rows, but any other time it's going to either complain with warnings or recycling silently and show a misleading/meaningless plot. In general, "never" use hard-coded numbers when working programmatically like this, perhaps better seq_len(nrow(b)) instead of 1:300.
The bounds (x/y limits) for the plot are defined with the first plot command. After that, in base R graphics, no other plotting command will alter the limits. This means it is highly likely that all of pred+2*se are greater than max(pred), so R thinks it's plotting the lines, but due to plotting inefficiency is really doing nothing since the lines are off-canvas.
For this, you need to set the limits up front, perhaps:
xlims <- with(b, range(c(pred, pred+2*se), na.rm = TRUE))
plot(seq_len(nrow(b)), b$pred, type="l", lwd=1.5, xlim=xlims)
lines(seq_len(nrow(b)), b$pred+2*b$se, type="l", lty=2, col='red')
That should address your question. Continue reading if you want to consider migration to ggplot2 ... not a one-for-one migration, not trivial, and perhaps premature at this point, but still something to think about.
While the above should fix the problem you cited, you might also consider migrating to ggplot2: it allows many other things (too many to discuss here), including the feature of updating the x/y limits with every "layer" you add to it. For instance, I wonder if the above will work:
library(ggplot2)
ggplot(b, aes(x = seq_along(pred), y = pred)) +
geom_line(linewidth = 1.5) + # this is doing what your first 'plot' is doing
geom_line(aes(y = pred + 2*se), linewidth = 2, color = "red") # your call to lines
(Notice no need to handle the x/y limits manually, ggplot2 figures it out for you with each layer added.)
I'm going to infer that you'll want to add a pred - 2*se as well, in which case it'll be another call to geom_line, as in
ggplot(b, aes(x = seq_along(pred), y = pred)) +
geom_line(linewidth = 1.5) +
geom_line(aes(y = pred + 2*se), linewidth = 2, color = "red") +
geom_line(aes(y = pred - 2*se), linewidth = 2, color = "blue")
Note that ggplot2 would actually prefer that you handle this with "long" data ... in that case, we can do something like below:
library(dplyr)
library(tidyr) # pivot_longer
b %>%
select(x, pred, se) %>%
mutate(
x = row_number(),
sehigh = pred + 2*se,
selow = pred - 2*se
) %>%
pivot_longer(-x, names_to = "type", values_to = "val") %>%
ggplot(aes(x, val, group = type, color = type)) +
geom_line() +
scale_color_manual(values = c(pred = "black", sehigh = "red", selow = "blue"))
In this case, only one call to geom_line, and ggplot will handle colors automatically (based on the new categorical variable type that we created in a previous step).
First post so please forgive any transgressions.
The data (russ_defensive) looks something like this
russ_defensive dataframe
And this code is meant to create a facet_grid of stacked bars conditionally filled by capitalisation and with axis.text.x colour set to red or black based on whether the industry is defensive or not
library(dplyr)
library(ggplot2)
chart_foo <- ggplot(data = russ_defensive, aes(x = industry)) +
facet_grid(~ sector, space = "free", scales="free") +
geom_bar(stat="count") + aes(fill = capitalisation) +
theme(axis.text.x = element_text(angle = 90, color = ifelse(russ_defensive$defensive_industries == "N", "red", "black")))
Around half of the industries are non-defensive (so russ_defensive$defensive_industries is "N") however this code only turns one of the labels red (see here) and gives the following error:
Warning message:
Vectorized input to `element_text()` is not officially supported.
Results may be unexpected or may change in future versions of ggplot2.
Is there a simple fix to this/ alternate method to conditionally formatting labels based on a column of the dataset?
Thanks for any help, if a reproducible dataset would be useful please let me know.
Well, as the warning tells you it is not recommended to choose the axis text colours by using vectorised theme input (although many people try nonetheless). I believe this was also one of the motivations behind the ggtext package, in which you can use markdown to stylise your text. Below, you'll find an example with a standard dataset, I hope it translates well to yours. We just conditionally apply colouring to some of the x-axis categories.
library(ggplot2)
library(ggtext)
#> Warning: package 'ggtext' was built under R version 4.0.3
df <- transform(mtcars, car = rownames(mtcars))[1:10,]
red_cars <- sample(df$car, 5)
df$car <- ifelse(
df$car %in% red_cars,
paste0("<span style='color:#FF0000'>", df$car, "</span>"),
df$car
)
ggplot(df, aes(car, mpg)) +
geom_col() +
theme(axis.text.x = element_markdown(angle = 90))
Created on 2021-02-03 by the reprex package (v1.0.0)
For more examples, see https://github.com/wilkelab/ggtext#markdown-in-theme-elements
I'm having an issue when exporting stat_density2d plots.
ggplot(faithful, aes(eruptions, y = waiting, alpha = ..density..)) +
stat_density2d(geom = 'tile', contour = F)
When exporting as a png it looks like so:
But when I export as a PDF a grid appears:
I'm assuming that this is because the boundaries of the tiles overlap and so have equivalent of a doubled alpha value.
How can I edit just the edges of the tiles to stop this from happening?
Secondary question:
As Tjebo mentioned geom = 'raster' would fix this problem. However, this raises a new issue that only one group gets plot.
df <- faithful
df$new <- as.factor(ifelse(df$eruptions>3.5, 1, 0))
ggplot(df, aes(eruptions, waiting, fill = new, alpha = ..density..)) +
stat_density2d(geom = 'tile', contour = F) +
scale_fill_manual(values = c('1' = 'blue', '0' = 'red'))
ggplot(df, aes(eruptions, waiting, fill = new, alpha = ..density..)) +
stat_density2d(geom = 'raster', contour = F) +
scale_fill_manual(values = c('1' = 'blue', '0' = 'red'))
help on this second issue would also be much appreciated!
Now I decided to transform my comment into an answer instead. Hopefully it solves your problem.
There is an old, related google thread on this topic - It seems related to how the plots are vectorized in each pdf viewer.
A hack is suggested in this thread, but one solution might simply be to use geom = 'raster' instead.
library(ggplot2)
ggplot(faithful, aes(eruptions, y = waiting, alpha = ..density..)) +
stat_density2d(geom = 'raster', contour = F)
Created on 2019-08-02 by the reprex package (v0.3.0)
If you have a look at the geom_raster documentation - geom_raster is recommended if you want to export to pdf!
The most common use for rectangles is to draw a surface. You always want to use geom_raster here because it's so much faster, and produces smaller output when saving to PDF
edit - second part of the question
Your tile plot can't be correct - you are creating cut-offs (your x value), so the fill should not overlap. This points to the core of the problem - the alpha=..density.. probably calculates the density based on the entire sample, and not by group. I think the only way to go is to pre-calculate the density (e.g., using density(). In your example, for demonstration purpose, we have this luckily precalculated, as faithfuld (this is likely not showing the results which you really want, as it is the density on the entire sample!!).
I'd furthermore recommend not to use numbers as your factor values - this is pretty confusing for you and R. Use characters instead. Also, ideally don't use df for a sample data frame, as this is a base R function;)
Hope this helps
mydf <- faithfuld ## that is crucial!!! faithfuld contains a precalculated density column which is needed for the plot!!
mydf$new <- as.factor(ifelse(mydf$eruptions>3.5, 'large', 'small'))
ggplot(mydf, aes(eruptions, waiting)) +
geom_raster(aes(fill = new, alpha = density), interpolate = FALSE)
Created on 2019-08-02 by the reprex package (v0.3.0)
In the latest version of gganimate by https://github.com/thomasp85 I would like to choose which parts of the plot I can keep static throughout the animation and which will be animated. In the previous version of gganimate you could specify the frame in the aes of ggplot. Thus you could create a base plot that would be static and plot the animated plot over this. How can similar be achieved in the latest version?
This has already been addressed in an issue for gganimate on GitHub: https://github.com/thomasp85/gganimate/issues/94
Basically you specify the the layers that are meant to be static with a separate data frame from the one you initially passed to ggplot. The example in the GitHub ticket I referred to is
library(gganimate)
#> Loading required package: ggplot2
ggplot(dat = data.frame(x=1:10,y=1:10), aes(x=x,y=y)) +
geom_point() +
geom_line(data = data.frame(x2 = 1:10, y = 1:10),
aes(x = x2, y = y, group = 1)) +
transition_time(x)
animate(last_plot(), nframes = 50)
Here the line is held static, while the point is moving.
I have created a point plot using ggplot2 that works relatively well. I would love to run in using Plotly, however when I do - it ends up upsetting the y axis and making the legend very wonky. I will post some before and after below but I am very new to both and looking for the right direction. The ggplot2 is okay but the added interactivity of plotly would be a huge win for what we are doing. Also a weird note - the top graph returned seems to cut off the plot (the highest value - not sure why). Thanks.
Code is:
library(ggplot2)
library(dplyr)
library(plotly)
library(sqldf)
library(tidyverse)
library(lubridate)
library(rio) #lets you use "import" for any file - without using extension name
options(scipen =999) #disable scientific notation
#prepare data:
setwd("C:/Users/hayescod/Desktop/BuysToForecastTracking")
Buys_To_Forecast <- import("BuysToForecastTrack")
colnames(Buys_To_Forecast) <- c("Date", "BusinessSegment", "Material", "StockNumber", "POCreatedBy", "PlantCode", "StockCategory", "Description", "Excess", "QuantityBought", "WareHouseSalesOrders", "GrandTotal", "Comments" )
Buys_To_Forecast$PlantCode <-factor(Buys_To_Forecast$PlantCode) #update PlantCode to factor
#use SQL to filter and order the data set:
btf <- sqldf("SELECT Date,
SUM(QuantityBought) AS 'QuantityBought',
Comments
FROM Buys_To_Forecast
GROUP BY Date, Comments
ORDER BY Date")
#use ggplot:
btfnew <- ggplot(data=btf, aes(x=Date, y=QuantityBought, color=Comments, size=QuantityBought)) +
geom_point() +
facet_grid(Comments~., scales="free")+
ggtitle("Buys To Forecast Review")+
theme(plot.title = element_text(hjust = 0.5),
axis.title.x = element_text(color="DarkBlue", size = 18),
axis.title.y = element_text(color="Red", size = 14))
btfnew #display the plot in ggplot
ggplotly(btfnew) #display the plot in Plotly