Is it possible to expand geom_ribbon to xlimits? - r

I have the following code (as an example) which I would like to adapt such that the ribbon extends to the entire xrange, as geom_hline() does. The ribbon indicates what values are within accepted bounds. In my real application sometimes has no upper or lower bound, so the hline by itself is not enough to determine whether values are within bounds.
library(ggplot2)
set.seed(2016-12-19)
dates <- seq(as.Date('2016-01-01'), as.Date('2016-12-31'), by = 1)
values <- rexp(length(dates), 1)
groups <- rpois(length(dates), 5)
temp <- data.frame(
date = dates,
value = values,
group = groups,
value.min = 0,
value.max = 2
)
ggplot(temp, aes(date, value)) +
geom_ribbon(aes(ymin = value.min, ymax = value.max), fill = '#00cc33', alpha = 0.6) +
geom_hline(aes(yintercept = value.min)) +
geom_hline(aes(yintercept = value.max)) +
geom_point() +
facet_wrap(~group)
I tried setting the x in geom_ribbon to datesas well, but then only fractions of the range are filled.
Also I tried this:
geom_ribbon(aes(ymin = -Inf, ymax = 2, x = dates), data = data.frame(), fill = '#00cc33', alpha = 0.6)
but then the data seems to be overwritten for the entire plot and I get the error Error in eval(expr, envir, enclos) : object 'value' not found. Even if it would work then the range is still actually too narrow as the xlimits are expanded.

Here's one way to do it:
ggplot(temp, aes(as.numeric(date), value)) +
geom_rect(aes(xmin=-Inf, xmax=Inf, ymin = value.min, ymax = value.max), temp[!duplicated(temp$group),], fill = '#00cc33', alpha = 0.6) +
geom_hline(aes(yintercept = value.min)) +
geom_hline(aes(yintercept = value.max)) +
geom_point() +
scale_x_continuous(labels = function(x) format(as.Date(x, origin = "1970-01-01"), "%b %y")) +
facet_wrap(~group)
Note that I used as.numeric(date), because otherwise Inf and -Inf yield
Error: Invalid input: date_trans works with objects of class Date only
To get date labels for numeric values, I adjusted the scale_x_continuous labels accordingly. (Although they are not exact here. You may want to adjust it by using the exact dates instead of month/year, or alternatively set manual breaks using the breaks argument and for example seq.Date.)
Also note that I used temp[!duplicated(temp$group),] to avoid overplotting and thus maintaining the desired alpha transparency.

Based on lukeA's answer I produced the following code, which I think is a little simpler:
library(ggplot2)
set.seed(2016-12-19)
dates <- seq(as.Date('2016-01-01'), as.Date('2016-12-31'), by = 1)
values <- rexp(length(dates), 1)
groups <- rpois(length(dates), 5)
temp <- data.frame(
date = dates,
value = values,
group = groups,
value.min = 1,
value.max = 2
)
bounds <- data.frame(
xmin = -Inf,
xmax = Inf,
ymin = temp$value.min[1],
ymax = temp$value.max[1]
)
ggplot(temp, aes(date, value)) +
geom_rect(
aes(
xmin = as.Date(xmin, origin = '1970-01-01'),
xmax = as.Date(xmax, origin = '1970-01-01'),
ymin = ymin,
ymax = ymax
),
data = bounds,
fill = '#00cc33',
alpha = 0.3,
inherit.aes = FALSE
) +
geom_point() +
facet_wrap(~group)
I created a temporary dataframe containing the bounds for the rectangle, and added inherit.aes = FALSE since apparently the bounds otherwise overrule the temp data.frame (still seems a bug to me). By transforming the -Inf and Inf to the correct datatype I didn't need the custom labeler (if your dealing with POSIXt use the correct as.POSIXct/lt as automatic transformation fails).

Related

Shade parts of a ggplot based on a (changing) dummy variable

I want to shade areas in a ggplot but I don't want to manually tell geom_rect() where to stop and where to start. My data changes and I always want to shade several areas based on a condition.
Here for example with the condition "negative":
library("ggplot2")
set.seed(3)
plotdata <- data.frame(somevalue = rnorm(10), indicator = 0 , counter = 1:10)
plotdata[plotdata$somevalue < 0,]$indicator <- 1
plotdata
I can do that manually like here or here:
plotranges <- data.frame(from = c(1,4,9), to = c(2,4,9))
ggplot() +
geom_line(data = plotdata, aes(x = counter, y = somevalue)) +
geom_rect(data = plotranges, aes(xmin = from - 1, xmax = to, ymin = -Inf, ymax = Inf), alpha = 0.4)
But my problem is that, so to speak, the set.seed() argument changes and I want to still automatically generate the plot without specifying min and max values of the shading manually. Is there a way (maybe without geom_rect() but instead geom_bar()?) to plot shading based directly on my indicator variable?
edit: Here is my own best attempt, as you can see not great:
ggplot(data = plotdata, aes(x = counter, y = somevalue)) + geom_line() +
geom_bar(aes(y = indicator*max(somevalue)), stat= "identity")
You can use stat_summary() to calculate the extremes of runs of your indicator. In the code below data.table::rleid() groups the data by runs of indicators. In the summary layer, y doesn't really do anything, so we use it to store the resolution of your datapoints, which we then later use to offset the xmin/xmax parameters. The after_stat() function is used to access computed variables after the ranges have been computed.
library("ggplot2")
plotdata <- data.frame(somevalue = rnorm(10), counter = 1:10)
plotdata$indicator <- as.numeric(plotdata$somevalue < 0)
ggplot(plotdata, aes(counter, somevalue)) +
stat_summary(
geom = "rect",
aes(group = data.table::rleid(indicator),
xmin = after_stat(xmin - y * 0.5),
xmax = after_stat(xmax + y * 0.5),
alpha = I((indicator) * 0.4),
y = resolution(counter)),
fun.min = min, fun.max = max,
orientation = "y", ymin = -Inf, ymax = Inf
) +
geom_line()
Created on 2021-09-14 by the reprex package (v2.0.1)

overlay discrete and continuous layer in ggplot - surprised that layer order matters

consider the following example dataset:
library(dplyr)
library(ggplot2)
d = mtcars %>%
as_tibble(rownames = "name") %>%
mutate(wt.cat = cut(wt, seq(1.5, 5.5, by = 1))) %>%
group_by(wt.cat) %>%
summarize(
Mean = mean(mpg),
Min = min(mpg),
Max = max(mpg)
)
Say I want to plot points for the "mean" value of each category in wt.cat and a ribbon showing the range. This works:
ggplot(d, aes(x = wt.cat)) +
geom_point(aes(y= Mean)) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue")
But the points are masked by the ribbon. However, if I change the order of the layers so that the points are plotted on top of the ribbon, I get an error:
ggplot(d, aes(x = wt.cat)) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y= Mean))
## Error: Discrete value supplied to continuous scale
So even though I'm specifying the discrete axis as the "default" aesthetic, it gets overridden by the specification of the first plotted layer. The only way I can find around this is to plot a dummy point layer first:
ggplot(d, aes(x = wt.cat)) +
geom_point(aes(y= Mean), shape = NA) +
geom_ribbon(aes(x = as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y= Mean))
## Warning message:
## Removed 4 rows containing missing values (geom_point).
Is there a more "proper" or correct way of combining discrete and continuous layers? Is there a solution that doesn't require creating a dummy layer?
would something like this be a solution?
d %>% {
ggplot(., aes(x = wt.cat)) +
scale_x_discrete(labels = levels(.$wt.cat)) +
geom_ribbon(aes(x =as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y=Mean))
}
I just learned you can wrap a pipe with { } and then reference the entire data frame with .
As camille said, the issue is that geom_ribbon requires a continuous scale because it plots area across values relating to the adjacent position. I believe the scale gets converted to continuous when geom_ribbon is added, but the labels are maintained.
Hope this helps
As per my reply -- the following works just as well if you want ggplot2 to handle all the labeling
d %>%
ggplot(aes(x = wt.cat)) +
scale_x_discrete() +
geom_ribbon(aes(x =as.numeric(wt.cat), ymin = Min, ymax = Max), fill = "blue") +
geom_point(aes(y=Mean))

Creating a ggplot() from scratch in R to illustrate results

I'm a bit new to R and this is the first time I'd like to use ggplot(). My aim is to create a few plots that will look like the template below, which is an output from the package effects for those who know it:
:
Given this data:
Average Error Area
1: 0.4407528 0.1853854 Loliondo
2: 0.2895050 0.1945540 Seronera
How can I replicate the plot seen in the image with labels, error bars as in Error and the line connecting both Average points?
I hope somebody can put me on the right track and then I will go from there for other data I have.
Any help is appreciated!
Using ggplot2::geom_errorbar you can add error bars by first deriving your ymin and ymax.
df <- tibble::tribble(~Average, ~Error, ~Area,
0.4407528, 0.1853854, "Loliondo",
0.2895050, 0.1945540, "Seronera")
dfnew <- df %>%
mutate(ymin = Average - Error,
ymax = Average + Error)
p <- ggplot(data = dfnew, aes(x = Area, y = Average)) +
geom_point(colour = "blue") + geom_line(aes(group = 1), colour = "blue") +
geom_errorbar(aes(x = Area, ymin = ymin, ymax = ymax), colour = "purple")
Here's a quick and dirty one that is similar to what was just posted:
df <-
tibble(
average = c(0.44, 0.29),
error = c(0.185, 0.195),
area = c("Loliondo", "Seronera")
)
df %>%
ggplot(aes(x = area)) +
geom_line(
aes(y = average, group = 1),
color = "blue"
) +
geom_errorbar(
aes(ymin = average - 0.5 * error, ymax = average + 0.5 * error),
color = "purple",
width = 0.1
)
The trickiest part here is the group = 1 segment, which you need for the line to be drawn with factors on the x axis.
The aes(x = area) goes up top because it's used in both geoms, while the y, group, ymin, and ymax are used only locally. The color and width arguments appear outside of the aes() call since they are used for appearance modifications.

Extend x-axis with dates

I wish to use ggrepel to add labels to the ends of the lines of a ggplot. To do that, I need to make space for the labels. To do that, I use scale_x_continuous ot extend the x-axis. Not sure that's correct and am open to other strategies.
I can do it when the x_axis type is friendly numeric.
library("tidyverse")
library("ggrepel")
p <- tibble (
x = c(1991, 1999),
y = c(3, 5)
)
ggplot(p, aes(x, y)) + geom_line() + scale_x_continuous(limits = c(1991, 2020)) +
geom_text_repel(data = p[2,], aes(label = "Minimum Wage"), size = 4, nudge_x = 1, nudge_y = 0, colour = "gray50")
However, when I try something similar except the x-axis is of the evil date type, I get the error:
Error in as.Date.numeric(value) : 'origin' must be supplied
p <- tibble (
x = c(as.Date("1991-01-01"), as.Date("1999-01-01")),
y = c(2, 5)
)
range <- c(as.Date("1991-01-01"), as.Date("2020-01-01"))
ggplot(p, aes(x, y)) + geom_line() + scale_x_continuous(limits = range)
How can I get this to work with my arch nemesis, date?
Use scale_x_date instead of scale_x_continuous:
p <- tibble (
x = c(as.Date("1991-01-01"), as.Date("1999-01-01")),
y = c(2, 5)
)
range <- c(as.Date("1991-01-01"), as.Date("2020-01-01"))
ggplot(p, aes(x, y)) + geom_line() + scale_x_date(limits = range)
Note that scale_x_date() has an expand argument which allows exact control over where the x-axis starts and ends. You could try expand = c(0,0) to include only the dates specified in your limits = argument or expand = c(f, f) where f is the fraction of days relative to the entire time series record you should include in your plot beyond the range of dates specified via your limit = argument. For example, f could be 0.01.

ggplot - annotate() - "Discrete value supplied to continuous scale"

I ve read many SO Answers regarding what can cause the error "Discrete value supplied to continuous scale" but I still fail to solve the following issue. In my case, the error is caused by using annotate(). If get rid of + annotate(...) everything works well. Else the error is raised.
My code is as follows:
base <- ggplot() +
annotate(geom = "rect", ymin = -Inf , ymax = 0, xmax = 0, xmin = Inf, alpha = .1)
annotated <- base +
geom_boxplot(outlier.shape=NA, data = technicalsHt, aes(x = name, y = px_last))
> base # fine
> annotated
Error: Discrete value supplied to continuous scale
Unfortunately, I cannot give the code leading to the dataframe used here (viz. technicalsHt) bacause it is very long and reliant on APis. A description of it:
> str(technicalsHt)
'data.frame': 512 obs. of 3 variables:
$ date : Date, format: "2016-11-14" "2016-11-15" ...
$ px_last: num 1.096 0.365 -0.067 0.796 0.281 ...
$ name : Factor w/ 4 levels "Stock Price Strength",..: 1 1 1 1 1 1 1 1 1 1 ...
> head(technicalsHt)
date px_last name
1 2016-11-14 1.09582090 Stock Price Strength
2 2016-11-15 0.36458685 Stock Price Strength
3 2016-11-16 -0.06696111 Stock Price Strength
4 2016-11-17 0.79613481 Stock Price Strength
5 2016-11-18 0.28067475 Stock Price Strength
6 2016-11-21 1.10780834 Stock Price Strength
The code without annotate works perfectly:
base <- ggplot()
annotated <- annotated +
geom_boxplot(outlier.shape=NA, data = technicalsHt, aes(x = name, y = px_last))
> annotated # fine
I tried playing around with technicalsHt, e.g. doing the following:
technicalsHt[,3] <- "hi"
technicalsHt[,2] <- rnorm(ncol(technicalsHt), 2,3)
but no matter what, using a annotate statement raises the error.
EDIT:
following the answer below, I tried to put the data and aes in the initial ggplot call, and have geom_boxplot from the outset:
base <-
# also tried: base <- ggplot(data = technicalsHt, aes(x = factor(name), y = px_last)) + geom_boxplot(outlier.shape=NA)
annotated <- base + ggplot(data = technicalsHt, aes(x = name, y = px_last)) + geom_boxplot(outlier.shape=NA)
annotate(geom = "rect", ymin = -Inf , ymax = 0, xmax = 0, xmin = Inf, alpha = .1)
this works but it is not really satisfactory since the annotation layer (shading part of the coordinate system) then covers the boxes.
(While e.g., that link also mentions this error in connection with annotate, the answer given there does not solve my issue, so I would be extremely grateful for help. First of all, which of the variable is causing problem?)
I had this issue and did not find the answer I wanted, so here is my solution. This is a bit prettier than plotting two times the boxplot.
If you want to annotate a rectangle below the points when there is a discrete scale you need to specify that to ggplot
ggplot(mtcars, aes(factor(cyl), mpg)) +
scale_x_discrete() +
annotate(geom = "rect", ymin = -Inf , ymax = 10, xmax = 0, xmin = Inf, alpha = .1) +
geom_boxplot()
Switch around the order, and bring the data and main aesthetics into your ggplot call. You are basically writing this:
p1 <- ggplot() +
annotate(geom = "rect", ymin = -Inf , ymax = 10, xmax = 0, xmin = Inf, alpha = .1)
At this point, p1 has a continuous x axis, since you provided numbers here.
p2 <- p1 + geom_boxplot(aes(factor(cyl), mpg), mtcars)
Now you add another layer that has a discrete axis, this yields an error.
If you write it the 'proper' way, everything is OK:
ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot() +
annotate(geom = "rect", ymin = -Inf , ymax = 10, xmax = 0, xmin = Inf, alpha = .1)
p.s.: Also it's not that hard to make a reproducible minimal example that accurately shows your problem, as you can see.
In response to the layering, the easiest work around that i have found is simply plotting the same box plot twice. I am aware that it is unnecessary code, but it is a very quick fix for the layering issue.
ggplot(mtcars, aes(factor(cyl), mpg)) +
geom_boxplot() +
annotate(geom = "rect", ymin = -Inf , ymax = 10, xmax = 0, xmin = Inf, alpha = .1) +
geom_boxplot()
I cannot notice any image degradation from it as the pixels perfectly overlap. Feel free to correct me if anyone has a UHD monitor.

Resources