How to plot facets with discontinuous y-axis - r

I am trying to produce a plot with a discontinuous y-axis but can't get the facet titles to only show once:
Example Data:
data(mpg)
library(ggplot2)
> ggplot(mpg, aes(displ, cty)) +
+ geom_point() +
+ facet_grid(. ~ drv)
After much digging it appears that this is impossible in ggplot2, but I have discovered the gg.gap package. However, this package replicates the facet titles for each segment of the plot. Let's say I want a break in the y axis from 22-32 as follows:
library(gg.gap)
gg.gap(plot = p,
segments = c(22, 32),
ylim = c(0, 35))
Facet titles appear for each plot segment but this is clearly pretty confusing and terrible aesthetically. I would be grateful for any insight of help anyone could provide! I'm stumped.
I know this is possible if I plot in base R, but given other constraints I am unable to do so (I need the graphics/grammar provided by ggplot2.
Thanks in advance!

This is a bit of an ugly workaround. The idea is to set y-values in the broken portion to NA so that no points are drawn there. Then, we facet on a findInterval() with the breaks of the axes (negative because we want to preserve bottom-to-top axes). Finally we manually resize the panels with ggh4x::force_panelsizes() to set the 2nd panel to have 0 height. Full disclaimer, I wrote ggh4x so I'm biased.
A few details: the strips along the y-direction are hidden by setting the relevant theme elements to blank. Also, ideally you'd calculate what proportion the upper facet should be relative to the lower facet and replace the 0.2 by that number.
library(ggplot2)
library(ggh4x)
ggplot(mpg, aes(displ, cty)) +
geom_point(aes(y = ifelse(cty >= 22 & cty < 32, NA, cty))) +
facet_grid(-findInterval(cty, c(-Inf, 22, 32, Inf)) ~ drv,
scales = "free_y", space = "free_y") +
theme(strip.background.y = element_blank(),
strip.text.y = element_blank(),
panel.spacing.y = unit(5.5/2, "pt")) +
force_panelsizes(rows = c(0.2, 0, 1))
#> Warning: Removed 20 rows containing missing values (geom_point).
Alternative approach for boxplot:
Instead of censoring the bit on the break, you can duplicate the data and manipulate the position scales to show what you want. We rely on the clipping of the data by the coordinate system to crop the graphical objects.
library(ggplot2)
library(ggh4x)
ggplot(mpg, aes(class, cty)) +
geom_boxplot(data = ~ transform(., facet = 2)) +
geom_boxplot(data = ~ transform(., facet = 1)) +
facet_grid(facet ~ drv, scales = "free_y", space = "free_y") +
facetted_pos_scales(y = list(
scale_y_continuous(limits = c(32, NA), oob = scales::oob_keep, # <- keeps data
expand = c(0, 0, 0.05, 0)),
scale_y_continuous(limits= c(NA, 21), oob = scales::oob_keep,
expand = c(0.05, 0, 0, 0))
)) +
theme(strip.background.y = element_blank(),
strip.text.y = element_blank())

Here's an approach that relies on changing the data before ggplot2, and then adjusting the scale labels, comparable to what you do for a secondary y axis.
library(dplyr)
low_max <- 22.5
high_min <- 32.5
adjust <- high_min - low_max
mpg %>%
mutate(cty2 = as.numeric(cty),
cty2 = case_when(cty < low_max ~ cty2,
cty > high_min ~ cty2 - adjust,
TRUE ~ NA_real_)) %>%
ggplot(aes(displ, cty2)) +
geom_point() +
annotate("segment", color = "white", size = 2,
x = -Inf, xend = Inf, y = low_max, yend = low_max) +
scale_y_continuous(breaks = 1:50,
label = function(x) {x + ifelse(x>=low_max, adjust, 0)}) +
facet_grid(. ~ drv)

Related

Calculate axis tick locations based on data in faceted plot

I have an issue where I would like to calculate locations of y-axis labels in a large plot mad with facet_grid(). Let me show you what I mean using the mpg dataset.
require(tidyverse)
ggplot(mpg, aes(x = displ)) +
geom_point(aes(y = hwy)) +
facet_grid(drv ~ class, scales = "free")
You will notice that both axes use variable numbers of labels. Ignoring problems with the x-axis, I am interested in labelling only three values on the y-axis: 0, half of max value, and max value. It makes sense in my use-case, so here is what I tried.
ggplot(mpg, aes(x = displ)) +
geom_point(aes(y = hwy)) +
facet_grid(drv ~ class, scales = "free") +
geom_blank(aes(y = 0)) + # extends y-axis to 0
scale_y_continuous(expand = expansion(mult = c(0, 0.1)), # prevents ggplot2 from extending beyond y = 0
n.breaks = 3) # Three axis labels, please.
The plot correctly starts at y = 0 and labels it correctly. However, remaining labels are poorly assigned, and have labels n = 2 and n = 4 instead of n = 3 for some reason.
If I could only directly calculate the label positions!
ggplot(mpg, aes(x = displ)) +
geom_point(aes(y = hwy)) +
facet_grid(drv ~ class, scales = "free") +
geom_blank(aes(y = 0)) +
scale_y_continuous(expand = expansion(mult = c(0, 0.1)),
n.breaks = 3,
breaks = c(0, 0.5*max(hwy), 1*max(hwy))) # Maybe these formulas works?
Error in check_breaks_labels(breaks, labels) : object 'hwy' not found
I believe providing break points by this way should work, but that my syntax is bad. How do I access and work with the data underlying the plot? Alternatively, if this doesn't work, can I manually specify y-axis labels for each row of panels?
I could really use some assistance here, please.
If you want custom rules for breaks, the easiest thing is to use a function implementing those rules given the (panel) limits.
Below an example for labeling 0, the max and half-max.
library(ggplot2)
ggplot(mpg, aes(x = displ)) +
geom_point(aes(y = hwy)) +
facet_grid(drv ~ class, scales = "free") +
scale_y_continuous(expand = expansion(mult = c(0, 0.1)),
limits = c(0, NA), # <- fixed min, flexible max. replaces geom_blank
breaks = function(x){c(x[1], mean(x), x[2])})
You can remove scales free: Do you get then what you desire?
require(tidyverse)
ggplot(mpg, aes(x = displ)) +
geom_point(aes(y = hwy)) +
facet_grid(drv ~ class)

Avoid overlapping text labels using ggplot2::coord_polar

I have a plot with points on a polar coordinate system. Each point has an associated label, which should be shown around the plot at the given angle. This can be achieved either using axis.text or geom_text; I have used geom_text here. Unfortunately, the text labels overlap. Using position = position_jitter() apparently only allows jittering by height, but not by width (i.e., does not solve the issue). MWE:
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text(aes(x=angle, y=1.1,
label=labels),
size = 3)
I would like to jitter the labels such that they do not overlap, but stay outside the plotting area. I have also tried to use the angle argument to conditionally angle the labels to make more space, but couldn't figure out the right formula to make the angles work.
Edit: Here is another way to create the plot, using scale_x_continuous to create the labels as axis.text.x. This does, however, again lead to overlapping labels.
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
scale_x_continuous(limits = c(0, 360), expand = c(0, 0), breaks = df$angle, labels=df$labels) +
theme_minimal() +
theme(panel.grid = element_blank())
ggrepel will work well in this context.
library(ggplot2)
library(ggrepel)
df <- data.frame("angle" = runif(50, 0, 359),
"projection" = runif(50, 0, 1),
"labels" = paste0("label_", 1:50))
ggplot(data = df, aes(x = angle, y = projection, label = labels)) +
geom_point() +
coord_polar() +
theme_minimal() +
geom_text_repel(size = 3)

Several distributions in the same plot -- using geom_density function from ggplot2

I think I'm very close to getting this code done, but I'm missing something here.
I want to "combine" two plots into just one like this:
The first plot has this code:
ggplot(test, aes(y=key,x=value)) +
geom_path()+
coord_flip()
And the second one has this one below:
ggplot(test, aes(x=value, fill=key)) +
geom_density() +
coord_flip()
This kind of multiple distributions plot are often seen in stats book when we read about normal distributions. The most useful link I've got so far was this one here.
Please use this code to reproduce my question:
library(tidyverse)
test <- data.frame(key = c("communication","gross_motor","fine_motor"),
value = rnorm(n=30,mean=0, sd=1))
ggplot(test, aes(x=value, fill=key)) +
geom_density() +
coord_flip()
ggplot(test, aes(y=key,x=value)) +
geom_path(size=2)+
coord_flip()
Thanks much
You might be interested in ridgeline plots from the ggridges package.
Ridgeline plots are partially overlapping line plots that create the impression of a mountain range. They can be quite useful for visualizing changes in distributions over time or space.
library(tidyverse)
library(ggridges)
set.seed(123)
test <- data.frame(
key = c("communication", "gross_motor", "fine_motor"),
value = rnorm(n = 30, mean = 0, sd = 1)
)
ggplot(test, aes(x = value, y = key)) +
geom_density_ridges(scale = 0.9) +
theme_ridges() +
NULL
#> Picking joint bandwidth of 0.525
Add median line:
ggplot(test, aes(x = value, y = key)) +
stat_density_ridges(quantile_lines = TRUE, quantiles = 2, scale = 0.9) +
coord_flip() +
theme_ridges() +
NULL
#> Picking joint bandwidth of 0.525
Simulate a rug:
ggplot(test, aes(x = value, y = key)) +
geom_density_ridges(
jittered_points = TRUE,
position = position_points_jitter(width = 0.05, height = 0),
point_shape = '|', point_size = 3, point_alpha = 1, alpha = 0.7,
) +
theme_ridges() +
NULL
#> Picking joint bandwidth of 0.525
Created on 2018-10-16 by the reprex package (v0.2.1.9000)
I think the easiest way to do this is with facet_wrap(). If you don't like the default appearance of the facets you can tweak them with theme(), e.g.:
ggplot(test, aes(x=value, fill=key)) +
geom_density() +
facet_wrap(~ key) +
coord_flip() +
theme(panel.spacing.x = unit(0, "mm"))
Result:

ggplot2 add data from additional data frame next to plot

I would like to be able to extend my boxplots with additional information. Here is a working example for ggplot2:
library(ggplot2)
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
# Basic box plot
p <- ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
I would like to add additional information from a separate data frame. For example:
extra <- data.frame(dose=factor(c(0.5,1,2)), label=c("Label1", "Label2", "Label3"), n=c("n=42","n=52","n=35"))
> extra
dose label n
1 0.5 Label1 n=42
2 1 Label2 n=52
3 2 Label3 n=35
I would like to create the following figure where the information to each dose (factor) is outside the plot and aligns with each of the dose levels (I made this in powerpoint as an example):
EDIT:
I would like to ask advice for an extension of the initial question.
What about this extension where I use fill to split up dose by the two groups?
ToothGrowth$dose <- as.factor(ToothGrowth$dose)
ToothGrowth$group <- head(rep(1:2, 100), dim(ToothGrowth)[1])
ToothGrowth$group <- factor(ToothGrowth$group)
p <- ggplot(ToothGrowth, aes(x=dose, y=len, fill=group)) +
geom_boxplot()
# Rotate the box plot
p + coord_flip()
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
Is it possible to align data from the new data frame (extra, 6 rows) with each of the dose/group combinations?
We can use geom_text with clip = "off" inside coord_flip:
ggplot(ToothGrowth, aes(x=dose, y=len)) +
geom_boxplot() +
geom_text(
y = max(ToothGrowth$len) * 1.1,
data = extra,
aes(x = dose, label = sprintf("%s\n%s", label, n)),
hjust = 0) +
coord_flip(clip = "off") +
theme(plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"))
Explanation: We place text outside of the plot area with geom_text and disable clipping with clip = "off" inside coord_flip. Lastly, we increase the plot margin to accommodate the additional labels. You can adjust the vertical y position in the margin (so the horizontal position in the plot because of the coordinate flip) by changing the factor in y = max(ToothGrowth$len) * 1.1.
In response to your edit, here is a possibility
extra <- data.frame(
dose=factor(rep(c(0.5,1,2), each=2)),
group=factor(rep(c(1:2), 3)),
label=c("Label1A", "Label1B", "Label2A", "Label2B", "Label3A", "Label3B"),
n=c("n=12","n=30","n=20", "n=32","n=15","n=20")
)
library(tidyverse)
ToothGrowth %>%
mutate(
dose = as.factor(dose),
group = as.factor(rep(1:2, nrow(ToothGrowth) / 2))) %>%
ggplot(aes(x = dose, y = len, fill = group)) +
geom_boxplot(position = position_dodge(width = 1)) +
geom_text(
data = extra %>%
mutate(
dose = as.factor(dose),
group = as.factor(group),
ymax = max(ToothGrowth$len) * 1.1),
aes(x = dose, y = ymax, label = sprintf("%s\n%s", label, n)),
position = position_dodge(width = 1),
size = 3,
hjust = 0) +
coord_flip(clip = "off", ylim = c(0, max(ToothGrowth$len))) +
theme(
plot.margin = unit(c(1, 5, 0.5, 0.5), "lines"),
legend.position = "bottom")
A few comments:
We ensure that labels match the dodged bars by using position_dodge(with = 1) inside geom_text and geom_boxplot.
It seems that position_dodge does not like a global y (outside of aes). So we include the y position for the labels in extra and use it inside aes. As a result, we need to explicitly limit the range of the y axis. We can do that inside coord_flip with ylim = c(0, max(ToothGrowth$len)).

Create the effect of drawing an additional line and legend-item on an existing ggplot2

I am making a presentation and would like to present one line graph (geom_line()) with an appropriate legend. I then want to overlay a new geom_line and add the corresponding legend item. For aesthetic reasons, I want the overlay to not modify the legend location given in the first plot. The effect should be that one is drawing on an existing graph, and adding to its legend.
If I simply using ggplot to first make the first plot and then make a new plot with both lines, the location of the legend changes noticeably.
If I try to make the first plot be the full plot, but setting one of the line sizes to zero, I run into the problem that I can't suppress the legend-item for the size-zero line.
How can I achieve my desired effect with ggplot2?
EDIT:
Here is the code to make the two graphs that I first naively tried.
require(ggplot2)
require(reshape2)
x<-seq(-10,10,length=200)
G <- (1/(sqrt(2*pi))) * exp(-((x)^2)/(2))
G2 <- 2*(1/(pi))*(1/(x^2+1))
df = data.frame(x,G,G2)
ggplot(data = melt(data.frame(x,G),id.vars = 'x'))+
geom_line(aes(x=x, y=value, color=variable),size=.5)+
scale_color_manual("Distribution",values=c("orange"),labels=c("Gaussian"))+
coord_cartesian(ylim = c(0, 1))
ggplot(data = melt(data.frame(x,G,G2),id.vars = 'x'))+
geom_line(aes(x=x, y=value, color=variable),size=.5)+
scale_color_manual("Distribution",values=c("orange","blue"),labels=c("Gaussian","2Gaussian"))+
coord_cartesian(ylim = c(0, 1))
If it's not clear from these pictures that there is a problem, open up the images from these two links and flip from one to another.
http://rpubs.com/jwg/269311
http://rpubs.com/jwg/269312
NOTICE: The problem is even worse than I first described, since not only is the legend moving but the coordinate axis is moving as well.
Presumably this can be fixed by plotting both and then making its legend-item and the line invisible. Is this a possibility?
Here's a solution which will keep everything aligned with the bonus of animation.
library(ggplot2)
library(tidyr)
library(gganimate)
p <- df %>%
gather(var, val, -x) %>%
ggplot(aes(x, val, frame = var)) +
geom_line(aes(color = var, group = var, cumulative = TRUE)) +
coord_cartesian(ylim = c(0, 1))
gganimate(p, "myplot.gif", "gif")
This should generate a file myplot.gif with this result:
Not sure if this is what you want, but here goes:
x<-seq(-10,10,length=200)
G <- (1/(sqrt(2*pi))) * exp(-((x)^2)/(2))
G2 <- 2*(1/(pi))*(1/(x^2+1))
df <- data.frame(x,G,G2)
df.plot <- tidyr::gather(df, key = 'variable', value = 'value', -x)
ggplot(df.plot, aes(x, value, color = variable)) + geom_line() + scale_color_manual(breaks = c("G"), values = c("orange", NA)) +
coord_cartesian(xlim = c(-10, 10), ylim = c(0,1)) + theme(legend.position = c(0,0)) +
theme(legend.position = "right",
legend.justification = "top")
ggplot(df.plot, aes(x, value, color = variable)) + geom_line() + scale_color_manual(breaks = c("G", "G2"), values = c("orange", "blue")) +
coord_cartesian(xlim = c(-10, 10), ylim = c(0,1)) + theme(legend.position = "right",
legend.justification = "top")

Resources