How to prevent xlim from changing the height using geom_curve? - r

I have the following code:
library(tidyverse)
data_frame(x = 1:5, x1=x+1, c = c('a','a','a','b','b')) %>%
ggplot() +
geom_curve(aes(x = x, xend= x1, y = 0, yend = 0), curvature = -1.3, alpha=.2) +
facet_wrap(~ c, ncol=1)
but I would like to tweak the limits of the y axis to cut the background area above ~ .1.
I tried to do this:
data_frame(x = 1:5, x1=x+1, c = c('a','a','a','b','b')) %>%
ggplot() +
geom_curve(aes(x = x, xend= x1, y = 0, yend = 0), curvature = -1.3, alpha=.2) +
facet_grid(c ~ .) +
ylim(0,.35) +
facet_wrap(~ c, ncol=1)
but it simply rescales the archs based on the values in ylim. How can I prevent this behavior?

coord_fixed() has arguments that allow you to control precisely what you would like to have here.
See also http://ggplot2.tidyverse.org/reference/coord_fixed.html for reference.
Unfortunately, it is however not possible to use your x and x1 in a dynamic way inside coord_fixed().
As long as you are fine putting absolute values (0.6 and 6.4 below), you can however do something like this:
data_frame(x = 1:5, x1 = x+1, c = c('a','a','a','b','b')) %>%
ggplot(.) +
geom_curve(aes(x = x, xend = x1, y = 0, yend = 0), curvature = -1.3, alpha = .2) +
facet_grid(c ~ .) +
coord_fixed(ratio = 7, xlim = c(0.6, 6.4), ylim = c(0, 0.12), expand = FALSE) +
scale_y_continuous(breaks = c(0, 0.1))
Assuming this looks like what you would want it to look like, note that I set expand = FALSE to start ylim at zero, and added buffers to xlim (0.4) and the upper bound of ylim.
I have modified the default ratio value from 1 to 7, to scale you back down from the 0.7 to 0.1, which is what I understand you would like to have in the end. ratio = 1 would imply that you have the same scale (same distances) on the y-axis as on the x-axis (which is what you refer to as re-scaling I believe).
Finally I had to add the manual breaks for the y-axis (to have fewer ones), such that the grid boxes would be a bit larger, which again is just what I infer as your possible wish.

Does replacing ylim(0,.35) with coord_fixed(ylim=c(0, 0.35)) do what you want?

Related

How can arrows be added to faceted plots at different positions on each plot but at a constant angle?

This is a tricky one, and I'm making it trickier by basically asking two questions concurrently. But they're related (in practice, if not in theory).
The issue
Basically I want to use one script and one data frame to create a faceted plot with arrows:
at unique locations on each plot and
at consistent angles regardless of the scale.
The goal is to use the arrows to indicate dosing of a therapeutic, which might change for individuals or treatment cohorts.
Example dataset
Here's an example of the sort of data which I might need to plot with arrows:
df_3 <- tibble(
ID = rep(1:2, each = 10),
TIME = rep(1:10, times = 2),
DV = c(runif(10), runif(10) * 5)
)
An example plot of what I DON'T want
The follow code generates an example of where I am, but not where I want to be:
ggplot(data = df_3, aes(x = TIME, y = DV)) +
geom_line() +
facet_wrap(~ID, scales = "free_y") +
annotate("segment",
x = c(0.5, 1.5, 2.5), xend = c(0, 1, 2), y = 0.05, yend = 0,
arrow = arrow(length = unit(0.20, "cm"), type = "closed"))
Note that the arrows can only be set to a single series of locations (which are tedious to add, since I need to finagle the x and xend variables to get the angle I want) and, because the y-axes are different, the angles of each set of angles is different.
For example, let's say I want arrows at times 0, 1, and 2 for Individual 1, but at times 2, 4, and 6 for Individual 2.
I'm thinking I need to add the location for the arrows into the dataset, but I'm worried that will force ggplot to plot the arrows for every individual plot, creating fuzzy/dark arrows.
I'm open to any and all suggestions or thoughts. I appreciate your time.
The problem you're running into is that the arrows are totally defined in dataspace, which can skew the angle in visual space. One way to tackle it is to write your own geom that draws it exactly as you like, but that feels like overkill for a task that seems so simple.
This seems like a nice use case for Paul Murrell's gggrid package. One of the possibilities is to create a function that takes the raw data and transformed data (called coords), and outputs the desired graphical object.
# devtools::install_github("pmur002/gggrid")
library(ggplot2)
library(gggrid)
#> Loading required package: grid
df_3 <- data.frame(
ID = rep(1:2, each = 10),
TIME = rep(1:10, times = 2),
DV = c(runif(10), runif(10) * 5)
)
arrows <- function(data, coords) {
offset <- unit(4, "mm")
segmentsGrob(
x0 = unit(coords$x, "npc") + offset,
x1 = unit(coords$x, "npc"),
y0 = unit(coords$y, "npc") + offset,
y1 = unit(coords$y, "npc"),
arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
gp = gpar(fill = "black")
)
}
ggplot(data = df_3, aes(x = TIME, y = DV)) +
geom_line() +
facet_wrap(~ID, scales = "free_y") +
grid_panel(data = data.frame(TIME = c(0.5, 1.5, 2.5), DV = 0),
grob = arrows)
The nice thing is that this is not a static annotation such as annotation_custom() which is simply repeated along facets. In the example below we can see that the second arrow gets send to the second panel.
ggplot(data = df_3, aes(x = TIME, y = DV)) +
geom_line() +
facet_wrap(~ID, scales = "free_y") +
grid_panel(data = data.frame(TIME = c(0.5, 1.5, 2.5),
DV = 0, ID = c(1, 2, 1)), # <- facet var
grob = arrows)
Created on 2021-09-21 by the reprex package (v2.0.1)
Not sure whether I fully understand what you are trying to achieve but one option to add your arrows would be to make use of annotation_custom instead of annotate.
set.seed(123)
library(tibble)
df_3 <- tibble(
ID = rep(1:2, each = 10),
TIME = rep(1:10, times = 2),
DV = c(runif(10), runif(10) * 5)
)
library(grid)
library(ggplot2)
segment <- segmentsGrob(x0 = unit(.05, "npc"), y0 = unit(.05, "npc"),
x1 = unit(0, "npc"), y1 = unit(0, "npc"),
arrow = arrow(angle = 45, length = unit(0.2, "cm"), type = "closed"), gp = gpar(fill = "black"))
ggplot(data = df_3, aes(x = TIME, y = DV)) +
geom_line() +
scale_x_continuous(limits = c(0, NA)) +
scale_y_continuous(limits = c(0, NA)) +
facet_wrap(~ID, scales = "free_y") +
lapply(0:2, function(xmin) {
annotation_custom(grob = segment, xmin = xmin, ymin = 0, xmax = max(df_3$TIME) + xmin)
})

Setting Midpoint for continuous diverging color scale on a heatmap

I need to adjust the midpoint location for a heatmap via ggplot2. I've googled around and have seen scale_fill_gradient2 be a great fit but the colors don't seem to match up to what I'm looking for. I know z needs a range from 0 to 1. Example dataset generation below:
library(ggplot2)
library(tibble)
library(RColorBrewer)
set.seed(5)
df <- as_tibble(expand.grid(x = -5:5, y = 0:5, z = NA))
df$z <- runif(length(df$z), min = 0, max = 1)
I tried plotting with the scale_fill_gradient2 but the blue color isn't coming as "dark" as I'd like.
ggplot(df, aes(x = x, y = y)) +
geom_tile(aes(fill = z)) +
scale_fill_gradient2(
low = 'red', mid = 'white', high = 'blue',
midpoint = 0.7, guide = 'colourbar', aesthetics = 'fill'
) +
scale_x_continuous(expand = c(0, 0), breaks = unique(df$x)) +
scale_y_continuous(expand = c(0, 0), breaks = unique(df$y))
Therefore, I'm using scale_fill_distiller with the color palette 'RdBu' which comes out with the color scale I need but the ranges and the midpoints aren't right.
ggplot(df, aes(x = x, y = y)) +
geom_tile(aes(fill = z)) +
scale_fill_distiller(palette = 'RdBu') +
scale_x_continuous(expand = c(0, 0), breaks = unique(df$x)) +
scale_y_continuous(expand = c(0, 0), breaks = unique(df$y))
Is there a way to get the 2nd color scale but with the option to set midpoint range as the first?
The color scales provided by the colorspace package will generally allow you much more fine-grained control. First, you can use the same colorscale but set the mid-point.
library(ggplot2)
library(tibble)
library(colorspace)
set.seed(5)
df <- as_tibble(expand.grid(x = -5:5, y = 0:5, z = NA))
df$z <- runif(length(df$z), min = 0, max = 1)
ggplot(df, aes(x = x, y = y)) +
geom_tile(aes(fill = z)) +
scale_fill_continuous_divergingx(palette = 'RdBu', mid = 0.7) +
scale_x_continuous(expand = c(0, 0), breaks = unique(df$x)) +
scale_y_continuous(expand = c(0, 0), breaks = unique(df$y))
However, as you see, this creates the same problem as before, because you'd have to be further away from the midpoint to get darker blues. Fortunately, the divergingx color scales allow you to manipulate either branch independently, and so we can create a scale that turns to dark blue much faster. You can play around with l3, p3, and p4 until you get the result you want.
ggplot(df, aes(x = x, y = y)) +
geom_tile(aes(fill = z)) +
scale_fill_continuous_divergingx(palette = 'RdBu', mid = 0.7, l3 = 0, p3 = .8, p4 = .6) +
scale_x_continuous(expand = c(0, 0), breaks = unique(df$x)) +
scale_y_continuous(expand = c(0, 0), breaks = unique(df$y))
Created on 2019-11-05 by the reprex package (v0.3.0)
Claus' answer is great (and I'm a fan of his work), but I'd like to add that you can retain control within vanilla ggplot as well if you use the scale_fill_gradientn() function:
library(ggplot2)
library(tibble)
set.seed(5)
df <- as_tibble(expand.grid(x = -5:5, y = 0:5, z = NA))
df$z <- runif(length(df$z), min = 0, max = 1)
ggplot(df, aes(x = x, y = y)) +
geom_tile(aes(fill = z)) +
scale_fill_gradientn(
colours = c("red", "white", "blue"),
values = c(0, 0.7, 1)
) +
scale_x_continuous(expand = c(0, 0), breaks = unique(df$x)) +
scale_y_continuous(expand = c(0, 0), breaks = unique(df$y))
A notable downside is that you'd have to provide the values argument in rescaled space, so between 0-1. Consider if your fill values range from 0-10 instead and want the midpoint on 0.7, you'd have to provide values = c(0, 0.07, 1).

How to stop ggrepel labels moving between gganimate frames in R/ggplot2?

I would like to add labels to the end of lines in ggplot, avoid them overlapping, and avoid them moving around during animation.
So far I can put the labels in the right place and hold them static using geom_text, but the labels overlap, or I can prevent them overlapping using geom_text_repel but the labels do not appear where I want them to and then dance about once the plot is animated (this latter version is in the code below).
I thought a solution might involve effectively creating a static layer in ggplot (p1 below) then adding an animated layer (p2 below), but it seems not.
How do I hold some elements of a plot constant (i.e. static) in an animated ggplot? (In this case, the labels at the end of lines.)
Additionally, with geom_text the labels appear as I want them - at the end of each line, outside of the plot - but with geom_text_repel, the labels all move inside the plotting area. Why is this?
Here is some example data:
library(dplyr)
library(ggplot2)
library(gganimate)
library(ggrepel)
set.seed(99)
# data
static_data <- data.frame(
hline_label = c("fixed_label_1", "fixed_label_2", "fixed_label_3", "fixed_label_4",
"fixed_label_5", "fixed_label_6", "fixed_label_7", "fixed_label_8",
"fixed_label_9", "fixed_label_10"),
fixed_score = c(2.63, 2.45, 2.13, 2.29, 2.26, 2.34, 2.34, 2.11, 2.26, 2.37))
animated_data <- data.frame(condition = c("a", "b")) %>%
slice(rep(1:n(), each = 10)) %>%
group_by(condition) %>%
mutate(time_point = row_number()) %>%
ungroup() %>%
mutate(score = runif(20, 2, 3))
and this is the code I am using for my animated plot:
# colours for use in plot
condition_colours <- c("red", "blue")
# plot static background layer
p1 <- ggplot(static_data, aes(x = time_point)) +
scale_x_continuous(breaks = seq(0, 10, by = 2), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10), limits = c(2, 3), expand = c(0, 0)) +
# add horizontal line to show existing scores
geom_hline(aes(yintercept = fixed_score), alpha = 0.75) +
# add fixed labels to the end of lines (off plot)
geom_text_repel(aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0) +
coord_cartesian(clip = 'off') +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank(),
plot.margin = margin(5.5, 120, 5.5, 5.5))
# animated layer
p2 <- p1 +
geom_point(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
geom_line(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition),
show.legend = FALSE) +
scale_color_manual(values = condition_colours) +
geom_segment(data = animated_data,
aes(xend = time_point, yend = score, y = score, colour = condition),
linetype = 2) +
geom_text(data = animated_data,
aes(x = max(time_point) + 1, y = score, label = condition, colour = condition),
hjust = 0, size = 4) +
transition_reveal(time_point) +
ease_aes('linear')
# render animation
animate(p2, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)
Suggestions for consideration:
The specific repelling direction / amount / etc. in geom_text_repel is determined by a random seed. You can set seed to a constant value in order to get the same repelled positions in each frame of animation.
I don't think it's possible for repelled text to go beyond the plot area, even if you turn off clipping & specify some repel range outside plot limits. The whole point of that package is to keep text labels away from one another while remaining within the plot area. However, you can extend the plot area & use geom_segment instead of geom_hline to plot the horizontal lines, such that these lines stop before they reach the repelled text labels.
Since there are more geom layers using animated_data as their data source, it would be cleaner to put animated_data & associated common aesthetic mappings in the top level ggplot() call, rather than static_data.
Here's a possible implementation. Explanation in annotations:
p3 <- ggplot(animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
# static layers (assuming 11 is the desired ending point)
geom_segment(data = static_data,
aes(x = 0, xend = 11, y = fixed_score, yend = fixed_score),
inherit.aes = FALSE, colour = "grey25") +
geom_text_repel(data = static_data,
aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0, inherit.aes = FALSE,
seed = 123, # set a constant random seed
xlim = c(11, NA)) + # specify repel range to be from 11 onwards
# animated layers (only specify additional aesthetic mappings not mentioned above)
geom_point() +
geom_line() +
geom_segment(aes(xend = time_point, yend = score), linetype = 2) +
geom_text(aes(x = max(time_point) + 1, label = condition),
hjust = 0, size = 4) +
# static aesthetic settings (limits / expand arguments are specified in coordinates
# rather than scales, margin is no longer specified in theme since it's no longer
# necessary)
scale_x_continuous(breaks = seq(0, 10, by = 2)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10)) +
scale_color_manual(values = condition_colours) +
coord_cartesian(xlim = c(0, 13), ylim = c(2, 3), expand = FALSE) +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank()) +
# animation settings (unchanged)
transition_reveal(time_point) +
ease_aes('linear')
animate(p3, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)

ggplot2 - separating box plot labels by colour

I am trying to create a box plot with labels for some of the individal data. The box plot is separated by two variables, mapped to x and colour. However when I add labels using geom_text_repel from the ggrepel package (necessary for the real data) they separate by x but not colour. See this minimal reproducible example:
library(ggplot2)
library(ggrepel)
## create dummy data frame
rep_id <- c("a", "a", "b", "b", "c", "c", "d", "d", "e", "e")
dil <- c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2)
bleach_time <- c(0, 24, 0, 24, 0, 24, 0, 24, 0, 24)
a_i <- c(0.1, 0.2, 0.35, 0.2, 0.01, 0.4, 0.23, 0.1, 0.2, 0.5)
iex <- data_frame(rep_id, dil, bleach_time, a_i)
rm(rep_id, dil, bleach_time, a_i)
## Plot bar chart of a_i separated by bleach_time and dil
p <- ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE, segment.alpha = 0)
p
As you can see the labels are colour coded, but they are all lined up around the centre of each pair of plots rather than separated by the plots. I've tried nudge_x but that moves all the labels together. Is there a way I can move each set of labels individually?
For comparison here is the plot of my full data set with the outliers labelled - you can see how each set of labels isn't centred around the points it's labelling, complicating interpretation:
It looks like geom_text_repel needs position = position_dodge(width = __), not just the position = "dodge" shorthand I'd suggested, hence the error. You can mess around with setting the width; 0.7 looked okay to me.
library(tidyverse)
library(ggrepel)
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7))
Since you're plotting distributions, it might be important to keep positions along the y-axis the same, and only let geom_text_repel jitter along the x-axis, so I repeated the plot with direction = "x", which made me notice something interesting...
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7), direction = "x")
There are a couple texts being obscured by the fact that they have the same color as the fill of the boxplots! You can fix this with a better combination of color + fill palettes. The quick fix I did was turning down the luminosity of the color and turning up the luminosity of the fill in the scale_*_discrete calls to make them distinct (but also pretty ugly).
ggplot(iex, aes(x = as.character(bleach_time), y = a_i, fill = as.factor(dil))) +
geom_boxplot() +
geom_text_repel(aes(label = rep_id, colour = as.factor(dil)), na.rm = TRUE,
segment.alpha = 0, position = position_dodge(width = 0.7), direction = "x") +
scale_color_discrete(l = 30) +
scale_fill_discrete(l = 100)
Note that you can also adjust the force used in the repel, so if you need the labels to not overlap but to also hug closer to the middles of the boxplots, you can mess around with that setting as well.

Plotting multiple density distributions on one plot

For teaching purposes I'm looking to create and plot multiple distributions on to one graph. The code I've been using to do this is:
library(ggplot2)
library(ggfortify)
# Create an initial graph with 1 distribution
p3 <- ggdistribution(dnorm,
seq(-5, 10,length=1000),
colour='blue',
mean=0.15,
sd=0.24,
fill='blue')
# Update p3 with second distribution
p3 <- ggdistribution(dnorm, seq(-5, 10,length=1000),
mean = 1.11,
sd = 0.55,
colour='green',
fill='green',p=p3)
# View p3
p3
Initially, this seems great because it produces a graph with both distributions:
The problems start when I try to change the appearance of the graph.
(1) First when I attempt to change the y-axis scale so that it ranges from 0 to 1 instead of the percentages it shows by default, I am able to do so, but something happens to the distributions. Here is the code I am using:
p3 <- p3 + ylim(0,1) + xlim (-2, 6) + labs(title="Plotting Multiple Distributions", x="Mean difference", y="Density")
And this returns the following graph:
Any advice on how I can change the y-axis without ruining the distribution would be very appreciated!
(2) Second, when I try to add 2 lines along the axes using this code:
p3 <- p3 + geom_segment(aes(x=0, y=0, xend=0, yend=0.98),
size=1,
arrow = arrow(length = unit(0.4,"cm")))
p3 <- p3 + geom_segment(aes(x=-2, y=0, xend=6, yend=0),
size=1)
...R returns the following error message:
Error in eval(expr, envir, enclos) : object 'ymin' not found
Any advice as to how I might add these lines to improve the aesthetics of the graph would be very appreciated.
Thank you in advance for your time.
Sounds like you wish to change the y-axis labels to the range (0, 1), without actually changing the underlying distribution. Here's one approach:
# after obtaining p3 from the two ggdistribution() functions
# get the upper limit for p3's current y-axis range, rounded up
y.orig <- layer_scales(p3)$y$range$range[2] # = 1.662259 in my case, yours may
# differ based on the distribution
y.orig <- ceiling(y.orig * 10) / 10 # = 1.7
p3 +
xlim(-2, 6) +
scale_y_continuous(breaks = seq(0, y.orig, length.out = 5),
labels = scales::percent(seq(0, 1, length.out = 5))) +
annotate("segment", x = 0, y = 0, xend = 0, yend = y.orig,
size = 1, arrow = arrow(length = unit(0.4, "cm"))) +
annotate("segment", x = -2, y = 0, xend = 6, yend = 0,
size = 1)
Or if you prefer to keep labels close to the fake axis created from line segments, include expand = c(0, 0) for x / y:
p3 +
scale_x_continuous(limits = c(-2, 6), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(0, y.orig, length.out = 5),
labels = scales::percent(seq(0, 1, length.out = 5)),
expand = c(0, 0)) +
annotate("segment", x = 0, y = 0, xend = 0, yend = y.orig,
size = 1, arrow = arrow(length = unit(0.4, "cm"))) +
annotate("segment", x = -2, y = 0, xend = 6, yend = 0,
size = 1)

Resources