R ggplot2 ggrepel labelling positions - r

I am trying to add labels to a ggplot object. The labels do not look neat and tidy due to their positioning. I have tried using various geom_label_repel and geom_text_repel options but am not having much luck.
I cannot share the data unfortunately, but I have inserted one of my codes below and a screenshot of one section of the redacted graph. The graph has multiple peaks that need labelling. Each label has 2 lines.
I would like the lines connecting the labels to be directly above each peak on the x axis, then turn at a right angle and the line continue horizontally slightly. I would then like the label to sit on top of this horizontal section of the line.
Some peaks are very close together, so the labels will end up being pushed up the y axis so they are able to stack up neatly.
I hope that description makes sense. I would appreciate it if anyone is able to help.
Thank you!
library(ggplot2)
library(ggrepel)
library(dplyr)
upper_plot <- ggplot() +
geom_point(data = plot_data[which(analysis == "Analysis1"),],
aes(x = rel_pos, y = logged_p, color = as.factor(chr)),
size = 0.25) +
scale_color_manual(values = rep(my_upper_colors, nrow(axis_df))) +
geom_point(data=upper_highlight_pos2_old,
aes(x = rel_pos, y = logged_p),
color= c('grey'),
size=0.75,
pch = 16) +
geom_point(data=upper_labels_old,
aes(x = rel_pos, y = logged_p),
color='dark grey',
size=2,
pch = 18) +
geom_point(data=upper_highlight_pos2_novel,
aes(x = rel_pos, y = logged_p),
color= c('black'),
size=0.75,
pch = 16) +
geom_point(data=upper_labels_novel,
aes(x = rel_pos, y = logged_p),
color='black',
size=2,
pch = 18) +
scale_x_continuous(labels = axis_df$chr,
breaks = axis_df$chr_center,
expand = expansion(mult = 0.01)) +
scale_y_continuous(limits = c(0, maxp),
expand = expansion(mult = c(0.02, 0.06))) +
# geom_hline(yintercept = -log10(1e-5), color = "red", linetype = "dashed",
# size = 0.3) +
geom_hline(yintercept = -log10(5e-8), color = "black", linetype = "dashed",
size = 0.3) +
labs(x = "", y = bquote(atop('GWAS', '-log'[10]*'(p)'))) +
theme_classic() +
theme(legend.position = "none",
axis.title.x = element_blank(),
plot.margin = margin(t=5, b = 5, r=5, l = 10)) +
geom_label_repel(data = upper_labels,
aes(x = rel_pos, y = logged_p, label = label),
ylim = c(maxp / 3, NA),
size = 2,
force_pull = 0,
nudge_x = 0.5,
box.padding = 0.5,
nudge_y = 0.5,
min.segment.length = 0, # draw all lines no matter how short
segment.size = 0.2,
segment.curvature = -0.1,
segment.ncp = 3,
segment.angle = 45,
label.size=NA, #no border/box
fill = NA, #no background
)
This is my current untidy layout...
EDIT:
This is the sort of layout I am after. The lines will need to be flexible and either be right-handed or left-handed depending on space (source: https://www.nature.com/articles/s41588-020-00725-7)

Related

How to present the results of a dataframe in a serial scale using ggplot as in the example attached?

I have this data frame :
Raw.Score = c(0,1,2,3,4,5,6,7,8)
Severity = c(-3.56553994,-2.70296933,-1.63969850,-0.81321707,-0.04629182,
0.73721320,1.61278518,2.76647043,3.94804472)
x = data.frame(Raw.Score = Raw.Score, Severity = Severity)
Raw.score are raw numbers from 0 to 8 (let's consider them as the labels of the severity numbers)
Severity are relative numbres that represent the locations of the scores in the diagram
I want to graphically present the results as in the following example using ggplot (the example includes different numbers but I want something similar)
As a fun exercise in ggplot-ing here is one approach to achieve or come close to your desired result.
Raw.Score = c(0,1,2,3,4,5,6,7,8)
Severity = c(-3.56553994,-2.70296933,-1.63969850,-0.81321707,-0.04629182,
0.73721320,1.61278518,2.76647043,3.94804472)
dat <- data.frame(Raw.Score, Severity)
library(ggplot2)
dat_tile <- data.frame(
Severity = seq(-4.1, 4.1, .05)
)
dat_axis <- data.frame(
Severity = seq(-4, 4, 2)
)
tile_height = .15
ymax <- .5
ggplot(dat, aes(y = 0, x = Severity, fill = Severity)) +
# Axis line
geom_hline(yintercept = -tile_height / 2) +
# Colorbar
geom_tile(data = dat_tile, aes(color = Severity), height = tile_height) +
# Sgements connecting top and bottom labels
geom_segment(aes(xend = Severity, yend = -ymax, y = ymax), color = "orange") +
# Axis ticks aka dots
geom_point(data = dat_axis,
y = -tile_height / 2, shape = 21, stroke = 1, fill = "white") +
# ... and labels
geom_text(data = dat_axis, aes(label = Severity),
y = -tile_height / 2 - .1, vjust = 1, fontface = "bold") +
# Bottom labels
geom_label(aes(y = -ymax, label = scales::number(Severity, accuracy = .01))) +
# Top labels
geom_point(aes(y = ymax, color = Severity), size = 8) +
geom_text(aes(y = ymax, label = Raw.Score), fontface = "bold") +
# Colorbar annotations
annotate(geom = "text", fontface = "bold", label = "MILD", color = "black", x = -3.75, y = 0) +
annotate(geom = "text", fontface = "bold", label = "SEVERE", color = "white", x = 3.75, y = 0) +
# Fixing the scales
scale_x_continuous(expand = c(0, 0)) +
scale_y_continuous(limits = c(-ymax, ymax)) +
# Color gradient
scale_fill_gradient(low = "orange", high = "red", guide = "none") +
scale_color_gradient(low = "orange", high = "red", guide = "none") +
# Get rid of all non-data ink
theme_void() +
# Add some plot margin
theme(plot.margin = rep(unit(10, "pt"), 4)) +
coord_cartesian(clip = "off")

How to fill stat_ellipse with patterns with transparent backgrounds in R?

I am using stat_ellipse in R to generate ellipse area polygons from the data. However, they overlap significantly and turning the alpha level to transparent "kind of" works. I wanted to see if there was a way to fill specific ellipses with a pattern that has a transparent background since they overlap so much. Maybe I could have some solid colors and others patterns?
This is my working plot code now:
ggplot(data = claw3,
aes(x = iso1,
y = iso2,
fill = group,
lty = community,
shape = community)) +
stat_ellipse(aes(group = interaction(group, community),
lty = community),
alpha = 0.85, #trasparent level trying to make 2012 West more visible
color = "black",
level = p.ell,
type = "norm",
geom = "polygon",
size = 1.1) +
geom_point(aes(fill = group), size = 2, alpha = 1, color = "black") +
scale_fill_manual(values = c("blue", "grey30","00FFFFFF"),labels = c("2012", "2014","2016"))+
scale_color_manual(values = c( "blue", "grey30","00FFFFFF"))+
scale_linetype_manual(values = c("dotted","solid"))+
scale_shape_manual(values = c(21, 24))+
guides(shape = guide_legend(override.aes = list(fill = "white")), #overrides legend for the community boxes filled white
fill = guide_legend(override.aes = list(shape = NA, size = 1))) + #overrides legend for group removes shapes in year
ylab(expression(paste(delta^{15}, "N (\u2030)"))) +
xlab(expression(paste(delta^{13}, "C (\u2030)"))) +
scale_x_continuous(breaks= seq(-26.5, -19.5, by = 1),
#labels = c( -24, rep("", 2), -23, rep("", 2), -21),
limits = c(-26.5, -19.5),
expand = c(0, 0)) +
scale_y_continuous(breaks= seq(4, 11),
labels = c(4, "",6, "",8, "", 10, ""),
limits = c(3.5, 11),
expand = c(0, 0)) +
theme(text = element_text(size=14)) +
theme_classic(base_size = 14) +
theme(legend.title = element_blank())
I have recently found ggpattern, but it does not look like its friendly with stat_ellipse or I really just don't understand where to put it. I believe Id have to remove the scale_fill_manual and scale_color_manual commands, but thats about it.

CI/SD geom_ribbon() missing when zoomed in

I have an issue with geom_ribbon and I think this is a bug and not a feature.
I want to zoom in on the "interesting" part of my plot but I don't want ggplot to exclude anything just because the entire thing doesn't fit into the plot. For that I use coord_cartesian to do the limiting. And it works for lines and points and probably many other things (bars) but not for geom_ribbon. So here's an example:
# Load libraries
library(ggplot2)
# Create data:
set.seed(1234)
LineA=c(seq(1,20,0.1))
LineB=c(seq(1,25,0.1))
LineC=c(seq(1,19,0.1))
LineD=c(seq(1,60,0.1))
my_df=data.frame(Mean = c(sort(sample(LineA,40)),sort(sample(LineB,40)),sort(sample(LineC,40)),
sort(sample(LineD,40))))
my_df$Names=c(rep("Line-A",40),rep("Line-B",40),rep("Line-C",40),rep("Line-D",40))
my_df$SD=c(runif(n = 120, min = 1, max = 5),runif(n = 40, min = 1, max = 20))
my_df$Time=c(1:40,1:40,1:40,1:40)
my_df$Mean_low=my_df$Mean-my_df$SD
my_df$Mean_low[my_df$Mean_low<0]=0
my_df$Mean_hi=my_df$Mean+my_df$SD
head(my_df)
# Plot
# Ribbon visible:
ggplot(my_df, aes(x=Time, y=Mean)) + geom_line(aes(colour = Names), size = 1) +
geom_point(size = 2, aes(shape = Names, color = Names))+
geom_ribbon(aes(x = Time, y=NULL, ymin = Mean_low, ymax = Mean_hi, fill = Names),
show.legend = F, linetype = 0, alpha = 0.1, na.rm = T) +
geom_hline(yintercept = 20, linetype = "dotdash", color = "red", size = 1)+
theme_classic()+
scale_y_continuous("Mean value", breaks = seq(0, 100, 2), expand = expansion(mult = c(0, 0.01))) +
scale_x_continuous("Days", breaks = seq(0, max(my_df$Time),2),
expand = expansion(mult = c(0.01, 0.005))) +
coord_cartesian(ylim = c(0, 100), xlim = c(0, 50))
Here the ribbon visible if all of it is allowed to fit in the plot but the Ribbon is missing for Line-D completely when I limit the y axis as seen here below:
ggplot(my_df, aes(x=Time, y=Mean)) + geom_line(aes(colour = Names), size = 1) +
geom_point(size = 2, aes(shape = Names, color = Names))+
geom_ribbon(aes(x = Time, y=NULL, ymin = Mean_low, ymax = Mean_hi, fill = Names),
show.legend = F, linetype = 0, alpha = 0.1, na.rm = T) +
geom_hline(yintercept = 20, linetype = "dotdash", color = "red", size = 1)+
theme_classic()+
scale_y_continuous("Mean value", breaks = seq(0, 100, 2), expand = expansion(mult = c(0, 0.01))) +
scale_x_continuous("Days", breaks = seq(0, max(my_df$Time),2),
expand = expansion(mult = c(0.01, 0.005))) +
coord_cartesian(ylim = c(0, 30), xlim = c(0, 50))
I found only one workaround as also described here: Extended range in geom_ribbon by manually removing the data (NA for values) for values that would stay outside limits but that is a workaround and not a solution. The limiting/zooming works for most other geom options, then why not for the geom_ribbon as well? Does anyone know a more elegant solution? Is it a bug? Should I try to let ggplot people know?
Thank you!!
Just installing the ragg library [library(ragg)] displays the ribbons when the plot is exported/saved. The cut off bands are still not visible when zooming-in in R-studio plot, though. It could be a bug in the ggplot.

How to stop ggrepel labels moving between gganimate frames in R/ggplot2?

I would like to add labels to the end of lines in ggplot, avoid them overlapping, and avoid them moving around during animation.
So far I can put the labels in the right place and hold them static using geom_text, but the labels overlap, or I can prevent them overlapping using geom_text_repel but the labels do not appear where I want them to and then dance about once the plot is animated (this latter version is in the code below).
I thought a solution might involve effectively creating a static layer in ggplot (p1 below) then adding an animated layer (p2 below), but it seems not.
How do I hold some elements of a plot constant (i.e. static) in an animated ggplot? (In this case, the labels at the end of lines.)
Additionally, with geom_text the labels appear as I want them - at the end of each line, outside of the plot - but with geom_text_repel, the labels all move inside the plotting area. Why is this?
Here is some example data:
library(dplyr)
library(ggplot2)
library(gganimate)
library(ggrepel)
set.seed(99)
# data
static_data <- data.frame(
hline_label = c("fixed_label_1", "fixed_label_2", "fixed_label_3", "fixed_label_4",
"fixed_label_5", "fixed_label_6", "fixed_label_7", "fixed_label_8",
"fixed_label_9", "fixed_label_10"),
fixed_score = c(2.63, 2.45, 2.13, 2.29, 2.26, 2.34, 2.34, 2.11, 2.26, 2.37))
animated_data <- data.frame(condition = c("a", "b")) %>%
slice(rep(1:n(), each = 10)) %>%
group_by(condition) %>%
mutate(time_point = row_number()) %>%
ungroup() %>%
mutate(score = runif(20, 2, 3))
and this is the code I am using for my animated plot:
# colours for use in plot
condition_colours <- c("red", "blue")
# plot static background layer
p1 <- ggplot(static_data, aes(x = time_point)) +
scale_x_continuous(breaks = seq(0, 10, by = 2), expand = c(0, 0)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10), limits = c(2, 3), expand = c(0, 0)) +
# add horizontal line to show existing scores
geom_hline(aes(yintercept = fixed_score), alpha = 0.75) +
# add fixed labels to the end of lines (off plot)
geom_text_repel(aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0) +
coord_cartesian(clip = 'off') +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank(),
plot.margin = margin(5.5, 120, 5.5, 5.5))
# animated layer
p2 <- p1 +
geom_point(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
geom_line(data = animated_data,
aes(x = time_point, y = score, colour = condition, group = condition),
show.legend = FALSE) +
scale_color_manual(values = condition_colours) +
geom_segment(data = animated_data,
aes(xend = time_point, yend = score, y = score, colour = condition),
linetype = 2) +
geom_text(data = animated_data,
aes(x = max(time_point) + 1, y = score, label = condition, colour = condition),
hjust = 0, size = 4) +
transition_reveal(time_point) +
ease_aes('linear')
# render animation
animate(p2, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)
Suggestions for consideration:
The specific repelling direction / amount / etc. in geom_text_repel is determined by a random seed. You can set seed to a constant value in order to get the same repelled positions in each frame of animation.
I don't think it's possible for repelled text to go beyond the plot area, even if you turn off clipping & specify some repel range outside plot limits. The whole point of that package is to keep text labels away from one another while remaining within the plot area. However, you can extend the plot area & use geom_segment instead of geom_hline to plot the horizontal lines, such that these lines stop before they reach the repelled text labels.
Since there are more geom layers using animated_data as their data source, it would be cleaner to put animated_data & associated common aesthetic mappings in the top level ggplot() call, rather than static_data.
Here's a possible implementation. Explanation in annotations:
p3 <- ggplot(animated_data,
aes(x = time_point, y = score, colour = condition, group = condition)) +
# static layers (assuming 11 is the desired ending point)
geom_segment(data = static_data,
aes(x = 0, xend = 11, y = fixed_score, yend = fixed_score),
inherit.aes = FALSE, colour = "grey25") +
geom_text_repel(data = static_data,
aes(x = 11, y = fixed_score, label = hline_label),
hjust = 0, size = 4, direction = "y", box.padding = 1.0, inherit.aes = FALSE,
seed = 123, # set a constant random seed
xlim = c(11, NA)) + # specify repel range to be from 11 onwards
# animated layers (only specify additional aesthetic mappings not mentioned above)
geom_point() +
geom_line() +
geom_segment(aes(xend = time_point, yend = score), linetype = 2) +
geom_text(aes(x = max(time_point) + 1, label = condition),
hjust = 0, size = 4) +
# static aesthetic settings (limits / expand arguments are specified in coordinates
# rather than scales, margin is no longer specified in theme since it's no longer
# necessary)
scale_x_continuous(breaks = seq(0, 10, by = 2)) +
scale_y_continuous(breaks = seq(2, 3, by = 0.10)) +
scale_color_manual(values = condition_colours) +
coord_cartesian(xlim = c(0, 13), ylim = c(2, 3), expand = FALSE) +
guides(col = F) +
labs(title = "[Title Here]", x = "Time", y = "Mean score") +
theme_minimal() +
theme(panel.grid.minor = element_blank()) +
# animation settings (unchanged)
transition_reveal(time_point) +
ease_aes('linear')
animate(p3, nframes = 50, end_pause = 5, height = 1000, width = 1250, res = 120)

Blend Colours from Different Layers

I'm trying to create a Venn diagram where each circle has a unique colour, and the intersections blend those colours.
I can make the circles with the ggforce package. And I can blend the colours by setting the alpha values to, say, 0.75:
library(ggplot2)
library(ggforce)
propositions <- data.frame(
cirx = c(-.75 , .75),
ciry = c(0 , 0),
r = c(1.5 , 1.5),
labx = c(-2.25 , 2.25),
laby = c(1 , 1),
labl = c("A", "B")
)
ggplot(propositions) +
theme_void() + coord_fixed() +
xlim(-3,3) + ylim(-2,2) +
theme(panel.border = element_rect(colour = "black", fill = NA, size = 1)) +
geom_circle(aes(x0 = cirx, y0 = ciry, r = r), fill = "red", alpha = .6, data = propositions[1,]) +
geom_circle(aes(x0 = cirx, y0 = ciry, r = r), fill = "blue", alpha = .6, data = propositions[2,]) +
geom_text(aes(x = labx, y = laby, label = labl),
fontface = "italic", size = 10, family = "serif")
But the results are pretty poor:
The colours are washed out, and the intersection's colour isn't as distinct from the right-side circle's as I'd like. I want something closer to this (photoshopped) result:
I could do this if there was some way to designate and fill the intersection. In principle, this could be done with geom_ribbon(), I think. But that seems painful, and hacky. So I'm hoping for a more elegant solution.
Here's the workaround using geom_ribbon(). It's not a proper solution though, since it won't generalize to other shapes and intersections without manually redefining the boundaries of the ribbon, which can get real hairy fast.
There's gotta be a way to get ggplot2 to automatically do the work of blending colours across layers, right?
library(ggplot2)
library(ggforce)
x <- seq(-.75, .75, 0.01)
upper <- function(x) {
a <- sqrt(1.5^2 - (x[x < 0] - .75)^2)
b <- sqrt(1.5^2 - (x[x >= 0] + .75)^2)
c(a,b)
}
lower <- function(x) {
-upper(x)
}
ggplot() +
coord_fixed() + theme_void() +
xlim(-3,3) + ylim(-2,2) +
geom_circle(aes(x0 = -.75, y0 = 0, r = 1.5), fill = "red") +
geom_circle(aes(x0 = .75, y0 = 0, r = 1.5), fill = "blue") +
geom_ribbon(aes(x = x, ymin = upper(x), ymax = lower(x)), fill = "purple", colour = "black") +
theme(panel.border = element_rect(colour = "black", fill = NA, size = 1)) +
geom_text(aes(x = c(-2.25, 2.25), y = c(1, 1), label = c("A", "B")),
fontface = "italic", size = 10, family = "serif")

Resources