How to label lines obtained using split with ggrepel - r

I am trying to label 4 lines grouped by the value of variable cc. To label the lines I use ggrepel but I get all the 4 labels instead of 2 for each graph. How to correct this error?
The location of the labels is in this example at the last date but I want something more flexible: I want to locate each of the 4 labels in specific points that I chose (e.g. b at date 1, a at date 2, etc.). How to do that?
library(tidyverse)
library(ggrepel)
library(cowplot)
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = country)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(df, date == 4),
aes(label = country)) +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))
Created on 2021-08-18 by the reprex package (v2.0.0)

Here I make a tibble to hold color and position preferences, and join that to df.
The geom_text_repel line should probably use i instead of df so that it's split the same way as the line. The only trouble is this forces us to specify that we want four colors up front, since otherwise each chart would just use the two it needs.
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
label_pos <- tibble(country = letters[1:4],
label_pos = c(2, 1, 3, 2),
color = RColorBrewer::brewer.pal(4, "Set2")[1:4])
df <- df %>% left_join(label_pos)
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = color)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(i, date == label_pos),
aes(label = country), box.padding = unit(0.02, "npc"), direction = "y") +
scale_color_identity() +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))

Related

Adding geom_line between data points with different geom_boxplot fill variable

Hi I have a much larger data frame but a sample dummy df is as follows:
set.seed(23)
df = data.frame(name = c(rep("Bob",8),rep("Tom",8)),
topic = c(rep(c("Reading","Writing"),8)),
subject = c(rep(c("English","English","Spanish","Spanish"),4)),
exam = c(rep("First",4),rep("Second",4),rep("First",4),rep("Second",4)),
score = sample(1:100,16))
I have to plot it in the way shown in the picture below (for my original data frame) but with lines connecting the scores corresponding to each name between the first and second class in the exam variable, I tried geom_line(aes(group=name)) but the lines are not connected in the right way. Is there any way to connect the points that also respects the grouping by the fill variable similar to how the position_dodge() helps separate the points by their fill grouping? Thanks a lot!
library(ggplot2)
df %>% ggplot(aes(x=topic,y=score,fill=exam)) +
geom_boxplot(outlier.shape = NA) +
geom_point(size=1.75,position = position_dodge(width = 0.75)) +
facet_grid(~subject,switch = "y")
One option to achieve your desired result would be to group the lines by name and topic and do the dodging of lines manually instead of relying on position_dogde. To this end convert topic to a numeric for the geom_line and shift the position by the necessary amount to align the lines with the dodged points:
set.seed(23)
df <- data.frame(
name = c(rep("Bob", 8), rep("Tom", 8)),
topic = c(rep(c("Reading", "Writing"), 8)),
subject = c(rep(c("English", "English", "Spanish", "Spanish"), 4)),
exam = c(rep("First", 4), rep("Second", 4), rep("First", 4), rep("Second", 4)),
score = sample(1:100, 16)
)
library(ggplot2)
ggplot(df, aes(x = topic, y = score, fill = exam)) +
geom_boxplot(outlier.shape = NA) +
geom_point(size = 1.75, position = position_dodge(width = 0.75)) +
geom_line(aes(
x = as.numeric(factor(topic)) + .75 / 4 * ifelse(exam == "First", -1, 1),
group = interaction(name, topic)
)) +
facet_grid(~subject, switch = "y")

Avoid overlap of points on a timeline (1-D repeling)

I want to create a timeline plot that roughly resembles the example below: lots of overlap at some points, not a lot of overlap at others.
What I need: overlapping images should repel each other where necessary, eliminating or reducing overlap. Ideally I'd be able to implement either a vertical or horizontal repel.
library(tidyverse)
library(ggimage)
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) )
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = date, y = group, image = img, group = group), asp = 1)
Something similar to the repelling in ggbeeswarm::geom_beeswarm or ggrepel::geom_text_repel would be nice, but those don't support images. So I think I need to pre-apply some kind of 1-dimensional packing algorithm, implementing iterative pair-wise repulsion on my vector of dates within each group, to try to find a non-overlapping arrangement.
Any ideas? Thank you so much!
Created on 2021-10-30 by the reprex package (v2.0.1)
Here is the solution I’ve been able to come up with, repurposing the circleRepelLayout function from the awesome packcircles package
into the repel_vector vector function that takes in your overlapping vector and a "repel_radius", and returns, if possible, a non-overlapping version.
I demonstrate the solution with the richtext geom since this is a geom I’ve always wished had repel functionality.
library(packcircles)
library(tidyverse)
library(ggtext)
library(ggimage)
repel_vector <- function(vector, repel_radius = 1, repel_bounds = range(vector)){
stopifnot(is.numeric(vector))
repelled_vector <-
packcircles::circleRepelLayout(x = data.frame(vector, ypos = 1, repel_radius),
xysizecols = c("vector", "ypos", "repel_radius"),
xlim = repel_bounds, ylim = c(0,1),
wrap = FALSE) %>%
as.data.frame() %>%
.$layout.x
return(repelled_vector)
}
overlapping_vec <- c(1, 1.1, 1.2, 10, 10.1, 10.2)
repelled_vec_default <- repel_vector(overlapping_vec)
repelled_vec_tighter <- repel_vector(overlapping_vec, repel_radius = 0.35)
ggplot() +
annotate("richtext", x = overlapping_vec, y = 3, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_default, y = 2, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_tighter, y = 1, label = "**test**", alpha = 0.5) +
scale_y_continuous(breaks = 1:3, labels = c("Tighter repel", "Default repel", "Overlapping points"))
In theory you apply this to 2D repelling as well.
To solve the problem in my question, this can be applied like so:
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) ) %>%
group_by(group) %>%
mutate(repelled_date = repel_vector(as.numeric(date),
repel_radius = 4,
repel_bounds = range(as.numeric(date)) + c(-3,3)),
repelled_date = as.Date(repelled_date, origin = "1970-01-01"))
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = repelled_date, y = group, image = img, group = group), asp = 1)
Created on 2021-10-30 by the reprex package (v2.0.1)

Combine text and image in a geom_label_repel in ggplot

I'm trying to do a line graph and have the last point of each series be labelled by a combination of text and image. I usually use ggrepel package for this and have no problem doing this with text only. My problem is I can't figure out how to add an image in the label.
I thought that a label like Country <img src='https://link.com/to/flag.png' width='20'/> would work and so this is what I've tried to do:
library(dplyr)
library(ggplot2)
library(ggrepel)
# example df
df <- data.frame(
Country = c(rep("France", 5), rep("United Kingdom", 5)),
Ratio = rnorm(10),
Days = c(seq(1, 5, 1), seq(4, 8, 1)),
abbr = c(rep("FR", 5), rep("GB", 5))) %>%
group_by(Country) %>%
# add "label" only to last point of the graph
mutate(label = if_else(Days == max(Days),
# combine text and img of country's flag
true = paste0(Country, " <img src='https://raw.githubusercontent.com/behdad/region-flags/gh-pages/png/", abbr, ".png' width='20'/>"),
false = NA_character_)
)
# line graph
ggplot(data = df, aes(x = Days, y = Ratio, color = Country)) +
geom_line(size = 1) +
theme(legend.position = "none") +
geom_label_repel(aes(label = label),
nudge_x = 1,
na.rm = T)
But this produces the raw label and not the country's name with its flag, as intended:
This is obviously not the way to go, can anyone please help me?
Try this approach using ggtext function geom_richtext(). You can customize other elements if you wish. Here the code:
library(dplyr)
library(ggplot2)
library(ggrepel)
library(ggtext)
# example df
df <- data.frame(
Country = c(rep("France", 5), rep("United Kingdom", 5)),
Ratio = rnorm(10),
Days = c(seq(1, 5, 1), seq(4, 8, 1)),
abbr = c(rep("FR", 5), rep("GB", 5))) %>%
group_by(Country) %>%
# add "label" only to last point of the graph
mutate(label = if_else(Days == max(Days),
# combine text and img of country's flag
true = paste0(Country, " <img src='https://raw.githubusercontent.com/behdad/region-flags/gh-pages/png/", abbr, ".png' width='20'/>"),
false = NA_character_)
)
# line graph
ggplot(data = df, aes(x = Days, y = Ratio, color = Country,label = label)) +
geom_line(size = 1) +
theme(legend.position = "none") +
geom_richtext(na.rm = T,nudge_x = -0.1,nudge_y = -0.1)
Output:

R ggplot facet_wrap y ticks on different sides

For some reason, I have to make a plot that looks more or less like this:
For this I use the following code:
library(ggplot2)
library(tidyverse)
set.seed(10)
df<-data.frame(Meas = runif(1000,0,10),
Prop1 = sample(x = LETTERS[1:3],1000,replace=TRUE),
Prop2 = sample(x = letters[1:5],1000,replace=TRUE),
Prop3 = sample(x=c("monkey","donkey","flipper"),1000,replace=TRUE))%>%
gather(Prop,Propvalue,-Meas)
ggplot(df,aes(x = Propvalue,y=Meas))+
geom_boxplot()+
facet_wrap(~Prop,ncol=2,scales="free_y")+
coord_flip()
I believe this would look better if the y-ticks on the right graph would appear on the right (for the graphs on the left, the y-ticks should remain where they are, but flipper and donkey should appear on the right side to avoid the gap between the left and right panels), but I can't find a way to do this.
Here's a hack that utilises ggplot's sec.axis argument, which creates a secondary axis opposite the primary axis & has to be a one-to-one mapping of it. I call this a hack, because this works only for continuous axis, so we need to map the categorical Propvalue to numeric values.
Note: I assumed in this example that you want all odd numbered PropX facets' labels on the left, & even numbered PropX facets' labels on the right. You can also tweak the options for other variations.
library(ggplot2)
library(tidyverse)
# generate data
set.seed(10)
df<-data.frame(Meas = runif(1000,0,10),
Prop1 = sample(x = LETTERS[1:3],1000,replace=TRUE),
Prop2 = sample(x=c("monkey","donkey","flipper"),1000,replace=TRUE),
Prop3 = sample(x = letters[1:5],1000,replace=TRUE))%>%
gather(Prop,Propvalue,-Meas)
# map Propvalue to integers, primary axis contents, & secondary axis contents.
df2 <- df %>%
mutate(Propvalue.int = as.integer(factor(Propvalue,
levels = df %>% select(Prop, Propvalue) %>%
arrange(Prop, Propvalue) %>% unique() %>%
select(Propvalue) %>% unlist())),
facet.column = ifelse(as.integer(str_extract(Prop, "[0-9]")) %% 2 == 0, 2, 1),
Propvalue.left = ifelse(facet.column == 1, Propvalue, ""),
Propvalue.right = ifelse(facet.column == 2, Propvalue, ""))
# create mapping table
integer2factor <- df2 %>%
select(Propvalue.int, Propvalue.left, Propvalue.right) %>%
unique() %>% arrange(Propvalue.int)
ggplot(df2,aes(x = Propvalue.int, y=Meas,
group = Propvalue.int))+
geom_boxplot() +
scale_x_continuous(breaks = integer2factor$Propvalue.int,
labels = integer2factor$Propvalue.left,
name = "Propvalue",
sec.axis = dup_axis(breaks = integer2factor$Propvalue.int,
labels = integer2factor$Propvalue.right,
name = "")) +
facet_wrap(~Prop,ncol=2,scales="free")+
coord_flip() +
theme(axis.ticks.y = element_blank())
I believe this will do the trick.
library(ggplot2)
library(tidyverse)
library(tidyr)
set.seed(10)
df <-data.frame(Meas = runif(1000,0,10),
Prop1 = sample(x = LETTERS[1:3],1000,replace=TRUE),
Prop2 = sample(x = letters[1:5],1000,replace=TRUE),
Prop3 = sample(x=c("monkey","donkey","flipper"),1000,replace=TRUE))%>%
gather(Prop,Propvalue,-Meas)
ggplot(df,aes(x = Propvalue,y=Meas))+
geom_boxplot()+
facet_wrap(~Prop,ncol=2,scales="free_y")+
coord_flip()
p.list = lapply(sort(unique(df$Prop)), function(i) { # i <- "Prop1"
ggplot(df[df$Prop==i,],aes(x = Propvalue, y=Meas))+
geom_boxplot()+
facet_wrap(~Prop,scales="free_y")+
coord_flip()
})
p.list[[2]] <- p.list[[2]] + scale_x_discrete(position = "top")
library(gridExtra)
do.call(grid.arrange, c(p.list, nrow=2))

Color one point and add an annotation in ggplot2?

I have a dataframe a with three columns :
GeneName, Index1, Index2
I draw a scatterplot like this
ggplot(a, aes(log10(Index1+1), Index2)) +geom_point(alpha=1/5)
Then I want to color a point whose GeneName is "G1" and add a text box near that point, what might be the easiest way to do it?
You could create a subset containing just that point and then add it to the plot:
# create the subset
g1 <- subset(a, GeneName == "G1")
# plot the data
ggplot(a, aes(log10(Index1+1), Index2)) + geom_point(alpha=1/5) + # this is the base plot
geom_point(data=g1, colour="red") + # this adds a red point
geom_text(data=g1, label="G1", vjust=1) # this adds a label for the red point
NOTE: Since everyone keeps up-voting this question, I thought I would make it easier to read.
Something like this should work. You may need to mess around with the x and y arguments to geom_text().
library(ggplot2)
highlight.gene <- "G1"
set.seed(23456)
a <- data.frame(GeneName = paste("G", 1:10, sep = ""),
Index1 = runif(10, 100, 200),
Index2 = runif(10, 100, 150))
a$highlight <- ifelse(a$GeneName == highlight.gene, "highlight", "normal")
textdf <- a[a$GeneName == highlight.gene, ]
mycolours <- c("highlight" = "red", "normal" = "grey50")
a
textdf
ggplot(data = a, aes(x = Index1, y = Index2)) +
geom_point(size = 3, aes(colour = highlight)) +
scale_color_manual("Status", values = mycolours) +
geom_text(data = textdf, aes(x = Index1 * 1.05, y = Index2, label = "my label")) +
theme(legend.position = "none") +
theme()

Resources