I'm trying to do a line graph and have the last point of each series be labelled by a combination of text and image. I usually use ggrepel package for this and have no problem doing this with text only. My problem is I can't figure out how to add an image in the label.
I thought that a label like Country <img src='https://link.com/to/flag.png' width='20'/> would work and so this is what I've tried to do:
library(dplyr)
library(ggplot2)
library(ggrepel)
# example df
df <- data.frame(
Country = c(rep("France", 5), rep("United Kingdom", 5)),
Ratio = rnorm(10),
Days = c(seq(1, 5, 1), seq(4, 8, 1)),
abbr = c(rep("FR", 5), rep("GB", 5))) %>%
group_by(Country) %>%
# add "label" only to last point of the graph
mutate(label = if_else(Days == max(Days),
# combine text and img of country's flag
true = paste0(Country, " <img src='https://raw.githubusercontent.com/behdad/region-flags/gh-pages/png/", abbr, ".png' width='20'/>"),
false = NA_character_)
)
# line graph
ggplot(data = df, aes(x = Days, y = Ratio, color = Country)) +
geom_line(size = 1) +
theme(legend.position = "none") +
geom_label_repel(aes(label = label),
nudge_x = 1,
na.rm = T)
But this produces the raw label and not the country's name with its flag, as intended:
This is obviously not the way to go, can anyone please help me?
Try this approach using ggtext function geom_richtext(). You can customize other elements if you wish. Here the code:
library(dplyr)
library(ggplot2)
library(ggrepel)
library(ggtext)
# example df
df <- data.frame(
Country = c(rep("France", 5), rep("United Kingdom", 5)),
Ratio = rnorm(10),
Days = c(seq(1, 5, 1), seq(4, 8, 1)),
abbr = c(rep("FR", 5), rep("GB", 5))) %>%
group_by(Country) %>%
# add "label" only to last point of the graph
mutate(label = if_else(Days == max(Days),
# combine text and img of country's flag
true = paste0(Country, " <img src='https://raw.githubusercontent.com/behdad/region-flags/gh-pages/png/", abbr, ".png' width='20'/>"),
false = NA_character_)
)
# line graph
ggplot(data = df, aes(x = Days, y = Ratio, color = Country,label = label)) +
geom_line(size = 1) +
theme(legend.position = "none") +
geom_richtext(na.rm = T,nudge_x = -0.1,nudge_y = -0.1)
Output:
Related
For some reason when producing a plotly graph with the ggplotly function, the filtering does not seem to resize the y-axis. The filtered portion are simply removed, while yaxis stays at it's original length. Please see this example:
library(plotly)
library(ggplot2)
library(dplyr)
lab <- paste("Vertical Label", c(1, 2, 3, 4, 5))
ds <- data.frame(x = sample(lab, size = 1000, replace = T),
y = sample(LETTERS[1:5], size = 1000, replace = T)) %>%
group_by(x,y) %>% summarise(count= n())
ggplotly(
ggplot(ds, aes(x = x,y=count, fill = y)) +
geom_col() +
theme(axis.text.x = element_text(angle = 90))
)
Same approach with plot_ly function works. However, I needed similar results with ggploty
plot_ly(ds, x = ~x, y = ~count, type = 'bar', color = ~y
) %>% layout(title = "Vertical Axis Lables",
xaxis = list(title = ""),
yaxis = list(title = ""), barmode = 'stack')
I couldn't find anything helpful in stack overflow or google. Just came across an incomplete answer here:
https://community.rstudio.com/t/ggplotly-bar-chart-not-resizing-after-filtering/115675/3
Any help will be greatly appreciated.
Applying a tip from R Plotly Legend Filtering enables re-stacking and similar ordering, while enabling auto-scaling provides y-axis adaptation:
library(plotly)
library(ggplot2)
library(dplyr)
lab <- paste("Vertical Label", c(1, 2, 3, 4, 5))
ds <- data.frame(x = sample(lab, size = 1000, replace = T),
y = sample(LETTERS[1:5], size = 1000, replace = T)) %>%
group_by(x,y) %>% summarise(count= n())
p <- ggplotly(
ggplot(ds, aes(x = x,y=count, fill = y)) +
geom_col() +
theme(axis.text.x = element_text(angle = 90))
)
for (i in 1:length(p$x$data)) {
p$x$data[[i]]$base <- c()
tmp <- p$x$data[[i]]
p$x$data[[i]] <- p$x$data[[length(p$x$data) - i + 1]]
p$x$data[[length(p$x$data) - i + 1]] <- tmp
}
p
It is only necessary to reset the base of the plotly variable for each of the x-axis elements that will be plotted.
for (i in 1:length(p$x$data)) {
p$x$data[[i]]$base <- c()
}
In the above example, if you reset the order, (1) D does not resize and (2) purple overlays A (A is never seen unless purple is filtered).
I want to create a timeline plot that roughly resembles the example below: lots of overlap at some points, not a lot of overlap at others.
What I need: overlapping images should repel each other where necessary, eliminating or reducing overlap. Ideally I'd be able to implement either a vertical or horizontal repel.
library(tidyverse)
library(ggimage)
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) )
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = date, y = group, image = img, group = group), asp = 1)
Something similar to the repelling in ggbeeswarm::geom_beeswarm or ggrepel::geom_text_repel would be nice, but those don't support images. So I think I need to pre-apply some kind of 1-dimensional packing algorithm, implementing iterative pair-wise repulsion on my vector of dates within each group, to try to find a non-overlapping arrangement.
Any ideas? Thank you so much!
Created on 2021-10-30 by the reprex package (v2.0.1)
Here is the solution I’ve been able to come up with, repurposing the circleRepelLayout function from the awesome packcircles package
into the repel_vector vector function that takes in your overlapping vector and a "repel_radius", and returns, if possible, a non-overlapping version.
I demonstrate the solution with the richtext geom since this is a geom I’ve always wished had repel functionality.
library(packcircles)
library(tidyverse)
library(ggtext)
library(ggimage)
repel_vector <- function(vector, repel_radius = 1, repel_bounds = range(vector)){
stopifnot(is.numeric(vector))
repelled_vector <-
packcircles::circleRepelLayout(x = data.frame(vector, ypos = 1, repel_radius),
xysizecols = c("vector", "ypos", "repel_radius"),
xlim = repel_bounds, ylim = c(0,1),
wrap = FALSE) %>%
as.data.frame() %>%
.$layout.x
return(repelled_vector)
}
overlapping_vec <- c(1, 1.1, 1.2, 10, 10.1, 10.2)
repelled_vec_default <- repel_vector(overlapping_vec)
repelled_vec_tighter <- repel_vector(overlapping_vec, repel_radius = 0.35)
ggplot() +
annotate("richtext", x = overlapping_vec, y = 3, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_default, y = 2, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_tighter, y = 1, label = "**test**", alpha = 0.5) +
scale_y_continuous(breaks = 1:3, labels = c("Tighter repel", "Default repel", "Overlapping points"))
In theory you apply this to 2D repelling as well.
To solve the problem in my question, this can be applied like so:
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) ) %>%
group_by(group) %>%
mutate(repelled_date = repel_vector(as.numeric(date),
repel_radius = 4,
repel_bounds = range(as.numeric(date)) + c(-3,3)),
repelled_date = as.Date(repelled_date, origin = "1970-01-01"))
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = repelled_date, y = group, image = img, group = group), asp = 1)
Created on 2021-10-30 by the reprex package (v2.0.1)
I am trying to label 4 lines grouped by the value of variable cc. To label the lines I use ggrepel but I get all the 4 labels instead of 2 for each graph. How to correct this error?
The location of the labels is in this example at the last date but I want something more flexible: I want to locate each of the 4 labels in specific points that I chose (e.g. b at date 1, a at date 2, etc.). How to do that?
library(tidyverse)
library(ggrepel)
library(cowplot)
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = country)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(df, date == 4),
aes(label = country)) +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))
Created on 2021-08-18 by the reprex package (v2.0.0)
Here I make a tibble to hold color and position preferences, and join that to df.
The geom_text_repel line should probably use i instead of df so that it's split the same way as the line. The only trouble is this forces us to specify that we want four colors up front, since otherwise each chart would just use the two it needs.
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
label_pos <- tibble(country = letters[1:4],
label_pos = c(2, 1, 3, 2),
color = RColorBrewer::brewer.pal(4, "Set2")[1:4])
df <- df %>% left_join(label_pos)
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = color)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(i, date == label_pos),
aes(label = country), box.padding = unit(0.02, "npc"), direction = "y") +
scale_color_identity() +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))
Following is the sample dataset that I have:
df <- structure(list(Class = c("A", "B", "C", "D"),
`Attempted` = c(374, 820, 31, 108),
`Missed` = c(291, 311, 5, 15),
`Cancelled` = c(330, 206, 6, 5),
`Unknown` = c(950, 341, 6, 13)),
class = "data.frame", row.names = c(NA, -4L))
I want to create a mosaic plot with 'percentages' instead of absolute numbers. To be precise, I want to see what percentage of 'class A' people out of the total 'class A' population 'missed' their test? And, similarly for other class population.
I have not tried any code yet as I have absolutely no clue how to start. Can anyone please help me with this?
Using only one package, you can do and note I am labeling the cells with the proportions in each class (i.e rows sum up to 1):
library(vcd)
M = as.table(as.matrix(df[,-1]))
names(dimnames(M)) = c("Class","result")
labs <- round(prop.table(M,margin=1), 2)
mosaic(M, pop = FALSE)
labeling_cells(text = labs, margin = 0)(M)
You can also just visualize it with a simple
library(RColorBrewer)
barplot(t(labs),col=brewer.pal(4,"Set2"))
legend("bottomright",legend = colnames(labs),inset=c(0,1.1), xpd=TRUE,
fill =brewer.pal(4,"Set2"),horiz=TRUE,cex=0.7)
If you use ggplot2 and another other gg stuff, you need to pivot your data long:
library(tidyr)
library(dplyr)
library(ggplot2)
df_long = df %>%
pivot_longer(-Class) %>%
group_by(Class) %>%
mutate(total = sum(value),
p = round(100*value/total,digits=1)) %>%
ungroup()
ggplot(df_long,aes(x=Class,y=p,fill=name)) + geom_col() + geom_text(aes(label=p),position=position_stack(vjust=0.2))
If you want to use ggplot2, you need to modify this answer by z.lin, note I take the sqrt to make the smaller plots more visible:
ggplot(df_long,
aes(x = Class, y = p, width = sqrt(total), fill = name)) +
geom_col(colour = "black") +
geom_text(aes(label = p), position = position_stack(vjust = 0.5)) +
facet_grid(~Class, scales = "free_x", space = "free_x") +
theme_void()
Does anyone know if there is a way to add variable labels to the ggparcoord function in GGally? I've tried numerous ways with geom_text, but nothing is yielding results.
To be more explicit, I am looking to pass the row.names(mtcars) through geom_text. The only way that I can distinguish the car is passing row.names(mtcars) through the groupColumn argument, but I don't like the way this looks.
Doesn't work:
mtcars$carName <- row.names(mtcars) # This becomes column 12
library(GGally)
# Attempt 1
ggparcoord(mtcars,
columns = c(12, 1, 6),
groupColumn = 1) +
geom_text(aes(label = carName))
# Attempt 2
ggparcoord(mtcars,
columns = c(12, 1, 6),
groupColumn = 1,
mapping = aes(label = carName))
Any ideas would be appreciated!
Solution 1: If you want to stick close to your original attempt, you can calculate the appropriate y coordinates for the car names, & add that as a separate data source. Use inherit.aes = FALSE so that this geom_text layer doesn't inherit anything from the ggplot object created using ggparcoord():
library(dplyr)
p1 <- ggparcoord(mtcars,
columns = c(12, 1, 6),
groupColumn = 1) +
geom_text(data = mtcars %>%
select(carName) %>%
mutate(x = 1,
y = scale(as.integer(factor(carName)))),
aes(x = x, y = y, label = carName),
hjust = 1.1,
inherit.aes = FALSE) +
# optional: remove "carName" from x-axis labels
scale_x_discrete(labels = function(x) c("", x[-1])) +
# also optional: hide legend, which doesn't really seem relevant here
theme(legend.position = "none")
p1
Solution 2: This alternative uses carName as the group column, & doesn't pass it as one of the parallel coordinate columns. (which I think this might be closer to the use cases intended by this function...) Specifying carName as the group column allows the car name values to be captured in the data slot of the ggplot object created by ggparcoord() this time, so our geom_text label can inherit it directly, & even filter only for rows corresponding to variable == "mpg" (or whatever the first of the parallel coordinate columns is named, in the actual use case). The y coordinates are not as evenly spread out as above, but geom_text_repel from the ggrepel package does a decent job at shifting overlapping text labels away from one another.
library(dplyr)
library(ggrepel)
p2 <- ggparcoord(mtcars,
columns = c(1, 6),
groupColumn = "carName") +
geom_text_repel(data = . %>%
filter(variable == "mpg"),
aes(x = variable, y = value, label = carName),
xlim = c(NA, 1)) + # limit repel region to the left of the 1st column
theme(legend.position = "none") # as before, hide legend since the labels
# are already in the plot
p2
Solution 3 / 4: You can actually plot the same with ggplot(), without relying on extensions that may do unexpected stuff behind the scenes:
library(dplyr)
library(tidyr)
library(ggrepel)
# similar output to solution 1
p3 <- mtcars %>%
select(carName, mpg, wt) %>%
mutate(carName.column = as.integer(factor(carName))) %>%
gather(variable, value, -carName) %>%
group_by(variable) %>%
mutate(value = scale(value)) %>%
ungroup() %>%
ggplot(aes(x = variable, y = value, label = carName, group = carName)) +
geom_line() +
geom_text(data = . %>% filter(variable == "carName.column"),
hjust = 1.1) +
scale_x_discrete(labels = function(x) c("", x[-1]))
p3
# similar output to solution 2
p4 <- mtcars %>%
select(carName, mpg, wt) %>%
gather(variable, value, -carName) %>%
group_by(variable) %>%
mutate(value = scale(value)) %>%
ungroup() %>%
ggplot(aes(x = variable, y = value, label = carName, group = carName)) +
geom_line() +
geom_text_repel(data = . %>% filter(variable == "mpg"),
xlim = c(NA, 1))
p4
Edit
You can add text labels on the right as well, for each of the above. Do note that the location for labels may not be nicely spaced out, since they are positioned according to wt's scaled values:
p1 +
geom_text(data = mtcars %>%
select(carName, wt) %>%
mutate(x = 3,
y = scale(wt)),
aes(x = x, y = y, label = carName),
hjust = -0.1,
inherit.aes = FALSE)
p2 +
geom_text_repel(data = . %>%
filter(variable == "wt"),
aes(x = variable, y = value, label = carName),
xlim = c(2, NA))
p3 +
geom_text(data = . %>% filter(variable == "wt"),
hjust = -0.1)
p4 +
geom_text_repel(data = . %>% filter(variable == "wt"),
xlim = c(2, NA))