automatically adding brackets or braces to ggplot and plotly objects - r

I want to automate adding brackets/braces to a ggplot object and then convert it to plotly using ggplotly.
library(tidyverse)
library(plotly)
#devtools::install_github("NicolasH2/ggbrace")
library(ggbrace)
set.seed(10)
mydata <- data.frame(xx = c(1:10), yy = sample(0:10, 10))
my_bracket_data <- data.frame(start = c(1, 4, 6),
end = c(3, 5, 9),
info = c("first", "second", "third"),
y_bottom = rep(11, 3),
y_top = rep(12, 3))
I can do this by using geom_brace but it involves manually typing it out for each brace rather than using the data that is already stored in a data.frame:
plot_geom_brace <- ggplot(data = mydata, aes(xx, yy)) +
geom_line(size = 1.5) +
geom_brace(aes(x = c(1, 3), y = c(11, 12), label = "first"), inherit.data = F, labelsize = 5) +
geom_brace(aes(x = c(4, 5), y = c(11, 12), label = "second"), inherit.data = F, labelsize = 5) +
geom_brace(aes(x = c(6, 9), y = c(11, 12), label = "third"), inherit.data = F, labelsize = 5)
plot_geom_brace
ggplotly(plot_geom_brace)
Is there a way that I can do this without repeatedly writing geom_brace layers for each brace (and instead access the data from my_bracket_data directly)?
As an aside this can be automated using geom_bracket but this is not supported by plotly yet.
library(ggpubr)
plot_geom_bracket <- ggplot(data = mydata, aes(xx, yy)) +
geom_line(size = 1.5) +
geom_bracket(xmin = my_bracket_data$start,
xmax = my_bracket_data$end,
y.position = rep(11, 3),
label = my_bracket_data$info,
inherit.aes = FALSE)
plot_geom_bracket
ggplotly(plot_geom_bracket)
# Warning message:
# In geom2trace.default(dots[[1L]][[1L]], dots[[2L]][[1L]], dots[[3L]][[1L]]) :
# geom_GeomBracket() has yet to be implemented in plotly.
# If you'd like to see this geom implemented,
# Please open an issue with your example code at
# https://github.com/ropensci/plotly/issues
Any suggestions?
Thanks

An option could be by creating two dataframes of your bracket data. One dataframe for the geom_braces by converting the data to a longer format with pivot_longer to create three geom braces through the aes. To get the labels you can create a small summarise table with the x and y positions per group of info. Here is some reproducible code:
library(tidyverse)
library(plotly)
#devtools::install_github("NicolasH2/ggbrace")
library(ggbrace)
set.seed(10)
mydata <- data.frame(xx = c(1:10), yy = sample(0:10, 10))
my_bracket_data <- data.frame(start = c(1, 4, 6),
end = c(3, 5, 9),
info = c("first", "second", "third"),
y_bottom = rep(11, 3),
y_top = rep(12, 3))
# Data for geom_brace
my_bracket_data_long <- my_bracket_data %>%
pivot_longer(cols = c(start, end), values_to = "x_value", names_to = "x_names") %>%
pivot_longer(cols = c(y_bottom, y_top), values_to = "y_value", names_to = "y_names")
# data for labels braces
my_bracket_data_labels <- my_bracket_data_long %>%
group_by(info) %>%
summarise(x_pos = mean(x_value),
y_pos = mean(y_value) + 1)
# plot
plot_geom_brace <- ggplot(data = mydata, aes(xx, yy)) +
geom_line(size = 1.5) +
geom_brace(data = my_bracket_data_long, aes(x = x_value, y = y_value, group = info)) +
geom_text(data = my_bracket_data_labels, aes(x = x_pos, y = y_pos, group = info, label = info))
ggplotly(plot_geom_brace)
Created on 2023-01-07 with reprex v2.0.2

special ggplot2 libraries like ggpubr usually don't play along well with conversion to plotly objects.
If you dont want to type out each geom_bracet call you could loop over the rows of the dataframe, create the geom statement using paste and pass it to the existing plot object using eval in the following line:
m<- data.frame(s = c(1, 4, 6), ## = my_bracket_data
e = c(3, 5, 9),
i = c("first", "second", "third"),
y_b = rep(11, 3),
y_t = rep(12, 3))
p<- ggplot(data = mydata, aes(xx, yy)) + geom_line(size = 1.5)
for (i in 1:NROW(my_bracket_data)) {
input = paste('geom_brace(aes(x =c(',m[i,]$s,',',m[i,]$e'),c(',m[i,]$y_b,',',
m[i,]$y_t,'),label=',m[i,]$i,'), inherit.data = F, labelsize = 5)',sep='')
p = p + eval(parse(text=input))
}
p
However this is more of a hacky solution, but that's what R tends to become if you incorporate/mix different styles like for example apply functions with tidyr syntax (or in this case ggplot, which could be seen as an ancestor of tidyr) and more programming style approaches (for, while , func...[yes you can programm in R]) and also want to let it automatically converse the whole thing to a Javascript thing (aka plotly) . .. its a beautiful mess .

Related

Avoid overlap of points on a timeline (1-D repeling)

I want to create a timeline plot that roughly resembles the example below: lots of overlap at some points, not a lot of overlap at others.
What I need: overlapping images should repel each other where necessary, eliminating or reducing overlap. Ideally I'd be able to implement either a vertical or horizontal repel.
library(tidyverse)
library(ggimage)
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) )
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = date, y = group, image = img, group = group), asp = 1)
Something similar to the repelling in ggbeeswarm::geom_beeswarm or ggrepel::geom_text_repel would be nice, but those don't support images. So I think I need to pre-apply some kind of 1-dimensional packing algorithm, implementing iterative pair-wise repulsion on my vector of dates within each group, to try to find a non-overlapping arrangement.
Any ideas? Thank you so much!
Created on 2021-10-30 by the reprex package (v2.0.1)
Here is the solution I’ve been able to come up with, repurposing the circleRepelLayout function from the awesome packcircles package
into the repel_vector vector function that takes in your overlapping vector and a "repel_radius", and returns, if possible, a non-overlapping version.
I demonstrate the solution with the richtext geom since this is a geom I’ve always wished had repel functionality.
library(packcircles)
library(tidyverse)
library(ggtext)
library(ggimage)
repel_vector <- function(vector, repel_radius = 1, repel_bounds = range(vector)){
stopifnot(is.numeric(vector))
repelled_vector <-
packcircles::circleRepelLayout(x = data.frame(vector, ypos = 1, repel_radius),
xysizecols = c("vector", "ypos", "repel_radius"),
xlim = repel_bounds, ylim = c(0,1),
wrap = FALSE) %>%
as.data.frame() %>%
.$layout.x
return(repelled_vector)
}
overlapping_vec <- c(1, 1.1, 1.2, 10, 10.1, 10.2)
repelled_vec_default <- repel_vector(overlapping_vec)
repelled_vec_tighter <- repel_vector(overlapping_vec, repel_radius = 0.35)
ggplot() +
annotate("richtext", x = overlapping_vec, y = 3, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_default, y = 2, label = "**test**", alpha = 0.5) +
annotate("richtext", x = repelled_vec_tighter, y = 1, label = "**test**", alpha = 0.5) +
scale_y_continuous(breaks = 1:3, labels = c("Tighter repel", "Default repel", "Overlapping points"))
In theory you apply this to 2D repelling as well.
To solve the problem in my question, this can be applied like so:
test_img <- list.files(system.file("extdata", package="ggimage"), pattern="png", full.names=TRUE)
set.seed(123)
df <-
tibble(date = as.Date(paste0("2020-", round(runif(45, 1, 2)), "-", round(runif(45, 1, 10)))),
group = paste0("Timeline ", rep(1:9, each = 5)),
img = sample(test_img, size = 45, replace = T) ) %>%
group_by(group) %>%
mutate(repelled_date = repel_vector(as.numeric(date),
repel_radius = 4,
repel_bounds = range(as.numeric(date)) + c(-3,3)),
repelled_date = as.Date(repelled_date, origin = "1970-01-01"))
df %>%
ggplot() +
geom_line(aes(x = date, y = group, group = group), size = 5, alpha = 0.2) +
geom_image(aes(x = repelled_date, y = group, image = img, group = group), asp = 1)
Created on 2021-10-30 by the reprex package (v2.0.1)

How to label lines obtained using split with ggrepel

I am trying to label 4 lines grouped by the value of variable cc. To label the lines I use ggrepel but I get all the 4 labels instead of 2 for each graph. How to correct this error?
The location of the labels is in this example at the last date but I want something more flexible: I want to locate each of the 4 labels in specific points that I chose (e.g. b at date 1, a at date 2, etc.). How to do that?
library(tidyverse)
library(ggrepel)
library(cowplot)
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = country)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(df, date == 4),
aes(label = country)) +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))
Created on 2021-08-18 by the reprex package (v2.0.0)
Here I make a tibble to hold color and position preferences, and join that to df.
The geom_text_repel line should probably use i instead of df so that it's split the same way as the line. The only trouble is this forces us to specify that we want four colors up front, since otherwise each chart would just use the two it needs.
set.seed(1234)
df <- tibble(date = c(rep(1,4), rep(2,4), rep(3,4), rep(4,4)),
country = rep(c('a','b','c','d'),4),
value = runif(16),
cc = rep(c(1,1,2,2),4))
label_pos <- tibble(country = letters[1:4],
label_pos = c(2, 1, 3, 2),
color = RColorBrewer::brewer.pal(4, "Set2")[1:4])
df <- df %>% left_join(label_pos)
df$cc <- as.factor(df$cc)
# make list of plots
ggList <- lapply(split(df, df$cc), function(i) {
ggplot(i, aes(x = date, y = value, color = color)) +
geom_line(lwd = 1.1) +
geom_text_repel(data = subset(i, date == label_pos),
aes(label = country), box.padding = unit(0.02, "npc"), direction = "y") +
scale_color_identity() +
theme(legend.position = "none")
})
# plot as grid in 1 columns
cowplot::plot_grid(plotlist = ggList, ncol = 1,
align = 'v', labels = levels(df$cc))

Equivalent of gganimate::transition_events on plotly

In R, using gganimate, one can make an animated plot where events appear and disappear with time. For example:
library(lubridate)
library(gganimate)
df=data.frame(
x=c(1,2,3,4),
y=c(1,2,3,4),
start=c(1,2,3,4),
end=c(5,6,7,8),
en=as_date(1),
ex=as_date(1))
ggplot(data=df, aes(x=x,y=y))+
geom_point()+
gganimate::transition_events(
start=start,
end=end,
enter_length = as.numeric(en),
exit_length = as.numeric(ex))
This produces a plot in which points appear according to column "start" and desappear according to column "end".
I wonder if there is an easy way to achieve the same in with plotly (preferably using ggplotly()), getting a slider to move along the time.
Here is an example using ggplotly. The result however isn't exactly the same:
library(plotly)
library(lubridate)
df = data.frame(
x = c(1, 2, 3, 4),
y = c(1, 2, 3, 4),
start = c(1, 2, 3, 4),
end = c(5, 6, 7, 8),
en = as_date(1),
ex = as_date(1)
)
frame_list <- Map(seq, from = df$start, to = df$end)
DF <- data.frame(x = rep(df$x, times = lengths(frame_list)),
y = rep(df$y, times = lengths(frame_list)),
frame = unlist(frame_list))
p <- ggplot(DF, aes(x, y)) +
geom_point(aes(size = y, frame = frame))
fig <- ggplotly(p)
fig %>%
animation_opts(
frame = 0,
easing = "linear",
redraw = FALSE,
mode = "immediate"
)
fig

Replicate position_fill calculations

This is possibly more of a maths question than a coding one, but this has me really stuck! I have a data frame in the format of this small reproducible example:
library(dplyr)
library(ggplot2)
df <- data.frame(sample = c(rep(1, 3), rep(2, 3), rep(3 , 3), rep(4 , 3)),
condition = c(rep("a", 6), rep("b", 6)),
amino_acid = rep(c("Asx", "Glx", "Ser"), 4),
conc = c(2, 4, 3, 6, 7, 5, 1, 2, 1, 8, 2, 5),
total_conc = c(rep(9, 3), rep(12, 3), rep(4 , 3), rep(15 , 3))) %>%
mutate(pc_conc = conc / total_conc * 100)
I want to look at the change in the proportion of amino_acid concs (i.e. change in pc_conc) with the conditions a and b, which I can do using fun.y = "mean" and position = position_fill():
ggplot(df, aes(x = condition, y = pc_conc, fill = amino_acid)) +
geom_bar(stat = "summary", fun.y = "mean", na.rm = TRUE,
position = position_fill())
What I want is the values of each of these bars (i.e. the average proportion of conc for each amino_acid for each condition). If I average pc_conc in the original dataframe you can see that I don't get the same result:
df2 <- df %>% group_by(condition, amino_acid) %>%
summarise_at(vars(pc_conc), mean)
ggplot(df2, aes(x = condition, y = pc_conc, fill = amino_acid)) +
geom_bar(stat = "summary", fun.y = "identity", na.rm = TRUE)
How can I either extract the values calculated by the position_fill() function from the ggplot object or replicate those calculations so I have the same numbers? I've tried averaging the original conc and total_conc variables and calculating pc_conc but that also doesn't create the normalised percentages I'm looking for.

Conditional formatting of table in R...a better way?

Trying to improve this code. What I have worked up works but looks ugly and is VERY clumsy.
Looking for a ggplot method or something that is more user friendly. Would appreciate the tips and advice.
library("dplyr")
thi <- data.frame(RH = c(1,1,1,2,2,2,3,3,3), T = c(1,2,3,1,2,3,1,2,3), THI = c(8,8,5,7,5,10,5,8,7))
table_thi <- tapply(thi$THI, list(thi$RH, thi$T), mean) %>% as.table()
x = 1:ncol(table_thi)
y = 1:nrow(table_thi)
centers <- expand.grid(y,x)
image(x, y, t(table_thi),
col = c("lightgoldenrod", "darkgoldenrod", "darkorange"),
breaks = c(5,7,8,9),
xaxt = 'n',
yaxt = 'n',
xlab = '',
ylab = '',
ylim = c(max(y) + 0.5, min(y) - 0.5))
text(round(centers[,2],0), round(centers[,1],0), c(table_thi), col= "black")
mtext(paste(attributes(table_thi)$dimnames[[2]]), at=1:ncol(table_thi), padj = -1)
mtext(attributes(table_thi)$dimnames[[1]], at=1:nrow(table_thi), side = 2, las = 1, adj = 1.2)
abline(h=y + 0.5)
abline(v=x + 0.5)
How about this:
library(dplyr)
library(ggplot2)
thi <- data.frame(
RH = c(1, 1, 1, 2, 2, 2, 3, 3, 3),
T = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
THI = c(8, 8, 5, 7, 5, 10, 5, 8, 7)
)
names(thi) = c('col1', 'col2', 'thi')
ggplot(thi, aes(x = col1, y = col2, fill = factor(thi), label = thi)) +
geom_tile() +
geom_text()
Or depending on whether thi is really factor (discrete) or continuous variable, you may want something like this:
ggplot(thi, aes(x = col1, y = col2, fill = thi, label = thi)) +
geom_tile() +
geom_text(color = 'white')
Note: You probably want to avoid using column or variable names that are reserved words or abbreviations (e.g. avoid calling something T because that's an abbreviation for the keyword TRUE). In the code above, I renamed the columns of your data.frame.
Since the question says conditional formatting of a table, however, you may want to consider the gt package:
library(gt)
thi %>% gt()
Or this:
thi %>% gt() %>%
data_color(
columns = vars(thi),
colors = scales::col_factor(
palette = "Set1",
domain = NULL
))
Or maybe this:
thi %>% gt() %>%
tab_style(
style = cells_styles(
bkgd_color = "#F9E3D6",
text_style = "italic"),
locations = cells_data(
columns = vars(thi),
rows = thi <= 7
)
)

Resources