Combine multiple facet strips across columns in ggplot2 facet_wrap - r

I am trying to combine facet strips across two adjacent panels (there is always two adjacent ones with the same first ID variable, but with two different scenarios, let's call them "A" and "B"). I am not particularly wedded to the gtable + grid solution I tried, but sadly I cannot use the facet_nested() from the ggh4x package (I cannot install it on my company's server due to various restrictions that are in place and needed dependencies - I looked at using only the relevant code, but that again is not easy due to the dependencies).
A minimum viable example of the basic plot I want to make easier to read by indicating which panels "belong together" by combining the top facet strips looks like this:
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
The strips with the "1"s, the ones with the "2"s etc. should be combined (in reality it's a somewhat longer text, but this is just for illustration). I was trying to adapt an answer for a similar scenario (https://stackoverflow.com/a/40316170/7744356 - thank you #markus for finding it again), but this is what I tried. As you can see below, the height of what I produce seems wrong. I assume this must be some trivial thing I am overlooking/not understanding.
# Combine strips for a ID
g <- ggplot_gtable(ggplot_build(p1))
strip <- gtable_filter(g, "strip-t", trim = FALSE)
stript <- which(grepl('strip-t', g$layout$name))
stript2 = stript[idx*2-1]
top <- strip$layout$t[idx*2-1]
# # Using the $b below instead of b = top[i]+1, also seems not to work
#bot <- strip$layout$b[idx*2-1]
l <- strip$layout$l[idx*2-1]
r <- strip$layout$r[idx*2]
mat <- matrix(vector("list",
length = length(idx)*3),
nrow = length(idx))
mat[] <- list(zeroGrob())
res <- gtable_matrix("toprow", mat,
unit(c(1, 0, 1), "null"),
unit( rep(1, length(idx)),
"null"))
for (i in 1:length(stript2)){
if (i==1){
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(g, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
} else {
zz <- res %>%
gtable_add_grob(g$grobs[[stript2[i]]]$grobs[[1]], 1, 1, 1, 3) %>%
gtable_add_grob(zz, .,
t = top[i],
l = l[i],
b = top[i]+1,
r = r[i],
name = c("add-strip"))
}
}
grid::grid.draw(zz)
------------ Update with a ggh4x implementation -----------------
This may solve this type of problem for many, but has its downsides (e.g. axes alignment across rows gets a bit manual, probably need to manually remove x-axes and ensure the limits are the same, add a unified y-axis label, requires installation of a package from github: devtools::install_github("teunbrand/ggh4x#v0.1") for a specific version, plus cowplot interacts badly with e.g. ggtern). So I'd love it, if someone still managed to do a pure gtable + grid version.
library(tidyverse)
library(ggh4x)
library(cowplot)
plots = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()),
plotrow=(id-1)%/%4+1) %>%
group_by(plotrow) %>%
group_map( ~ ggplot(data=.,
aes(x=x,y=y)) +
geom_jitter() +
facet_nested( ~ id + id2, ))
plot_grid(plotlist = plots, nrow = 4, ncol=1)

I'm a bit late to this game, but ggh4x now has a facet_nested_wrap() implementation that should greatly simplify this problem (disclaimer: I wrote ggh4x).
library(tidyverse)
library(ggh4x)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_nested_wrap(~id + id2, nrow = 4, ncol=8)
p1
Created on 2020-08-12 by the reprex package (v0.3.0)
Keep in mind that there might still be a few bugs in this. Also, I'm aware that this doesn't help the OP because his package versions are constrained, but I thought I mention this here anyway.

Here's a reprex of a somewhat pedestrian way to do it in grid. I have made the "parent" facet somewhat darker to emphasise the nesting, but if you prefer the color to match just change the rectGrob fill color to "gray85".
# Set up plot as per example
library(tidyverse)
library(gtable)
library(grid)
idx = 1:16
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n())) %>%
ggplot(aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id + id2, nrow = 4, ncol=8)
g <- ggplot_gtable(ggplot_build(p1))
# Code to produce facet strips
stript <- grep("strip", g$layout$name)
grid_cols <- sort(unique(g$layout[stript,]$l))
t_vals <- rep(sort(unique(g$layout[stript,]$t)), each = length(grid_cols)/2)
l_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 1], length = length(t_vals))
r_vals <- rep(grid_cols[seq_along(grid_cols) %% 2 == 0], length = length(t_vals))
labs <- levels(as.factor(p1$data$id))
for(i in seq_along(labs))
{
filler <- rectGrob(y = 0.7, height = 0.6, gp = gpar(fill = "gray80", col = NA))
tg <- textGrob(label = labs[i], y = 0.75, gp = gpar(cex = 0.8))
g <- gtable_add_grob(g, filler, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("filler", i))
g <- gtable_add_grob(g, tg, t = t_vals[i], l = l_vals[i], r = r_vals[i],
name = paste0("textlab", i))
}
grid.newpage()
grid.draw(g)
And to demonstrate changing the rectGrob to 50% height and "gray85":
Or if you wanted you could assign a different fill for each cycle of the loop:
Obviously the above method might take a few tweaks to fit other plots with different numbers of levels etc.
Created on 2020-07-04 by the reprex package (v0.3.0)

Maybe this can not tackle the issue, but I would like to post because it could help to present results in a different plot keeping the same structure. You will have to define the number of columns for the plot in plot_layout(ncol = 4). This code uses patchwork package. Hope this can be useful.
library(tidyverse)
library(gtable)
library(grid)
library(patchwork)
idx = 1:16
#Data
p1 = expand_grid(id=idx, id2=c("A", "B"), x=1:10) %>%
mutate(y=rnorm(n=n()))
#Split data
List <- split(p1,p1$id)
#Sketch function
myplot <- function(x)
{
d <- ggplot(x,aes(x=x,y=y)) +
geom_jitter() +
facet_wrap(~id2, nrow = 1, ncol=2)+
ggtitle(unique(x$id))+
theme(plot.title = element_text(hjust = 0.5))
return(d)
}
#List of plots
Lplots <- lapply(List,myplot)
#Concatenate plots
#Create chain for plots
chain <- paste0('Lplots[[',1:length(Lplots),']]',collapse = '+')
#Evaluate the object and create the plot
Plot <- eval(parse(text = chain))+plot_layout(ncol = 4)+
plot_annotation(title = 'A nice plot')&theme(plot.title = element_text(hjust=0.5))
#Display
Plot
You will end up with a plot like this:

Related

how to (minimally) adjust geom_text() labels to avoid overlap on stacked geom_col()

How can I place non-overlapping direct labels on a stacked column chart?
geom_text_repel() is moving labels that don't need to be moved, creating readability problems.
library(tibble)
library(tidyr)
library(ggplot2)
library(ggrepel)
library(dplyr)
library(scales)
set.seed(23)
n <- 4
mu <- 4E6
sales <- tibble(
year=as.factor(seq(2019, length.out=n)),
A = rnorm(n=n, mean=mu, sd=mu/4),
B = rnorm(n=n, mean=6*mu, sd=mu),
C = rnorm(n=n, mean=mu/5, sd=mu/40),
D = rnorm(n=n, mean=mu/10, sd=mu/40),
E = rnorm(n=n, mean=4*mu, sd=mu)
) %>% pivot_longer(!year, names_to="product", values_to="sales")
p <- sales %>%
group_by(year, product) %>%
summarise(sales=sum(sales)) %>%
mutate(pct_of_year_sales = sales/sum(sales),
label=paste(
scales::label_dollar(scale=1/1E6, suffix="M", accuracy=0.1)(sales),
scales::label_percent(accuracy=0.1)(pct_of_year_sales),
sep=", ")) %>%
ggplot(aes(x=year, y=sales, fill=product, label=label)) +
geom_col() +
scale_y_continuous(labels = scales::label_dollar(scale=1/1E6, suffix="M"),
expand = expansion(mult = c(0, .05)))
p + geom_text(position=position_stack(vjust=0.5)) +
labs(title="geom_text()",
subtitle="overlapping labels")
p + geom_text_repel(position=position_stack(vjust = 0.5),
direction="y") +
labs(title="geom_text_repel()",
subtitle="text for series A moves needlessly")
Created on 2023-02-02 by the reprex package (v2.0.1)
I agree wholly with #tjebo's comment, sometimes adding so much text is not the best way. But sometimes it is mandated (out of our control), so I suggest addressing one of your complaints:
geom_text_repel() is moving labels that don't need to be moved
We can use geom_text on most (filtering by percentage sales) and geom_text_repel on the few.
Normally, I'd use data = ~ filter(., pct_of_year_sales <= 0.015)) (and > for the main), but the stacking is disrupted when the number and values of columns are disrupted. Instead, we can create two sets of labels where some are empty ("" or NA) depending on their pct_of_year_sales value. This way, geom_text_repel gets to see all columns/values and will place them appropriately.
dat <- sales %>%
group_by(year, product) %>%
summarise(sales=sum(sales), .groups="drop") %>%
mutate(pct_of_year_sales = sales/sum(sales),
label = paste(
scales::label_dollar(scale=1/1E6, suffix="M", accuracy=0.1)(sales),
scales::label_percent(accuracy=0.1)(pct_of_year_sales),
sep=", ")) %>%
mutate(
label1 = if_else(pct_of_year_sales > 0.015, label, NA_character_),
label2 = if_else(pct_of_year_sales <= 0.015, label, NA_character_)
)
p <- ggplot(dat, aes(x=year, y=sales, fill=product, label=label1)) +
geom_col() +
scale_y_continuous(labels = scales::label_dollar(scale=1/1E6, suffix="M"),
expand = expansion(mult = c(0, .05))) +
geom_text(position=position_stack(vjust=0.5), na.rm = TRUE) +
labs(title="geom_text()",
subtitle="overlapping labels")
p
p +
ggrepel::geom_text_repel(
min.segment.length = 0, force = 10,
aes(label = label2),
position=position_stack(vjust = 0.5), hjust = 0.25,
direction="y", na.rm=TRUE)
I added hjust= to shift them a little, also helping to clarify the segment lines (without hjust, they tend to connect with the commas, which is a visually-distracting artifact). You may want to play with force= or other segment-line aesthetics to break them out more clearly. (It would be feasible, for instance, to define hjust within the frame itself and assign that aesthetic within aes(..) instead, in order to control the horizontal justification per-year, for instance. Just a thought.)
I think I like #r2evans answer better, as it utilizes the tools as they're meant to be used. I went a different route.
It's really never too early to start learning about the grid...
This uses the libraries grid and gridExtra.
First, I saved the plot to an object, and investigated where the labels were and what settings were applied (using geom_text, not ...repel).
In grid, you can set the justification for each label. So I made the vertical justification of those values in C to 0 and in D to 1. This was enough in my plot pane... however, depending on the size of your graph, you may have to go to values that are further apart. Just remember that .5 is the middle, not 0.
See my code comments for a more in-depth explanation.
library(grid)
library(gridExtra)
pp <- p + geom_text(position=position_stack(vjust=0.5)) +
labs(title="geom_text()",
subtitle="overlapping labels")
pg <- ggplotGrob(pp) # create grid (gtable) of graph
# the labels in geom_text
pg$grobs[[6]]$children$GRID.text.246$label # use this to see the label order
# lbls top to btm then left to right
# get vjust to modify
gimme <- pg$grobs[[6]]$children$GRID.text.246$vjust
# which indices need to change (C and D labels)
ttxt <- seq(from = 3, by = 5, length.out = 4) # 5 labels in column
btxt <- seq(from = 4, by = 5, length.out = 4) # 4 columns
gimme[ttxt] <- 0 # set C to top of vspace
gimme[btxt] <- 1 # set D to bottom of vspace
# replace existing vjust
pg$grobs[[6]]$children$GRID.text.246$vjust <- gimme
You can view the plot with grid or change it back to a ggplot object. To change it back to a ggplot object you can either use ggplotify::as.ggplot or ggpubr::as_ggplot, they do the same thing.
plot.new()
grid.draw(pg)
# back to ggplot obj
ggpubr::as_ggplot(pg)

geom_boxplot with mapped, variable widths per bar?

I would like to be able to map the width of each of the boxplots in a plot to a variable, or otherwise specify it. Let's say I want the relative widths of the boxes in the figure below to be 1, 2, 3. Setting varwidth won't help me since the actual numbers of observations are the same for each bar.
I have the beginnings of a horrible hacky solution I can post, but would welcome something actually good!
library(ggplot2)
set.seed(101)
dd <- data.frame(f = factor(rep(LETTERS[1:3], each = 10)),
y = rnorm(30))
g1 <- ggplot(dd, aes(f,y)) + geom_boxplot()
print(g1)
My basic idea (which would take more work to make it nice) is to ggplot_build(); hack the relevant elements in the data for the layer; and redraw the plot: example below. Obviously not as nice as having a real mapping/scaling system but maybe OK for simple cases ...
rel_wid <- c(1, 2, 3)
g1B <- ggplot_build(g1)
newdat <- g1B$data[[1]]
wids <- mean(newdat$new_width)*rel_wid/mean(rel_wid)
newdat <- within(newdat,
{
xmin <- newx - wids/2
xmax <- newx + wids/2
})
g2 <- g1B
g2$data[[1]] <- newdat
library(grid)
grid.draw(ggplot_gtable(g2))
A bit more concise and does it all in one gulp, without having to build the plot first:
library(ggplot2)
set.seed(101)
dd <- data.frame(f = factor(rep(LETTERS[1:3], each = 10)), y = rnorm(30))
ggplot(dd, aes(f,y)) +
Map(\(a, b) geom_boxplot(data = a, width = b), split(dd, dd$f), 1:3 * 0.35)
Created on 2023-02-07 with reprex v2.0.2

violin plot in R and values in X axis

I created these two violin plots in R, using:
install.packages("vioplot")
par(mfrow = c(1, 2))
vioplot::vioplot(HEL$Y,las=2,main="HEL$Y",col="deepskyblue",notch=TRUE)
vioplot::vioplot(ITA$Y,las=2,main="ITA$Y",col="aquamarine",notch=TRUE)
as a result I get the following. However, I don't know why in the X axis I get 1 and 2. How can I get rid of the 2?
Thanks for your help.
This mysterious behavior is due to the use of the argument "notch = TRUE". Example:
set.seed(456)
vioplot(rnorm(10), notch = TRUE)
My interpretation is that notch is not an argument of vioplot, so the function interprets it as data to add to the graph (see the little smudge at y = 1: that's where it wants to put the new data, since TRUE equals 1 when it is converted into a numeric).
To confirm that an unknown argument is interpreted as data to be plotted, here is a little experiment:
vioplot(rnorm(10), unknown_argument = rnorm(10))
And the result:
This is a ggplot2 solution in case you're interested.
library(ggplot2)
library(dplyr)
# Recreate similar data
HEL <- data.frame(Y = rnorm(50, 8, 3))
ITA <- data.frame(Y = rnorm(50, 9, 2))
# Join in a single dataframe and reshape to longer format
dat <- bind_rows(rename(HEL, hel_y = Y),
rename(ITA, ita_y = Y)) |>
tidyr::pivot_longer(everything())
# Make the plots
dat |>
ggplot(aes(name, value)) +
geom_violin(aes(fill = name)) +
geom_boxplot(width = 0.1) +
scale_fill_manual(values = c("deepskyblue", "aquamarine")) +
theme(legend.position = "")
Created on 2022-04-28 by the reprex package (v2.0.1)

breaks at integer powers of ten on ggplot2 log10 axes

Transforming ggplot2 axes to log10 using scales::trans_breaks() can sometimes (if the range is small enough) produce un-pretty breaks, at non-integer powers of ten.
Is there a general purpose way of setting these breaks to occur only at 10^x, where x are all integers, and, ideally, consecutive (e.g. 10^1, 10^2, 10^3)?
Here's an example of what I mean.
library(ggplot2)
# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
x = rep(1:3, 3),
y = 10^seq(from = -4, to = 1, length.out = 9))
p <- ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ fct, scales = "free_y") # faceted to try and emphasise that it's general purpose, rather than specific to a particular axis range
The unwanted result -- y-axis breaks are at non-integer powers of ten (e.g. 10^2.8)
p + scale_y_log10(
breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
I can achieve the desired result for this particular example by adjusting the n argument to scales::trans_breaks(), as below. But this is not a general purpose solution, of the kind that could be applied without needing to adjust anything on a case-by-case basis.
p + scale_y_log10(
breaks = scales::trans_breaks("log10", function(x) 10^x, n = 1),
labels = scales::trans_format("log10", scales::math_format(10^.x))
)
Should add that I'm not wed to using scales::trans_breaks(), it's just that I've found it's the function that gets me closest to what I'm after.
Any help would be much appreciated, thank you!
Here is an approach that at the core has the following function.
breaks = function(x) {
brks <- extended_breaks(Q = c(1, 5))(log10(x))
10^(brks[brks %% 1 == 0])
}
It gives extended_breaks() a narrow set of 'nice numbers' and then filters out non-integers.
This gives us the following for you example case:
library(ggplot2)
library(scales)
#> Warning: package 'scales' was built under R version 4.0.3
# dummy data
df <- data.frame(fct = rep(c("A", "B", "C"), each = 3),
x = rep(1:3, 3),
y = 10^seq(from = -4, to = 1, length.out = 9))
ggplot(df, aes(x, y)) +
geom_point() +
facet_wrap(~ fct, scales = "free_y") +
scale_y_continuous(
trans = "log10",
breaks = function(x) {
brks <- extended_breaks(Q = c(1, 5))(log10(x))
10^(brks[brks %% 1 == 0])
},
labels = math_format(format = log10)
)
Created on 2021-01-19 by the reprex package (v0.3.0)
I haven't tested this on many other ranges that might be difficult, but it should generalise better than setting the number of desired breaks to 1. Difficult ranges might be those just in between -but not including- powers of 10. For example 11-99 or 101-999.

Displaying saved ggplots in a grid when saved in a matrix in r

I have created plots using a loop and saved them in a matrix (see my earlier question). I now want to arrange the plots in a grid using plot_grid or similar. Is there a simple way to call the matrix of saved plots? I want the resulting grid of plots to match the layout of the matrix.
library(ggplot2)
library(gridExtra)
library(cowplot)
# create and save the plots to the matrix
plt <- vector('list', 10)
plt <- matrix(plt, nrow = 5, ncol = 2)
it = 1
while (it < 5){
myX = runif(10)
myY = runif(10)
df = data.frame(myX,myY)
plt[[it, 1]] = ggplot(data = df, aes(myX, myY)) +
geom_point(size = 2, color = "blue")
plt[[it, 2]] = ggplot(data = df, aes(myX, myY)) +
geom_point(size = 2, color = "red")
it = it + 1
}
# display the plots in a grid that matches the matrix format
a1<- plt[[1,1]]
b1 <-plt[[1,2]]
a2<- plt[[2,1]]
b2 <-plt[[2,2]]
plot_grid(a1, b1, a2, b2, ncol = 2)
What I have above works, but I have had to assign each of the elements of the matrix to a variable, and then manually call all the variables in order in the plot_grid command. I am trying to find a way to avoid this. I have just showed a grid of 4 here - I have many more plots in my real problem.
I have tried using plot_grid(plt, ncol = 2), which gives the error "Cannot convert object of class matrixarray into a grob." I have also tried mylist = c(plt[[1,1]], plt[[1,2]], plt[[2,1]], plt[[2,2]]) and plot_grid(mylist, ncol = 2) but get the same error.
I also tried to use do.call("grid.arrange", c(plt, ncol = 2)) based on this answer but could not get that working.
Storing non-atomic objects inside of matrices can be very messy. So first, gather the plots you want in a normal list, then, when passing the list to plot_grid, be sure do to so via the plotlist= parameter
mylist = list(plt[[1,1]], plt[[1,2]], plt[[2,1]], plt[[2,2]])
plot_grid(plotlist=mylist, ncol=2)
If you wanted to plot all the values, then you would just need to transpose the list because the list is stored in column order but is plotted in row order by default
plot_grid(plotlist=t(plt), ncol=2)
The do.call would have worked if you had a plain list and not a matrix list. YOu could have a helper function to remove the matrix part from the list when trying to plot. For example
undim <- function(x) {dim(x) <- NULL; x}
do.call("plot_grid", c(undim(plt), ncol = 2))
But here the plotlist= parameter is definitely the better way to dg.
In line with #MrFlick I would suggest to store your plots in a list which could be easily passed to plot_grid via argument plotlist
library(ggplot2)
library(gridExtra)
library(cowplot)
it = 1
plt <- list()
while (it < 5){
myX = runif(10)
myY = runif(10)
df = data.frame(myX,myY)
p1 <- ggplot(data = df, aes(myX, myY)) +
geom_point(size = 2, color = "blue")
p2 <- ggplot(data = df, aes(myX, myY)) +
geom_point(size = 2, color = "red")
plt <- c(plt, list(p1, p2))
it <- it + 1
}
# display the plots in a grid that matches the matrix format
plot_grid(plotlist = plt, ncol = 2)

Resources