Stacking multiple figures together in ggplot - r

I am attempting to make publication ready figures where the bottom axis (with tick marks) of one figure is cleanly combined with the top axis of the figure below it. Here is an example of what it might look like, although this one doesn't have tick marks on each panel:
Here is my attempt to do so, by simply using grid.arrange:
#Libraries:
library(ggplot2)
library(dplyr)
library(gridExtra)
#Filter to create two separate data sets:
dna1 <- DNase %>% filter(Run == 1)
dna2 <- DNase %>% filter(Run == 2)
#Figure 1:
dna1_plot <- ggplot(dna1, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(axis.title.x = element_blank())
#Figure 2:
dna2_plot <- ggplot(dna2, aes(x = conc, y = density)) + geom_point() + theme_classic()
#Using grid.arrange to combine:
dna <- grid.arrange(dna1_plot, dna2_plot, nrow = 2)
And an attempt with some adjustments to the plot margins, although this didn't seem to work:
dna1_plot_round2 <- ggplot(dna1, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(axis.title.x = element_blank(),
plot.margin = (0,0,0,0), "cm")
dna2_plot_round2 <- ggplot(dna2, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(plot.margin = unit(c(-0.5,-1,0,0), "cm"))
dna_round2 <- grid.arrange(dna1_plot_round2, dna2_plot_round2, nrow = 2)
Does anyone know the best way to stack figures like this in ggplot? Is there a better way than using grid.arrange? If possible it would be great to see how to do it with/without tick marks on each x axis as well.
Thank you!

You don't need any non-native ggplot stuff. Keep your data in one data frame and use facet_grid.
dna <- DNase %>% filter(Run %in% 1:2)
ggplot(dna, aes(x = conc, y = density)) +
geom_point() +
theme_bw() +
facet_grid(rows = vars(Run)) +
theme(panel.spacing = unit(0, "mm"))

The R package deeptime has a function called ggarrange2 that can achieve this. Instead of just pasting the plots together like grid.arrange (and ggarrange), it lines up all of the axes and axis labels from all of the plots.
# remove bottom axis elements, reduce bottom margin, add panel border
dna1_plot_round2 <- ggplot(dna1, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(axis.text.x = element_blank(), axis.ticks.x = element_blank(), axis.title.x = element_blank(),
plot.margin = margin(0,0,-.05,0, "cm"), panel.border = element_rect(fill = NA))
# reduce top margin (split the difference so the plots are the same height), add panel border
dna2_plot_round2 <- ggplot(dna2, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(plot.margin = margin(-.05,0,0,0, "cm"), panel.border = element_rect(fill = NA))
dna_round2 <- ggarrange2(dna1_plot_round2, dna2_plot_round2, nrow = 2)
You might also try the fairly recent patchwork package, although I don't have much experience with it.
Note that while Gregor's answer may be fine for this specific example, this answer might be more appropriate for other folks that come across this question (and see the example at the top of the question).

For your purposes, I believe Gregor Thomas' answer is best. But if you are in a situation where facets aren't the best option for combining two plots, the newish package {{patchwork}} handles this more elegantly than any alternatives I've seen.
Patchwork also provides lots of options for adding annotations surrounding the combined plot. The readME and vignettes will get you started.
library(patchwork)
(dna1_plot / dna2_plot) +
plot_annotation(title = "Main title for combined plots")
Edit to better address #Cameron's question.
According to the package creator, {{patchwork}} does not add any space between the plots. The white space in the example above is due to the margins around each individual ggplot. These margins can be adjusted using the plot.margin argument in theme(), which takes a numeric vector of the top, right, bottom, and left margins.
In the example below, I set the bottom margin of dna1_plot to 0 and strip out all the bottom x-axis ticks and text. I also set the top margin of dna2_plot to 0. Doing this nearly makes the y-axis lines touch in the two plots.
dna1_plot <- ggplot(dna1, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank(),
plot.margin = unit(c(1,1,0,1), "mm"))
#Figure 2:
dna2_plot <- ggplot(dna2, aes(x = conc, y = density)) + geom_point() + theme_classic() +
theme(plot.margin = unit(c(0,1,1,1), "mm"))
(dna1_plot / dna2_plot)

Related

Replace barplot axis labels with plots in R?

A while ago I asked this question about how to replace a barplots x-axis labels with individual plots and I received an answer. However, I'm back trying to do this again, except this time I want to flip the barplot. The issue I'm having is I cant figure out how to adapt the code in the previous answer to allow me to flip the plot.
For example, if I create some data and a barplot with the x-axis labels replaced by plots like so:
df <- data.frame(vals = c(10, 5, 18),
name = c("A", "B", "C"))
bp <- df %>%
ggplot() +
geom_bar(aes(x = name, y = vals), stat = "identity") +
xlab("") +
theme_bw() +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank())
# create plots to use as x-axis --------------------------------------------
p1 <- ggplot(df, aes(x = vals, y = vals)) + geom_point() + theme_bw() +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.title.y = element_blank(),
axis.text.y = element_blank(),
axis.ticks.y = element_blank())
p3 <- p2 <- p1
# turn into list of plots
myList <- list(p1, p2, p3)
# -------------------------------------------------------------------------
# attach plots to axis
width <- .9 # Default width of bars
p_axis <- ggplot(df) +
geom_blank(aes(x = name)) +
purrr::map2(myList, seq_along(myList), ~ annotation_custom(ggplotGrob(.x), xmin = .y - width / 2, xmax = .y + width / 2)) +
theme_void()
bp / p_axis + plot_layout(heights = c(4, 1))
That creates this:
Now, if I add in the line bp + coordflip() while creating the barplot and continue with the rest of the code, the barplot is flipped, but the individual plots remain in place, like so:
I'm guessing I need to alter the p_axis part of the code to fix the individual plots where A, B, C are shown in the above plot... but im not sure exactly what to do to fix this? I tried experimenting but have been unsuccessful so far.
I just changed in annotation_custom the xmin and xmam to ymin and ymax. Also, I changed the part bp / p_axis to p_axis|bp.
p_axis <- ggplot(df) +
geom_blank(aes(y = name)) +
purrr::map2(myList, seq_along(myList), ~ annotation_custom(ggplotGrob(.x), ymin = .y - width / 2, ymax = .y + width / 2)) +
theme_void()
p_axis|bp
Some fine-tuning of the widths are needed. Here is what it looks like now.
A simpler approach at this point is to use the ggExtra package, which has a function ggMarginal() that adds these plots with your choice of geom.
See https://geeksforgeeks.org/r-ggplot2-marginal-plots/ for a nice demonstration

gridExtra panel plot with identical panel sizes in ggplot

library(tidyverse)
library(grid)
df <- tibble(
date = as.Date(40100:40129, origin = "1899-12-30"),
value = rnorm(30, 8)
)
p1 <- ggplot(df, aes(date, value)) +
geom_line() +
scale_x_date(date_breaks = "1 day") +
theme(
axis.title.x = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0.5)
) +
coord_cartesian(xlim = c(min(df$date) + 0, max(df$date) - 0))
p2 <- ggplot(df, aes(date, value)) +
geom_bar(stat = "identity") +
scale_x_date(date_breaks = "1 day") +
theme(
axis.title.x = element_blank(),
axis.text.x = element_text(angle = 90, vjust = 0.5)
) +
coord_cartesian(xlim = c(min(df$date) + 0, max(df$date) - 0))
Let's create the plots p1 and p1 as shown above. I can plot these stacked on top of each other with widths that are exactly identical (zoom to full screen to make it obvious). Note that the dates line up perfectly. Code is directly below.
grid.newpage()
grid.draw(rbind(ggplotGrob(p1), ggplotGrob(p2), size = "last"))
Unfortunately I can't use ggsave() with this code chunk above so I go to the gridExtra package.
gridExtra::grid.arrange(p1, p2)
This almost works, but notice the dates don't quite line up perfectly, in a vertical fashion comparing the top graph to the bottom graph. So... what's the equivalent to rbind()s size = "last" to get me two grid.arrange'd objects with exactly identical widths (so the dates line up properly)?
As an alternative to grid, the new patchwork library might help here. It works with ggsave and does a good job of aligning plots.
https://github.com/thomasp85/patchwork
patchwork::plot_layout(p1 / p2)
I discovered a solution using the egg package which I think is included as part of ggplot2. I'm going to go this route to prevent having to install patchwork. It appears you need R 3.5+ to be able to install patchwork.
egg::ggarrange(p1, p2)
p <- egg::ggarrange(p1, p2)
ggsave(plot = p, "panel-plot.png")

How to manually edit a grid.arrange, ggplot_gtable and facet?

I'm ploting a Hydrograph but I additionally use facet_grid in R because I have objects with common features.
But when I use facet_grid the plot gets distorted, as shown in the figure below. How can I randerize this?
Note that it is not aligned properly, the scale of the y axis is scrambled, etc.
Among the adjustments I tried, I realized that it is possible to greatly improve this plot. I've created an image based on the above plot, some other attempts on how I'm trying and making some adjustments to paint to demonstrate what I'm trying to do.
Here's my code:
library(ggplot2)
library(grid)
library(gridExtra)
g1 <- ggplot(data_cet,
aes(x = Periodo,
y = Ind_plu)) +
geom_bar(stat = 'identity',
fill = "blue",
position = position_dodge()) +
ylab("Precip.") +
scale_y_reverse(labels = scales::comma) +
theme_bw() +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank())
g2 <- ggplot(data_cet,
aes(x = Periodo,
y = Nivel,
colour = Bomba)) +
geom_line(aes(group = 1)) +
scale_color_manual(values = c("#0B775E", "#35274A", "#F2300F")) +
labs(colour = "Status CMB") +
facet_grid(data_cet$arranjo + data_cet$Bacia ~.) +
scale_x_date(breaks = datebreaks_m,
labels = date_format("%b/%y")) +
xlab('Período') + ylab('% Nível') +
theme_bw() +
theme(axis.text.x = element_text(face = "plain",
color = "black",
angle = 90),
axis.text.y = element_text(face = "plain",
color = "black"),
legend.title = element_blank(),
strip.background = element_blank(),
legend.position = "bottom")
g1 <- ggplot_gtable(ggplot_build(g1))
g2 <- ggplot_gtable(ggplot_build(g2))
maxWidth = unit.pmax(g1$widths[2:3], g2$widths[2:3])
g1$widths[2:3] <- maxWidth
g2$widths[2:3] <- maxWidth
plot_hyd <- grid.arrange(g1, g2, ncol = 1, heights = c(1, 3))
ggsave(file = "plot_hyd4.pdf", plot_hyd)
My dataset is too large, my apologize for not showing the dataset and dput().
You could add a widths = c(0.9, 1) to grid.arrange (fiddle with the first number some) to get your graphs to line up along the right side.
Otherwise, ggsave your file to a larger pdf. Your element_text objects, such as the legend, are absolute sizes, so if you scale up the pdf dimensions your graphs will look larger by comparison.
The exact values of widths and ggsave(width, height) are going to depend on you data, and unfortunately will take some trial and error. If you're using something like RStudio, I suggest fiddling with the grid.arrange call and finding the widths argument you like before calling ggsave. When you are ready to experiment with different ggsave width and height arguments, run it at a lower dpi the first few times so it processes more quickly.
Note that since you haven't included your data, I haven't tried to recreate this problem - this is just how I've solved this kind of issue in the past. If these suggestions don't work for you, let me know and I can use some built-in datasets to find another solution
Following the logic of the #Pintintended tip for the code. I adopted the layout_matrix argument.
>
plot_hyd <- grid.arrange(g1, g2,
layout_matrix = rbind(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,NA),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
c(2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2)))
#ggsave(file="plot_hyd4.jpeg",plot_hyd,width=13,height=16,dpi=200)

decrease size of dendogram (or y-axis) ggplot

I have this code for a dendrogram. How can I decrease the size of dendrogram (or y-axis)?
I am using this code as example. In my dataset, I have large labels so I do not have space enough to include it. For that reason, I would like to reduce the space used for y axis, decrease the distance between 0 and 150. Also, when I save the figure as tiff, most of figure is the dendogram and I can not see labels clearly.
df <- USArrests # really bad idea to muck up internal datasets
labs <- paste("sta_",1:50,sep="") # new labels
rownames(df) <- labs # set new row names
library(ggplot2)
library(ggdendro)
hc <- hclust(dist(df), "ave") # heirarchal clustering
dendr <- dendro_data(hc, type="rectangle") # convert for ggplot
clust <- cutree(hc,k=2) # find 2 clusters
clust.df <- data.frame(label=names(clust), cluster=factor(clust))
# dendr[["labels"]] has the labels, merge with clust.df based on label column
dendr[["labels"]] <- merge(dendr[["labels"]],clust.df, by="label")
# plot the dendrogram; note use of color=cluster in geom_text(...)
ggplot() +
geom_segment(data=segment(dendr), aes(x=x, y=y, xend=xend, yend=yend)) +
geom_text(data=label(dendr),
aes(x, y, label=label, hjust=0, color=cluster),
size=3) +
coord_flip() +
scale_y_reverse(expand=c(0.2, 0)) +
theme(axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
axis.text.y=element_blank(),
axis.title.y=element_blank(),
panel.background=element_rect(fill="white"),
panel.grid=element_blank())
How can I decrease the size of dendogram similar than this heatmap?
(source: r-graph-gallery.com)
Thanks you so much
For flexibility, I recommend putting the dendrogram labels on the x-axis itself, rather than text labels within the plot. Otherwise no matter what values you choose for expand in the y-axis, part of the labels could be cut off for some image sizes / dimensions.
Define colour palette for the dendrogram labels:
library(dplyr)
label.colour = label(dendr)$cluster %>%
factor(levels = levels(.),
labels = scales::hue_pal()(n_distinct(.))) %>%
as.character()
For the purpose of illustration, make some labels very long:
label.values <- forcats::fct_recode(
label(dendr)$label,
sta_45_abcdefghijklmnop = "sta_45",
sta_31_merrychristmas = "sta_31",
sta_6_9876543210 = "sta_6")
Plot:
p <- ggplot(segment(dendr)) +
geom_segment(aes(x=x, y=y, xend=xend, yend=yend)) +
coord_flip() +
scale_x_continuous(breaks = label(dendr)$x,
# I'm using label.values here because I made
# some long labels for illustration. you can
# simply use `labels = label(dendr)$label`
labels = label.values,
position = "top") +
scale_y_reverse(expand = c(0, 0)) +
theme_minimal() +
theme(axis.title = element_blank(),
axis.text.y = element_text(size = rel(0.9),
color = label.colour),
panel.grid = element_blank())
p
# or if you want a color legend for the clusters
p + geom_point(data = label(dendr),
aes(x = x, y = y, color = cluster), alpha = 0) +
scale_color_discrete(name = "Cluster",
guide = guide_legend(override.aes = list(alpha = 1))) +
theme(legend.position = "bottom")
You can do this by adding a size parameter to axis.text.y like so:
theme(axis.line.y=element_blank(),
axis.ticks.y=element_blank(),
axis.text.y=element_text(size=12),
axis.title.y=element_blank(),
panel.background=element_rect(fill="white"),
panel.grid=element_blank())

Displaying multiple factors with Sina plots

NOTE: I have updated this post following discussion with Z. Lin. Originally, I had simplified my problem to a two factor design (see section "Original question"). However, my actual data consists of four factors, requiring facet_grid. I am therefore providing an example for a four factor design further below (see section "Edit").
Original question
Let's assume I have a two factor design with dv as my dependent variable and iv.x and iv.y as my factors/independent variables. Some quick sample data:
DF <- data.frame(dv = rnorm(900),
iv.x = sort(rep(letters[1:3], 300)),
iv.y = rep(sort(rep(rev(letters)[1:3], 100)), 3))
My goal is to display each condition separately as can nicely be done with violin plots:
ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_violin()
I have recently come across Sina plots and would like to do the same here. Unfortunately Sina plots don't do this, collapsing the data instead.
ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_sina()
An explicit call to position dodge doesn't help either, as this produces an error message:
ggplot(DF, aes(iv.x, dv, colour=iv.y)) + geom_sina(position = position_dodge(width = 0.5))
The authors of Sina plots have already been made aware of this issue in 2016:
https://github.com/thomasp85/ggforce/issues/47
My problem is more in terms of time. We soon want to submit a manuscript and Sina plots would be a great way to display our data. Can anyone think of a workaround for Sina plots such that I can still display two factors as in the example with violin plots above?
Edit
Sample data for a four factor design:
DF <- data.frame(dv=rnorm(400),
iv.w=sort(rep(letters[1:2],200)),
iv.x=rep(sort(rep(letters[3:4],100)), 2),
iv.y=rep(sort(rep(rev(letters)[1:2],50)),4),
iv.z=rep(sort(rep(letters[5:6],25)),8))
An example with violin plots of what I would like to create using Sina plots:
ggplot(DF, aes(iv.x, dv, colour=iv.y)) +
facet_grid(iv.w ~ iv.z) +
geom_violin(aes(y = dv, fill = iv.y),
position = position_dodge(width = 1))+
stat_summary(aes(y = dv, fill = iv.y), fun.y=mean, geom="point",
colour="black", show.legend = FALSE, size=.2,
position=position_dodge(width=1))+
stat_summary(aes(y = dv, fill = iv.y), fun.data=mean_cl_normal, geom="errorbar",
position=position_dodge(width=1), width=.2, show.legend = FALSE,
colour="black", size=.2)
Edited solution, since OP clarified that facets are required:
ggplot(DF, aes(x = interaction(iv.y, iv.x),
y = dv, fill = iv.y, colour = iv.y)) +
facet_grid(iv.w ~ iv.z) +
geom_sina() +
stat_summary(fun.y=mean, geom="point",
colour="black", show.legend = FALSE, size=.2,
position=position_dodge(width=1))+
stat_summary(fun.data=mean_cl_normal, geom="errorbar",
position=position_dodge(width=1), width=.2,
show.legend = FALSE,
colour="black", size=.2) +
scale_x_discrete(name = "iv.x",
labels = c("c", "", "d", "")) +
theme(panel.grid.major.x = element_blank(),
axis.text.x = element_text(hjust = -4),
axis.ticks.x = element_blank())
Instead of using facets to simulate dodging between colours, this approach creates a new variable interaction(colour.variable, x.variable) to be mapped to the x-axis.
The rest of the code in scale_x_discrete() & theme() are there to hide the default x-axis labels / ticks / grid lines.
axis.text.x = element_text(hjust = -4) is a hack that shifts x-axis labels to approximately the right position. It's ugly, but considering the use case is for a manuscript submission, I assume the size of plots will be fixed, and you just need to tweak it once.
Original solution:
Assuming your plots don't otherwise require facetting, you can simulate the appearance with facets:
ggplot(DF, aes(x = iv.y, y = dv, colour = iv.y)) +
geom_sina() +
facet_grid(~iv.x, switch = "x") +
labs(x = "iv.x") +
theme(axis.text.x = element_blank(), # hide iv.y labels
axis.ticks.x = element_blank(), # hide iv.y ticks
strip.background = element_blank(), # make facet strip background transparent
panel.spacing.x = unit(0, "mm")) # remove horizontal space between facets

Resources