autocrop faceted plots made by ggplot - r

When making faceted plots in ggplot and changing the aspect ratio, usually there is a lot of white space either left and right or above and below the graph. E.g:
library(ggplot2)
df <- data.frame(x=rep(1,3), y=rep(1,3), z=factor(letters[1:3]))
p <- ggplot(df, aes(x, y)) + geom_point() + coord_fixed(ratio=1) + facet_grid(z ~ .)
ggsave("plot.jpg", p, scale=1, device="jpeg")
Is there a way to autocrop the graph?

This is what I came up with. I tested it on your sample & it seems to work okay there, but apologies in advance if it breaks somewhere else. I had to dig into the grob version for the width/height information, and depending on whether the plot is faceted or not, the attribute information for "unit" is located at different places.
Hopefully someone more well-versed with the grid package's unit object can chip in, but here's what I've got:
# Note that you need to set an upper limit to the maximum height/width
# that the plot can occupy, in the max.dimension parameter (defaults to
# 10 inches in this function)
ggsave_autosize <- function(filename, plot = last_plot(), device = NULL, path = NULL, scale = 1,
max.dimension = 10, units = c("in", "cm", "mm"),
dpi=300, limitsize = TRUE){
sumUnitNull <- function(x){
res <- 0
for(i in 1:length(x)){
check.unit <- ifelse(!is.null(attr(x[i], "unit")), attr(x[i], "unit"),
ifelse(!is.null(attr(x[i][[1]], "unit")), attr(x[i][[1]], "unit"), NA))
if(!is.na(check.unit) && check.unit == "null") res <- res + as.numeric(x[i])
}
return(res)
}
# get width/height information from the plot object (likely in a mixture of different units)
w <- ggplotGrob(plot)$widths
h <- ggplotGrob(plot)$heights
# define maximum dimensions
w.max <- grid::unit(max.dimension, units) %>% grid::convertUnit("in") %>% as.numeric()
h.max <- grid::unit(max.dimension, units) %>% grid::convertUnit("in") %>% as.numeric()
# sum the inflexible size components of the plot object's width/height
# these components have unit = "in", "mm", "pt", "grobheight", etc
w.in <- w %>% grid::convertUnit("in") %>% as.numeric() %>% sum()
h.in <- h %>% grid::convertUnit("in") %>% as.numeric() %>% sum()
# obtain the amount of space available for the flexible size components
w.avail <- w.max - w.in
h.avail <- h.max - h.in
# sum the flexible sized components of the plot object's width/height
# these components have unit = "null"
w.f <- sumUnitNull(w)
h.f <- sumUnitNull(h)
# shrink the amount of avilable space based on what the flexible components would actually take up
if(w.f/h.f > w.avail/h.avail) h.avail <- w.avail/w.f*h.f else w.avail <- h.avail/h.f*w.f
w <- w.in + w.avail
h <- h.in + h.avail
ggsave(filename, plot = plot, device = device, path = path, scale = scale,
width = w, height = h, units = units, dpi = dpi, limitsize = limitsize)
}
p <- ggplot(mpg, aes(displ, cty)) + geom_point() + coord_fixed(ratio=1)
p <- p + facet_grid(. ~ cyl)
ggsave("pOriginal.png", p + ggtitle("original"))
ggsave_autosize("pAutoSize.png", p + ggtitle("auto-resize"))
ggsave_autosize("pAutoSize8.png", p + ggtitle("auto-resize, max dim = 8in x 8in"), max.dimension = 8, units = "in")
Original version w/o cropping. There's black space on the left / right:
Automatically cropped version. Height = 10 inches:
Automatically cropped version. Height = 8 inches (so the font looks slightly larger):

Related

figuring out panel size given dimensions for the final plot in ggsave (to show count for geom_dotplot)

I'm trying to show the count for dotplot on the x-axis as outlined here:
showing count on x-axis for dot plot
library(ggplot2)
library(grid)
date = seq(as.Date("2016/1/5"), as.Date("2016/1/12"), "day")
value = c(11,11,12,12,13,14,14,14)
dat =data.frame(date = date, value = value)
### base plot
g <- ggplot(dat, aes(x = value)) + geom_dotplot(binwidth = 0.8) + coord_flip()
g # output to read parameter
### calculation of width and height of panel
current.vpTree()
seekViewport('panel.3-4-3-4')
real_width <- convertWidth(unit(1,'npc'), 'inch', TRUE)
real_height <- convertHeight(unit(1,'npc'), 'inch', TRUE)
### calculation of other values
height_coordinate_range <-diff(ggplot_build(g)$layout$panel_params[[1]]$y.range)
real_binwidth <- real_height / height_coordinate_range * 0.8 # 0.8 is the argument binwidth
num_balls <- real_width / 1.1 / real_binwidth # the number of stacked balls. 1.1 is expanding value.
g + ylim(0, num_balls)
However, it seems to me that real_width refers to width of the panel, not the whole plot. This leads to a misalignment between count ticks and dots when I use:
ggsave(g,
filename = "g.png",
path = getwd(),
device = "png",
height = real_height,
width = real_width,
units = "cm")
Given I want a plot that is 6cm x 6cm, how can i find out the dimensions of the panel, so that I can use the panel dimensions to calculate num_balls?
The moral of the story is: you can't know the panel size exactly until the plot is being drawn. However, you can approximate it by subtracting the dimensions of plot decoration from the output dimension. This approach ignores that in reality, graphics devices can have a 'resolution'/'scaling' parameter that effects the true size of the text. The reasoning here is that since panels are 'null' units, they adapt to whatever is left of the output dimension, after every non-null units have been subtracted.
library(ggplot2)
library(grid)
dat =data.frame(
date = seq(as.Date("2016/1/5"), as.Date("2016/1/12"), "day"),
value = c(11,11,12,12,13,14,14,14)
)
g <- ggplot(dat, aes(x = value)) + geom_dotplot(binwidth = 0.8) + coord_flip()
output_width <- unit(6, "cm")
output_height <- unit(6, "cm")
# Convert plot to gtable
gt <- ggplotGrob(g)
# Find panel
is_panel <- grep("panel", gt$layout$name)
panel_location <- gt$layout[is_panel,]
# Note: panel width / heights are 'null' units
gt$widths[panel_location$l]
#> [1] 1null
gt$heights[panel_location$t]
#> [1] 1null
# Get widths/heights of everything except the panel
width <- gt$widths[-panel_location$l]
height <- gt$heights[-panel_location$t]
# Calculate as output size - size of plot decoration
panel_height <- output_height - sum(height)
panel_width <- output_width - sum(width)
convertUnit(panel_height, "cm")
#> [1] 4.6283951674277cm
convertUnit(panel_width, "cm")
#> [1] 4.57547850076103cm
Created on 2022-02-24 by the reprex package (v2.0.1)
There are ways to fix the dimensions of a panel that would make things easier, but that wasn't the question.

Using aesthetics to color sub-plots when creating custom geoms

I am creating a custom geom which displays a little plot-in-plot, a small mini barplot that can be displayed somewhere on the plot. You will see an example below (don't ask me why I do it, I need it Because Of Reasons).
My questions:
How does ggplot2 choose the colors of the bar? I don't see how they were defined. They magically appear when coord$transform is called. Why are colors for the same cylinder number on different parts of the plot different?
How can I make the colors matching between the mini barplots?
How can I add a legend describing the bars? I assume that would have to be called from a setup function of the geom or an associated stat, because normally each group gets only one key (and here we need several key per group).
Example: using the mpg data set, we plot mean displacement over time. At each year, we put a small bar plot showing relative proportions of cars with a given number of cylinders (still working on the labels, though).
# library(ggplot2)
# library(grid)
## per year summary – mean displacement
point.data <- mpg %>% group_by(year) %>% summarise(mean_displ=mean(displ))
## per year, number of models with the given number of cylinders
bar.data <- mpg %>% group_by(year) %>% count(cyl) %>% left_join(point.data)
## position of the mini barplot
bar.data$x <- bar.data$year
bar.data$y <- bar.data$mean_displ + .15
I use my custom geom to show the bar plots (code for the geom is below)
ggplot(point.data, aes(x=year, y=mean_displ)) +
geom_point(size=8, col="lightblue") +
geom_line(size=3, col="lightblue") +
geom_bar_widget(data=bar.data,
aes(x=x, y=y, width=.2, height=.15,
group=year, value=n, fill=factor(n))) +
xlim(1995, 2010) + ylim(3, 4)
Result:
And here is the code for the geom:
## calculate the mini barplot
.bar_widget_bars <- function(x, y, w, h, v, fill) {
nv <- length(v)
v <- h * v / max(v) # scale the values
dx <- w / nv # width of a bar
# xx and yy are the mid positions of the rectangles
xx <- x - w/2 + seq(dx/2, w - dx/2, length.out=nv)
yy <- y - h/2 + v/2
## widths and heights of the rectangles
ww <- rep(dx, nv)
hh <- v
list(rectGrob(xx, yy, ww, hh, gp=gpar(fill=fill)))
}
## draw a mini barplot
.bar_widget_draw_group <- function(data, panel_params, coord, wgdata) {
ct <- coord$transform(data, panel_params)
grobs <- .bar_widget_bars(x=ct$x[1], y=ct$y[1],
w=ct$width[1], h=ct$height[1],
v=data$value, fill=ct$fill)
class(grobs) <- "gList"
gTree("bar_widget_grob", children=grobs)
}
## the widget for the mini barplot
GeomBarWidget <- ggproto("GeomPieWidget",
Geom,
required_aes=c("x", "y", "width", "height", "group", "value"),
default_aes=aes(shape=19, colour="black", fill=NULL, labels=NULL),
draw_key=draw_key_blank(),
draw_group=.bar_widget_draw_group,
extra_params=c("na.rm")
)
geom_bar_widget <- function(mapping = NULL, data = NULL,
stat = "identity",
position = "identity", na.rm = FALSE,
show.legend = FALSE,
inherit.aes = TRUE, ...) {
layer(
geom = GeomBarWidget, mapping = mapping,
data = data, stat = stat,
position = position, show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(na.rm = na.rm, ...)
)
}
OK, so the problem was not ggplot, the problem was the idiot who wrote the code.
In the code above I wrote fill=factor(n), but n is actually the value to show on the bar plot. Instead, I should have used fill=factor(cyl), which works exactly as expected.

Finagling the space and width arguments to barplot to align 2x1 plot window

I'd like to align the bottom barplot in the following so that the groups line up vertically between the two plots:
par(mfrow = c(2, 1))
n = 1:5
barplot(-2:2, width = n, space = .2)
barplot(matrix(-10:9, nrow = 4L, ncol = 5L), beside = TRUE,
width = rep(n/4, each = 5L), space = c(0, .8))
I've been staring at the definition of the space and width arguments to barplot (from ?barplot) for a while and I really expected the above to work (but clearly it didn't):
width -- optional vector of bar widths. Re-cycled to length the number of bars drawn. Specifying a single value will have no visible effect...
space -- the amount of space (as a fraction of the average bar width) left before each bar. May be given as a single number or one number per bar. If height is a matrix and beside is TRUE, space may be specified by two numbers, where the first is the space between bars in the same group, and the second the space between the groups. If not given explicitly, it defaults to c(0,1) if height is a matrix and beside is TRUE, and to 0.2 otherwise.
As I read it, this means we should be able to match the group widths in the top plot by dividing each group into 4 (hence n/4). For space, since we're dividing each bar's width by 4, the average width will as well; hence we should multiply the fraction by 4 to compensate for this (hence space = c(0, 4*.2)).
However it appears this is being ignored. In fact, it seems all the boxes have the same width! In tinkering around, I've only been able to get the relative within-group widths to vary.
Will it be possible to accomplish what I've got in mind with barplot? If not, can someone say how to do this in e.g. ggplot2?
It is possible to do this with base plot as well, but it helps to pass the matrix as a vector for the second plot. Subsequently, you need to realize the space argument is a fraction of the average bar width. I did it as follows:
par(mfrow = c(2, 1))
widthsbarplot1 <- 1:5
spacesbarplot1 <- c(0, rep(.2, 4))
barplot(-2:2, width = widthsbarplot1, space = spacesbarplot1)
widthsbarplot2 <- rep(widthsbarplot1/4, each = 4)
spacesbetweengroupsbarplot2 <- mean(widthsbarplot2)
allspacesbarplot2 <- c(rep(0,4), rep(c(spacesbetweengroupsbarplot2, rep(0,3)), 4))
matrix2 <- matrix(-10:9, nrow = 4L, ncol = 5L)
barplot(c(matrix2),
width = widthsbarplot2,
space = allspacesbarplot2,
col = c("red", "yellow", "green", "blue"))
You can actually pass widths in ggplot as vectors as well. You'll need the dev version of ggplot2 to get the correct dodging though:
library(dplyr)
library(ggplot2)
df1 <- data.frame(n = 1:5, y = -2:2)
df1$x <- cumsum(df1$n)
df2 <- data.frame(n = rep(1:5, each = 4), y2 = -10:9)
df2$id <- 1:4 # just for the colors
df3 <- full_join(df1, df2)
p1 <- ggplot(df1, aes(x, y)) + geom_col(width = df1$n, col = 1)
p2 <- ggplot(df3, aes(x, y2, group = y2, fill = factor(id))) +
geom_col(width = df3$n, position = 'dodge2', col = 1) +
scale_fill_grey(guide = 'none')
cowplot::plot_grid(p1, p2, ncol = 1, align = 'v')
Another way, using only base R and still using barplot (not going "down" to rect) is to do it in several barplot calls, with add=TRUE, playing with space to put the groups of bars at the right place.
As already highlighted, the problem is that space is proportional to the mean of width. So you need to correct for that.
Here is my way:
# draw first barplot, getting back the value
bp <- barplot(-2:2, width = n, space = .2)
# get the xlim
x_rg <- par("usr")[1:2]
# plot the "frame"
plot(0, 0, type="n", axes=FALSE, xlab="", ylab="", xlim=x_rg, xaxs="i", ylim=range(as.vector(pr_bp2)))
# plot the groups of bars, one at a time, specifying space, with a correction according to width, so that each group start where it should
sapply(1:5, function(i) barplot(pr_bp2[, i, drop=FALSE], beside = TRUE, width = n[i]/4, space = c((bp[i, 1]-n[i]/2)/(n[i]/4), rep(0, 3)), add=TRUE))
You can do this in ggplot2 by setting the x-axis locations of the bars explicitly and using geom_rect for plotting. Here's an example that's probably more complicated than it needs to be, but hopefully it will demonstrate the basic idea:
library(tidyverse)
sp = 0.4
d1 = data.frame(value=-2:2) %>%
mutate(key=paste0("V", 1:n()),
width=1:n(),
spacer = cumsum(rep(sp, n())) - sp,
xpos = cumsum(width) - 0.5*width + spacer)
d2 = matrix(-10:9, nrow = 4L, ncol = 5L) %>%
as.tibble %>%
gather(key, value) %>%
mutate(width = as.numeric(gsub("V","",key))) %>%
group_by(key) %>%
mutate(width = width/n()) %>%
ungroup %>%
mutate(spacer = rep(cumsum(rep(sp, length(unique(key)))) - sp, each=4),
xpos = cumsum(width) - 0.5*width + spacer)
d = bind_rows(list(d1=d1, d2=d2), .id='source') %>%
group_by(source, key) %>%
mutate(group = LETTERS[1:n()])
ggplot(d, aes(fill=group, colour=group)) +
geom_rect(aes(xmin=xpos-0.5*width, xmax=xpos+0.5*width, ymin=0, ymax=value)) +
facet_grid(source ~ ., scales="free_y") +
theme_bw() +
guides(fill=FALSE, colour=FALSE) +
scale_x_continuous(breaks = d1$xpos, labels=d1$key)

Fixing plot area width when using layout_matrix in grid.arrange

I am combining facet plots of tiles. I want each tile to be square, or at least take the same height and width.
So far I have managed to give equal height to each row of tiles using layout_matrix. I am stuck when trying to fix an equal width to each column of tiles (across the plots).
Some code based on mtcars to try and illustrate the layout of my plot (actual data way more complicated):
library("tidyverse")
library("gridExtra")
df0 <- mtcars %>%
group_by(cyl) %>%
count()
df1 <- mtcars %>%
rownames_to_column("car") %>%
mutate(man = gsub("([A-Za-z]+).*", "\\1", car))
g <- list()
for(i in 1:nrow(df0)){
g[[i]] <- ggplot(data = df1 %>% filter(cyl == df0$cyl[i]),
mapping = aes(x = "", y = car, fill = qsec)) +
geom_tile() +
facet_grid( man ~ ., scales = "free_y", space = "free") +
labs(x = "", y = "") +
guides(fill = FALSE) +
theme(strip.text.y = element_text(angle=0)) +
coord_fixed()
}
m0 <- cbind(c(rep(1, df0$n[1]), rep(NA, max(df0$n) - df0$n[1])),
c(rep(2, df0$n[2]), rep(NA, max(df0$n) - df0$n[2])),
c(rep(3, df0$n[3]), rep(NA, max(df0$n) - df0$n[3])))
grid.arrange(grobs = g, layout_matrix = m0)
Which produces this plot (minus my MS Paint skills):
Presumably the different lengths of the labels in the strip text and y axis lead to the different widths for the plotting area. Not sure how I can avoid this behavior though? I thought I could create on big facet_grid but I could not get anywhere near the layout of the plot above.
Turns out this is a rather tricky thing to do. Luckily, cowplot::plot_grid can already do the alignment that results in equal sizes of the columns. I just took that function and removed the fluff, and decoupled the heights from the grid pattern it normally uses. We end up with a little custom function that does the job (all credits to Claus Wilke):
plot_grid_gjabel <- function(plots, heights) {
grobs <- lapply(plots, function(x) {
if (!is.null(x))
cowplot:::ggplot_to_gtable(x)
else NULL
})
num_plots <- length(plots)
num_widths <- unique(lapply(grobs, function(x) {
length(x$widths)
}))
num_widths[num_widths == 0] <- NULL
max_widths <- do.call(grid::unit.pmax,
lapply(grobs, function(x) { x$widths }))
for (i in 1:num_plots) {
grobs[[i]]$widths <- max_widths
}
width <- 1 / num_plots
height <- heights / max(heights)
x <- cumsum(width[rep(1, num_plots)]) - width
p <- cowplot::ggdraw()
for (i in seq_along(plots)) {
p <- p + cowplot::draw_grob(grid::grobTree(grobs[[i]]), x[i], 1 - height[i],
width, height[i])
}
return(p)
}
We can simply call this like so:
plot_grid_gjabel(g, df0$n)
Resulting in:

set size of a plot area in ggplot2 [duplicate]

I have a series of ggplot2 graphics with a constant number of horizontal but differing number of vertical facets. I would like to save the graphics as .pdf on landscape a4 format.
However, I don't know how I can achieve identical proportions of the facets. If I try to tweak it manually and vary width and height for different numbers of vertical facets, the scales vary between plots, i.e., I get different points sizes and line widths.
In essence, how can I achieve identical facets sizes and scales for plots with a variable number of (vertical) facets?
Here is an example:
df <- expand.grid(a = 1:2, b = 1:5, x = 1:10)
df$y <- df$x
plot <- ggplot(data = df, mapping = aes(x = x, y = y)) +
geom_point()
plot1 <- plot + facet_grid(facets = "a ~ b")
plot2 <- plot + facet_grid(facets = ". ~ b")
ggsave(filename = "./figures/plot1.pdf", plot = plot1,
height = 210, width = 297, units = "mm")
ggsave(filename = "./figures/plot2.pdf", plot = plot2,
height = 210, width = 297, units = "mm")
I use this code to set panel sizes to absolute values, maybe it helps here
set_panel_size <- function(p=NULL, g=ggplotGrob(p), file=NULL,
margin = unit(1,"mm"),
width=unit(4, "cm"),
height=unit(4, "cm")){
panels <- grep("panel", g$layout$name)
panel_index_w<- unique(g$layout$l[panels])
panel_index_h<- unique(g$layout$t[panels])
nw <- length(panel_index_w)
nh <- length(panel_index_h)
if(getRversion() < "3.3.0"){
# the following conversion is necessary
# because there is no `[<-`.unit method
# so promoting to unit.list allows standard list indexing
g$widths <- grid:::unit.list(g$widths)
g$heights <- grid:::unit.list(g$heights)
g$widths[panel_index_w] <- rep(list(width), nw)
g$heights[panel_index_h] <- rep(list(height), nh)
} else {
g$widths[panel_index_w] <- rep(width, nw)
g$heights[panel_index_h] <- rep(height, nh)
}
if(!is.null(file))
ggsave(file, g,
width = convertWidth(sum(g$widths) + margin,
unitTo = "in", valueOnly = TRUE),
height = convertHeight(sum(g$heights) + margin,
unitTo = "in", valueOnly = TRUE))
invisible(g)
}
print.fixed <- function(x) grid.draw(x)

Resources