Retrieve facet labels from a ggplot or a gtable/gTree/grob/gDesc object - r

I have data I'm plotting using ggplot's facet_grid:
My data:
species <- c("spcies1","species2")
conditions <- c("cond1","cond2","cond3")
batches <- 1:6
df <- expand.grid(species=species,condition=conditions,batch=batches)
set.seed(1)
df$y <- rnorm(nrow(df))
df$replicate <- 1
df$col.fill <- paste(df$species,df$condition,df$batch,sep=".")
My plot:
integerBreaks <- function(n = 5, ...)
{
library(scales)
breaker <- pretty_breaks(n, ...)
function(x){
breaks <- breaker(x)
breaks[breaks == floor(breaks)]
}
}
library(ggplot2)
p <- ggplot(df,aes(x=replicate,y=y,color=col.fill))+
geom_point(size=3)+facet_grid(~col.fill,scales="free_x")+
scale_x_continuous(breaks=integerBreaks())+
theme_minimal()+theme(legend.position="none",axis.title=element_text(size=8))
which gives:
Obviously the labels are long and come out pretty messed up in the figure so I was wondering if there's a way edit these labels in the ggplot object (p) or the gtable/gTree/grob/gDesc object (ggplotGrob(p)).
I am aware that one way of getting better labels is to use the labeller function when the ggplot object is created but in my case I'm specifically looking for a way to edit the facet labels after the ggplot object has been created.

As I mentioned in the comments, the facet names are nested quite deeply within the gtable that ggplotGrob() gives you. However, this is still possible and since the OP explicitly wants to edit them after being plotted, you can do this with:
library(grid)
gg <- ggplotGrob(p)
edited_grobs <- mapply(FUN = function(x, y) {
x[["grobs"]][[1]][["children"]][[2]][["children"]][[1]][["label"]] <- y
return(x)
},
gg$grobs[which(grepl("strip-t",gg$layout$name))],
unique(gsub("cond","c", df$condition)),
SIMPLIFY = FALSE)
gg$grobs[which(grepl("strip-t",gg$layout$name))] <- edited_grobs
grid.draw(gg)
Note that this extracts all the strips using gg$grobs[which(grepl("strip-t",gg$layout$name))] and passes them to the mapply to be reset with the gsub(...) that OP specified in their comment.
In general, if you want to access just one of the text labels, there is a very similar structure which I made use of in my mapply:
num_to_access <- 1
gg$grobs[which(grepl("strip-t",gg$layout$name))][[num_to_access]][["grobs"]][[1]][["children"]][[2]][["children"]][[1]]$label
So to access the 4th label for example all you would need to do is change num_to_acces to be 4. Hope this helps!

Related

In R, how can I tell if the scales on a ggplot object are log or linear?

I have many ggplot objects where I wish to print some text (varies from plot to plot) in the same relative position on each plot, regardless of scale. What I have come up with to make it simple is to
define a rescale function (call it sx) to take the relative position I want and return that position on the plot's x axis.
sx <- function(pct, range=xr){
position <- range[1] + pct*(range[2]-range[1])
}
make the plot without the text (call it plt)
Use the ggplot_build function to find the x scale's range
xr <- ggplot_build(plt)$layout$panel_params[[1]]$x.range
Then add the text to the plot
plt <- plt + annotate("text", x=sx(0.95), ....)
This works well for me, though I'm sure there are other solutions folks have derived. I like the solution because I only need to add one step (step 3) to each plot. And it's a simple modification to the annotate command (x goes to sx(x)).
If someone has a suggestion for a better method I'd like to hear about it. There is one thing about my solution though that gives me a little trouble and I'm asking for a little help:
My problem is that I need a separate function for log scales, (call it lx). It's a bit of a pain because every time I want to change the scale I need to modify the annotate commands (change sx to lx) and occasionally there are many. This could easily be solved in the sx function if there was a way to tell what the type of scale was. For instance, is there a parameter in ggplot_build objects that describe the log/lin nature of the scale? That seems to be the best place to find it (that's where I'm pulling the scale's range) but I've looked and can not figure it out. If there was, then I could add a command to step 3 above to define the scale type, and add a tag to the sx function in step 1. That would save me some tedious work.
So, just to reiterate: does anyone know how to tell the scaling (type of scale: log or linear) of a ggplot object? such as using the ggplot_build command's object?
Suppose we have a list of pre-build plots:
linear <- ggplot(iris, aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point()
log <- linear + scale_y_log10()
linear <- ggplot_build(linear)
log <- ggplot_build(log)
plotlist <- list(a = linear, b = log)
We can grab information about their position scales in the following way:
out <- lapply(names(plotlist), function(i) {
# Grab plot, panel parameters and scales
plot <- plotlist[[i]]
params <- plot$layout$panel_params[[1]]
scales <- plot$plot$scales$scales
# Only keep (continuous) position scales
keep <- vapply(scales, function(x) {
inherits(x, "ScaleContinuousPosition")
}, logical(1))
scales <- scales[keep]
# Grab relevant transformations
out <- lapply(scales, function(scale) {
data.frame(position = scale$aesthetics[1],
# And now for the actual question:
transformation = scale$trans$name,
plot = i)
})
out <- do.call(rbind, out)
# Grab relevant ranges
ranges <- params[paste0(out$position, ".range")]
out$min <- sapply(ranges, `[`, 1)
out$max <- sapply(ranges, `[`, 2)
out
})
out <- do.call(rbind, out)
Which will give us:
out
position transformation plot min max
1 x identity a 1.8800000 4.520000
2 y identity a 4.1200000 8.080000
3 y log-10 b 0.6202605 0.910835
4 x identity b 1.8800000 4.520000
Or if you prefer a straightforward answer:
log$plot$scales$scales[[1]]$trans$name
[1] "log-10"

Looping cut2 color argument in qplot

First off fair warning that this is relevant to a quiz question from coursera.org practical machine learning. However, my question does not deal with the actual question asked, but is a tangential question about plotting.
I have a training set of data and I am trying to create a plot for each predictor that includes the outcome on the y axis, the index of the data set on the x axis, and colors the plot by the predictor in order to determine the cause of bias along the index. To make the color argument more clear I am trying to use cut2() from the Hmisc package.
Here is my data:
library(ggplot2)
library(caret)
library(AppliedPredictiveModeling)
library(Hmisc)
data(concrete)
set.seed(1000)
inTrain = createDataPartition(mixtures$CompressiveStrength, p = 3/4)[[1]]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
training$index <- 1:nrow(training)
I tried this and it makes all the plots but they are all the same color.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(paste0("cutEx",i), cut2(x[ ,i]))
print(qplot(x$index, x$CompressiveStrength, color=paste0("cutEx",i)))
}
}
plotCols(training)
Then I tried this and it makes all the plots, and this time they are colored but the cut doesn't work.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(cols[i], cut2(x[ ,i]))
print(qplot(x$index, x$CompressiveStrength, color=x[ ,cols[i]]))
}
}
plotCols(training)
It seems qplot() doesn't like having paste() in the color argument. Does anyone know another way to loop through the color argument and still keep my cuts? Any help is greatly appreciated!
Your desired output is easier to achieve using ggplot() instead of qplot(), since you can use aes_string(), that accepts strings as arguments.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(paste0("cutEx", i), cut2(x[, i]))
p <- ggplot(x) +
aes_string("index", "CompressiveStrength", color = paste0("cutEx", i)) +
geom_point()
print(p)
}
}
plotCols(training)

plotting a list of grobs

DISCLOSURE: I'm not sure how to make a reproducible example for this question.
I'm trying to plot a list of grobs using the gridExtra package.
I have some code that looks like this:
## Make Graphic Objects for Spec and raw traces
for (i in 1:length(morletPlots)){
gridplots_Spec[[i]]=ggplotGrob(morletPlots[[i]])
gridplots_Raw[[i]]=ggplotGrob(rawPlot[[i]])
gridplots_Raw[[i]]$widths=gridplots_Spec[[i]]$widths
}
names(gridplots_Spec)=names(morletPlots)
names(gridplots_Raw)=names(rawPlot)
## Combine spec and Raw traces
g=list()
for (i in 1:length(rawPlot)){
g[[i]]=arrangeGrob(gridplots_Spec[i],gridplots_Raw[i],heights=c(4/5,1/5))
}
numPlots = as.numeric(length(g))
##Plot both
for (i in 1:numPlots){
grid.draw(g[i],ncol=2)
}
Let me walk through the code.
morletPlots = a list of ggplots
rawplot = A list of ggplots
gridplots_spec and gridplots_Raw = list of grobs from the ggplots made above.
g = a list of the two grobs above combined so combining gridplots_spec[1] and gridplots_raw[1] so on and so on for the length of the list.
now my goal would be two plot all of those into 2 columns. But whenever I pass the gridplots_spec[i] through the grid.draw loop I get an error:
Error in UseMethod("grid.draw") :
no applicable method for 'grid.draw' applied to an object of class "list"
I can't unlist it becasue it just turns into a long character vector. any ideas?
If it's absolutely crucial I can spend the time to make an reproducible example but I'm more likely just missing a simple step.
Here's my interpretation of your script, if it's not the intended result you may want to use some bits and pieces to make your question reproducible.
library(grid)
library(gridExtra)
library(ggplot2)
morletPlots <- replicate(5, ggplot(), simplify = FALSE)
rawplot <- replicate(5, ggplot(), simplify = FALSE)
glets <- lapply(morletPlots, ggplotGrob)
graws <- lapply(rawplot, ggplotGrob)
rawlet <- function(raw, let, heights=c(4,1)){
g <- rbind(let, raw)
panels <- g$layout[grepl("panel", g$layout$name), ]
# g$heights <- grid:::unit.list(g$heights) # not needed
g$heights[unique(panels$t)] <- lapply(heights, unit, "null")
g
}
combined <- mapply(rawlet, raw = graws, let=glets, SIMPLIFY = FALSE)
grid.newpage()
grid.arrange(grobs=combined, ncol=2)
Edit I can't resist this mischievous hack to colour the plots for illustration; feel free to ignore it.
palette(RColorBrewer::brewer.pal(8, "Pastel1"))
ggplot.numeric = function(i) ggplot2::ggplot() +
theme(panel.background=element_rect(fill=i))
morletPlots <- lapply(1:5, ggplot)
rawplot <- lapply(1:5, ggplot)

Inverse of ggplotGrob?

I have a function which manipulates a ggplot object, by converting it to a grob and then modifying the layers. I would like the function to return a ggplot object not a grob. Is there a simple way to convert a grob back to gg?
The documentation on ggplotGrob is awfully sparse.
Simple example:
P <- ggplot(iris) + geom_bar(aes(x=Species, y=Petal.Width), stat="identity")
G <- ggplotGrob(P)
... some manipulation to G ...
## DESIRED:
P2 <- inverse_of_ggplotGrob(G)
such that, we can continue to use basic ggplot syntax, ie
`P2 + ylab ("The Width of the Petal")`
UPDATE:
To answer the question in the comment, the motivation here is to modify the colors of facet labels programmatically, based on the value of label name in each facet. The functions below work nicely (based on input from baptise in a previous question).
I would like for the return value from colorByGroup to be a ggplot object, not simply a grob.
Here is the code, for those interested
get_grob_strips <- function(G, strips=grep(pattern="strip.*", G$layout$name)) {
if (inherits(G, "gg"))
G <- ggplotGrob(G)
if (!inherits(G, "gtable"))
stop ("G must be a gtable object or a gg object")
strip.type <- G$layout[strips, "name"]
## I know this works for a simple
strip.nms <- sapply(strips, function(i) {
attributes(G$grobs[[i]]$width$arg1)$data[[1]][["label"]]
})
data.table(grob_index=strips, type=strip.type, group=strip.nms)
}
refill <- function(strip, colour){
strip[["children"]][[1]][["gp"]][["fill"]] <- colour
return(strip)
}
colorByGroup <- function(P, colors, showWarnings=TRUE) {
## The names of colors should match to the groups in facet
G <- ggplotGrob(P)
DT.strips <- get_grob_strips(G)
groups <- names(colors)
if (is.null(groups) || !is.character(groups)) {
groups <- unique(DT.strips$group)
if (length(colors) < length(groups))
stop ("not enough colors specified")
colors <- colors[seq(groups)]
names(colors) <- groups
}
## 'groups' should match the 'group' in DT.strips, which came from the facet_name
matched_groups <- intersect(groups, DT.strips$group)
if (!length(matched_groups))
stop ("no groups match")
if (showWarnings) {
if (length(wh <- setdiff(groups, DT.strips$group)))
warning ("values in 'groups' but not a facet label: \n", paste(wh, colapse=", "))
if (length(wh <- setdiff(DT.strips$group, groups)))
warning ("values in facet label but not in 'groups': \n", paste(wh, colapse=", "))
}
## identify the indecies to the grob and the appropriate color
DT.strips[, color := colors[group]]
inds <- DT.strips[!is.na(color), grob_index]
cols <- DT.strips[!is.na(color), color]
## Fill in the appropriate colors, using refill()
G$grobs[inds] <- mapply(refill, strip = G$grobs[inds], colour = cols, SIMPLIFY = FALSE)
G
}
I would say no. ggplotGrob is a one-way street. grob objects are drawing primitives defined by grid. You can create arbitrary grobs from scratch. There's no general way to turn a random collection of grobs back into a function that would generate them (it's not invertible because it's not 1:1). Once you go grob, you never go back.
You could wrap a ggplot object in a custom class and overload the plot/print commands to do some custom grob manipulation, but that's probably even more hack-ish.
You can try the following:
p = ggplotify::as.ggplot(g)
For more info, see https://cran.r-project.org/web/packages/ggplotify/vignettes/ggplotify.html
It involves a little bit of a cheat annotation_custom(as.grob(plot),...), so it may not work for all circumstances: https://github.com/GuangchuangYu/ggplotify/blob/master/R/as-ggplot.R
Have a look at the ggpubr package: it has a function as_ggplot(). If your grob is not too complex it might be a solution!
I would also advise to have a look at the patchwork package which combine nicely ggplots... it is likely to not be what you are looking for but... have a look.

How can I arrange an arbitrary number of ggplots using grid.arrange?

This is cross-posted on the ggplot2 google group
My situation is that I'm working on a function that outputs an arbitrary number of plots (depending upon the input data supplied by the user). The function returns a list of n plots, and I'd like to lay those plots out in 2 x 2 formation. I'm struggling with the simultaneous problems of:
How can I allow the flexibility to be handed an arbitrary (n) number of plots?
How can I also specify I want them laid out 2 x 2
My current strategy uses grid.arrange from the gridExtra package. It's probably not optimal, especially since, and this is key, it totally doesn't work. Here's my commented sample code, experimenting with three plots:
library(ggplot2)
library(gridExtra)
x <- qplot(mpg, disp, data = mtcars)
y <- qplot(hp, wt, data = mtcars)
z <- qplot(qsec, wt, data = mtcars)
# A normal, plain-jane call to grid.arrange is fine for displaying all my plots
grid.arrange(x, y, z)
# But, for my purposes, I need a 2 x 2 layout. So the command below works acceptably.
grid.arrange(x, y, z, nrow = 2, ncol = 2)
# The problem is that the function I'm developing outputs a LIST of an arbitrary
# number plots, and I'd like to be able to plot every plot in the list on a 2 x 2
# laid-out page. I can at least plot a list of plots by constructing a do.call()
# expression, below. (Note: it totally even surprises me that this do.call expression
# DOES work. I'm astounded.)
plot.list <- list(x, y, z)
do.call(grid.arrange, plot.list)
# But now I need 2 x 2 pages. No problem, right? Since do.call() is taking a list of
# arguments, I'll just add my grid.layout arguments to the list. Since grid.arrange is
# supposed to pass layout arguments along to grid.layout anyway, this should work.
args.list <- c(plot.list, "nrow = 2", "ncol = 2")
# Except that the line below is going to fail, producing an "input must be grobs!"
# error
do.call(grid.arrange, args.list)
As I am wont to do, I humbly huddle in the corner, eagerly awaiting the sagacious feedback of a community far wiser than I. Especially if I'm making this harder than it needs to be.
You're ALMOST there! The problem is that do.call expects your args to be in a named list object. You've put them in the list, but as character strings, not named list items.
I think this should work:
args.list <- c(plot.list, 2,2)
names(args.list) <- c("x", "y", "z", "nrow", "ncol")
as Ben and Joshua pointed out in the comments, I could have assigned names when I created the list:
args.list <- c(plot.list,list(nrow=2,ncol=2))
or
args.list <- list(x=x, y=y, z=x, nrow=2, ncol=2)
Try this,
require(ggplot2)
require(gridExtra)
plots <- lapply(1:11, function(.x) qplot(1:10,rnorm(10), main=paste("plot",.x)))
params <- list(nrow=2, ncol=2)
n <- with(params, nrow*ncol)
## add one page if division is not complete
pages <- length(plots) %/% n + as.logical(length(plots) %% n)
groups <- split(seq_along(plots),
gl(pages, n, length(plots)))
pl <-
lapply(names(groups), function(g)
{
do.call(arrangeGrob, c(plots[groups[[g]]], params,
list(main=paste("page", g, "of", pages))))
})
class(pl) <- c("arrangelist", "ggplot", class(pl))
print.arrangelist = function(x, ...) lapply(x, function(.x) {
if(dev.interactive()) dev.new() else grid.newpage()
grid.draw(.x)
}, ...)
## interactive use; open new devices
pl
## non-interactive use, multipage pdf
ggsave("multipage.pdf", pl)
I'm answering a bit late, but stumbled on a solution at the R Graphics Cookbook that does something very similar using a custom function called multiplot. Perhaps it will help others who find this question. I'm also adding the answer as the solution may be newer than the other answers to this question.
Multiple graphs on one page (ggplot2)
Here's the current function, though please use the above link, as the author noted that it's been updated for ggplot2 0.9.3, which indicates it may change again.
# Multiple plot function
#
# ggplot objects can be passed in ..., or to plotlist (as a list of ggplot objects)
# - cols: Number of columns in layout
# - layout: A matrix specifying the layout. If present, 'cols' is ignored.
#
# If the layout is something like matrix(c(1,2,3,3), nrow=2, byrow=TRUE),
# then plot 1 will go in the upper left, 2 will go in the upper right, and
# 3 will go all the way across the bottom.
#
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
require(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}
One creates plot objects:
p1 <- ggplot(...)
p2 <- ggplot(...)
# etc.
And then passes them to multiplot:
multiplot(p1, p2, ..., cols = n)

Resources