Indexes withing ggplot in R - r

I am generating some plots with the following code.I have 8 plots generated with the the following code and what I want is to have them on the same page with no titles. More specifically, I want in every plot to have on the left up-corner a letter (a,b..) and at the end of the plot to have something like an one-row legend (e.g Plots: a. category one, b. category two, ...).
Code:
g1= ggplot(som, aes(x=value, y=variable))+geom_smooth(method=lm,alpha=0.25,col='green',lwd=0.1) +ylim(0,1000)+xlim(-2,2)+
geom_point(shape=23,fill="black",size=0.2)+theme_bw()+theme(plot.background = element_blank(),panel.grid.major = element_blank()
,panel.grid.minor = element_blank()) +labs(x="something here",y="something else")+
theme(axis.title.x = element_text(face="bold", size=7),axis.text.x = element_text(size=5))+
theme(axis.title.y = element_text(face="bold", size=7),axis.text.y = element_text(size=5))+
theme(plot.title = element_text(lineheight=.8, face="bold",size=8))
grid.arrange(g1,g2,g3,g4,g5,g6,g7,g8,ncol=2)
Is it possible to do that with ggplot? If so, how can I do this?
p.s I have no problem with the above code
Thank you.

This is how you could do it with library(cowplot).
First some plots:
set.seed(1)
plots <- list()
for (i in 1:8) {
my_cars <- mtcars[sample(1:nrow(mtcars), 10), ]
plots[[i]] <- ggplot(my_cars, aes(mpg, hp, color = as.factor(cyl))) +
geom_point() +
geom_smooth(method = "lm", color = "black")
}
Then to have a unifying title (or legend here) we use a combination of two plot_grid() calls.
lbls <- LETTERS[1:length(plots)]
# add a line break because its long
lbls <- gsub("E", "\nE", lbls)
grid <- plot_grid(plotlist = plots, labels = lbls, ncol = 2)
legend <- ggdraw() +
draw_label(paste0(lbls, "= category",1:length(plots), collapse = " "))
plot_grid(grid, legend, rel_heights = c(1, .1), ncol = 1)
The documentation for cowplot is great and has a ton of examples. Check it out here and here. Let me know if you get stuck.

Related

ggplot - control the number of graph per a page [duplicate]

I have the facet_wrap function to make multiple graphs (n=~51) but they all appear on one page. Now after searching, I found out that ggplot2 can't place graphs on multiple pages.
Is there a way to do this? I looked at this question (Multiple graphs over multiple pages using ggplot) and tried out the code, with little success.
Here is my code for my graphs, it produces ~51 graphs on one page, making them very small and hard to see, if I could print this to 1 graph per page in a pdf, that would be great:
ggplot(indbill, aes(x = prey, y = weight), tab) +
geom_polygon(aes(group = load, color = capture), fill = NA, size = 0.75) +
facet_wrap(~ individual) +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(size=rel(0.5)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("") +
guides(color = guide_legend(ncol=2)) +
coord_radar()
If someone could write up a little code and explain it to me, that would be great.
There are multiple ways to do the pagination: ggforce or gridExtra::marrangeGrob. See also this answer for another example.
ggforce:
library(ggplot2)
# install.packages("ggforce")
library(ggforce)
# Standard facetting: too many small plots
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap(~cut:clarity, ncol = 3)
# Pagination: page 1
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap_paginate(~cut:clarity, ncol = 3, nrow = 3, page = 1)
# Pagination: page 2
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap_paginate(~cut:clarity, ncol = 3, nrow = 3, page = 2)
# Works with grid as well
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_grid_paginate(color~cut:clarity, ncol = 3, nrow = 3, page = 4)
gridExtra:
# install.packages("gridExtra")
library(gridExtra)
set.seed(123)
pl <- lapply(1:11, function(.x)
qplot(1:10, rnorm(10), main=paste("plot", .x)))
ml <- marrangeGrob(pl, nrow=2, ncol=2)
## non-interactive use, multipage pdf
## ggsave("multipage.pdf", ml)
## interactive use; calling `dev.new` multiple times
ml
Created on 2018-08-09 by the reprex package (v0.2.0.9000).
One option is to just plot, say, six levels of individual at a time using the same code you're using now. You'll just need to iterate it several times, once for each subset of your data. You haven't provided sample data, so here's an example using the Baseball data frame:
library(ggplot2)
library(vcd) # For the Baseball data
data(Baseball)
pdf("baseball.pdf", 7, 5)
for (i in seq(1, length(unique(Baseball$team87)), 6)) {
print(ggplot(Baseball[Baseball$team87 %in% levels(Baseball$team87)[i:(i+5)], ],
aes(hits86, sal87)) +
geom_point() +
facet_wrap(~ team87) +
scale_y_continuous(limits=c(0, max(Baseball$sal87, na.rm=TRUE))) +
scale_x_continuous(limits=c(0, max(Baseball$hits86))) +
theme_bw())
}
dev.off()
The code above will produce a PDF file with four pages of plots, each with six facets to a page. You can also create four separate PDF files, one for each group of six facets:
for (i in seq(1, length(unique(Baseball$team87)), 6)) {
pdf(paste0("baseball_",i,".pdf"), 7, 5)
...ggplot code...
dev.off()
}
Another option, if you need more flexibility, is to create a separate plot for each level (that is, each unique value) of the facetting variable and save all of the individual plots in a list. Then you can lay out any number of the plots on each page. That's probably overkill here, but here's an example where the flexibility comes in handy.
First, let's create all of the plots. We'll use team87 as our facetting column. So we want to make one plot for each level of team87. We'll do this by splitting the data by team87 and making a separate plot for each subset of the data.
In the code below, split splits the data into separate data frames for each level of team87. The lapply wrapper sequentially feeds each data subset into ggplot to create a plot for each team. We save the output in plist, a list of (in this case) 24 plots.
plist = lapply(split(Baseball, Baseball$team87), function(d) {
ggplot(d, aes(hits86, sal87)) +
geom_point() +
facet_wrap(~ team87) +
scale_y_continuous(limits=c(0, max(Baseball$sal87, na.rm=TRUE))) +
scale_x_continuous(limits=c(0, max(Baseball$hits86))) +
theme_bw() +
theme(plot.margin=unit(rep(0.4,4),"lines"),
axis.title=element_blank())
})
Now we'll lay out six plots at time in a PDF file. Below are two options, one with four separate PDF files, each with six plots, the other with a single four-page PDF file. I've also pasted in one of the plots at the bottom. We use grid.arrange to lay out the plots, including using the left and bottom arguments to add axis titles.
library(gridExtra)
# Four separate single-page PDF files, each with six plots
for (i in seq(1, length(plist), 6)) {
pdf(paste0("baseball_",i,".pdf"), 7, 5)
grid.arrange(grobs=plist[i:(i+5)],
ncol=3, left="Salary 1987", bottom="Hits 1986")
dev.off()
}
# Four pages of plots in one PDF file
pdf("baseball.pdf", 7, 5)
for (i in seq(1, length(plist), 6)) {
grid.arrange(grobs=plist[i:(i+5)],
ncol=3, left="Salary 1987", bottom="Hits 1986")
}
dev.off()
something like :
by(indbill, indbill$individual, function (x){
ggplot(x, aes(x = prey, y = weight), tab) +
geom_polygon(aes(group = load, color = capture), fill = NA, size = 0.75) +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(size=rel(0.5)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("") +
guides(color = guide_legend(ncol=2)) +
coord_radar()
}

cowplot, shared legend, spacing goes wrong when output to png

I have a problem with combining ggplots using cowplot and ggpubr which is driving me crazy.
The problem is with the legend. When it's displayed with its original graph on its own, the spacing between the elements of the legend (guide title, key, key-label) are absolutely fine.
However, if I extract the legend from the original plot, and then display it in a combined plot (using either plot_grid from cowplot or ggarrange from ggpubr) then the spacing goes haywire. The longer the text, the more the spacing expands.
What is wrong here, and how do I fix it so that the legend in the combined plot looks exactly like the one in the original individual plot?
Example
This example uses ggarrange from ggpubr; my results with get_legend and plot_grid using cowplot are similar. Treatment names are entirely made up.
library(survival)
library(broom)
library(dplyr)
library(foreach)
library(ggpubr)
fit <- survfit(Surv(time,status == 2) ~ trt + sex, data=pbc)
time.xticks <- seq(0, 4000, 1000)
delta <- 0.00001
# Survival plot
kmdata <- tidy(fit) %>%
mutate(trt=factor(gsub('trt=(.*),.*','\\1',strata)),
sex=factor(gsub('.*sex=(.*)','\\1',strata), levels=levels(pbc$sex)))
p1 <- ggplot(data=filter(kmdata, time<=max(time.xticks)), aes(x=time, y=estimate, colour=sex, linetype=trt)) + geom_step() +
scale_x_continuous(breaks = time.xticks,
limits = c(min(time.xticks), max(time.xticks))) +
scale_colour_discrete(name="Sex", labels=c("Male","Female")) +
scale_linetype_discrete(name="Treatment group", labels=c("Zyxatrxilbroh 35 mg", "Placebo 35 mg")) +
theme(legend.position = "bottom", legend.box = "horizontal",
legend.background = element_rect(fill="grey90", colour="black", size=0),
legend.key.height=unit(0.2, "cm"),
text=element_text(size=18))
tardata <- foreach(s=unique(kmdata$strata), .combine="rbind") %do% {
filter(kmdata, strata==s)[findInterval(pmax(0, time.xticks-delta), filter(kmdata, strata==s)$time)+1,] %>%
bind_cols(tibble(time.xticks))
} %>%
mutate(ypos = -((as.integer(sex)-1)*(length(unique(pbc$trt))+2) + as.integer(trt) + 1))
tarheads <- tibble(xpos=0,
ypos=-(((1:length(unique(pbc$sex))) - 1)*(length(unique(pbc$trt)) + 2) + 1),
lab=levels(pbc$sex))
risk.yticks <- sort(unique(tardata$ypos))
risk.ylabels <- rep(rev(paste("trt =",levels(kmdata$trt))), length(unique(kmdata$sex)))
# Number-at-risk table
p2 <- ggplot(data=tardata, aes(x=time.xticks, y=ypos, label=n.risk, colour=sex)) + geom_text() +
geom_text(data=tarheads, aes(x=xpos, y=ypos, label=lab), colour="black", hjust="left") +
scale_x_continuous(breaks = time.xticks,
limits = c(min(time.xticks), max(time.xticks))) +
scale_y_continuous(breaks = risk.yticks,
labels = risk.ylabels) +
theme(text=element_text(size=18))
# put the two together
p.comb <- ggarrange(p1, p2, heights = c(2, 0.8), ncol=1,
align="v", common.legend = TRUE, legend="bottom")
# alternate version with guide headings at the top left instead of at the side
p1.a <- p1 + guides(colour = guide_legend(order=1,
title.position = "top",
title.hjust = 0),
linetype = guide_legend(order=1,
title.position = "top",
title.hjust = 0))
p.comb.a <- ggarrange(p1.a, p2, heights = c(2, 0.8), ncol=1,
align="v", common.legend = TRUE, legend="bottom")
# send to png
png("test-p1.png", width=8, height=4.5, units="in", res=200, type="cairo")
plot(p1)
dev.off()
png("test-pcomb.png", width=8, height=4.5, units="in", res=200, type="cairo")
plot(p.comb)
dev.off()
Results
Individual plot with correct legend spacing:
[
Combined plot with legend spacing expanded so much that the legend no longer fits in the image:

Efficient way to map data to legend text color in ggplot2

I'm wondering if there's an efficient way to map data onto legend text color in ggplot2, just like we can do with axis text. Reproducible example follows.
First, let's make a plot:
library(ggplot2)
library(dplyr)
drv_counts <- mutate(mpg,
drv = case_when(drv == "r" ~ "rear wheel drive",
drv == "4" ~ "4 wheel drive",
drv == "f" ~ "front wheel drive"),
model_drv = interaction(model, drv)) %>%
group_by(model_drv) %>%
summarize(model = model[1], drv = drv[1], count = n()) %>%
arrange(drv, count) %>%
mutate(model = factor(model, levels = model))
p <- ggplot(drv_counts, aes(x=model, y=count, fill=drv)) +
geom_col() + coord_flip() + guides(fill = guide_legend(reverse=T)) +
theme_minimal()
p
Now let's color the axis labels by drive train. This is very easy:
# ggplot2 colors
cols <- c("4 wheel drive" = "#F8766D", "front wheel drive" = "#00BA38", "rear wheel drive" = "#619CFF")
p2 <- p + theme(axis.text.y = element_text(color = cols[drv_counts$drv]))
p2
Now let's try the same trick on the legend. It doesn't work:
p2 + theme(legend.text = element_text(color = cols))
The reason this doesn't work for legend text but does work for axis text is that all the axis labels are drawn in one grob, and hence we can give that grob a vector of colors, but the legend labels are drawn in separate grobs.
We can go in and color all the grobs manually, but that's super ugly and cumbersome:
g <- ggplotGrob(p2)
g$grobs[[15]]$grobs[[1]]$grobs[[9]]$children[[1]]$gp$col <- cols[g$grobs[[15]]$grobs[[1]]$grobs[[9]]$children[[1]]$label]
g$grobs[[15]]$grobs[[1]]$grobs[[10]]$children[[1]]$gp$col <- cols[g$grobs[[15]]$grobs[[1]]$grobs[[10]]$children[[1]]$label]
g$grobs[[15]]$grobs[[1]]$grobs[[11]]$children[[1]]$gp$col <- cols[g$grobs[[15]]$grobs[[1]]$grobs[[11]]$children[[1]]$label]
grid::grid.newpage()
grid::grid.draw(g)
My question is: Can somebody think of a way of getting this effect without having to dig down into the grob tree? I'm Ok with a patch to ggplot2 if it's only a few modified lines. Alternatively, can the digging down into the grob tree be automated so I don't have to access child grobs by manually setting list indices that will change the moment I make a minor change to the figure?
Update: A related question can be found here. To make my question distinct, let's add the requirement that colors aren't copied over from the symbols but rather can be set to any arbitrary values. This added requirement has real-world relevance because I usually use a darker color for text than for symbols.
Here's a pretty mediocre method of hacking grobs together to make a legend. I setup a palette based on the unique values of the drv variable (so it can be scaled to larger datasets or more colors). Then I mapped over the values of the palette to make each legend item: a rectGrob and a textGrob, both with the corresponding color from the palette. These could definitely be tweaked to look better. All of these get arranged into a new grob and stuck alongside the plot with cowplot. It isn't gorgeous but it might be a start.
library(tidyverse)
library(grid)
library(gridExtra)
pal <- colorspace::qualitative_hcl(n = length(unique(drv_counts$drv)), l = 60, c = 70) %>%
setNames(unique(drv_counts$drv))
p2 <- ggplot(drv_counts, aes(x=model, y=count, fill=drv)) +
geom_col() +
coord_flip() +
theme_minimal() +
scale_fill_manual(values = pal, guide = F) +
theme(axis.text.y = element_text(color = pal[drv_counts$drv]))
legend <- pal %>%
imap(function(col, grp) {
rect <- rectGrob(x = 0, width = unit(0.5, "line"), height = unit(0.5, "line"), gp = gpar(col = col, fill = col), hjust = 0)
label <- textGrob(label = grp, gp = gpar(col = colorspace::darken(col, 0.4), fontsize = 10), x = 0, hjust = 0)
cowplot::plot_grid(rect, label, nrow = 1, rel_widths = c(0.12, 1))
}) %>%
arrangeGrob(grobs = rev(.), padding = unit(0.1, "line"), heights = rep(unit(1.1, "line"), 3))
cowplot::plot_grid(p2, legend, rel_widths = c(1, 0.45))
Created on 2018-05-26 by the reprex package (v0.2.0).

Add separate legend to PDF from lapply

I've created a multi-page PDF with plots generated from a few hundred unique identifiers. Basically, I would like to add a separate legend panel once per page.
The PDF is basically constructed as detailed here and here
There are dozens of walk-throughs on how to add a separate legend for a few graphical objects using grid.arrange, the most promising are here and here.
Basically, the steps are:
create the database
use lapply to make a list of the graphical objects and
create a pdf the chunks up the list of graphical objects.
I suspect the process falls apart at step 3 - adding the legend to the list of grobs.
To reproduce the problem
color.names <- setNames(c("A", "B", "C", "D", "F"), c("green3", "chocolate1", "darkgoldenrod1", "firebrick1"))
group.colors <- c(A = "#333BFF", B = "#CC6600", C ="#9633FF", D = "#E2FF33", F = "#E3DB71")
SOexample <- data.frame(
studentid = runif(100,min=500000, max=999999),
grade = runif(100, min=20, max=100),
lettergrade = sample(c("A", "B","C","D","F"),size=100,replace=TRUE),
firstname = sample(c("Alan", "Billy","Charles","Donna","Felicia"),size=100,replace=TRUE)
)
To generate the legend
df <- SOexample
gl <- ggplot(df, aes(x=" ", y=as.numeric(grade), ymin=50, ymax=100))+ geom_boxplot()+ guides(fill=FALSE) + geom_point(aes(colour=lettergrade)) + labs( x=" ", y=" ") + ggtitle(sprintf("%s", df$firstname), aes(cex=.05)) + scale_colour_manual(name="Number", values=group.colors) + scale_fill_manual(name="", values="red") + theme_grey() + theme(legend.position="none", plot.title = element_text(size = 8, face = "bold"), plot.subtitle=element_blank()) + theme(axis.title.x=element_blank(), axis.text.x=element_blank(), axis.ticks.x=element_blank())
The function to grab the legend using cowplot
install.packages("cowplot")
library(cowplot)
leg <- get_legend(gs + theme(legend.position="right"))
To make all the graphical objects
plist = lapply(split(SOexample, factor(SOexample$studentid)), function(df) { ggplot(df, aes(x=" ", y=as.numeric(grade), ymin=50, ymax=100))+ geom_boxplot()+ guides(fill=FALSE) + geom_point(aes(colour=lettergrade)) + labs( x=" ", y=" ") + ggtitle(sprintf("%s", df$firstname), aes(cex=.05)) + scale_colour_manual(name="Number", values=group.colors) + scale_fill_manual(name="", values="red") + theme_grey() +theme(legend.position="none", plot.title = element_text(size = 8, face = "bold"), plot.subtitle=element_blank()) + theme(axis.title.x=element_blank(),axis.text.x=element_blank(), axis.ticks.x=element_blank())})
Making the PDF
pdf("allpeople.pdf", pointsize=8)
for (i in seq(1, length(plist), 11)) {
grid.arrange(grobs=plist[i:(i+11)],
ncol=4, left="Magic Numbers", bottom=" ")
}
dev.off()
I suspect the process is falling apart in the create PDF stage. Ideally, I would add the legend as a graphical object in / at the grid.arrange step, e.g.,
grobs[12]<- leg
But no luck, and also the last item in the plist() process seems to have not been fully converted to a graphical object.
Using this auto-generating method, i.e., cannot list out graphical objects individually, how does one add the legend to each page of the PDF?
There are various options (ggsave('file.pdf',marrangeGrob(plist,ncol=4,nrow=3)), for instance), but I'd probably do something like this for finer control:
pl <- split(plist, gl(10,10))
pdf("allpeople.pdf", pointsize=8)
for (i in seq_along(pl)) {
grid.arrange(grobs=c(pl[[i]], list(leg)),
ncol=4,
left="Magic Numbers",
bottom=" ")
}
dev.off()

Put legend under each facet using facet_grid; adding one title and one caption to plot

I'm working with a plot analogous to the following:
ggplot(data=mtcars, aes(x=wt, y=mpg, color=carb)) +
geom_line() + facet_grid(gear ~ .) +
ggtitle(expression("Title")) +
labs(caption = "Sources: Compustat, Author's Calculations") +
theme(plot.title = element_text(size = 20, hjust = 0.5),
plot.caption=element_text(size=8, hjust=.5),
strip.background = element_blank(),
strip.text = element_blank(),
legend.title = element_blank())
I'm trying to do the following:
Insert a legend beneath each of the 3 facets, each legend specific to the facet above it.
Insert one plot title (as opposed to the same title above each facet).
Insert one caption beneath the final facet (as opposed to three captions beneath each facet).
I was able to reproduce this example on assigning a legend to each facet.
However, the plot title was placed above and the caption below each facet. Also, this example uses facet_wrap and not facet_grid.
Thank you in advance.
library(dplyr)
library(ggplot2)
tempgg <- mtcars %>%
group_by(gear) %>%
do(gg = {ggplot(data=., aes(x=wt, y=mpg, color=carb)) +
geom_point() +
labs(x = NULL) +
guides(color = guide_colorbar(title.position = "left")) +
theme(plot.title = element_text(size = 20, hjust = 0.5),
plot.caption=element_text(size=8, hjust=.5),
legend.position = "bottom")})
tempgg$gg[1][[1]] <- tempgg$gg[1][[1]] + labs(title = "Top title")
tempgg$gg[3][[1]] <- tempgg$gg[3][[1]] + labs(x = "Axis label", caption = "Bottom caption")
tempgg %>% gridExtra::grid.arrange(grobs = .$gg)
This isn't the most elegant way to do it. Each of the three grobs gets an equal space when you grid.arrange them, so the first and last ones are squished from the title and caption taking up space. You could add something like heights = c(3,2,3) inside the grid.arrange call, but you'd have to fiddle with each of the heights to get it to look right, and even then it would be a visual approximation, not exact.
To do it the more precise way, you'd need to look at the underlying gtables in each of the grobs. https://stackoverflow.com/users/471093/baptiste is the expert on that.
Update:
I used a #baptiste solution, which is still not particularly elegant, but gives you the same plot space for each panel. Use this snippet in place of the last line above.
tempggt <- tempgg %>% do(ggt = ggplot_gtable(ggplot_build(.$gg))) %>% .$ggt
gg1 <- tempggt[[1]]
gg2 <- tempggt[[2]]
gg3 <- tempggt[[3]]
gridExtra::grid.arrange(gridExtra::rbind.gtable(gg1, gg2, gg3))

Resources