Add separate legend to PDF from lapply - r

I've created a multi-page PDF with plots generated from a few hundred unique identifiers. Basically, I would like to add a separate legend panel once per page.
The PDF is basically constructed as detailed here and here
There are dozens of walk-throughs on how to add a separate legend for a few graphical objects using grid.arrange, the most promising are here and here.
Basically, the steps are:
create the database
use lapply to make a list of the graphical objects and
create a pdf the chunks up the list of graphical objects.
I suspect the process falls apart at step 3 - adding the legend to the list of grobs.
To reproduce the problem
color.names <- setNames(c("A", "B", "C", "D", "F"), c("green3", "chocolate1", "darkgoldenrod1", "firebrick1"))
group.colors <- c(A = "#333BFF", B = "#CC6600", C ="#9633FF", D = "#E2FF33", F = "#E3DB71")
SOexample <- data.frame(
studentid = runif(100,min=500000, max=999999),
grade = runif(100, min=20, max=100),
lettergrade = sample(c("A", "B","C","D","F"),size=100,replace=TRUE),
firstname = sample(c("Alan", "Billy","Charles","Donna","Felicia"),size=100,replace=TRUE)
)
To generate the legend
df <- SOexample
gl <- ggplot(df, aes(x=" ", y=as.numeric(grade), ymin=50, ymax=100))+ geom_boxplot()+ guides(fill=FALSE) + geom_point(aes(colour=lettergrade)) + labs( x=" ", y=" ") + ggtitle(sprintf("%s", df$firstname), aes(cex=.05)) + scale_colour_manual(name="Number", values=group.colors) + scale_fill_manual(name="", values="red") + theme_grey() + theme(legend.position="none", plot.title = element_text(size = 8, face = "bold"), plot.subtitle=element_blank()) + theme(axis.title.x=element_blank(), axis.text.x=element_blank(), axis.ticks.x=element_blank())
The function to grab the legend using cowplot
install.packages("cowplot")
library(cowplot)
leg <- get_legend(gs + theme(legend.position="right"))
To make all the graphical objects
plist = lapply(split(SOexample, factor(SOexample$studentid)), function(df) { ggplot(df, aes(x=" ", y=as.numeric(grade), ymin=50, ymax=100))+ geom_boxplot()+ guides(fill=FALSE) + geom_point(aes(colour=lettergrade)) + labs( x=" ", y=" ") + ggtitle(sprintf("%s", df$firstname), aes(cex=.05)) + scale_colour_manual(name="Number", values=group.colors) + scale_fill_manual(name="", values="red") + theme_grey() +theme(legend.position="none", plot.title = element_text(size = 8, face = "bold"), plot.subtitle=element_blank()) + theme(axis.title.x=element_blank(),axis.text.x=element_blank(), axis.ticks.x=element_blank())})
Making the PDF
pdf("allpeople.pdf", pointsize=8)
for (i in seq(1, length(plist), 11)) {
grid.arrange(grobs=plist[i:(i+11)],
ncol=4, left="Magic Numbers", bottom=" ")
}
dev.off()
I suspect the process is falling apart in the create PDF stage. Ideally, I would add the legend as a graphical object in / at the grid.arrange step, e.g.,
grobs[12]<- leg
But no luck, and also the last item in the plist() process seems to have not been fully converted to a graphical object.
Using this auto-generating method, i.e., cannot list out graphical objects individually, how does one add the legend to each page of the PDF?

There are various options (ggsave('file.pdf',marrangeGrob(plist,ncol=4,nrow=3)), for instance), but I'd probably do something like this for finer control:
pl <- split(plist, gl(10,10))
pdf("allpeople.pdf", pointsize=8)
for (i in seq_along(pl)) {
grid.arrange(grobs=c(pl[[i]], list(leg)),
ncol=4,
left="Magic Numbers",
bottom=" ")
}
dev.off()

Related

ggplot - control the number of graph per a page [duplicate]

I have the facet_wrap function to make multiple graphs (n=~51) but they all appear on one page. Now after searching, I found out that ggplot2 can't place graphs on multiple pages.
Is there a way to do this? I looked at this question (Multiple graphs over multiple pages using ggplot) and tried out the code, with little success.
Here is my code for my graphs, it produces ~51 graphs on one page, making them very small and hard to see, if I could print this to 1 graph per page in a pdf, that would be great:
ggplot(indbill, aes(x = prey, y = weight), tab) +
geom_polygon(aes(group = load, color = capture), fill = NA, size = 0.75) +
facet_wrap(~ individual) +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(size=rel(0.5)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("") +
guides(color = guide_legend(ncol=2)) +
coord_radar()
If someone could write up a little code and explain it to me, that would be great.
There are multiple ways to do the pagination: ggforce or gridExtra::marrangeGrob. See also this answer for another example.
ggforce:
library(ggplot2)
# install.packages("ggforce")
library(ggforce)
# Standard facetting: too many small plots
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap(~cut:clarity, ncol = 3)
# Pagination: page 1
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap_paginate(~cut:clarity, ncol = 3, nrow = 3, page = 1)
# Pagination: page 2
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_wrap_paginate(~cut:clarity, ncol = 3, nrow = 3, page = 2)
# Works with grid as well
ggplot(diamonds) +
geom_point(aes(carat, price), alpha = 0.1) +
facet_grid_paginate(color~cut:clarity, ncol = 3, nrow = 3, page = 4)
gridExtra:
# install.packages("gridExtra")
library(gridExtra)
set.seed(123)
pl <- lapply(1:11, function(.x)
qplot(1:10, rnorm(10), main=paste("plot", .x)))
ml <- marrangeGrob(pl, nrow=2, ncol=2)
## non-interactive use, multipage pdf
## ggsave("multipage.pdf", ml)
## interactive use; calling `dev.new` multiple times
ml
Created on 2018-08-09 by the reprex package (v0.2.0.9000).
One option is to just plot, say, six levels of individual at a time using the same code you're using now. You'll just need to iterate it several times, once for each subset of your data. You haven't provided sample data, so here's an example using the Baseball data frame:
library(ggplot2)
library(vcd) # For the Baseball data
data(Baseball)
pdf("baseball.pdf", 7, 5)
for (i in seq(1, length(unique(Baseball$team87)), 6)) {
print(ggplot(Baseball[Baseball$team87 %in% levels(Baseball$team87)[i:(i+5)], ],
aes(hits86, sal87)) +
geom_point() +
facet_wrap(~ team87) +
scale_y_continuous(limits=c(0, max(Baseball$sal87, na.rm=TRUE))) +
scale_x_continuous(limits=c(0, max(Baseball$hits86))) +
theme_bw())
}
dev.off()
The code above will produce a PDF file with four pages of plots, each with six facets to a page. You can also create four separate PDF files, one for each group of six facets:
for (i in seq(1, length(unique(Baseball$team87)), 6)) {
pdf(paste0("baseball_",i,".pdf"), 7, 5)
...ggplot code...
dev.off()
}
Another option, if you need more flexibility, is to create a separate plot for each level (that is, each unique value) of the facetting variable and save all of the individual plots in a list. Then you can lay out any number of the plots on each page. That's probably overkill here, but here's an example where the flexibility comes in handy.
First, let's create all of the plots. We'll use team87 as our facetting column. So we want to make one plot for each level of team87. We'll do this by splitting the data by team87 and making a separate plot for each subset of the data.
In the code below, split splits the data into separate data frames for each level of team87. The lapply wrapper sequentially feeds each data subset into ggplot to create a plot for each team. We save the output in plist, a list of (in this case) 24 plots.
plist = lapply(split(Baseball, Baseball$team87), function(d) {
ggplot(d, aes(hits86, sal87)) +
geom_point() +
facet_wrap(~ team87) +
scale_y_continuous(limits=c(0, max(Baseball$sal87, na.rm=TRUE))) +
scale_x_continuous(limits=c(0, max(Baseball$hits86))) +
theme_bw() +
theme(plot.margin=unit(rep(0.4,4),"lines"),
axis.title=element_blank())
})
Now we'll lay out six plots at time in a PDF file. Below are two options, one with four separate PDF files, each with six plots, the other with a single four-page PDF file. I've also pasted in one of the plots at the bottom. We use grid.arrange to lay out the plots, including using the left and bottom arguments to add axis titles.
library(gridExtra)
# Four separate single-page PDF files, each with six plots
for (i in seq(1, length(plist), 6)) {
pdf(paste0("baseball_",i,".pdf"), 7, 5)
grid.arrange(grobs=plist[i:(i+5)],
ncol=3, left="Salary 1987", bottom="Hits 1986")
dev.off()
}
# Four pages of plots in one PDF file
pdf("baseball.pdf", 7, 5)
for (i in seq(1, length(plist), 6)) {
grid.arrange(grobs=plist[i:(i+5)],
ncol=3, left="Salary 1987", bottom="Hits 1986")
}
dev.off()
something like :
by(indbill, indbill$individual, function (x){
ggplot(x, aes(x = prey, y = weight), tab) +
geom_polygon(aes(group = load, color = capture), fill = NA, size = 0.75) +
theme(axis.ticks.x = element_blank(),
axis.text.x = element_text(size=rel(0.5)),
axis.ticks.y = element_blank(),
axis.text.y = element_blank()) +
xlab("") + ylab("") +
guides(color = guide_legend(ncol=2)) +
coord_radar()
}

Hide legend elements in ggplot2

I am trying to plot the parameter estimates and levels of hierarchy from a stan model output. For the legend, I am hoping to remove all labels except for the "Overall Effects" label but I can't figure out how to remove all of the species successfully.
Here is the code:
ggplot(dfwide, aes(x=Estimate, y=var, color=factor(sp), size=factor(rndm),
alpha=factor(rndm))) +
geom_point(position =pd) +
geom_errorbarh(aes(xmin=(`2.5%`), xmax=(`95%`)), position=pd,
size=.5, height = 0, width=0) +
geom_vline(xintercept=0) +
scale_colour_manual(values=c("blue", "red", "orangered1","orangered3", "sienna4",
"sienna2", "green4", "green3", "purple2", "magenta2"),
labels=c("Overall Effects", expression(italic("A. pensylvanicum"),
italic("A. rubrum"), italic("A. saccharum"),
italic("B. alleghaniensis"), italic("B. papyrifera"),
italic("F. grandifolia"), italic("I. mucronata"),
italic("P. grandidentata"), italic("Q. rubra")))) +
scale_size_manual(values=c(3, 1, 1, 1, 1, 1, 1, 1, 1, 1)) +
scale_shape_manual(labels="", values=c("1"=16,"2"=16)) +
scale_alpha_manual(values=c(1, 0.4)) + guides(size=FALSE, alpha=FALSE) +
ggtitle(label = "A.") +
scale_y_discrete(limits = rev(unique(sort(dfwide$var))), labels=estimates) +
ylab("") +
labs(col="Effects") + theme(legend.title=element_blank())
The key points you need to notice is that remove part of the labels in legend can't be achieved by the function in ggplot2, what you need to do is interact with grid, which more underlying since both lattice and ggplot2 are based grid,to do some more underlying work, we need some functions in the grid.
To remove part of the labels in legend, there are three functions need to be used, they are grid.force(), grid.ls() and grid.remove() . After draw the picture by ggplot2, then using grid.force() and grid.ls(), we can find all the elements in the picture, they all are point, line, text, etc. Then we may need to find the elements we are interested, this process is interactive, since names of the element in ggplot2 are made by some numbers and text, they are not always meanful, after we identify the names of the element we are interested, we can use the grid.remove() function to remove the elements, blew is the sample code I made.
library(grid)
library(ggplot2)
set.seed(1)
data <- data.frame(x = rep(1:10, 2), y = sample(1:100, 20),
type = sample(c("A", "B"), 20, replace = TRUE))
ggplot(data, aes(x = x, y =y,color = type))+
geom_point()+
geom_line()+
scale_color_manual(values = c("blue", "darkred"))+
theme_bw()
until now, we have finished draw the whole picture, then we need to do some works remove some elements in the picture.
grid.force()
grid.ls()
grid.ls() list all the element names
grid.remove("key-4-1-1.5-2-5-2")
grid.remove("key-4-1-2.5-2-5-2")
grid.remove("label-4-3.5-4-5-4")
It's not perfect, but my solution would be to actually make two plots and combine them together. See this post where I lifted the extraction code from.
I don't have your data, but I think you will get the idea below:
library(ggplot2)
library(gridExtra)
library(grid)
#g_table credit goes to https://stackoverflow.com/a/11886071/2060081
g_legend<-function(a.gplot){
tmp <- ggplot_gtable(ggplot_build(a.gplot))
leg <- which(sapply(tmp$grobs, function(x) x$name) == "guide-box")
legend <- tmp$grobs[[leg]]
return(legend)}
p_legend = ggplot(dfwide[sp=='Overall Effects'], aes(x=Estimate, y=var, color=factor(sp),
size=factor(rndm),
alpha=factor(rndm))) +
geom_point(position =pd) +
geom_errorbarh(aes(xmin=(`2.5%`), xmax=(`95%`)), position=pd,
size=.5, height = 0, width=0) +
geom_vline(xintercept=0) +
scale_colour_manual(values=c("blue"),
labels=c("Overall Effects"))) +
scale_size_manual(values=c(3)) +
scale_shape_manual(labels="", values=c("1"=16,"2"=16)) +
scale_alpha_manual(values=c(1, 0.4)) + guides(size=FALSE, alpha=FALSE) +
ggtitle(label = "A.") +
scale_y_discrete(limits = rev(unique(sort(dfwide$var))), labels=estimates) +
ylab("") +
labs(col="Effects") + theme(legend.title=element_blank())
p_legend = g_legend(p_legend)
One of your plots will just be the legend. Subset your data based on the Overall Effects and then plot the two plots together as a grid.

Indexes withing ggplot in R

I am generating some plots with the following code.I have 8 plots generated with the the following code and what I want is to have them on the same page with no titles. More specifically, I want in every plot to have on the left up-corner a letter (a,b..) and at the end of the plot to have something like an one-row legend (e.g Plots: a. category one, b. category two, ...).
Code:
g1= ggplot(som, aes(x=value, y=variable))+geom_smooth(method=lm,alpha=0.25,col='green',lwd=0.1) +ylim(0,1000)+xlim(-2,2)+
geom_point(shape=23,fill="black",size=0.2)+theme_bw()+theme(plot.background = element_blank(),panel.grid.major = element_blank()
,panel.grid.minor = element_blank()) +labs(x="something here",y="something else")+
theme(axis.title.x = element_text(face="bold", size=7),axis.text.x = element_text(size=5))+
theme(axis.title.y = element_text(face="bold", size=7),axis.text.y = element_text(size=5))+
theme(plot.title = element_text(lineheight=.8, face="bold",size=8))
grid.arrange(g1,g2,g3,g4,g5,g6,g7,g8,ncol=2)
Is it possible to do that with ggplot? If so, how can I do this?
p.s I have no problem with the above code
Thank you.
This is how you could do it with library(cowplot).
First some plots:
set.seed(1)
plots <- list()
for (i in 1:8) {
my_cars <- mtcars[sample(1:nrow(mtcars), 10), ]
plots[[i]] <- ggplot(my_cars, aes(mpg, hp, color = as.factor(cyl))) +
geom_point() +
geom_smooth(method = "lm", color = "black")
}
Then to have a unifying title (or legend here) we use a combination of two plot_grid() calls.
lbls <- LETTERS[1:length(plots)]
# add a line break because its long
lbls <- gsub("E", "\nE", lbls)
grid <- plot_grid(plotlist = plots, labels = lbls, ncol = 2)
legend <- ggdraw() +
draw_label(paste0(lbls, "= category",1:length(plots), collapse = " "))
plot_grid(grid, legend, rel_heights = c(1, .1), ncol = 1)
The documentation for cowplot is great and has a ton of examples. Check it out here and here. Let me know if you get stuck.

How can I automatically plot graphs in R with ggplot and save them to a folder?

I'm trying to create a graph (using quickplot) for each column of a data set and save it to a folder as a pdf -any advice would be much appreciated!
So far I've made a test data frame (before I try it with 500+ columns)
test.data <-cbind.data.frame(data$col_1,data$col_2,data$col_3)
Then I've tried to write a function to plot and save the graphs. I'm trying to make the graphs bar charts (with some title & color specifications) which show the count of the no. people in each category. So the columns typically consist of categorical data.
plot.graphs <- function(x) {
for(i in colSums(x)){
plots <- quickplot(i) +
geom_bar(color= "#6267c1", fill="#6267c1") +
labs(title= "i",
x="i",
y="Count") +
theme(help()
plot.title = element_text(colour = "#453694"),
axis.title = element_text(colour ="#453694"))
ggsave(plots,filename = "testplot",nm[1],".pdf",sep="")
print(plots)
}
}
plot.graphs(test.data)
However, this seems to come up with lots of errors so I don't think I'm doing it right.
Try wrapping your plot-code with a pdf() graphical device and dev.off(). pdf() will open a pdf graphical device, and store all graphics you generate in a file, until you close the graphical device with dev.off().
I can't test your code because I don't have the dataset, but try this:
pdf(file = 'test.pdf', onefile = TRUE, paper = 'special', height = 11, width = 8.5)
for(i in colSums(x)){
plots <- quickplot(i) +
geom_bar(color= "#6267c1", fill="#6267c1") +
labs(title= "i",
x="i",
y="Count") +
theme(help()
plot.title = element_text(colour = "#453694"),
axis.title = element_text(colour ="#453694"))
}
dev.off()
Also see: https://stat.ethz.ch/R-manual/R-devel/library/grDevices/html/pdf.html
Just in case this is helpful for anyone, the following script ended up working for me:
plot.auto <- function(data, list = as.list(colnames(data))){
df <- data
ln <- length(names(data))
for(i in 1:ln){
plot <- quickplot(na.omit(df[,i],main=names(df)[i])) +
geom_bar() +
labs(title= colnames((df[,i])),
x=colnames((df)[i]),
y="y axis title") +
# this makes the titles and text particular colours
theme(
plot.title = element_text(colour = "#455876"),
axis.title = element_text(colour ="#455467"),
# this puts the labels on an angle so they don't overlap
axis.text.x = element_text(angle = 30, hjust = 1))+
# this makes the title of the plot the same as the column name
ggtitle(colnames((df)[i])) +
geom_text(stat='count',aes(label=..count..),vjust=-0.3)
print(length(unique(df[,i])))
# this deletes any characters in the column names which will make saving
difficult
save_name1<- gsub("/","or",as.character(colnames(df)[i]))
save_name<- gsub("\\?","",save_name1)
#this tells you each title of the graph as the function runs
print(save_name)
#this saves each graph in a folder which must be in your Working Directory
eg. Auto_Plot_Folder (as a pdf) with the file the same as the column name
ggsave(plot,filename =
paste("Auto_Plot_folder/",save_name,".pdf",sep =""),device ="pdf")
}
}

How to put more information on my R Pie Chart ?

I have the following simple R code.
vect <- c(a = 4, b = 7, c = 5)
pie(vect, labels = c("A", "B", "C"), col = c("#999999", "#6F6F6F", "#000000"))
that print the following pie chart
How do I modify the above code to print more detail on my pie chart, so that it look like this other one ?
Thanks for any reply !
My above comment in details:
Create a dummy data frame for the example:
vect <- data.frame(v=c(rep('a', 25), rep('b', 30), rep('c', 45)))
Call ggplot for a pie-chart (coord_polar) based on the manual:
p <- ggplot(vect, aes(x=factor(1), fill = factor(v))) + geom_bar(width = 1) + coord_polar(theta="y")
Tweak it:
p + opts(title = "Decision tree") + xlab('') + ylab('') +
theme_bw() + scale_fill_grey(name = "Rankings")
Resulting in:
I would recommend several attributes you dive into with the plot method. For the plot title, see main attribute, and legend attribute for adding the colorful palette.
You may also refer to a simple plot tutorial here: http://www.harding.edu/fmccown/r/#piecharts

Resources