Saving ggplots to a list in a for loop - r

I produce nine ggplots within a for loop and subsequently arrange those plots using grid.arrange:
plot_list <- list()
for(i in c(3:ncol(bilanz.vol))) {
histogram <- ggplot(data = bilanz.vol, aes(x = bilanz.vol[,i])) +
geom_histogram() +
scale_x_log10() +
ggtitle(paste(varnames[i]))
# ggsave(filename = paste("Graphs/", vars[i], ".png", sep = ""), width = 16, height = 12, units = "cm")
plot_list <- c(plot_list, list(histogram))
}
library(gridExtra)
png(filename = "Graphs/non-mfi.png", width = 1280, height = 960, units = "px")
do.call(grid.arrange, c(plot_list, list(ncol = 3)))
dev.off()
The code itself works fine and there are no errors. But for some reason I do not understand, the grid shows the same (last) histogram nine times. Still, each plot shows the correct title.
Interestingly, when I uncomment the ggsave line in the code above, each plot is saved correctly (separately) and shows the expected histogram.
Any ideas?

The reason is that ggplot does not evaluate the expression in the aes call before it is used (so I believe at least), it just sets up the plot and stores the data inside of it. In you case "the data" is the entire data frame bilanz.vol and since i = ncol(bilanz.vol) after the for loop completes the expression bilanz.vol[,i] will evaluate to the same thing for all plot objects.
To make it work you could do this, which makes sure all plot objects contains different data sets my.data.
my.data <- data.frame(x = bilanz.vol[,i])
histogram <- ggplot(data = my.data, aes(x = x)) +
geom_histogram() +
scale_x_log10() +
ggtitle(paste(varnames[i]))

Related

(R) Graphs in a loop and funktion are empty, even though they work when plottet standalone

So, I have a boxplot where i annotate the number of datapoint per plot and significance levels in letters above the plots. When plottet in a normal (?!?) workflow, they take about 1-2 seconds to plot in a X Window System Graphics (X11), the plot gets saved afterwards. When the plot-command is wrapped in a for-loop or called by a function, the X11-window stays empty and gets saved like that.
Here is a minimal example using mtcars, showcasing the same problem. Without context this example does not make sense.
library(ggplot2)
setwd("C:/")
output <- "C:/"
data <- mtcars
data$cyl <- as.factor(data$cyl)
#----normal plotting----
x11()
ggplot(data, aes(x = cyl, y = mpg))+
stat_boxplot(geom = "errorbar")+
geom_boxplot()
savePlot(paste0(output, "example_normal", ".tiff"), type = "tiff")
dev.off()
#----plotting throught a function----
my.plot <- function(x)
{
x11()
ggplot(x, aes(x = cyl, y = mpg))+
stat_boxplot(geom = "errorbar")+
geom_boxplot()
savePlot(paste0(output, "example_function", ".tiff"), type = "tiff")
dev.off()
}
my.plot(data)
Cheers
I had to post a print(ggplot(...)) around it to make it work in a for-loop.

ggsave cuts of part of the common legend created with ggarrange

I am trying to generate multiple plots from my data by using lapply and then arranging the resulting list with ggarrange. When I try to save the final figure with ggsave part of the legend text is cut off in the png.
First I define what I want to plot along with Plot titles and colors
main.overview <- list(
c("AA", "AA", "black"),
c("X5.HETE", "5-HETE", "red"),)
I then define a function to generate the plots.
plot.overview = function(data, mediator) {
analyte <- mediator[[1]]
name <- mediator[[2]]
color <- mediator[[3]]
ggplot(data = data, aes_string(x="Compound",y=analyte)) +
geom_boxplot(aes(fill=Compound)) +
labs(title=name) +
scale_fill_brewer(palette="Reds") +
theme_classic() +
theme(plot.title = element_text(hjust = 0.5, color = color),axis.title.x = element_blank(),axis.title.y = element_blank())}
Finally I call the function and arrange the plots into a figure
myplots <- lapply(main.overview, plot.overview, data=lm)
arrange <- ggarrange(plotlist = myplots, common.legend = TRUE, nrow=1, legend = "right")
figure <- annotate_figure(arrange, left = text_grob(expression(10^6~cells), rot=90))
ggsave("overview.png", dpi="print", device="png",plot=figure, height=10, width=30, units="cm")
In the final png however the common legend i put on the right is cut off.
EDIT:
I have figured out part of the problem, the problem only occurs on my desktop-pc and not on my laptop, so it might be a problem with additional packages or versions of the R libraries

arbitrary number of plots for grid.arrange

I'm trying to plot an arbitrary number of bar plots with rmarkdown separated by 2 columns. In my example there will be 20 total plots so I was hoping to get 10 plots in each column, however, I can't seem to get this to work with grid.arrange
plot.categoric = function(df, feature){
df = data.frame(x=df[,feature])
plot.feature = ggplot(df, aes(x=x, fill = x)) +
geom_bar() +
geom_text(aes(label=scales::percent(..count../1460)), stat='count', vjust=-.4) +
labs(x=feature, fill=feature) +
ggtitle(paste0(length(df$x))) +
theme_minimal()
return(plot.feature)
}
plist = list()
for (i in 1:20){
plist = c(plist, list(plot.categoric(train, cat_features[i])))
}
args.list = c(plist, list(ncol=2))
do.call("grid.arrange", args.list)
When I knit this to html I'm getting the following output:
I was hoping I would get something along the lines of:
but even with this the figure sizes are still funky, I've tried playing with heights and widths but still no luck. Apologies if this is a long question
If you have all the ggplot objects in a list then you can easily build the two column graphic via gridExtra::grid.arrange. Here is a simple example that will put eight graphics into a 4x2 matrix.
library(ggplot2)
library(gridExtra)
# Build a set of plots
plots <-
lapply(unique(diamonds$clarity),
function(cl) {
ggplot(subset(diamonds, clarity %in% cl)) +
aes(x = carat, y = price, color = color) +
geom_point()
})
length(plots)
# [1] 8
grid.arrange(grobs = plots, ncol = 2)

Formatting output with Knitr, ggplot2 and xtable

I am trying to achieve the following task with Knitr, ggplot2 and xtables:
Generate several annotated plots of beta-distributions with ggplot2
Write the output in a layout such that I have a plot, and a corresponding summary Stats table following it, for every plot.
Write the code such that both PDF and HTML reports can be a generated in a presentable way
Here is my attempt at this task (Rnw file):
\documentclass{article}
\begin{document}
Test for ggplot2 with Knitr
<<Initialize, echo=FALSE>>=
library(ggplot2)
library(ggthemes)
library(data.table)
library(grid)
library(xtable)
library (plyr)
pltlist <- list()
statlist <- list()
#
The libraries are loaded. Now run the main loop
<<plotloop, echo=FALSE>>=
for (k in seq(1,7)){
x <- data.table(rbeta(100000,1.6,14+k))
xmean <- mean(x$V1, na.rm=T)
xqtl <- quantile(x$V1, probs = c(0.995), names=F)
xdiff <- xqtl - xmean
dens <- density(x$V1)
xscale <- (max(dens$x, na.rm=T) - min(dens$x, na.rm=T))/100
yscale <- (max(dens$y, na.rm=T))/100
y_max <- max(dens$y, na.rm=T)
y_intercept <- y_max-(10*yscale)
data <- data.frame(x)
y <- ggplot(data, aes(x=V1)) + geom_density(colour="darkgreen", size=2, fill="green",alpha=.3) +
geom_vline(xintercept = xmean, colour="blue", linetype = "longdash") +
geom_vline(xintercept = xqtl, colour="red", linetype = "longdash") +
geom_segment(aes(x=xmean, xend=xqtl, y=y_intercept, yend=y_intercept), colour="red", linetype = "solid", arrow = arrow(length = unit(0.2, "cm"), ends = "both", type = "closed")) +
annotate("text", x = xmean+xscale, y = y_max, label = paste("Val1:",round(xmean,4)), hjust=0) +
annotate("text", x = xqtl+xscale, y = y_max, label = paste("Val2:",round(xqtl,4))) +
annotate("text", x = xmean+10*xscale, y = y_max-15*yscale, label = paste("Val3:",round(xdiff,4))) +
xlim(min(dens$x, na.rm=T), xqtl + 9*xscale) +
xlab("Values") +
ggtitle("Beta Distribution") +
theme_bw() +
theme(plot.title = element_text(hjust = 0, vjust=2))
pltlist[[k]] <- y
statlist[[k]] <- list(mean=xmean, quantile=xqtl)
}
stats <- ldply(statlist, data.frame)
#
Plots are ready. Now Plot them
<<PrintPlots, warning=FALSE, results='asis', echo=FALSE, cache=TRUE, fig.height=3.5>>=
for (k in seq(1,7)){
print(pltlist[[k]])
print(xtable(stats[k,], caption="Summary Statistics", digits=6))
}
#
Plotting Finished.
\end{document}
I am faced with several issues after running this code.
When I run this code just as R code, Once I try to print the plots in the list, the horizontal line from the geom_segment part starts to move all over the place. However if I plot the figures individually, without putting them in a list, the figures are fine, as I would expect them to be.
Only the last plot is as I would expect the output to be, in all the other plots, the geom_segment line moves around randomly.
I am also unable to put a separate caption for the Plots as I can for the Tables.
Points to note :
I am storing the beta-random numbers in data.table since in our actual code, we are using data.table. However for the purposes of testing ggplot2 in this way, I convert the data.table into a data.frame, as ggplot2 requires.
I also need to generate the random numbers within the loop and generate the plots per iteration (so something like first generating the random numbers and then using melt would not work here), since generating the random numbers is emulating a complex database call per iteration of the loop.
I am using RStudio Version 0.98.1091 and
R version 3.1.2 (2014-10-31) on Windows 8.1
This is the expected Plot:
This is the plot I am getting when plotting from the list:
My output in PDF form :
PDF Output
Please advice if there are any ideas for solutions.
Thank you,
SG
I don't know why the horizontal line in geom_segment is "moving around" from plot to plot, rather than spanning xmean to xqtl. However, I was able to get the horizontal line in the correct location by getting the value from the stats data frame, rather than from direct calculation of the mean and quantile. You just have to create the stats data frame before the loop, rather than after, so that you can use it in the loop.
stats <- ldply(statlist, data.frame)
for (k in seq(1,7)){
...
y <- ggplot(data, aes(x=V1)) +
...
geom_segment(aes(x=stats[k,1], xend=stats[k,2], y=y_intercept, yend=y_intercept),
colour="red", linetype = "solid",
arrow = arrow(length = unit(0.2, "cm"), ends = "both", type = "closed")) +
...
pltlist[[k]] <- y
statlist[[k]] <- list(mean=xmean, quantile=xqtl)
}
Hopefully, someone else will be able to explain the anomalous behavior, but at least this seems to fix the problem.
For the figure caption, you can add a fig.cap argument to the chunk where you plot the figures, although this results in the same caption for each figure and causes the figures and tables to be plotted in separate groups, rather than interleaved:
<<PrintPlots, warning=FALSE, results='asis', echo=FALSE, cache=TRUE, fig.cap="Caption", fig.height=3.5>>=
for (k in seq(1,7)){
print(pltlist[[k]])
print(xtable(stats[k,], caption="Summary Statistics", digits=6))
}
You might want to use R Markdown and knitr which is easier than using LaTeX and R (as also zhaoy suggested).
You might also want to check out the ReporteRs package. I think it is actually easier to use than knitr. However, you cannot generate PDFs with it. But you can use pandoc to convert them into PDFs.

Keep all plot components same size in ggplot2 between two plots

I would like two separate plots. I am using them in different frames of a beamer presentation and I will add one line to the other (eventually, not in example below). Thus I do not want the presentation to "skip" ("jump" ?) from one slide to the next slide. I would like it to look like the line is being added naturally. The below code I believe shows the problem. It is subtle, but not how the plot area of the second plot is slightly larger than of the first plot. This happens because of the y axis label.
library(ggplot2)
dfr1 <- data.frame(
time = 1:10,
value = runif(10)
)
dfr2 <- data.frame(
time = 1:10,
value = runif(10, 1000, 1001)
)
p1 <- ggplot(dfr1, aes(time, value)) + geom_line() + scale_y_continuous(breaks = NULL) + scale_x_continuous(breaks = NULL) + ylab(expression(hat(z)==hat(gamma)[1]*time+hat(gamma)[4]*time^2))
print(p1)
dev.new()
p2 <- ggplot(dfr2, aes(time, value)) + geom_line() + scale_y_continuous(breaks = NULL) + scale_x_continuous(breaks = NULL) + ylab(".")
print(p2)
I would prefer to not have a hackish solution such as setting the size of the axis label manually or adding spaces on the x-axis (see one reference below), because I will use this technique in several settings and the labels can change at any time (I like reproducibility so want a flexible solution).
I'm searched a lot and have found the following:
Specifying ggplot2 panel width
How can I make consistent-width plots in ggplot (with legends)?
https://groups.google.com/forum/#!topic/ggplot2/2MNoYtX8EEY
How can I add variable size y-axis labels in R with ggplot2 without changing the plot width?
They do not work for me, mainly because I need separate plots, so it is not a matter of aligning them virtically on one combined plot as in some of the above solutions.
haven't tried, but this might work,
gl <- lapply(list(p1,p2), ggplotGrob)
library(grid)
widths <- do.call(unit.pmax, lapply(gl, "[[", "widths"))
heights <- do.call(unit.pmax, lapply(gl, "[[", "heights"))
lg <- lapply(gl, function(g) {g$widths <- widths; g$heights <- heights; g})
grid.newpage()
grid.draw(lg[[1]])
grid.newpage()
grid.draw(lg[[2]])
How about using this for p2:
p2 <- ggplot(dfr2, aes(time, value)) + geom_line() +
scale_y_continuous(breaks = NULL) +
scale_x_continuous(breaks = NULL) +
ylab(expression(hat(z)==hat(gamma)[1]*time+hat(gamma)[4]*time^2)) +
theme(axis.title.y=element_text(color=NA))
This has the same label as p1, but the color is NA so it doesn't display. You could also use color="white".

Resources