R: Making Bigger Graphs when using facet_wrap() - r

I'm having some trouble visualizing some data in an R Markdown document. I'm attaching a picture for your reference.
I would like the graphs produced to be larger, and would expect the HTML page to allow for these graphs to be spread out, but they are all coming back "squished"
g <- ggplot(item_loc_metrics, aes(capc_ssp_ratio, avg_wk_bkrm_eoh)) + geom_point(color="firebrick")
g
I run this and it returns a nicely formatted graph:
This snippet of code works fine, but I would like to cut this same graph 60+ times based on the store I'm looking at. I've tried to to do that with this bit:
g2 <- ggplot(item_loc_metrics, aes(capc_ssp_ratio, avg_wk_bkrm_eoh)) + geom_point(color="firebrick") + facet_wrap(~CO_LOC_N, ncol=5, scales = "fixed", shrink = FALSE)
g2
I end up then getting something that looks like this:

To let this question have an answer, I add my two cents.
You can increase fig.height and fig.width(as suggested in the comment) and set ncol to one.

Related

R patchwork package trouble displaying many plots

I am enjoying the patchwork package quite a bit and I am generally happy with even its default formatting.
However, I cannot figure out how to get a reasonable result when trying to display many plots. The output is unreadable (plots shrunk down and dominated by text) and the saved output is completely blank.
Input:
require(ggplot2)
require(patchwork)
plots <- list()
testfunction <- function(x) {
plot_placeholder <- ggplot(mtcars) +
geom_point(aes(mpg, disp)) +
ggtitle(paste("p", x, sep = ""))
plots[[length(plots) + 1]] <<- plot_placeholder
}
mapply(testfunction, c(1:66), SIMPLIFY = FALSE)
patchwork::wrap_plots(plots)
ggsave("test.pdf")
Output:
Is there a way within patchwork to make the page larger or wrap plots across multiple pages of a PDF? Or is there a different package that may help with this?
Update:
Maurits Evers pointed out that this is an issue with my arguments in ggsave(), not patchwork. With these changes the plots look much better in the .pdf output, but were still being truncated at the bottom of page 1. From this post, found the last bit of change needed to generate desired results
Input:
# ...
pdf()
patchwork::wrap_plots(plots)
ggsave("test.pdf", width = 20, height = 20)
dev.off()
Output:

r - Missing object when ggsave output as .svg

I'm attempting to step through a dataset and create a histogram and summary table for each factor and save the output as a .svg . The histogram is created using ggplot2 and the summary table using summary().
I have successfully used the code below to save the output to a single .pdf with each page containing the relevant histogram/table. However, when I attempt to save each histogram/table combo into a set of .svg images using ggsave only the ggplot histogram is showing up in the .svg. The table is just white space.
I've tried using dev.copy Cairo and svg but all end up with the same result: Histogram renders, but table does not. If I save the image as a .png the table shows up.
I'm using the iris data as a reproducible dataset. I'm not using R-Studio which I saw was causing some "empty plot" grief for others.
#packages used
library(ggplot2)
library(gridExtra)
library(gtable)
library(Cairo)
#Create iris histogram plot
iris.hp<-ggplot(data=iris, aes(x=Sepal.Length)) +
geom_histogram(binwidth =.25,origin=-0.125,
right = TRUE,col="white", fill="steelblue4",alpha=1) +
labs(title = "Iris Sepal Length")+
labs(x="Sepal Length", y="Count")
iris.list<-by(data = iris, INDICES = iris$Species, simplify = TRUE,FUN = function(x)
{iris.hp %+% x + ggtitle(unique(x$Species))})
#Generate list of data to create summary statistics table
sum.str<-aggregate(Sepal.Length~Species,iris,summary)
spec<-sum.str[,1]
spec.stats<-sum.str[,2]
sum.data<-data.frame(spec,spec.stats)
sum.table<-tableGrob(sum.data)
colnames(sum.data) <-c("species","sep.len.min","sep.len.1stQ","sep.len.med",
"sep.len.mean","sep. len.3rdQ","sep.len.max")
table.list<-by(data = sum.data, INDICES = sum.data$"species", simplify = TRUE,
FUN = function(x) {tableGrob(x)})
#Combined histogram and summary table across multiple plots
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list,table.list))),
nrow=2, ncol=1, top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
#bypass the class check per #baptiste
ggsave <- ggplot2::ggsave; body(ggsave) <- body(ggplot2::ggsave)[-2]
#
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35),
top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
ggsave(filename,multi.plots)
#dev.off()
}
Edit removed theme tt3 that #rawr referenced. It was accidentally left in example code. It was not causing the problem, just in case anyone was curious.
Edit: Removing previous answer regarding it working under 32bit install and not x64 install because that was not the problem. Still unsure what was causing the issue, but it is working now. Leaving the info about grid.export as it may be a useful alternative for someone else.
Below is the loop for saving the .svg's using grid.export(), although I was having some text formatting issues with this (different dataset).
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35), top =quote(paste(iris$labels$Species,'\nPage', g,
'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
grid.draw(multi.plots)
grid.export(filename)
grid.newpage()
}
EDIT: As for using arrangeGrob per #baptiste's comment. Below is the updated code. I was incorrectly using the single brackets [] for the returned by list, so I switched to the correct double brackets [[]] and used grid.draw to on the ggsave call.
for(i in 1:3){
prefix<-unique(iris$Species)
prefix<-prefix[i]
multi.plots<-grid.arrange(arrangeGrob(iris.list[[i]],table.list[[i]],
nrow=2,ncol=1,top = quote(paste(iris$labels$Species))))
filename<-paste(prefix,".svg",sep="")
ggsave(filename,grid.draw(multi.plots))
}

Issue: ggplot2 replicates last plot of a list in grid

I have some 16 plots. I want to plot all of these in grid manner with ggplot2. But, whenever I plot, I get a grid with all the plots same, i.e, last plot saved in a list gets plotted at all the 16 places of grid. To replicate the same issue, here I am providing a simple example with two files. Although data are entirely different, but plots drawn are similar.
library(ggplot2)
library(grid)
library(gridExtra)
library(scales)
set.seed(1006)
date1<- as.POSIXct(seq(from=1443709107,by=3600,to=1446214707),origin="1970-01-01")
power <- rnorm(length(date1),100,5)#with normal distribution
write.csv(data.frame(date1,power),"file1.csv",row.names = FALSE,quote = FALSE)
# Now another dataset with uniform distribution
write.csv(data.frame(date1,power=runif(length(date1))),"file2.csv",row.names = FALSE,quote = FALSE)
path=getwd()
files=list.files(path,pattern="*.csv")
plist<-list()# for saving intermediate ggplots
for(i in 1:length(files))
{
dframe<-read.csv(paste(path,"/",files[i],sep = ""),head=TRUE,sep=",")
dframe$date1= as.POSIXct(dframe$date1)
plist[[i]]<- ggplot(dframe)+aes(dframe$date1,dframe$power)+geom_line()
}
grid.arrange(plist[[1]],plist[[2]],ncol = 1,nrow=2)
You need to remove the dframe from your call to aes. You should do that anyway because you have provided a data-argument. In this case it's even more important because while you save the ggplot-object, things don't get evaluated until the call to plot/grid.arrange. When you do that, it looks at the current value of dframe, which is the last dataset in your iteration.
You need to plot with:
ggplot(dframe)+aes(date1,power)+geom_line()

Assigning "beanplot" object to variable in R

I have found that the beanplot is the best way to represent my data. I want to look at multiple beanplots together to visualize my data. Each of my plots contains 3 variables, so each one looks something like what would be generated by this code:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
c <- rnorm(100)
beanplot(a, b ,c ,ylim = c(-4, 4), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
(Would have just included an image but my reputation score is not high enough, sorry)
I have 421 of these that I want to put into one long PDF (EDIT: One plot per page is fine, this was just poor wording on my part). The approach I have taken was to first generate the beanplots in a for loop and store them in a list at each iteration. Then I will use the multiplot function (from the R Cookbook page on multiplot) to display all of my plots on one long column so I can begin my analysis.
The problem is that the beanplot function does not appear to be set up to assign plot objects as a variable. Example:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
plot1 <- beanplot(a, b, ylim = c(-5,5), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
plot1
If you then type plot1 into the R console, you will get back two of the plot parameters but not the plot itself. This means that when I store the plots in the list, I am unable to graph them with multiplot. It will simply return the plot parameters and a blank plot.
This behavior does not seem to be the case with qplot for example which will return a plot when you recall the stored plot. Example:
library(ggplot2)
a <- rnorm(100)
b <- rnorm(100)
plot2 <- qplot(a,b)
plot2
There is no equivalent to the beanplot that I know of in ggplot. Is there some sort of workaround I can use for this issue?
Thank you.
You can simply open a PDF device with pdf() and keep the default parameter onefile=TRUE. Then call all your beanplot()s, one after the other. They will all be in one PDF document, each one on a separate page. See here.

How to deal with a lot of plots in R

I have a for loop which produces 60 plots. I would like to save all this plots in only one file.
If I set par(mfrow=c(10,6)) it says : Error in plot.new() : figure margins too large
What can I do?
My code is as follows:
pdf(file="figure.pdf")
par(mfrow=c(10,6))
for(i in 1:60){
x=rnorm(100)
y=rnorm(100)
plot(x,y)
}
dev.off()
Your default plot, as stated in the loop, does not use the space very effectively. If you look at just a single plot, you can see it has large margins, both between axis and edge and plot area and axis text. Effectively, there is a lot of space-hogging.
Secondly, the default pdf-function creates small pages, 7 by 7 inches. That is not a large sheet to plot on.
Trying to plot a 10 x 6 or 12 x 5 on 7 by 7 inches is therefore trying to squeeze in a lot of whitespace on very little space.
For it to succeed, you must look into the margin-options of par which is mar, mai, oma and omi, and probably some more. Consult the documentation with the command
?par
In addition to this, you could consider not displaying axis-text, tick-marks, tick-labels and titles for every one of the 60 sub-plots, as this too will save you space.
But somebody has already gone through some of this trouble for you. Look into the lattice-package or ggplot2, which has some excellent methods for making table-like subplots.
But there is another pressing issue: What are you trying to display with 60 subplots?
Update
Seeing what you are trying to do, here is a small example of faceting in ggplot2. It uses the Tufte-theme from jrnold's ggthemes, which is copied into here and then modified slightly in the line after the function.
library(ggplot2)
library(scales)
#### Setup the `theme` for the plot, i.e. the appearance of background, lines, margins, etc. of the plot.
## This function returns a theme-object, which ggplot2 uses to control the appearance.
theme_tufte <- function(ticks=TRUE, base_family="serif", base_size=11) {
ret <- theme_bw(base_family=base_family, base_size=base_size) +
theme(
legend.background = element_blank(),
legend.key = element_blank(),
panel.background = element_blank(),
panel.border = element_blank(),
strip.background = element_blank(),
plot.background = element_blank(),
axis.line = element_blank(),
panel.grid = element_blank())
if (!ticks) {
ret <- ret + theme(axis.ticks = element_blank())
}
ret
}
## Here I modify the theme returned from the function,
theme <- theme_tufte() + theme(panel.margin=unit(c(0,0,0,0), 'lines'), panel.border=element_rect(colour='grey', fill=NA))
## and instruct ggplot2 to use this theme as default.
theme_set(theme)
#### Some data generation.
size = 60*30
data <- data.frame(x=runif(size), y=rexp(size)+rnorm(size), mdl=sample(60,size, replace=TRUE))
#### Main plotting routine.
ggplot(data, aes(x,y, group=mdl)) ## base state of the plot to be used on all "layers", i.e. which data to use and which mappings to use (x should use x-variable, y should use the y-variable
+ geom_point() ## a layer that renders data as points, creates the scatterplot
+ stat_quantile(formula=y~x) ## another layer that adds some statistics, in this case the 25%, 50% and 75% quantile lines.
+ facet_wrap(~ mdl, ncol=6) ## Without this, all the groups would be displayed in one large plot; this breaks it up according to the `mdl`-variable.
The usual challenge in using ggplot2 is restructuring all your data into data.frames. For this task, the reshape2 and plyr-packages might be of good use.
For you, I would imagine that your function that creates the subplot both calculates the estimation and creates the plot. This means that you have to split the function into calculating the estimation, returning it to a data.frame, which you then can collate and pass to ggplot.
Output the plots to a pdf:
X = matrix(rnorm(60*100), ncol=60)
Y = matrix(rnorm(60*100), ncol=60)
pdf(file="fileName.pdf")
for(j in 1:60){
plot(X[,j], Y[,j])
}
dev.off()
For placing many plots on a page or document (and I have created images with literally thousands of plots in them), it is convenient to separate the work between R--which creates the plots individually--and other software which is better suited for arranging arrays of things. If this reminds you of spreadsheets or word processing tables, then we are thinking alike.
This page, which is a screenshot from a PDF file, contains over 200 statistical graphics. Although it has been greatly reduced (to 40% nominal size) in order to obscure proprietary data, the original has all the detail of the original R graphics and can be zoomed to 1600% without problem.
Two mechanisms have worked reasonably well. For up to several hundred plots, a little macro to import and re-sequence a set of bitmapped image files (.emf or .wmf) into a Word document does fine. For better control, I turn to a comparable Excel macro. It is driven by a sheet that is empty of everything except a row with column headers and a column with row headers. (You can see them at the left and top of the figure.) The macro deletes everything else on that sheet (except for formatting), then munges each possible combination of row and column header into a file name and if it finds that file, it imports it into the corresponding cell. The whole operation takes just a few seconds for several thousand images.
Obviously this communication mechanism between R and the other software is primitive, consisting of a collection of image files having a standard naming convention. But the code needed to implement it all is brief (albeit customized to each situation) and it works reliably. For example, if you encapsulate the plotting code within a function, then it will be called within a loop to create many similar plots. At the end of that function add a few lines to save the plot to a file, something like this:
path <- "W: <whatever>/" # Folder for the output files
ext <- "wmf" # or "emf" or "png" or ... # Format (and extension) of the output
...
if (save) {
outfile <- paste(path, paste(munge(well), munge(parm), sep="_"), sep="/")
outfile <- paste(outfile, ext, sep=".")
savePlot(filename=outfile, type=ext)
}
In this case each plot is identified by two loop variables, well and parm, both of which are strings (they correspond to the column and row headers). The function for creating acceptable filenames merely strips out punctuation, replacing it by an anodyne placeholder:
munge <- function(s) gsub("[[:punct:]]", "_", s)
Once those images have been imported into Word, Excel, or wherever you like, it's fairly easy to reorganize them, place other material around them, etc., and then print the result in PDF format.
There is an art to creating these very large "small multiples" (in Tufte's terminology). To the extent possible, it helps to follow Tufte's principle of increasing the data:ink ratio by erasing inessential material. That makes graphical patterns clear even when the tableau has been greatly reduced in size in order to comprehend all its rows and columns at once. Although the preceding figure is a poor example--the individual plots had to have axes, gridlines, labels, and so on so that they can be read in detail when zoomed--the power of this method to reveal patterns is clear even at this scale. It is crucial to make the plots comparable to one another. In this example, which consists of time series, every plot has the same range on the x-axis; within each row (which corresponds to a different type of observation), the ranges on the y-axes are the same; and all color schemes and methods of symbolization are the same throughout.
You could also use knitr. This didn't instantly convert over to base graphics (and I've got to run now), but using ggplot works easily.
\documentclass{article}
\begin{document}
<<echo = FALSE, fig.keep='high', fig.height=3, fig.width=4>>=
require(ggplot2)
for (i in 1:10) print(ggplot(mtcars, aes(x = disp, y = mpg)) + geom_point())
#
\end{document}
The above code will produce a nice multi-page pdf with all the graphs.
For a very simple solution to this type of issue, I found that setting a large "Windows" device manages to make the window big enough for many uses.
windows(50,50)
par(mfrow=c(10,6))
for(i in 1:60){
x=rnorm(100)
y=rnorm(100)
plot(x,y)
}
Or in my case,
windows(20,20)
plot(Plotting_I_Need_In_Rows_of_4, mfrow=c(4,4))

Resources