I would like to create a variable p that contains a plot with four ggplot2 subplots. I am able to achieve this with the below code:
library(ggplot2)
library(gridExtra)
data = diamonds[1:50,]
x = data$x
myPlots = lapply(c(1,5,6,7), function(i){
y = as.data.frame(data[,i])
y = y[,1]
df = data.frame(x=x,y=y)
p <- qplot(x, y, data=df)
p
})
p = do.call("grid.arrange", c(myPlots, ncol=2))
I like that I can use the variable p later by calling:
library(grid)
grid.draw(p)
However, I do not like that when I initially create p with the do.call("grid.arrange") syntax, it plots it automatically (at least in RStudio).
My question is: Is it possible to create p to be stored for later use, without plotting it upon its creation?
Related
I am trying to construct a list of ggplot graphics, which will be plotted later. What I have so far, using Anscombe's quartet for an example, is:
library(ggplot2)
library(gridExtra)
base <- ggplot() + xlim(4,19)
plots = vector(mode = "list", length = 4)
for(i in 1:4) {
x <- anscombe[,i]
y <- anscombe[,i+4]
p <- geom_point(aes(x,y),colour="blue")
q <- geom_smooth(aes(x,y),method="lm",colour="red",fullrange=T)
plots[[i]] <- base+p+q
}
grid.arrange(grobs = plots,ncol=2)
As I travel through the loop, I want the current values of the plots p and q to be added with the base plot, into the i-th value of the list. That is, so that list element number i contains the plots relating to the i-th x and y columns from the dataset.
However, what happens is that the last plot only is drawn, four times. I've done something very similar with base R, using mfrow, plot and abline, so that I believe my logic is correct, but my implementation isn't. I suspect that the issue is with these lines:
plots = vector(mode = "list", length = 4)
plots[[i]] <- base+p+q
How can I create a list of ggplot graphics; starting with an empty list?
(If this is a trivial and stupid question, I apologise. I am very new both to R and to the Grammar of Graphics.)
The code works properly if lapply() is used instead of a for loop.
plots <- lapply(1:4, function(i) {
# create plot number i
})
The reason for this issue is that ggplot uses lazy evaluation. By the time the plots are rendered, the loop already iterated to i=4 and the last plot will be displayed four times.
Full working example:
library(ggplot2)
library(gridExtra)
base <- ggplot() + xlim(4,19)
plots <- lapply(1:4, function(i) {
x <- anscombe[,i]
y <- anscombe[,i+4]
p <- geom_point(aes(x,y),colour="blue")
q <- geom_smooth(aes(x,y),method="lm",colour="red",fullrange=T)
base+p+q
})
grid.arrange(grobs = plots,ncol=2)
To force evaluation, there's a simple solution, change aes(...) into aes_(...) and your code works.
library(ggplot2)
library(gridExtra)
base <- ggplot() + xlim(4,19)
plots <- lapply(1:4, function(i) {
x <- anscombe[,i]
y <- anscombe[,i+4]
p <- geom_point(aes_(x,y),colour="blue")
q <- geom_smooth(aes_(x,y),method="lm",colour="red",fullrange=T)
base+p+q
})
grid.arrange(grobs = plots,ncol=2)
I'm trying to plot multiple plots on a grid using ggplot2 in a for loop, followed by grid.arrange. But all the plots are identical afterwards.
library(ggplot2)
library(grid)
test = data.frame(matrix(rnorm(320), ncol=16 ))
names(test) = sapply(1:16, function(x) paste0("var_",as.character(x)))
plotlist = list()
for (i in 1:(dim(test)[2]-1)){
plotlist[[i]] = ggplot(test) +
geom_point(aes(get(x=names(test)[dim(test)[2]]), y=get(names(test)[i])))
}
pdf("output.pdf")
do.call(grid.arrange, list(grobs=plotlist, nrow=3))
dev.off(4)
When running this code, it seems like the get() calls are only evaluated at the time of the grid.arrange call, so all of the y vectors in the plot are identical as "var_15". Is there a way to force get evaluation immediately, so that I get 15 different plots?
Thanks!
Here are two ways that use purrr::map functions instead of a for-loop. I find that I have less of a clear sense of what's going on when I try to use loops, and since there are functions like the apply and map families that fit so neatly into R's vector operations paradigm, I generally go with mapping instead.
The first example makes use of cowplot::plot_grid, which can take a list of plots and arrange them. The second uses the newer patchwork package, which lets you add plots together—like literally saying plot1 + plot2—and add a layout. To do all those additions, I use purrr::reduce with + as the function being applied to all the plots.
library(tidyverse)
set.seed(722)
test = data.frame(matrix(rnorm(320), ncol=16 ))
names(test) = sapply(1:16, function(x) paste0("var_",as.character(x)))
# extract all but last column
xvars <- test[, -ncol(test)]
By using purrr::imap, I can map over all the columns and apply a function with 2 arguments: the column itself, and its name. That way I can set an x-axis label that specifies the column name. I can also easily access the column of data without having to use get or any tidyeval tricks (although for something for complicated, a tidyeval solution might be better).
plots <- imap(xvars, function(variable, var_name) {
df <- data_frame(x = variable, y = test[, ncol(test)])
ggplot(df, aes(x = x, y = y)) +
geom_point() +
xlab(var_name)
})
cowplot::plot_grid(plotlist = plots, nrow = 3)
library(patchwork)
# same as plots[[1]] + plots[[2]] + plots[[3]] + ...
reduce(plots, `+`) + plot_layout(nrow = 3)
Created on 2018-07-22 by the reprex package (v0.2.0).
Try this:
library(ggplot2)
library(grid)
library(gridExtra)
set.seed(1234)
test = data.frame(matrix(rnorm(320), ncol=16 ))
names(test) = sapply(1:16, function(x) paste0("var_",as.character(x)))
plotlist = list()
for (i in 1:(dim(test)[2]-1)) {
# Define here the dataset for the i-th plot
df <- data.frame(x=test$var_16, y=test[, i])
plotlist[[i]] = ggplot(data=df, aes(x=x, y=y)) + geom_point()
}
grid.arrange(grobs=plotlist, nrow=3)
Context: I have a dataset of 50+ features, and I would like to produce a boxplot, histogram, and summary statistic for each of them, for presentation purposes. That makes 150+ plots. The code I have used to do the above mentioned is as such:
library(ggplot2)
library(dplyr)
library(ggpubr)
library(ggthemes)
library(Rmisc)
library(gridExtra)
myplots <- list() # new empty list
for (i in seq(2,5,3)){
local({
i <- i
p1 <- ggplot(data=dataset,aes(x=dataset[ ,i], colour=label))+
geom_histogram(alpha=.01, position="identity",bins = 33, fill = "white") +
xlab(colnames(dataset)[ i]) + scale_y_log10() + theme_few()
p2<- ggplot(data=dataset, aes( x=label, y=dataset[ ,i], colour=label)) +
geom_boxplot()+ylab(colnames(dataset)[ i]) +theme_few()
p3<- summary(dataset[ ,i])
print(i)
print(p1)
print(p2)
print(p3)
myplots[[i]] <<- p1 # histogram
myplots[[i+1]] <<- p2 # boxplot
myplots[[i+2]] <<- p3 # summary
})
}
myplots[[2]]
length(myplots)
n <- length(myplots)
nCol <- floor(sqrt(n))
do.call("grid.arrange", c(myplots, ncol=nCol)) # PROBLEM: cant print summary as grob
I have created a list of plots, every 3 elements represent the results of a histogram, boxplot, and summary for each feature. I iterate through each of the 50+ features, appending each of the results to my list (not the best way to go about doing this I know). I then run into the following issue when I attempt to print the list through grid arrange:
Error in gList(list(grobs = list(list(x = 0.5, y = 0.5, width = 1, height = 1, :
only 'grobs' allowed in "gList"
Understandably so, as the summary function does not produce a graphical object. Any ideas as to how I can overcome this setback apart from not including summary statistics at all?
Hi after combining several of the suggestions here i managed to figure out how to go about plotting the summary statistics per feature as a grob object, after looping through the different features of my dataset.
library(skimr)
library(GridExtra)
library(ggplot2)
library(dplyr)
mysumplots <- list() # new empty list
for (i in seq(2,ncol(dataset))){
local({
i <-
sampletable <- data.frame(skim((dataset[ ,i]))) #creates a skim data frame
summarystats<-select(sampletable, stat, formatted) #select relevant df columns
summarystats<-slice(summarystats , 4:10) #select relevant stats
p3<-tableGrob(summarystats, rows=NULL) #converts df into a tableGrob
mysumplots[[i]] <<- p3 # summary #appends the grob of to a list of summary table grobs
})
}
do.call("grid.arrange", c(mysumplots, ncol=3)) # use grid arrange to plot all my grobs
What this does is create a skim dataframe of each column (feature), then i selected the relevant statistics, and assigned that grob to the variable p3, which is then iteratively appended to a list of tablegrobs for each feature. I then used gridarrange to print all of the tableGrobs out!
I'd like to save multiple ggplots as jpegs through a for loop. But when I've tried to adapt code I've written for a basic plot command, I get no output (nothing is saved to my working directory).
For Example, this works great:
library(cowplot)
library(ggplot)
X<-c(1,2,3,4,5,6,7,8,9)
Y1<-c(2,3,4,4,3,2,4,5,6)
Y2<-c(3,4,5,3,2,1,1,2,3)
Y3<-c(4,5,6,7,8,9,8,7,6)
DF<-data.frame(X,Y1,Y2,Y3)
for(i in 1:3){
jpeg(paste(i,".jpeg",sep=""))
plot(DF[,1],DF[,i+1])
dev.off()
}
I end up getting three jpeg files saved to my working directory.
I'm not sure how to properly index the ggplot call here for i, but even this should return 3 instances of the same plot:
for(i in 1:3){
jpeg(paste(i,".jpeg",sep=""))
ggplot(data=DF,aes(x=X,y=Y1))+geom_line()
dev.off()
}
In the end, I was hoping to combine multiple plots onto one jpeg, and then save multiple jpegs like this:
for(i in 1:3){
jpeg(paste(i,".jpeg",sep=""))
A<-ggplot(data=DF,aes(x=X,y=Y1))+geom_line()
B<-ggplot(data=DF,aes(x=X,y=Y2))+geom_line()
C<-ggplot(data=DF,aes(x=X,y=Y3))+geom_line()
plot_grid(A,B,C)
dev.off()
}
So this plot should also return 3 instances of the same plot, all with different indexed file names. But again, I get nothing.
So my question is why is there a difference between generic plotting and ggploting in this for loop. And how can one save mutliple jpegs from ggplots like above?
How about
library(gridExtra) # gridExtra::arrangeGrob
for(i in 1:3) {
jpeg(paste0(i, ".jpg"))
A <- ggplot(data = DF, aes(x = X, y = Y1)) + geom_line()
B <- ggplot(data = DF, aes(x = X, y = Y2)) + geom_line()
C <- ggplot(data = DF, aes(x = X, y = Y3)) + geom_line()
grid.arrange(arrangeGrob(A, B, C, ncol = 3))
dev.off()
}
Note: this solution does not produce the side annotations of cowplot ("A", "B", "C").
using your code:
for(i in 1:3){
jpeg(paste(i,".jpeg",sep=""))
A<-ggplot(data=DF,aes(x=X,y=Y1))+geom_line()
B<-ggplot(data=DF,aes(x=X,y=Y2))+geom_line()
C<-ggplot(data=DF,aes(x=X,y=Y3))+geom_line()
k<-plot_grid(A,B,C)
ggsave(k, filname = "path/finalplot.jpeg")
}
look at ?ggsave to look at other arguments to specify like height and width
Does anyone know how to use manipulate() on a ggplot, in order to easily select a smoothing (span) level? I´ve tried the following without success:
# fake data
xvals <- 1:10
yvals <- xvals^2*exp(rnorm(2,5,0.6))
data <- data.frame(xvals,yvals)
# plot with manipulate
manipulate(
ggplot(data,aes(xvals,yvals)) +
geom_smooth(span=slider(0.5,5)) +
geom_point()
)
I want to be able to cycle through "smoothing levels" easily.
Changed your data to have more data points.
xvals <- 1:100
yvals <- rnorm(100)
data <- data.frame(xvals,yvals)
You have to give name for the value used with span= in geom_smooth() (for example, span.val) and then define span.val=slider(0.1,1) outside the ggplot() function - in this example as second argument to manipulate().
library(manipulate)
library(ggplot2)
manipulate({
#define plotting function
ggplot(data,aes(xvals,yvals)) +
geom_smooth(method="loess",span=span.val) +
geom_point()},
#define variable that will be changed in plot
span.val=slider(0.1,1)
)