Huge fan of facet plots in ggplot2. However, sometimes I have too many subplots and it'd be nice to break them up into a list of plots. For example
df <- data.frame(x=seq(1,24,1), y=seq(1,24,1), z=rep(seq(1,12),each=2))
df
x y z
1 1 1 1
2 2 2 1
3 3 3 2
4 4 4 2
5 5 5 3
. . . .
. . . .
myplot <- ggplot(df,aes(x=x, y=y))+geom_point()+facet_wrap(~z)
myplot
How would I write a function to take the resulting plot and split it into a list of plots? Something along these lines
splitFacet <- function(subsPerPlot){
# Method to break a single facet plot into a list of facet plots, each with at most `subsPerPlot` subplots
# code...
return(listOfPlots)
}
Split plot into individual plots
We build a function along these steps :
We go through the structure of the object to get the names of the variables used for faceting (here 'z').
We overwrite the facet element of our plot object with the one from the empty ggplot object (so if we print it at this stage facets are gone).
We extract the data and split it along the variables we identified in 1st step.
We overwrite the original data with each subset (12 times here) and store all outputs in a list.
code
splitFacet <- function(x){
facet_vars <- names(x$facet$params$facets) # 1
x$facet <- ggplot2::ggplot()$facet # 2
datasets <- split(x$data, x$data[facet_vars]) # 3
new_plots <- lapply(datasets,function(new_data) { # 4
x$data <- new_data
x})
}
new_plots <- splitFacet(myplot)
length(new_plots) # [1] 12
new_plots[[3]] # 3rd plot
Split plot into faceted plots of n subplots max
If we want to keep the facets but have less plots by facet we can skip step 2, and rework our split instead so it includes several values of the variables used for faceting.
Rather than making a separate function we'll generalize the 1st, n is the number of facets you get by plot.
n = NULL means you get the previous output, which is slightly different from n = 1 (one facet by plot).
splitFacet <- function(x, n = NULL){
facet_vars <- names(x$facet$params$facets) # 1
if(is.null(n)){
x$facet <- ggplot2::ggplot()$facet # 2a
datasets <- split(x$data, x$data[facet_vars]) # 3a
} else {
inter0 <- interaction(x$data[facet_vars], drop = TRUE) # 2b
inter <- ceiling(as.numeric(inter0)/n)
datasets <- split(x$data, inter) # 3b
}
new_plots <- lapply(datasets,function(new_data) { # 4
x$data <- new_data
x})
}
new_plots2 <- splitFacet(myplot,4)
length(new_plots2) # [1] 3
new_plots2[[2]]
This might come in handy too :
unfacet <- function(x){
x$facet <- ggplot2::ggplot()$facet
x
}
The tidy way
If the code is available, no need to go through all this trouble, we can split the data before feeding it to ggplot :
library(tidyverse)
myplots3 <-
df %>%
split(ceiling(group_indices(.,z)/n_facets)) %>%
map(~ggplot(.,aes(x =x, y=y))+geom_point()+facet_wrap(~z))
myplots3[[3]]
While I was looking for a solution for this I can across ggplus. Specifically the function facet_multiple:
https://github.com/guiastrennec/ggplus
It lets you split a facet over a number of pages by specifying the amount of plots you want per page. In your example it would be:
library(ggplus)
df <- data.frame(x=seq(1,24,1), y=seq(1,24,1), z=rep(seq(1,12),each=2))
myplot <- ggplot(df,aes(x=x, y=y))+geom_point()
facet_multiple(plot = myplot, facets = 'z', ncol = 2, nrow = 2)
Is this the sort of thing you need? It worked a treat for me.
This is similar to Moody_Muddskipper's answer, but works with any type of faceting (facet_grid or facet_wrap), handles arbitrary expressions in facets, and doesn't draw facet strip bars.
library(rlang)
library(ggplot2)
split_facets <- function(x) {
facet_expr <- unlist(x[["facet"]][["params"]][c("cols", "rows", "facets")])
facet_levels <- lapply(facet_expr, rlang::eval_tidy, data = x[["data"]])
facet_id <- do.call(interaction, facet_levels)
panel_data <- split(x[["data"]], facet_id)
plots <- vector("list", length(panel_data))
for (ii in seq_along(plots)) {
plots[[ii]] <- x
plots[[ii]][["data"]] <- panel_data[[ii]]
plots[[ii]][["facet"]] <- facet_null()
}
plots
}
split_facets(ggplot(df,aes(x=x, y=y))+geom_point()+facet_wrap(~z))
split_facets(ggplot(df,aes(x=x, y=y))+geom_point()+facet_grid(z %% 2 ~ z %% 5))
It uses rlang::eval_tidy to evaluate the facet expressions, combines them into a single categorical factor, then uses that to split the data. It also "suppresses" each subplot's faceting part by replacing it with facet_null().
Posting this for anyone wanting to use ggplus. ggplus will work with later versions of R, but you need to install it using the developer's directions, i.e.
devtools::install_github("guiastrennec/ggplus")
I ran into the same issue when trying to install it using RStudio, then realized that it's just not one of the "standard packages." I'm using 3.4.4.
Related
I have five graphs plotted, each with one slight variable change, the randmod function as seen below.
library(spatstat)
library(ggplot2)
library(dplyr)
library(ggpubr)
library(tidyr)
set.seed(4)
dim <- 2000
radiusCluster<-100
lambdaParent<-.02
lambdaDaughter<-30
hosts<-900
randmod<-0 #this is the variable that changes
delta.t <- 1
iterations <- 1000
sigma <- 0.1
beta <- 1
theta <- 10
b <- .4
numbparents<-rpois(1,lambdaParent*dim)
xxParent<-runif(numbparents,0+radiusCluster,dim-radiusCluster)
yyParent<-runif(numbparents,0+radiusCluster,dim-radiusCluster)
numbdaughter<-rpois(numbparents,(lambdaDaughter))
sumdaughter<-sum(numbdaughter)
theta<-2*pi*runif(sumdaughter)
rho<-radiusCluster*sqrt(runif(sumdaughter))
xx0=rho*cos(theta)
yy0=rho*sin(theta)
xx<-rep(xxParent,numbdaughter)
yy<-rep(yyParent,numbdaughter)
xx<-xx+xx0
yy<-yy+yy0
cds<-data.frame(xx,yy)
is_outlier<-function(x){
x > dim| x < 0
}
cds<-cds[!(is_outlier(cds$xx)|is_outlier(cds$yy)),]
sampleselect<-sample(1:nrow(cds),hosts,replace=F)
cds<-cds%>%slice(sampleselect)
randfunction<-function(x){
x<-runif(length(x),0,dim)
}
randselect<-sample(1:nrow(cds),floor(hosts*randmod),replace=F)
cds[randselect,]<-apply(cds[randselect,],1,randfunction)
landscape<-ppp(x=cds$xx,y=cds$yy,window=owin(xrange=c(0,dim),yrange=c(0,dim)))
plot1<-ggplot(data.frame(landscape))+geom_point(aes(x=x,y=y))+coord_equal()+theme_minimal()+ggtitle("Rf=0")
plot1
This produces a graph identical to this:
I repeat this process for 4 other values of randmod, i.e.:
set.seed(4)
dim <- 2000
radiusCluster<-100
lambdaParent<-.02
lambdaDaughter<-30
hosts<-900
randmod<-0.25 #change in randmod
delta.t <- 1
iterations <- 1000
sigma <- 0.1
beta <- 1
theta <- 10
b <- .4
numbparents<-rpois(1,lambdaParent*dim)
xxParent<-runif(numbparents,0+radiusCluster,dim-radiusCluster)
yyParent<-runif(numbparents,0+radiusCluster,dim-radiusCluster)
numbdaughter<-rpois(numbparents,(lambdaDaughter))
sumdaughter<-sum(numbdaughter)
theta<-2*pi*runif(sumdaughter)
rho<-radiusCluster*sqrt(runif(sumdaughter))
xx0=rho*cos(theta)
yy0=rho*sin(theta)
xx<-rep(xxParent,numbdaughter)
yy<-rep(yyParent,numbdaughter)
xx<-xx+xx0
yy<-yy+yy0
cds<-data.frame(xx,yy)
is_outlier<-function(x){
x > dim| x < 0
}
cds<-cds[!(is_outlier(cds$xx)|is_outlier(cds$yy)),]
sampleselect<-sample(1:nrow(cds),hosts,replace=F)
cds<-cds%>%slice(sampleselect)
randfunction<-function(x){
x<-runif(length(x),0,dim)
}
randselect<-sample(1:nrow(cds),floor(hosts*randmod),replace=F)
cds[randselect,]<-apply(cds[randselect,],1,randfunction)
landscape<-ppp(x=cds$xx,y=cds$yy,window=owin(xrange=c(0,dim),yrange=c(0,dim)))
plot2<-ggplot(data.frame(landscape))+geom_point(aes(x=x,y=y))+coord_equal()+theme_minimal()+ggtitle("Rf=0.25")
plot2
Producing the graph below:
My problem is this, when I use ggarrange, the graphs become squished together and very unclear.
ggarrange(plot1,plot2,plot3,plot4,plot5,nrow=3,ncol=2)
I've tried other packages such as "egg" and "cowplot" to produce a graph that is at least reasonable in the plotting frame but without success. I have also tried:
ggsave("arrange.png", arrangeGrob(grobs = l))
But this also produces the same squished plot. Is it possible to either increase the scale of the plots within the equivalent of ggarrange, or possibly save the plots to a separate file that will maintain their original size?
I need to present this information clearly so that is why the graph as it stands is unacceptable.
Try with patchwork:
library(patchwork)
#Code
G <- wrap_plots(list(plot1,plot2,plot3,plot4,plot5),nrow=3,ncol=2)
Output:
From an experiment I do, I get large files with time series data. Each column represents one series that I would like to plot in a graph. The ranges of the X and Y axis are not important as I only need it as an overview.
The problem is, I have from 150-300 columns (and 400 rows) per data frame and and I am not able to figure out how to plot more than 10 graphs at once.
library(ggplot2)
library(reshape2)
csv <- read.csv(file = "CSV-File-path", header = F, sep = ";", dec = ".")[,1:10]
df <- as.data.frame(csv)
plot.ts(df)
The moment I change [,1:10] to [,1:11] I get an error:
Error in plotts(x = x, y = y, plot.type = plot.type, xy.labels =
xy.labels, : cannot plot more than 10 series as "multiple"
Ideally I would like an output of a multiple paged PDF file with at least 10 graphs per page. I am fairly new to R, I hope you are able to help me.
And here is a ggplot2 way to do it:
library(ggplot2)
library(reshape2)
nrow <- 200
ncol <- 24
df <- data.frame(matrix(rnorm(nrow*ncol),nrow,ncol))
# use all columns and shorten the measure name to mvar
mdf <- melt(df,id.vars=c(),variable.name="mvar")
gf <- ggplot(mdf,aes(value,fill=mvar)) +
geom_histogram(binwidth=0.1) +
facet_grid(mvar~.)
print(gf) # print it out so we can see it
ggsave("gplot.pdf",plot=gf) # now save it as a PDF
This is what the plot looks like:
Here is one way to do it. This one groups the columns in groups of 5 and then writes them as separate pages in a single PDF. But if it were me I would be using ggplot2 and doing them in a single plot.
nrow <- 18
ncol <- 20
df <- data.frame(matrix(runif(nrow*ncol),nrow,ncol))
plots <- list()
ngroup <- 5
icol <- 1
while(icol<=ncol(df)){
print(icol)
print(length(plots))
ecol <- min(icol+ngroup-1,ncol(df))
plot.ts(df[,icol:ecol])
plots[[length(plots)+1]] <- recordPlot()
icol <- ecol+1
}
graphics.off()
pdf('plots.pdf', onefile=TRUE)
for (p in plots) {
replayPlot(p)
}
graphics.off()
I would like to put a long legend into two columns and I am not having any success. Here's the code that I'm using with the solution found elsewhere which does not work for geom='area', though it works for my other plots. The plot that I do get from the code below looks like:
So how do I plot Q1 with the legend in two columns please?
NVER <- 10
NGRID <- 20
MAT <- matrix(NA, nrow=NVER, ncol=NGRID)
gsd <- 0.1 # standard deviation of the Gaussians
verlocs <- seq(from=0, to=1, length.out=NVER)
thegrid <- seq(from=0, to=1, length.out=NGRID)
# create a mixture of Gaussians with modes spaced evenly on 0 to 1
# i.e. the first mode is at 0 and the last mode is at 1
for (i in 1:NVER) {
# add the shape of gaussian i
MAT[i,] <- dnorm(thegrid, verlocs[[i]], sd=gsd)
}
M2 <- MAT/rowSums(MAT)
colnames(M2) <- as.character(thegrid)
# rownames(M2) <- as.character(verlocs)
library(reshape2)
D2 <- melt(M2)
# head(D2)
# str(D2)
D2$Var1 <- ordered(D2$Var1)
library(ggplot2)
Q1 <- qplot(Var2, value, data=D2, order=Var1, fill=Var1, geom='area')
Q1
# ggsave('sillyrainbow.png')
# now try the stackoverflow guide() solution
Q1 + guides(col=guide_legend(ncol=2)) # try but fail to put the legend in two columns!
Note that the solution in creating columns within a legend list while using ggplot in R code is incorporated above and it does not work unfortunately!
You are referring to the wrong guide.
Q1 + guides(fill=guide_legend(ncol=2))
I would like to create 4 plots which show 4 different conditions in a simulation. The 4 conditions in the simulation are iterated using a for loop. What I would like to do is:
for (cond in 1:4){
1.RUN SIMULATION
2.PLOT RESULTS
}
In the end I would like to have 4 plots arranged on a grid. With plot() I can just use par(mfrow) and the plots would be added automatically. Is there a way to do the same with ggplot?
I am aware that I could use grid.arrange() but that would require storing the plots in separate objects, plot1...plot5. But its not possible to do:
for (cond in 1:4){
1. run simulation
2. plot[cond]<-ggplot(...)
}
I cannot give separate names to the plots, like plot1, plot2, plot3 within the loop.
You could use gridExtra package:
library(gridExtra)
library(ggplot2)
p <- list()
for(i in 1:4){
p[[i]] <- ggplot(YOUR DATA, ETC.)
}
do.call(grid.arrange,p)
I would use facetting in this case. In my experience, explicitly arranging sub-plots is rarely needed in ggplot2. A mockup example will probably illustrate my point better:
run_model = function(id) {
data.frame(x_values = 1:1000,
y_values = runif(1000),
id = sprintf('Plot %d', id))
}
df = do.call('rbind', lapply(1:4, run_model))
head(df)
x_values y_values id
1 1 0.7000696 Plot 1
2 2 0.3992786 Plot 1
3 3 0.2718229 Plot 1
4 4 0.4049928 Plot 1
5 5 0.4158864 Plot 1
6 6 0.1457746 Plot 1
Here, id is the column to specifies to which model run a value belongs. Plotting it can simply be done using:
library(ggplot2)
ggplot(df, aes(x = x_values, y = y_values)) + geom_point() + facet_wrap(~ id)
Another option is to use multiplot function:
library(ggplot2)
p <- list()
for(i in 1:4){
p[[i]] <- ggplot(YOUR DATA, ETC.)
}
do.call(multiplot,p)
More information about that - http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_%28ggplot2%29/
Is it possible to plot pairs of columns in a single plot with a loop? For example, if I have a data frame of time series with 10 columns (x1, x2.. x10), I would like to create 5 plots: 1st plot will display x1 and x2, the 2nd plot would display x3 and x4 and so on.
Any plotting method would be useful, (zoo, lattice, ggplot2).
I got stuck at creating a loop to plot a single variable:
set.seed(1)
x<- data.frame(replicate(10,rnorm(10, mean = 0, sd = 1)))
cols <- seq(1,10)
library(zoo)
z <- read.zoo(x)
for (i in cols) {
plot(z[,i], screen = 1)
}
Thanks in advance.
How about this with ggplot2 and reshape2:
require(reshape2)
require(ggplot2)
m<-melt(matrix(z,10))
m$facet<-cut(m$Var2,c(0,2,4,6,8,10))
ggplot(m)+geom_line(aes(x=Var1,y=value,group=Var2,color=factor(Var2)))+facet_wrap(~ facet)
It can be done in a single line without a loop like this where the col argument specifies that the odd series are black and the even are red. Note that z in the question has 9 columns (since the first column in x is the time index) so we have used a 10 column z below instead which was likely what was intended.
library(zoo)
# test data
set.seed(123); z <- zoo(matrix(rnorm(250), 25)); colnames(z) <- make.names(1:10)
plot(z, screen = rep(colnames(z)[c(TRUE, FALSE)], each = 2), col = 1:2)
The output is shown below. To produce a single column add the argument nc=1 or to produce a lattice plot replace plot with xyplot.
ADDED: lattice solution.
like this? Although I am not clear how you want to plot it.
par(mfrow=c(1,5))
for (i in seq(1,10,by=2)){
plot(x[,i],x[,i+1])
}