Prevent a plot to be overwrite in a for loop - r

I am trying to create three different plots in a for loop and then plotting them together in the same graph.
I know that some questions regarding this topic have already been asked. But I do not know what I am doing wrong. Why is my plot being overwritten.
Nevertheless, I tried both solutions (creating a list or using assign function) and I do not know why I get my plot overwriten at the end of the loop.
So, the first solution is to create a list:
library(gridExtra)
library(ggplot2)
out<-list()
for (i in c(1,2,4)){
print(i)
name= paste("WT.1",colnames(WT.1#meta.data[i]), sep=" ")
print(name)
out[[length(out) + 1]] <- qplot(NEW.1#meta.data[i],
geom="density",
main= name)
print(out[[i]])
}
grid.arrange(out[[1]], out[[2]], out[[3]], nrow = 2)
When I print the plot inside the loop, I get what I want...but of course they are not together.
First Plot
When I plot them all together at the end, I get the same plot for all of the three: the last Plot I did.
All together
This is the second option: assign function. I have exactly the same problem.
for ( i in c(1,2,4)) {
assign(paste("WT.1",colnames(WT.1#meta.data[i]),sep="."),
qplot(NEW.1#meta.data[i],geom="density",
main=paste0("WT.1",colnames(WT.1#meta.data[i]))))
}

You're missing to dev.off inside the loop for every iteration. Reproducible code below:
library(gridExtra)
library(ggplot2)
out<-list()
for (i in c(1,2,3)){
print(i)
out[[i]] <- qplot(1:100, rnorm(100), colour = runif(100))
print(out[[i]])
dev.off()
}
grid.arrange(out[[1]], out[[2]], out[[3]], nrow = 2)

Related

How can automate this section of grid.arrange function in R

I have a code chunk that looks like this:
p1 <- plotQC(sce_1, type = "highest-expression")
p2 <- plotQC(sce_2, type = "highest-expression")
p3 <- plotQC(sce_3, type = "highest-expression")
p4 <- plotQC(sce_4, type = "highest-expression")
grid.arrange(p1,p2,p3,p4,ncol=2)
This works very well and has no errors or warnings.
I want to put a loop around. What I have done is
for (i in 1:length(paths))
assign(paste0("p",i), plotQC(get(paste0("sce_",i)), type = "highest-expression"))
grid.arrange(p1,p2,p3,p4,ncol=2)
The second chunk also works very well.However, I would like to make grid.arrange work without manually telling it about p1,p2,p3,p4 but it should detect it the number of p objects.
How can I do this? I am working in R markdown.
If you want to keep with your loop, you could also try this:
p.list <- list()
for (i in 1:length(paths)){
p <- plotQC(get(paste0("sce_",i)), type = "highest-expression")
p.list[[i]] <- p
}
cowplot::plot_grid(plotlist = p.list)
Here, instead of assigning the plot, we save it in a list called p.list, then we build your grid from the list. I used plot_grid from cowplot because it accepts a list of plots as an argument, and I find it easier to work with the plot grid overall.
While all this works, I think you'll agree that the following is better as it in no place requires you to repeat any of the lines or manually specify some numbers:
sce <- list(sce_1, sce_2, sce_3, sce_4)
p <- lapply(sce, plotQC, type = "highest-expression")
do.call(grid.arrange, c(p, ncol = 2))
In particular, working with lists is much better in such cases. For this purpose you probably should produce sce_1, ..., sce_4 also differently, as list elements.

Understanding the duplication of plots in cowplot plot_grid

In desperate need of a sanity check. I am struggling to see why the result of plot_grid (cowplot) of N plots in my code is producing N identical plots. From the list I provide, I've taken out each data frame to verify that each plot should be different, however, when I pass in the complete list to plot_grid they all look identical.
p <- vector("list",length(dataList))
for(i in 1:length(dataList)) {
df <- dataList[[i]]
p[[i]] <- ggplot(df, aes(df$base)) + geom_bar()
}
multi <- plot_grid(plotlist=p, align="hv")
save_plot(paste("data_freqs.tiff",sep=""), multi, dpi=300, base_aspect_ratio=1.5)
For example, when type the following I can see the data is different:
a<-dataList[[1]]
b<-dataList[[2]]
sum(a$base=="T")
>1245
sum(b$base=="T")
>1034
However, I end up with multiple plots of identical T values (all fixed to 1245).
Any help is much appreciated.
Thanks

Retrieve facet labels from a ggplot or a gtable/gTree/grob/gDesc object

I have data I'm plotting using ggplot's facet_grid:
My data:
species <- c("spcies1","species2")
conditions <- c("cond1","cond2","cond3")
batches <- 1:6
df <- expand.grid(species=species,condition=conditions,batch=batches)
set.seed(1)
df$y <- rnorm(nrow(df))
df$replicate <- 1
df$col.fill <- paste(df$species,df$condition,df$batch,sep=".")
My plot:
integerBreaks <- function(n = 5, ...)
{
library(scales)
breaker <- pretty_breaks(n, ...)
function(x){
breaks <- breaker(x)
breaks[breaks == floor(breaks)]
}
}
library(ggplot2)
p <- ggplot(df,aes(x=replicate,y=y,color=col.fill))+
geom_point(size=3)+facet_grid(~col.fill,scales="free_x")+
scale_x_continuous(breaks=integerBreaks())+
theme_minimal()+theme(legend.position="none",axis.title=element_text(size=8))
which gives:
Obviously the labels are long and come out pretty messed up in the figure so I was wondering if there's a way edit these labels in the ggplot object (p) or the gtable/gTree/grob/gDesc object (ggplotGrob(p)).
I am aware that one way of getting better labels is to use the labeller function when the ggplot object is created but in my case I'm specifically looking for a way to edit the facet labels after the ggplot object has been created.
As I mentioned in the comments, the facet names are nested quite deeply within the gtable that ggplotGrob() gives you. However, this is still possible and since the OP explicitly wants to edit them after being plotted, you can do this with:
library(grid)
gg <- ggplotGrob(p)
edited_grobs <- mapply(FUN = function(x, y) {
x[["grobs"]][[1]][["children"]][[2]][["children"]][[1]][["label"]] <- y
return(x)
},
gg$grobs[which(grepl("strip-t",gg$layout$name))],
unique(gsub("cond","c", df$condition)),
SIMPLIFY = FALSE)
gg$grobs[which(grepl("strip-t",gg$layout$name))] <- edited_grobs
grid.draw(gg)
Note that this extracts all the strips using gg$grobs[which(grepl("strip-t",gg$layout$name))] and passes them to the mapply to be reset with the gsub(...) that OP specified in their comment.
In general, if you want to access just one of the text labels, there is a very similar structure which I made use of in my mapply:
num_to_access <- 1
gg$grobs[which(grepl("strip-t",gg$layout$name))][[num_to_access]][["grobs"]][[1]][["children"]][[2]][["children"]][[1]]$label
So to access the 4th label for example all you would need to do is change num_to_acces to be 4. Hope this helps!

Save plots as R objects and displaying in grid

In the following reproducible example I try to create a function for a ggplot distribution plot and saving it as an R object, with the intention of displaying two plots in a grid.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
output<-list(distribution,var1,dat)
return(output)
}
Call to function:
set.seed(100)
df <- data.frame(x = rnorm(100, mean=10),y =rep(1,100))
output1 <- ggplothist(dat=df,var1='x')
output1[1]
All fine untill now.
Then i want to make a second plot, (of note mean=100 instead of previous 10)
df2 <- data.frame(x = rep(1,1000),y = rnorm(1000, mean=100))
output2 <- ggplothist(dat=df2,var1='y')
output2[1]
Then i try to replot first distribution with mean 10.
output1[1]
I get the same distibution as before?
If however i use the information contained inside the function, return it back and reset it as a global variable it works.
var1=as.numeric(output1[2]);dat=as.data.frame(output1[3]);p1 <- output1[1]
p1
If anyone can explain why this happens I would like to know. It seems that in order to to draw the intended distribution I have to reset the data.frame and variable to what was used to draw the plot. Is there a way to save the plot as an object without having to this. luckly I can replot the first distribution.
but i can't plot them both at the same time
var1=as.numeric(output2[2]);dat=as.data.frame(output2[3]);p2 <- output2[1]
grid.arrange(p1,p2)
ERROR: Error in gList(list(list(data = list(x = c(9.66707664902549, 11.3631137069225, :
only 'grobs' allowed in "gList"
In this" Grid of multiple ggplot2 plots which have been made in a for loop " answer is suggested to use a list for containing the plots
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
pltlist <- list()
pltlist[["plot"]] <- distribution
output<-list(pltlist,var1,dat)
return(output)
}
output1 <- ggplothist(dat=df,var1='x')
p1<-output1[1]
output2 <- ggplothist(dat=df2,var1='y')
p2<-output2[1]
output1[1]
Will produce the distribution with mean=100 again instead of mean=10
and:
grid.arrange(p1,p2)
will produce the same Error
Error in gList(list(list(plot = list(data = list(x = c(9.66707664902549, :
only 'grobs' allowed in "gList"
As a last attempt i try to use recordPlot() to record everything about the plot into an object. The following is now inside the function.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
distribution<-recordPlot()
output<-list(distribution,var1,dat)
return(output)
}
This function will produce the same errors as before, dependent on resetting the dat, and var1 variables to what is needed for drawing the distribution. and similarly can't be put inside a grid.
I've tried similar things like arrangeGrob() in this question "R saving multiple ggplot2 plots as R-object in list and re-displaying in grid " but with no luck.
I would really like a solution that creates an R object containing the plot, that can be redrawn by itself and can be used inside a grid without having to reset the variables used to draw the plot each time it is done. I would also like to understand wht this is happening as I don't consider it intuitive at all.
The only solution I can think of is to draw the plot as a png file, saved somewhere and then have the function return the path such that i can be reused - is that what other people are doing?.
Thanks for reading, and sorry for the long question.
Found a solution
How can I reference the local environment within a function, in R?
by inserting
localenv <- environment()
And referencing that in the ggplot
distribution <- ggplot(data=dat, aes(dat[,var1]),environment = localenv)
made it all work! even with grid arrange!

ggplot2 : printing multiple plots in one page with a loop

I have several subjects for which I need to generate a plot, as I have many subjects I'd like to have several plots in one page rather than one figure for subject.
Here it is what I have done so far:
Read txt file with subjects name
subjs <- scan ("ListSubjs.txt", what = "")
Create a list to hold plot objects
pltList <- list()
for(s in 1:length(subjs))
{
setwd(file.path("C:/Users/", subjs[[s]])) #load subj directory
ifile=paste("Co","data.txt",sep="",collapse=NULL) #Read subj file
dat = read.table(ifile)
dat <- unlist(dat, use.names = FALSE) #make dat usable for ggplot2
df <- data.frame(dat)
pltList[[s]]<- print(ggplot( df, aes(x=dat)) + #save each plot with unique name
geom_histogram(binwidth=.01, colour="cyan", fill="cyan") +
geom_vline(aes(xintercept=0), # Ignore NA values for mean
color="red", linetype="dashed", size=1)+
xlab(paste("Co_data", subjs[[s]] , sep=" ",collapse=NULL)))
}
At this point I can display the single plots for example by
print (pltList[1]) #will print first plot
print(pltList[2]) # will print second plot
I d like to have a solution by which several plots are displayed in the same page, I 've tried something along the lines of previous posts but I don't manage to make it work
for example:
for (p in seq(length(pltList))) {
do.call("grid.arrange", pltList[[p]])
}
gives me the following error
Error in arrangeGrob(..., as.table = as.table, clip = clip, main = main, :
input must be grobs!
I can use more basic graphing features, but I d like to achieve this by using ggplot. Many thanks for consideration
Matilde
Your error comes from indexing a list with [[:
consider
pl = list(qplot(1,1), qplot(2,2))
pl[[1]] returns the first plot, but do.call expects a list of arguments. You could do it with, do.call(grid.arrange, pl[1]) (no error), but that's probably not what you want (it arranges one plot on the page, there's little point in doing that). Presumably you wanted all plots,
grid.arrange(grobs = pl)
or, equivalently,
do.call(grid.arrange, pl)
If you want a selection of this list, use [,
grid.arrange(grobs = pl[1:2])
do.call(grid.arrange, pl[1:2])
Further parameters can be passed trivially with the first syntax; with do.call care must be taken to make sure the list is in the correct form,
grid.arrange(grobs = pl[1:2], ncol=3, top=textGrob("title"))
do.call(grid.arrange, c(pl[1:2], list(ncol=3, top=textGrob("title"))))
library(gridExtra) # for grid.arrange
library(grid)
grid.arrange(pltList[[1]], pltList[[2]], pltList[[3]], pltList[[4]], ncol = 2, main = "Whatever") # say you have 4 plots
OR,
do.call(grid.arrange,pltList)
I wish I had enough reputation to comment instead of answer, but anyway you can use the following solution to get it work.
I would do exactly what you did to get the pltList, then use the multiplot function from this recipe. Note that you will need to specify the number of columns. For example, if you want to plot all plots in the list into two columns, you can do this:
print(multiplot(plotlist=pltList, cols=2))

Resources