Join two graphs (one beside other) GWAS - r

I have this code to generate 2 different types of graphs (manhattan plot and a QQ plot)
# Set up the work directory in which all data is gonna be extracted
gwasResults2 = read.csv("DWStem.csv") #Change name of the file
library(qqman) #Run to create plots
library(cowplot)
library(extrafont)
library(grid)
library(cowplot)
library(gridExtra)
MH <- manhattan(gwasResults2, chr="CHR", bp="BP", snp="SNP", p="P",
col = c("chartreuse2", "darkorange1", "gold1"),ylim=c(0,-log10(1e-06)), chrlabs = NULL,
suggestiveline = -log10(1e-03), genomewideline = -log10(1e-05),
highlight = NULL, logp = TRUE, annotatePval = NULL,
annotateTop = TRUE, main='DWStem')
QQ <- qq(gwasResults2$P, main='DWStem', pch = 24, cex=1, col="gold", bg="brown1", lwd=1, xlim=c(0,5), ylim=c(0,5)) #Run to create qqplot $P need to be there!
Total <- plot_grid(MH, QQ, labels = c("a", "b"), ncol = 2)
But apparently I cant put them aside each other because I get the next error:
Error in plot_to_gtable(x) :
Argument needs to be of class "ggplot", "gtable", "grob", "recordedplot", or a function that plots to an R graphicsdevice when called, but is a list
Any idea of how I can solve it?
In advance, thank you! :D

The functions manhattan and qq produce base graphics, not grid graphics. You need to use base graphics methods for the layout. For example, using reproducible data,
par(mfrow=c(1,2))
manhattan(gwasResults, main = "a")
qq(gwasResults$P, main = "b")
produces
If your plots used grid graphics (produced by grid, ggplot2 or lattice), your method would have worked. If some use grid and some use base graphics, it is possible to mix them in the same display, but it is not easy. See the gridBase and gridGraphics packages.
EDITED to add:
If you have gridGraphics installed, then it's actually not so bad to mix the base graphics with grid graphics. You just set MH and QQ to be functions producing the graphs, rather than the graphs themselves. For example,
MH <- function() { manhattan(gwasResults) }
QQ <- function() { qq(gwasResults$P) }
Total <- plot_grid(MH, QQ, labels = c("a", "b"), ncol = 2)
When you print Total, you get this:
The graphs have lost their y axis labels, but otherwise look okay.

Related

Adding legend to venn diagram

I am using library VennDiagram to plot venn diagrams. But this function does not have a functionality to add legend and set names are displayed on or close to the sets themselves.
library(VennDiagram)
x <- list(c(1,2,3,4,5),c(4,5,6,7,8,9,10))
venn.diagram(x,filename="test.png",fill=c("#80b1d3","#b3de69"),
category.names=c("A","B"),height=500,width=500,res=150)
And with many sets, overplotting names is an issue and I would like to have a legend instead. The function is built on grid graphics and I have no idea how grid plotting works. But, I am attempting to add a legend anyway.
Looking into the venn.diagram function, I find that final plotted object is grob.list and it is a gList object and its plotted using grid.draw().
png(filename = filename, height = height, width = width,
units = units, res = resolution)
grid.draw(grob.list)
dev.off()
I figured out that I could create a legend by modifying the venn.diagram function with the code below.
cols <- c("#80b1d3","#b3de69")
lg <- legendGrob(labels=category.names, pch=rep(19,length(category.names)),
gp=gpar(col=cols, fill="gray"),byrow=TRUE)
Draw the object lg
png(filename = filename, height = height, width = width,
units = units, res = resolution)
grid.draw(lg)
dev.off()
to get a legend
How do I put the venn diagram (gList) and the legend (gTree,grob) together in a usable way? I am hoping to get something like base plot style:
or the ggplot style
If you are allowed to use other packages than VennDiagram, I suggest the following code using the eulerr package:
library(eulerr)
vd <- euler(c(A = 5, B = 3, "A&B" = 2))
plot(vd, counts = TRUE,lwd = 2,
fill=c("#80b1d3","#b3de69"),
opacity = .7,
key = list( space= "right", columns=1))
with key you define the legend location and appearance.
If you want to continue using the VennDiagram package and learn a bit of grid on the way:
Prepare diagram and legend
library(VennDiagram)
x <- list(c(1,2,3,4,5),c(4,5,6,7,8,9,10))
diag <- venn.diagram(x,NULL,fill=c("#80b1d3","#b3de69"),
category.names=c("A","B"),height=500,width=500,res=150)
cols <- c("#80b1d3","#b3de69")
lg <- legendGrob(labels=c("A","B"), pch=rep(19,length(c("A","B"))),
gp=gpar(col=cols, fill="gray"),
byrow=TRUE)
Transform the diagram to a gTree
(I'd love to find a better way if anyone knows one)
library(gridExtra)
g <- gTree(children = gList(diag))
Plot the two gTrees side by side
gridExtra::grid.arrange(g, lg, ncol = 2, widths = c(4,1))
Or one above the other
grid.arrange(g, lg, nrow = 2, heights = c(4,1))
I have found a solution as well, but the venn diagram region is not square aspect ratio. And the legend is not spaced ideally.
library(gridGraphics)
png("test.png",height=600,width=600)
grab_grob <- function(){grid.echo();grid.grab()}
grid.draw(diag)
g <- grab_grob()
grid.arrange(g,lg,ncol=2,widths=grid::unit(c(0.7,0.3),"npc"))
dev.off()

Save multiple ggplot2 plots as R object in list and re-displaying in grid

I would like to save multiple plots (with ggplot2) to a list during a large for-loop. And then subsequently display the images in a grid (with grid.arrange)
I have tried two solutions to this:
1 storing it in a list, like so:
pltlist[["qplot"]] <- qplot
however for some reason this does save the plot correctly.
So I resorted to a second strategy which is recordPlot()
This was able to save the plot correctly, but unable to
use it in a grid.
Reproducable Example:
require(ggplot2);require(grid);require(gridExtra)
df <- data.frame(x = rnorm(100),y = rnorm(100))
histoplot <- ggplot(df, aes(x=x)) + geom_histogram(aes(y=..density..),binwidth=.1,colour="black", fill="white")
qplot <- qplot(sample = df$y, stat="qq")
pltlist <- list()
pltlist[["qplot"]] <- qplot
pltlist[["histoplot"]] <- histoplot
grid.arrange(pltlist[["qplot"]],pltlist[["histoplot"]], ncol=2)
above code works but produces the wrong graph
in my actual code
Then I tried recordPlot()
print(histoplot)
c1 <- recordPlot()
print(qplot)
c2 <- recordPlot()
I am able to display all the plots individually
but grid.arrange produces an error:
grid.arrange(replayPlot(c1),replayPlot(c2), ncol=2) # = Error
Error in gList(list(wrapvp = list(x = 0.5, y = 0.5, width = 1, height = 1, :
only 'grobs' allowed in "gList"
In this thread Saving grid.arrange() plot to file
They dicuss a solution which utilizes arrangeGrob() instead
arrangeGrob(c1, c1, ncol=2) # Error
Error in vapply(x$grobs, as.character, character(1)) :
values must be length 1,
but FUN(X[[1]]) result is length 3
I am forced to use the recordPlot() instead of saving to a list since this does not produce the same graph when saved as when it is plotted immediately, which I unfortunately cannot replicate, sorry.
In my actual code I am doing a large for-loop, looping through several variables, making a correlation with each and making scatterplots, where I name the scatterplots dependent on their significans level. I then want to re-display the plots that were significant in a grid, in a dynamic knitr report.
I am aware that I could just re-plot the plots that were significant after the for-loop instead of saving them, (I can't save as png while doing knitr either). However I would like to find a way to dynammically save the plots as R-objects and then replot them in a grid afterwards.
Thanks for Reading
"R version 3.2.1"
Windows 7 64bit - RStudio - Version 0.99.652
attached base packages:
[1] grid grDevices datasets utils graphics stats methods base
other attached packages:
[1] gridExtra_2.0.0 ggplot2_1.0.1
I can think of two solutions.
1. If your goal is to just save the list of plots as R objects, I recommend:
saveRDS(object = pltlist, file = "file_path")
This way when you wish to reload in these graphs, you can just use readRDS(). You can then put them in cowplot or gridarrange. This command works for all lists and R Objects.
One caveat to this approach is if settings/labeling for ggplot2 is dependent upon things in the environment (not the data, but stuff like settings for point size, shape, or coloring) instead of the ggplot2 function used to make the graph), your graphs won't work until you restore your dependencies. One reason to save some dependencies is to modularize your scripts to make the graphs.
Another caveat is performance: From my experience, I found it is actually faster to read in the data and remake individual graphs than load in an RDS file of all the graphs when you have a large number of graphs (100+ graphs).
2. If your goal is to save an 'image' or 'picture' of each graph (single and/or multiplot as .png, .jpeg, etc.), and later adjust things in a grid manually outside of R such as powerpoint or photoshop, I recommend:
filenames <- c("Filename_1", "Filename_2") #actual file names you want...
lapply(seq_along(pltlist), function(i) {
ggsave(filename = filenames[i], plot = pltlist[[i]], ...) #use your settings here
})
Settings I like for single plots:
lapply(seq_along(pltlist), function(i) ggsave(
plot = pltlist[[i]],
filename = paste0("plot_", i, "_", ".tiff"), #you can even paste in pltlist[[i]]$labels$title
device = "tiff", width=180, height=180, units="mm", dpi=300, compression = "lzw", #compression for tiff
path = paste0("../Blabla") #must be an existing directory.
))
You may want to do the manual approach if you're really OCD about the grid arrangement and you don't have too many of them to make for publications. Otherwise, when you do grid.arrange you'll want to do all the specifications there (adjusting font, increasing axis label size, custom colors, etc.), then adjust the width and height accordingly.
Reviving this post to add multiplot here, as it fits exactly.
require(ggplot2)
mydd <- setNames( data.frame( matrix( rep(c("x","y","z"), each=10) ),
c(rnorm(10), rnorm(10), rnorm(10)) ), c("points", "data") )
# points data
# 1 x 0.733013658
# 2 x 0.218838717
# 3 x -0.008303382
# 4 x 2.225820069
# ...
p1 <- ggplot( mydd[mydd$point == "x",] ) + geom_line( aes( 1:10, data, col=points ) )
p2 <- ggplot( mydd[mydd$point == "y",] ) + geom_line( aes( 1:10, data, col=points ) )
p3 <- ggplot( mydd[mydd$point == "z",] ) + geom_line( aes( 1:10, data, col=points ) )
multiplot(p1,p2,p3, cols=1)
multiplot:
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
library(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}
Result:

Combine phylogenetic tree with x,y graph

I'm trying to arrange a phylogenetic tree onto a graph showing physiological data for a set of related organisms. Something like the picture below. This was put together in powerpoint from 2 separate graphs. I guess it gets the job done, but I was hoping to create a single image which I think will be easier to format into a document. I am able to produce the graph I want using ggplot2, and import the tree using ape. I was thinking there should be a way to save the tree as a graphical object and then arrange it with the graph using the gridarrange function in gridExtra. The problem is that ape won't let me save the tree as a graphical object, e.g.,
p2<-plot(tree, dir = "u", show.tip.label = FALSE)
just plots the tree and when you call p2 it just gives a list of arguments. I'm wondering if anyone has any tips.
Thanks!
I'm not sure if that will work with gtable from CRAN
require(ggplot2)
require(gridBase)
require(gtable)
p <- qplot(1,1)
g <- ggplotGrob(p)
g <- gtable_add_rows(g, unit(2,"in"), nrow(g))
g <- gtable_add_grob(g, rectGrob(),
t = 7, l=4, b=7, r=4)
grid.newpage()
grid.draw(g)
#grid.force()
#grid.ls(grobs=F, viewports=T)
seekViewport("layout.7-4-7-4")
par(plt=gridPLT(), new=TRUE)
plot(rtree(10), "c", FALSE, direction = "u")
upViewport()
first I'd like to thanks baptiste for ALL his multiple answers that solved most of my issues with ggplot2.
second, I had a similar question which was to include a tree from ape inside a heatmap obtained with ggplot2. Baptiste made my day, and though my simplified version could help.
I used only what was useful for me (removing the addition of gg_rows).
library(ape)
tr <- read.tree("mytree.tree")
# heat is the heatmap ggplot, using geom_tile
g <- ggplotGrob(heat)
grid.newpage()
grid.draw(g)
# use oma to reduce the tree so it fits
par(new = TRUE, oma = c(5, 4, 5, 38))
plot(tr)
nodelabels(tr$node.label, cex = 1, frame = "none", col = "black", adj = c(-0.3, 0.5))
add.scale.bar()
# use dev.copy2pdf and not ggsave
dev.copy2pdf(file = "heatmap_prob.pdf")
the result is here

superpose a histogram and an xyplot

I'd like to superpose a histogram and an xyplot representing the cumulative distribution function using r's lattice package.
I've tried to accomplish this with custom panel functions, but can't seem to get it right--I'm getting hung up on one plot being univariate and one being bivariate I think.
Here's an example with the two plots I want stacked vertically:
set.seed(1)
x <- rnorm(100, 0, 1)
discrete.cdf <- function(x, decreasing=FALSE){
x <- x[order(x,decreasing=FALSE)]
result <- data.frame(rank=1:length(x),x=x)
result$cdf <- result$rank/nrow(result)
return(result)
}
my.df <- discrete.cdf(x)
chart.hist <- histogram(~x, data=my.df, xlab="")
chart.cdf <- xyplot(100*cdf~x, data=my.df, type="s",
ylab="Cumulative Percent of Total")
graphics.off()
trellis.device(width = 6, height = 8)
print(chart.hist, split = c(1,1,1,2), more = TRUE)
print(chart.cdf, split = c(1,2,1,2))
I'd like these superposed in the same frame, rather than stacked.
The following code doesn't work, nor do any of the simple variations of it that I have tried:
xyplot(cdf~x,data=cdf,
panel=function(...){
panel.xyplot(...)
panel.histogram(~x)
})
You were on the right track with your custom panel function. The trick is passing the correct arguments to the panel.- functions. For panel.histogram, this means not passing a formula and supplying an appropriate value to the breaks argument:
EDIT Proper percent values on y-axis and type of plots
xyplot(100*cdf~x,data=my.df,
panel=function(...){
panel.histogram(..., breaks = do.breaks(range(x), nint = 8),
type = "percent")
panel.xyplot(..., type = "s")
})
This answer is just a placeholder until a better answer comes.
The hist() function from the graphics package has an option called add. The following does what you want in the "classical" way:
plot( my.df$x, my.df$cdf * 100, type= "l" )
hist( my.df$x, add= T )

R - save multiplot to file

I’d really appreciate your help with the following problem. I know several ways to save a single plot to a file. My question is: How do I correctly save a multiplot to a file?
To begin with, I’m not an experienced R user. I use ggplot2 to create my plots, and another thing I should probably mention is that I use the RStudio GUI. Using an example from the R Cookbook, I'm able to create multiple plots in one window.
I would like to save this so-called multiplot to a file (preferably as jpeg), but somehow fail to do this.
I’m creating the multiplot as follows:
##define multiplot function
multiplot <- function(..., plotlist=NULL, cols) {
require(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# Make the panel
plotCols = cols # Number of columns of plots
plotRows = ceiling(numPlots/plotCols) # Number of rows needed, calculated from # of cols
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(plotRows, plotCols)))
vplayout <- function(x, y)
viewport(layout.pos.row = x, layout.pos.col = y)
# Make each plot, in the correct location
for (i in 1:numPlots) {
curRow = ceiling(i/plotCols)
curCol = (i-1) %% plotCols + 1
print(plots[[i]], vp = vplayout(curRow, curCol ))
}
}
## define subplots (short example here, I specified some more aesthetics in my script)
plot1a <- qplot(variable1,variable2,data=Mydataframe1)
plot1b <- qplot(variable1,variable3,data=Mydataframe1)
plot1c <- qplot(variable1,variable2,data=Mydataframe2)
plot1d <- qplot(variable1,variable3,data=Mydataframe2)
## plot in one frame
Myplot <- multiplot(plot1a,plot1b,plot1c,plot1d, cols=2)
This gives the desired result. The problem arises when I try to save to a file. I can do this manually in RStudio (using Export -> Save plot as image), but I would like to run everything in a script. I manage to save only subplot1d (which is last_plot()), and not the complete multiplot.
What I’ve tried so far:
Using ggsave
ggsave(filename = "D:/R/plots/Myplots.jpg")
This results in only subplot 1d being saved.
Using jpeg(), print() and dev.off()
jpeg(filename = "Myplot.jpg", pointsize =12, quality = 200, bg = "white", res = NA, restoreConsole = TRUE)
print(Myplot)
dev.off()
This results in a completely white image (just the background I assume). print(Myplot) returns NULL.
Not sure what I’m doing wrong here. My lack of understanding R is the reason I am stuck trying to find a solution. Can anyone explain what I’m doing wrong and perhaps suggest a way to solve my problem(s)?
Its because Myplot is the returned value from your multiplot function, and it returns nothing (its job is to print the graphs). You need to call multiplot with the jpeg device open:
jpeg(filename = "Myplot.jpg", pointsize =12, quality = 200, bg = "white", res = NA, restoreConsole = TRUE)
multiplot(plot1a,plot1b,plot1c,plot1d, cols=2)
dev.off()
should work.
Using the example code (R cookbook), it works for me
png("chickweight.png")
multiplot(p1, p2, p3, p4, cols=2)
dev.off()
And for completeness sake, ggsave does not work as it only saves the last printed ggplot object, which in your case is just the last plot. This is caused by the fact that multiplot creates the plot by drawing the ggplot objects onto different subsets of the total graphics device. An alternative is to create the plot by combining the ggplot objects into one big ggplot object, and then printing the object. This would be compatible with ggsave. This approach is implemented by arrangeGrob in the gridExtra package.

Resources