How to put comma in large number of VennDiagram? - r

I have a venn diagram that I make with the package VennDiagram. The numbers are above the 100,000.
I would like the number in the iddle to be 150,001, with a comma separator, or 150 000, with a small space in between. Is this possible to do with VennDiagram?
This is my example
library(VennDiagram)
venn.diagram(x = list(A = 1:200000,B = 50000:300000), filename = "../example.tiff")

I dont think you can do this easily. There are two print modes, raw, and percent, but these are hard-coded in the function (have a look at VennDiagram::draw.triple.venn). You can add formats by changing the function (which I wouldn't fancy) or by manually tweaking the grobs (which is done below)
library(VennDiagram)
p <- venn.diagram(x = list(A = 1:200000,B = 50000:300000), filename = NULL)
# Change labels for first three text grobs
# hard-coded three, but it would be the number of text labels
# minus the number of groups passed to venn.diagram
idx <- sapply(p, function(i) grepl("text", i$name))
for(i in 1:3){
p[idx][[i]]$label <-
format(as.numeric(p[idx][[i]]$label), big.mark=",", scientific=FALSE)
}
grid.newpage()
grid.draw(p)

Related

Save multiple ggplots from a for loop in a single plot in a particular layout

I am trying to plot a single image that contains 35 ggplots. The order of the plots in the single image is fixed and is shown below.
I also want blank grids as shown in the grid image. Each grid should have the plot with a particular drug number. I have a data frame "drug_dctv2" which I am splitting, and making into a list to read data into the for loop.
The problem is: In plot_list[[i]], only the last plot is saved 35 times with i (1 to 35). I am also not sure how to save the plots in the particular order as shown in the grid.
Through my internet search, I found library like "cowplot" and "gridextra" but I couldn't find a proper way to implement these.
I made a plot layout file which contains the drug names in the following order as shown in the grid image and in place of blank spaces, I inserted "tab". But I do not find a way to proceed from there.
I am new to R. Any help and suggestion will be appreciated.
Data set looks like as shown below. Each Drug has 10 data points.
**Drug_name conc viab**
Drug_1 1 1.0265
Drug_1 0.1 1.2365
Drug_1 0.01 0.5896
-- -- --
Drug_2 1 2.0584
Drug_2 0.1 1.0277
Drug_2 0.01 1.5696
-- -- --
#
split <- split(file,rep(1:35,each=10)) #### this will be used in the for loop
plot_list = list()
for(i in 1:length(split))
{
data <- split[[i]]
c <- data$conc
v <- data$viab
p = ggplot(data = data,aes(x=c,y=v))+geom_point()+ylim(0,1.5)+
scale_x_continuous(trans='log10')+
theme(axis.text = element_blank(),axis.title = element_blank()) +
geom_line(data=line_data, aes(x=x,y=y2),color ="red",size=1)
plot_list[[i]] = p
}
Thank you in advance !!
ggplot, as many tidyverse packages, use delayed non standard evaluation. The expression you provide inside aes is not evaluated until the plot is built (e.g. printed or saved).
The expression in your question refers to the vectors c and v defined in the for loop. These vectors change on each iteration, but the aes call only contains an expression to the reference to c and v in the environment where the for loop is running, so the c and v values used in the plot are the ones available when the plot is printed or saved.
You can, as mentioned in the comments, use a column from the data frame directly, since ggplot evaluates the data frame when ggplot() is called.
An alternative if you wanted to keep using c and v, is to make sure each iteration runs in an independent environment, so ggplot references for c and v point to the different c and v on each loop iteration. This can be done for instance replacing the for loop with an lapply call.
plot_list <- lapply(split, function(data_drug) {
c <- data_drug$conc
v <- data_drug$viab
ggplot(data = data_drug,aes(x=c,y=v))+geom_point()+ylim(0,1.5)+
scale_x_continuous(trans='log10')+
theme(axis.text = element_blank(),axis.title = element_blank()) +
geom_line(data=line_data, aes(x=x,y=y2),color ="red",size=1)
})
This is one beautiful example where a for loop and an lapply call produce different results and it's a great learning experience about non standard evaluation and variable environments.
To combine the plots look at cowplot::plot_grid https://wilkelab.org/cowplot/articles/plot_grid.html
Something like this should work
library(cowplot)
plot_grid(
plot_list[[35]], plot_list[[5]], plot_list[[3]], plot_list[[2]],
plot_list[[34]], plot_list[[1]], plot_list[[4]], plot_list[[6]],
plot_list[[32]], plot_list[[8]], NULL, NULL,
plot_list[[30]], plot_list[[7]], plot_list[[33]] , NULL,
labels = "AUTO", ncol = 4
)
You can put all the function arguments in a list and use do.call to call the function with the arguments:
plot_order <- c(
35, 5, 3, 2,
34, 1, 4, 6,
32, 8, NA, NA
)
plot_grid_args <- c(plot_list[plot_order], list(ncol = 4))
do.call(plot_grid, plot_grid_args)
So, Finally I was able to solve this problem.
I made a variable layout with the position of the drugs as they are in the split[i] list. For eg: drug_35 has to come first on the grid and it is on 35th position in split[i] list, so in "layout" variable 35 comes first and so on.
I made a text file with the grid layout as shown above in the image and then read that file in the R script and by some lines of codes I was able to make the layout variable. For the sake of simplicity I am not showing those code lines here. But, I hope the concept is clear.
lay <- read.delim("layout.txt",stringsAsFactors = FALSE,sep = "\t", header = F)
lay1 = c(t(lay))
col_n = ncol(lay)
row_n = nrow(lay)
split <- split(file,rep(1:35,each=10))
## layout = 35 5 3 2 34 1 4 6 32 8 0 0 30 7 33 .....
## 0 means blank spaces
png("PLOT.png", width = 6, height = 10, units = "in", res = 400)
par(mfrow=c(row_n,col_n),mar=c(2,0.7,1.5,0.5)) ## margins: bottom, left, top and right
for(i in layout)
{
if(i== 0) { frame(); next; }
## Here if 0 comes then the for loop will be skipped and frame() will generate a blank in the grid image
data <- split[[i]]
c <- data$conc
v <- data$viab
plot(c,v,xlab = NULL,ylab = NULL, axes = F,log = "x")
}
dev.off()

R: Increase space between multiple boxplots to avoid omitted x axis labels

Let's say I generate 5 sets of random data and want to visualize them using boxplots and save those to a file "boxplots.png". Using the code
png("boxplots.png")
data <- matrix(rnorm(25),5,5)
boxplot(data, names = c("Name1","Name2","Name3","Name4","Name5"))
dev.off()
there are 5 boxplots created as desired in "boxplots.png", however the names for the second ("Name2") and the fourth ("Name4") boxplot are omitted. Even changing the window of my png-view makes no difference. How can I avoid this behavior?
Thank you!
Your offered code does not produce an overlap in my setting, but that point is relatively moot: you want a way to allow more space between words.
One (brute-force-ish) way to fix the symptom is to alternate putting them on separate lines:
set.seed(42)
data <- matrix(rnorm(25),5,5)
nms <- c("Name1","Name2","Name3","Name4","Name5")
oddnums <- which(seq_along(nms) %% 2 == 0)
evennums <- which(seq_along(nms) %% 2 == 1)
(There's got to be a better way to do that, but it works.)
From here:
png("boxplot.png", height = 240)
boxplot(data, names = FALSE)
mtext(nms[oddnums], side = 1, line = 2, at = oddnums)
mtext(nms[evennums], side = 1, line = 1, at = evennums)
dev.off()
(The use of png is not important here, I just used it because of your edit.)

combine multiple plots to a gif

Im trying to use the caTools package to combine multiple plots into a gif.
My basic code looks like :
for( i in 1:100){
plot(....) // plots few points and lines, changes slightly with each i
}
I would like to combine these to a gif to see the "evolution" of the plot.
However for write.gif() from caTools, I need to give an image as an input.
For each i, how do I convert the plot into an image without
saving to disc as an intermediate step
depending on ImageMagick or similar external dependencies.
Please feel free to point out if this is a duplicate. [ Creating a Movie from a Series of Plots in R doesnt seem to answer this ]
EDIT: Basically this requires us to convert a plot to a matrix. Since this very likely happens every time someone saves a plot, it should not be very difficult. However Im not able to get hold of how to exactly do this.
I suggest using the animation package and ImageMagick instead:
library(animation)
## make sure ImageMagick has been installed in your system
saveGIF({
for (i in 1:10) plot(runif(10), ylim = 0:1)
})
Otherwise you could try it in the veins of this (plenty of room for optimization):
library(png)
library(caTools)
library(abind)
# create gif frames and write them to pngs in a temp dir
dir.create(dir <- tempfile(""))
for (i in 1:8) {
png(file.path(dir, paste0(sprintf("%04d", i), ".png")))
plot(runif(10), ylim = 0:1, col = i)
dev.off()
}
# read pngs, create global palette, convert rasters to integer arrays and write animated gif
imgs <- lapply(list.files(dir, full.names = T), function(fn) as.raster(readPNG(fn)))
frames <- abind(imgs, along = 3) # combine raster pngs in list to an array
cols <- unique(as.vector(frames)) # determine unique colors, should be less then 257
frames <- aperm(array(match(frames, cols) - 1, dim = dim(frames)), c(2,1,3)) # replace rgb color codes (#ffffff) by integer indices in cols, beginning with 0 (note: array has to be transposed again, otherwise images are flipped)
write.gif(
image = frames, # array of integers
filename = tf <- tempfile(fileext = ".gif"), # create temporary filename
delay = 100, # 100/100=1 second delay between frames
col = c(cols, rep("#FFFFFF", 256-length(cols))) # color palette with 256 colors (fill unused color indices with white)
)
# open gif (windows)
shell.exec(tf)
Is that what you are looking for?
library(ggplot2)
a <- 0:10
df <- data.frame(a=a,b=a)
steps <-function(end){
a <- ggplot(df[1:end,], aes(a,b)) +
geom_point() +
scale_x_continuous(limits=c(0,10)) +
scale_y_continuous(limits=c(0,10))
print(a)
}
gif <- function() {
lapply(seq(1,11,1), function(i) {
steps(i)
})
}
library(animation)
saveGIF(gif(), interval = .2, movie.name="test.gif")
I liked #ttlngr's answer but I wanted something a bit simpler (it still uses ImageMagick).
saveGIF({
for (i in 1:10){
a <- ggplot(df[1:i,], aes(a,b)) +
geom_point() +
scale_x_continuous(limits=c(0,10)) +
scale_y_continuous(limits=c(0,10))
print(a)}
}, interval = .2, movie.name="test.gif")

Save multiple ggplot2 plots as R object in list and re-displaying in grid

I would like to save multiple plots (with ggplot2) to a list during a large for-loop. And then subsequently display the images in a grid (with grid.arrange)
I have tried two solutions to this:
1 storing it in a list, like so:
pltlist[["qplot"]] <- qplot
however for some reason this does save the plot correctly.
So I resorted to a second strategy which is recordPlot()
This was able to save the plot correctly, but unable to
use it in a grid.
Reproducable Example:
require(ggplot2);require(grid);require(gridExtra)
df <- data.frame(x = rnorm(100),y = rnorm(100))
histoplot <- ggplot(df, aes(x=x)) + geom_histogram(aes(y=..density..),binwidth=.1,colour="black", fill="white")
qplot <- qplot(sample = df$y, stat="qq")
pltlist <- list()
pltlist[["qplot"]] <- qplot
pltlist[["histoplot"]] <- histoplot
grid.arrange(pltlist[["qplot"]],pltlist[["histoplot"]], ncol=2)
above code works but produces the wrong graph
in my actual code
Then I tried recordPlot()
print(histoplot)
c1 <- recordPlot()
print(qplot)
c2 <- recordPlot()
I am able to display all the plots individually
but grid.arrange produces an error:
grid.arrange(replayPlot(c1),replayPlot(c2), ncol=2) # = Error
Error in gList(list(wrapvp = list(x = 0.5, y = 0.5, width = 1, height = 1, :
only 'grobs' allowed in "gList"
In this thread Saving grid.arrange() plot to file
They dicuss a solution which utilizes arrangeGrob() instead
arrangeGrob(c1, c1, ncol=2) # Error
Error in vapply(x$grobs, as.character, character(1)) :
values must be length 1,
but FUN(X[[1]]) result is length 3
I am forced to use the recordPlot() instead of saving to a list since this does not produce the same graph when saved as when it is plotted immediately, which I unfortunately cannot replicate, sorry.
In my actual code I am doing a large for-loop, looping through several variables, making a correlation with each and making scatterplots, where I name the scatterplots dependent on their significans level. I then want to re-display the plots that were significant in a grid, in a dynamic knitr report.
I am aware that I could just re-plot the plots that were significant after the for-loop instead of saving them, (I can't save as png while doing knitr either). However I would like to find a way to dynammically save the plots as R-objects and then replot them in a grid afterwards.
Thanks for Reading
"R version 3.2.1"
Windows 7 64bit - RStudio - Version 0.99.652
attached base packages:
[1] grid grDevices datasets utils graphics stats methods base
other attached packages:
[1] gridExtra_2.0.0 ggplot2_1.0.1
I can think of two solutions.
1. If your goal is to just save the list of plots as R objects, I recommend:
saveRDS(object = pltlist, file = "file_path")
This way when you wish to reload in these graphs, you can just use readRDS(). You can then put them in cowplot or gridarrange. This command works for all lists and R Objects.
One caveat to this approach is if settings/labeling for ggplot2 is dependent upon things in the environment (not the data, but stuff like settings for point size, shape, or coloring) instead of the ggplot2 function used to make the graph), your graphs won't work until you restore your dependencies. One reason to save some dependencies is to modularize your scripts to make the graphs.
Another caveat is performance: From my experience, I found it is actually faster to read in the data and remake individual graphs than load in an RDS file of all the graphs when you have a large number of graphs (100+ graphs).
2. If your goal is to save an 'image' or 'picture' of each graph (single and/or multiplot as .png, .jpeg, etc.), and later adjust things in a grid manually outside of R such as powerpoint or photoshop, I recommend:
filenames <- c("Filename_1", "Filename_2") #actual file names you want...
lapply(seq_along(pltlist), function(i) {
ggsave(filename = filenames[i], plot = pltlist[[i]], ...) #use your settings here
})
Settings I like for single plots:
lapply(seq_along(pltlist), function(i) ggsave(
plot = pltlist[[i]],
filename = paste0("plot_", i, "_", ".tiff"), #you can even paste in pltlist[[i]]$labels$title
device = "tiff", width=180, height=180, units="mm", dpi=300, compression = "lzw", #compression for tiff
path = paste0("../Blabla") #must be an existing directory.
))
You may want to do the manual approach if you're really OCD about the grid arrangement and you don't have too many of them to make for publications. Otherwise, when you do grid.arrange you'll want to do all the specifications there (adjusting font, increasing axis label size, custom colors, etc.), then adjust the width and height accordingly.
Reviving this post to add multiplot here, as it fits exactly.
require(ggplot2)
mydd <- setNames( data.frame( matrix( rep(c("x","y","z"), each=10) ),
c(rnorm(10), rnorm(10), rnorm(10)) ), c("points", "data") )
# points data
# 1 x 0.733013658
# 2 x 0.218838717
# 3 x -0.008303382
# 4 x 2.225820069
# ...
p1 <- ggplot( mydd[mydd$point == "x",] ) + geom_line( aes( 1:10, data, col=points ) )
p2 <- ggplot( mydd[mydd$point == "y",] ) + geom_line( aes( 1:10, data, col=points ) )
p3 <- ggplot( mydd[mydd$point == "z",] ) + geom_line( aes( 1:10, data, col=points ) )
multiplot(p1,p2,p3, cols=1)
multiplot:
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
library(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}
Result:

ggplot2 : printing multiple plots in one page with a loop

I have several subjects for which I need to generate a plot, as I have many subjects I'd like to have several plots in one page rather than one figure for subject.
Here it is what I have done so far:
Read txt file with subjects name
subjs <- scan ("ListSubjs.txt", what = "")
Create a list to hold plot objects
pltList <- list()
for(s in 1:length(subjs))
{
setwd(file.path("C:/Users/", subjs[[s]])) #load subj directory
ifile=paste("Co","data.txt",sep="",collapse=NULL) #Read subj file
dat = read.table(ifile)
dat <- unlist(dat, use.names = FALSE) #make dat usable for ggplot2
df <- data.frame(dat)
pltList[[s]]<- print(ggplot( df, aes(x=dat)) + #save each plot with unique name
geom_histogram(binwidth=.01, colour="cyan", fill="cyan") +
geom_vline(aes(xintercept=0), # Ignore NA values for mean
color="red", linetype="dashed", size=1)+
xlab(paste("Co_data", subjs[[s]] , sep=" ",collapse=NULL)))
}
At this point I can display the single plots for example by
print (pltList[1]) #will print first plot
print(pltList[2]) # will print second plot
I d like to have a solution by which several plots are displayed in the same page, I 've tried something along the lines of previous posts but I don't manage to make it work
for example:
for (p in seq(length(pltList))) {
do.call("grid.arrange", pltList[[p]])
}
gives me the following error
Error in arrangeGrob(..., as.table = as.table, clip = clip, main = main, :
input must be grobs!
I can use more basic graphing features, but I d like to achieve this by using ggplot. Many thanks for consideration
Matilde
Your error comes from indexing a list with [[:
consider
pl = list(qplot(1,1), qplot(2,2))
pl[[1]] returns the first plot, but do.call expects a list of arguments. You could do it with, do.call(grid.arrange, pl[1]) (no error), but that's probably not what you want (it arranges one plot on the page, there's little point in doing that). Presumably you wanted all plots,
grid.arrange(grobs = pl)
or, equivalently,
do.call(grid.arrange, pl)
If you want a selection of this list, use [,
grid.arrange(grobs = pl[1:2])
do.call(grid.arrange, pl[1:2])
Further parameters can be passed trivially with the first syntax; with do.call care must be taken to make sure the list is in the correct form,
grid.arrange(grobs = pl[1:2], ncol=3, top=textGrob("title"))
do.call(grid.arrange, c(pl[1:2], list(ncol=3, top=textGrob("title"))))
library(gridExtra) # for grid.arrange
library(grid)
grid.arrange(pltList[[1]], pltList[[2]], pltList[[3]], pltList[[4]], ncol = 2, main = "Whatever") # say you have 4 plots
OR,
do.call(grid.arrange,pltList)
I wish I had enough reputation to comment instead of answer, but anyway you can use the following solution to get it work.
I would do exactly what you did to get the pltList, then use the multiplot function from this recipe. Note that you will need to specify the number of columns. For example, if you want to plot all plots in the list into two columns, you can do this:
print(multiplot(plotlist=pltList, cols=2))

Resources