R : Create list of plots with for loop - r

I try to create a list of plots of my data using a for loop to filter (="TAB_tmp2") and add the new plot in the list (="ListeGRAPH"). I think the problem comes from the difference of filter data table (="TAB_tmp2").
I have read several topics on the web about that but I can't find a solution which could works in this case.
My code :
rm(list=ls()) # delete objects
#====================================
# Create data for the example
#====================================
TAB = data.frame(Types_Mesures = c(rep(1,3),rep(2,5),rep(3,10)))
TAB$ID_mesuresParType=NA
TAB$Mesures=log(c(1:length(TAB$Types_Mesures)))
Nb_Types=length(unique(TAB$Types_Mesures)) # in the real data, the number of "Types_Mesures" can change
for (x in 1:Nb_Types) {
TAB_tmp=TAB[TAB$Types_Mesures==x,2]
TAB[TAB$Types_Mesures==x,2]=c(1:length(TAB_tmp))
}
#====================================
# List of plots
#====================================
library(gridExtra)
library(ggplot2)
INPUTDirectory= "D:/TEST/"
setwd(dir=INPUTDirectory)
ListeGRAPH <- list()
for (x in 1:Nb_Types) {
TAB_tmp2=TAB[TAB$Types_Mesures==x,]
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = TAB_tmp2$ID_mesuresParType, y = TAB_tmp2$Mesures))
# #Save graph
# png(filename = paste("TAB_plot_T",x,".png", sep = ""))
# print(ListeGRAPH[[x]])
# graphics.off()
}
gridExtra::grid.arrange(grobs = ListeGRAPH)
When I run the code, I have this error :
Error: Aesthetics must be either length 1 or the same as the data (3):
x, y
It seems that grid.arrange don't accept plots of different dimensions ?
How could I do to make the list of plots with this kind of table ? In my real data the number of "Types_Mesures" can change.
More over, I think the for loop don't allow to use a temporary variable (="TAB_tmp2") to create the list of plot but this code works when I save my plot in PNG files.
Thanks a lot for you help !

The problem is actually not with grid.arrange. When you're creating the plots with ggplot, you do not need to use $ for indexing of columns. So instead of:
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = TAB_tmp2$ID_mesuresParType, y = TAB_tmp2$Mesures))
you should use:
ListeGRAPH[[x]] <- ggplot(data = TAB_tmp2) +
geom_line(aes(x = ID_mesuresParType, y = Mesures))
and then you will be able to plot the results using grid.arrange.

Related

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Save multiple ggplot2 plots as R object in list and re-displaying in grid

I would like to save multiple plots (with ggplot2) to a list during a large for-loop. And then subsequently display the images in a grid (with grid.arrange)
I have tried two solutions to this:
1 storing it in a list, like so:
pltlist[["qplot"]] <- qplot
however for some reason this does save the plot correctly.
So I resorted to a second strategy which is recordPlot()
This was able to save the plot correctly, but unable to
use it in a grid.
Reproducable Example:
require(ggplot2);require(grid);require(gridExtra)
df <- data.frame(x = rnorm(100),y = rnorm(100))
histoplot <- ggplot(df, aes(x=x)) + geom_histogram(aes(y=..density..),binwidth=.1,colour="black", fill="white")
qplot <- qplot(sample = df$y, stat="qq")
pltlist <- list()
pltlist[["qplot"]] <- qplot
pltlist[["histoplot"]] <- histoplot
grid.arrange(pltlist[["qplot"]],pltlist[["histoplot"]], ncol=2)
above code works but produces the wrong graph
in my actual code
Then I tried recordPlot()
print(histoplot)
c1 <- recordPlot()
print(qplot)
c2 <- recordPlot()
I am able to display all the plots individually
but grid.arrange produces an error:
grid.arrange(replayPlot(c1),replayPlot(c2), ncol=2) # = Error
Error in gList(list(wrapvp = list(x = 0.5, y = 0.5, width = 1, height = 1, :
only 'grobs' allowed in "gList"
In this thread Saving grid.arrange() plot to file
They dicuss a solution which utilizes arrangeGrob() instead
arrangeGrob(c1, c1, ncol=2) # Error
Error in vapply(x$grobs, as.character, character(1)) :
values must be length 1,
but FUN(X[[1]]) result is length 3
I am forced to use the recordPlot() instead of saving to a list since this does not produce the same graph when saved as when it is plotted immediately, which I unfortunately cannot replicate, sorry.
In my actual code I am doing a large for-loop, looping through several variables, making a correlation with each and making scatterplots, where I name the scatterplots dependent on their significans level. I then want to re-display the plots that were significant in a grid, in a dynamic knitr report.
I am aware that I could just re-plot the plots that were significant after the for-loop instead of saving them, (I can't save as png while doing knitr either). However I would like to find a way to dynammically save the plots as R-objects and then replot them in a grid afterwards.
Thanks for Reading
"R version 3.2.1"
Windows 7 64bit - RStudio - Version 0.99.652
attached base packages:
[1] grid grDevices datasets utils graphics stats methods base
other attached packages:
[1] gridExtra_2.0.0 ggplot2_1.0.1
I can think of two solutions.
1. If your goal is to just save the list of plots as R objects, I recommend:
saveRDS(object = pltlist, file = "file_path")
This way when you wish to reload in these graphs, you can just use readRDS(). You can then put them in cowplot or gridarrange. This command works for all lists and R Objects.
One caveat to this approach is if settings/labeling for ggplot2 is dependent upon things in the environment (not the data, but stuff like settings for point size, shape, or coloring) instead of the ggplot2 function used to make the graph), your graphs won't work until you restore your dependencies. One reason to save some dependencies is to modularize your scripts to make the graphs.
Another caveat is performance: From my experience, I found it is actually faster to read in the data and remake individual graphs than load in an RDS file of all the graphs when you have a large number of graphs (100+ graphs).
2. If your goal is to save an 'image' or 'picture' of each graph (single and/or multiplot as .png, .jpeg, etc.), and later adjust things in a grid manually outside of R such as powerpoint or photoshop, I recommend:
filenames <- c("Filename_1", "Filename_2") #actual file names you want...
lapply(seq_along(pltlist), function(i) {
ggsave(filename = filenames[i], plot = pltlist[[i]], ...) #use your settings here
})
Settings I like for single plots:
lapply(seq_along(pltlist), function(i) ggsave(
plot = pltlist[[i]],
filename = paste0("plot_", i, "_", ".tiff"), #you can even paste in pltlist[[i]]$labels$title
device = "tiff", width=180, height=180, units="mm", dpi=300, compression = "lzw", #compression for tiff
path = paste0("../Blabla") #must be an existing directory.
))
You may want to do the manual approach if you're really OCD about the grid arrangement and you don't have too many of them to make for publications. Otherwise, when you do grid.arrange you'll want to do all the specifications there (adjusting font, increasing axis label size, custom colors, etc.), then adjust the width and height accordingly.
Reviving this post to add multiplot here, as it fits exactly.
require(ggplot2)
mydd <- setNames( data.frame( matrix( rep(c("x","y","z"), each=10) ),
c(rnorm(10), rnorm(10), rnorm(10)) ), c("points", "data") )
# points data
# 1 x 0.733013658
# 2 x 0.218838717
# 3 x -0.008303382
# 4 x 2.225820069
# ...
p1 <- ggplot( mydd[mydd$point == "x",] ) + geom_line( aes( 1:10, data, col=points ) )
p2 <- ggplot( mydd[mydd$point == "y",] ) + geom_line( aes( 1:10, data, col=points ) )
p3 <- ggplot( mydd[mydd$point == "z",] ) + geom_line( aes( 1:10, data, col=points ) )
multiplot(p1,p2,p3, cols=1)
multiplot:
multiplot <- function(..., plotlist=NULL, file, cols=1, layout=NULL) {
library(grid)
# Make a list from the ... arguments and plotlist
plots <- c(list(...), plotlist)
numPlots = length(plots)
# If layout is NULL, then use 'cols' to determine layout
if (is.null(layout)) {
# Make the panel
# ncol: Number of columns of plots
# nrow: Number of rows needed, calculated from # of cols
layout <- matrix(seq(1, cols * ceiling(numPlots/cols)),
ncol = cols, nrow = ceiling(numPlots/cols))
}
if (numPlots==1) {
print(plots[[1]])
} else {
# Set up the page
grid.newpage()
pushViewport(viewport(layout = grid.layout(nrow(layout), ncol(layout))))
# Make each plot, in the correct location
for (i in 1:numPlots) {
# Get the i,j matrix positions of the regions that contain this subplot
matchidx <- as.data.frame(which(layout == i, arr.ind = TRUE))
print(plots[[i]], vp = viewport(layout.pos.row = matchidx$row,
layout.pos.col = matchidx$col))
}
}
}
Result:

ggplot2 : printing multiple plots in one page with a loop

I have several subjects for which I need to generate a plot, as I have many subjects I'd like to have several plots in one page rather than one figure for subject.
Here it is what I have done so far:
Read txt file with subjects name
subjs <- scan ("ListSubjs.txt", what = "")
Create a list to hold plot objects
pltList <- list()
for(s in 1:length(subjs))
{
setwd(file.path("C:/Users/", subjs[[s]])) #load subj directory
ifile=paste("Co","data.txt",sep="",collapse=NULL) #Read subj file
dat = read.table(ifile)
dat <- unlist(dat, use.names = FALSE) #make dat usable for ggplot2
df <- data.frame(dat)
pltList[[s]]<- print(ggplot( df, aes(x=dat)) + #save each plot with unique name
geom_histogram(binwidth=.01, colour="cyan", fill="cyan") +
geom_vline(aes(xintercept=0), # Ignore NA values for mean
color="red", linetype="dashed", size=1)+
xlab(paste("Co_data", subjs[[s]] , sep=" ",collapse=NULL)))
}
At this point I can display the single plots for example by
print (pltList[1]) #will print first plot
print(pltList[2]) # will print second plot
I d like to have a solution by which several plots are displayed in the same page, I 've tried something along the lines of previous posts but I don't manage to make it work
for example:
for (p in seq(length(pltList))) {
do.call("grid.arrange", pltList[[p]])
}
gives me the following error
Error in arrangeGrob(..., as.table = as.table, clip = clip, main = main, :
input must be grobs!
I can use more basic graphing features, but I d like to achieve this by using ggplot. Many thanks for consideration
Matilde
Your error comes from indexing a list with [[:
consider
pl = list(qplot(1,1), qplot(2,2))
pl[[1]] returns the first plot, but do.call expects a list of arguments. You could do it with, do.call(grid.arrange, pl[1]) (no error), but that's probably not what you want (it arranges one plot on the page, there's little point in doing that). Presumably you wanted all plots,
grid.arrange(grobs = pl)
or, equivalently,
do.call(grid.arrange, pl)
If you want a selection of this list, use [,
grid.arrange(grobs = pl[1:2])
do.call(grid.arrange, pl[1:2])
Further parameters can be passed trivially with the first syntax; with do.call care must be taken to make sure the list is in the correct form,
grid.arrange(grobs = pl[1:2], ncol=3, top=textGrob("title"))
do.call(grid.arrange, c(pl[1:2], list(ncol=3, top=textGrob("title"))))
library(gridExtra) # for grid.arrange
library(grid)
grid.arrange(pltList[[1]], pltList[[2]], pltList[[3]], pltList[[4]], ncol = 2, main = "Whatever") # say you have 4 plots
OR,
do.call(grid.arrange,pltList)
I wish I had enough reputation to comment instead of answer, but anyway you can use the following solution to get it work.
I would do exactly what you did to get the pltList, then use the multiplot function from this recipe. Note that you will need to specify the number of columns. For example, if you want to plot all plots in the list into two columns, you can do this:
print(multiplot(plotlist=pltList, cols=2))

Store the result of a plot() call to a variable without sending to current graphics device

This is really one of two questions - either:
1) How do I store the result of a print() call [i.e. x <- print(something) ] without sending anything to current graphics output?
-or-
2) Is there a function or method in ggplot that will store a plot() call to a variable without calling plot() directly? ggplotGrob is in the ballpark, but a ggplotGrob object doesn't return a list with $data in it the same way you get when you store the result of print() to a variable.
I'm using a technique picked up from this SO answer to pull out the points of a geom_density curve, and then using that data to generate some annotations. I've outlined the issue below -- when I call this as a function, I get the undesired intermediate plot object in my pdf, along with the final plot. The goal is to get rid of that undesired plot; given that base hist() has a plot = FALSE option I was hopeful that someone who knows something more about R viewports would be able to fix my plot() call (solution #1), but any solution is fine, frankly.
library(ggplot2)
library(plyr)
demo <- function (df) {
p <- ggplot(
df
,aes(
x = rating
)
) +
geom_density()
#plot the object so we can access $data
render_plot <- plot(p + ggtitle("Don't want this plot"))
#grab just the DF for the density line
density_df <- render_plot$data[[1]]
#get the maximum density value
max_y <- ddply(density_df, "group", summarise, y = max(y))
#join that back to the data to find the matching row
anno <- join(density_df, max_y, type = 'inner')
#use this to annotate
p <- p + annotate(
geom = 'text'
,x = anno$x
,y = anno$y
,label = round(anno$density, 3)
) +
ggtitle('Keep this plot')
return(p)
}
#call to demo outputs an undesired plot to the graphics device
ex <- demo(movies[movies$Comedy ==1,])
plot(ex)
#this is problematic if you are trying to make a PDF
#a distinct name for the pdf to avoid filesystem issues
unq_name <- as.character(format(Sys.time(), "%X"))
unq_name <- gsub(':', '', unq_name)
pdf(paste(unq_name , '.pdf', sep=''))
p <- demo(movies[movies$Drama ==1,])
print(p)
dev.off()
Use ggplot_build:
render_plot <- ggplot_build(p + ggtitle("Don't want this plot"))

NP chart using ggplot2

how i can generate NP chart using ggplot2?
I made simple Rscript which generates bar, point charts. I am supplying data by csv file. how many columns do i need to specify and in gplot functions what arguments do i need to pass?
I am very new to R, ggplots.
EDIT :
This is what is meant by an NP chart.
Current code attempt:
#load library ggplot2
library(ggplot2)
#get arguments
args <- commandArgs(TRUE)
pdfname <- args[1]
graphtype <- args[2]
datafile <- args[3]
#read csv file
tasks <- read.csv(datafile , header = T)
#name the pdf from passed arg 1
pdf(pdfname)
#main magic that generates the graph
qplot(x,y, data=tasks, geom = graphtype)
#clean up
dev.off()
In .csv file there are 2 columns x,y i call this script by Rscript cne.R 11_16.pdf "point" "data.csv".
Thanks you very much #mathematical.coffee this is what i need but
1> I am reading data from csv file which contains following data
this is my data
Month,Rate
"Jan","37.50"
"Feb","32.94"
"Mar","25.00"
"Apr","33.33"
"May","33.08"
"Jun","29.09"
"Jul","12.00"
"Aug","10.00"
"Sep","6.00"
"Oct","23.00"
"Nov","9.00"
"Dec","14.00"
2> I want to display value on each plotting point. and also display value for UCL,Cl,LCL, and give different label to x and y.
Problem when i read data it is not in the same order as in csv file. how to fix it?
You combine ggplot(tasks,aes(x=x,y=y)) with geom_line and geom_point to get the lines connected by points.
If you additionally want the UCL/LCL/etc drawn you add in a geom_hline (horizontal line).
To add text to these lines you can use geom_text.
An example:
library(ggplot2)
# generate some data to use, say monthly up to a year from today.
n <- 12
tasks <- data.frame(
x = seq(Sys.Date(),by="month",length=n),
y = runif(n) )
CL = median(tasks$y) # substitue however you calculate CL here
LCL = quantile(tasks$y,.25) # substitue however you calculate LCL here
UCL = quantile(tasks$y,.75) # substitue however you calculate UCL here
limits = c(UCL,CL,LCL)
lbls = c('UCL','CL','LCL')
p <- ggplot(tasks,aes(x=x,y=y)) + # store x/y values
geom_line() + # add line
geom_point(aes(colour=(y>LCL&y<UCL))) + # add points, colour if outside limits
opts(legend.position='none', # remove legend for colours
axis.text.x=theme_text(angle=90)) # rotate x axis labels
# Now add in the limits.
# horizontal lines + dashed for upper/lower and solid for the CL
p <- p + geom_hline(aes(yintercept=limits,linetype=lbls)) + # draw lines
geom_text(aes(y=limits,x=tasks$x[n],label=lbls,vjust=-0.2,cex=.8)) # draw text
# display
print(p)
which gives:

Resources