Save automatically produced plots in R - r

I'm using a function in R able to analyse my data and produce several plots.
The function is "snpzip" from adegenet package.
I would like to save automatically the three plots that the function produces as part of the output. Do you have any suggestion on how to do it?
I want to point to the fact that I know how to save a single plot, for instance with png or pdf followed by dev.off(). My problem is that when I run snpzip(snps, phen, method = "centroid"), the outcomes are three plots (which I would like to save).
I report here the same example as in the "adegenet" package:
simpop <- glSim(100, 10000, n.snp.struc = 10, grp.size = c(0.3,0.7),
LD = FALSE, alpha = 0.4, k = 4)
snps <- as.matrix(simpop)
phen <- simpop#pop
outcome <- snpzip(snps, phen, method = "centroid")

If you use a filename with a C integer format in it, then R will substitute the page number for that part of the name, generating multiple files. For example,
png("page%d.png")
plot(1)
plot(2)
plot(3)
dev.off()
will generate 3 files, page1.png, page2.png, and page3.png. For pdf(), you also need onefile=FALSE:
pdf("page%d.pdf", onefile = FALSE)
plot(1)
plot(2)
plot(3)
dev.off()

Related

Suppress graph output of a function [duplicate]

I am trying to turn off the display of plot in R.
I read Disable GUI, graphics devices in R but the only solution given is to write the plot to a file.
What if I don't want to pollute the workspace and what if I don't have write permission ?
I tried options(device=NULL) but it didn't work.
The context is the package NbClust : I want what NbClust() returns but I do not want to display the plot it does.
Thanks in advance !
edit : Here is a reproducible example using data from the rattle package :)
data(wine, package="rattle")
df <- scale (wine[-1])
library(NbClust)
# This produces a graph output which I don't want
nc <- NbClust(df, min.nc=2, max.nc=15, method="kmeans")
# This is the plot I want ;)
barplot(table(nc$Best.n[1,]),
xlab="Numer of Clusters", ylab="Number of Criteria",
main="Number of Clusters Chosen by 26 Criteria")
You can wrap the call in
pdf(file = NULL)
and
dev.off()
This sends all the output to a null file which effectively hides it.
Luckily it seems that NbClust is one giant messy function with some other functions in it and lots of icky looking code. The plotting is done in one of two places.
Create a copy of NbClust:
> MyNbClust = NbClust
and then edit this function. Change the header to:
MyNbClust <-
function (data, diss = "NULL", distance = "euclidean", min.nc = 2,
max.nc = 15, method = "ward", index = "all", alphaBeale = 0.1, plotetc=FALSE)
{
and then wrap the plotting code in if blocks. Around line 1588:
if(plotetc){
par(mfrow = c(1, 2))
[etc]
cat(paste(...
}
and similarly around line 1610. Save. Now use:
nc = MyNbClust(...etc....)
and you see no plots unless you add plotetc=TRUE.
Then ask the devs to include your patch.

drake readd function not working for plots

I'm trying to trouble shoot why Drake plots are not showing up with readd() - the rest of the pipeline seem's to have worked though.
Not sure if this is caused by minfi::densityPlot or some other reason; my thoughts are the later as it's also not working for the barplot function which is base R.
In the RMarkdown report I have readd(dplot1) etc. in the chunks but the output is NULL
This is the code I have in my R/setup.R file:
library(drake)
library(tidyverse)
library(magrittr)
library(minfi)
library(DNAmArray)
library(methylumi)
library(RColorBrewer)
library(minfiData)
pkgconfig::set_config("drake::strings_in_dots" = "literals") # New file API
# Your custom code is a bunch of functions.
make_beta <- function(rgSet){
rgSet_betas = minfi::getBeta(rgSet)
}
make_filter <- function(rgSet){
rgSet_filtered = DNAmArray::probeFiltering(rgSet)
}
This is my R/plan.R file:
# The workflow plan data frame outlines what you are going to do
plan <- drake_plan(
baseDir = system.file("extdata", package = "minfiData"),
targets = read.metharray.sheet(baseDir),
rgSet = read.metharray.exp(targets = targets),
mSetSq = preprocessQuantile(rgSet),
detP = detectionP(rgSet),
dplot1 = densityPlot(rgSet, sampGroups=targets$Sample_Group,main="Raw", legend=FALSE),
dplot2 = densityPlot (getBeta (mSetSq), sampGroups=targets$Sample_Group, main="Normalized", legend=FALSE),
pal = RColorBrewer::brewer.pal (8,"Dark2"),
dplot3 = barplot (colMeans (detP[,1:6]), col=pal[ factor (targets$Sample_Group[1:6])], las=2, cex.names=0.8, ylab="Mean detection p-values"),
report = rmarkdown::render(
knitr_in("report.Rmd"),
output_file = file_out("report.html"),
quiet = TRUE
)
)
After using make(plan) it looks like everything ran smoothly:
config <- drake_config(plan)
vis_drake_graph(config)
I am able to use loadd() to load the objects needed for one of these plots and then make the plots, like this:
loadd(rgSet)
loadd(targets)
densityPlot(rgSet, sampGroups=targets$Sample_Group,main="Raw", legend=FALSE)
But the readd() command doesn't work?
The output in the .html for dplot3 looks weird...
Fortunately, this is expected behavior. drake targets are return values of commands, and so the value of dplot3 is supposed to be the return value of barplot(). The return value of barplot() is actually not a plot. The "Value" section of the help file (?barplot) explains the return value.
A numeric vector (or matrix, when beside = TRUE), say mp, giving the coordinates of all the bar midpoints drawn, useful for adding to the graph.
If beside is true, use colMeans(mp) for the midpoints of each group of bars, see example.
So what is going on? As with most base graphics functions, the plot from barplot() is actually a side effect. barplot() sends the plot to a graphics device and then returns something else to the user.
Have you considered ggplot2? The return value of ggplot() is actually a plot object, which is more intuitive. If you want to stick with base graphics, maybe you could save the plot to an output file.
plan <- drake_plan(
...,
dplot3 = {
pdf(file_out("dplot3.pdf"))
barplot(...)
dev.off()
}
)

How to store results of a simulation and plot all the results in one plot using plot_KDE in Luminescence package for R

I am creating a simulation using random number simulations. This gives 100 sets of 45 values with error.
First I would like to store the results of these simulations.
I then need to plot the results of these simulations on one plot. The plot I need to produce uses the package Luminescence and is of the type KDE.
I have managed to produce the separate entities but am struggling to both store the results and to produce the plot with all the simulations.
So far I have created the simulation:
Simulation <- function() {
RNC <- rescale (SFMT(45, dim=1, mexp=216091,
usepset=T, withtorus= F, usetime=T),
c(0.01,130))
RNC_error <- RNC*0.15
df <-data.frame(RNC,RNC_error)
}
the plot I want to create uses the following:
library("Luminescence")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
For my final result I require the numerical results of all the simulations stored and a single KDE plot with the results of all the simulations.
Split your problem into two parts.
Storing results. You have a data frame, df, so just use write.csv() to store the results to a CSV file, i.e.
write.csv(df, file="some_file.csv")
Storing your plot. Obviously you can't use a csv file, so instead we'll use a pdf or png, e.g.
# Open the file
pdf("figure_file.pdf")
plot_KDE(data=df, na.rm = TRUE,
values.cumulative = TRUE, order = TRUE,
boxplot = F, rug = F,
summary.method = "MCM", bw = "nrd0",
output = TRUE)
# Close the file
dev.off()
To save as a png use png() instead of pdf()

How to find byte sizes of R figures on pages?

I would like to monitor the basic quality of the figures produced in R on individual pages such as byte size of each page,...
I can now do only quality assurance of average pages, see the following chapter about it.
I think there must be something builtin for the task than average measures.
Code which produces 4 pages in Rplots.pdf where I would like to know the byte size of each page in an output here; any other statistics of the page outputs is also welcome;
you can get the basic memory monitoring by objects here but I would like it to correspond to the outputs in PDF
# https://stat.ethz.ch/R-manual/R-devel/library/graphics/html/plot.html
require(stats) # for lowess, rpois, rnorm
plot(cars)
lines(lowess(cars))
plot(sin, -pi, 2*pi) # see ?plot.function
## Discrete Distribution Plot:
plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10,
main = "rpois(100, lambda = 5)")
## Simple quantiles/ECDF, see ecdf() {library(stats)} for a better one:
plot(x <- sort(rnorm(47)), type = "s", main = "plot(x, type = \"s\")")
points(x, cex = .5, col = "dark red")
## TODO summarise here the byte size of figures in the figures (1-4)
# Output: Rplot.pdf where 4 pages; I want to know the size of each page in bytes
I am currently doing the basic quality assurance in command-line but would like to move some of it to R, to observe bugs faster.
Expected output: byte size, for instance like 4th column of ls -l
To get bytesize of average individual page in an output document
Limitations
Requirement of the homogeneity of the data in pages. This method only works if the pages are all from the same sample.
Otherwise, it is troublesome because it is only average, not describing then the individual phenomenons.
Other possible weaknesses
PDF-elements and meta data. Consider PDF-file as whole, not focusing on the graphic objects itself. So this limits the absolute value use because the filesize contains also headers and other meta data which are not about the graphic objects.
Code
filename <- "main.pdf"
filesize <- file.size(filename)
# http://unix.stackexchange.com/q/331175/16920
pages <- Rpoppler::PDF_info(filename)$Pages
# print page size (= filesize / pages)
pagesize <- filesize / pages
## data of example file
num 7350960
int 62
num 118564
Input: just any 62-pages document
Output: average individual page size (118564)
Testing and's answer
Output but you cannot change the input easily to your wanted PDF-file
files size_bytes
[1,] "./test_page_size_pdf/page01.pdf" "4,123,942"
[2,] "./test_page_size_pdf/page02.pdf" " 4,971"
[3,] "./test_page_size_pdf/page03.pdf" " 4,672"
[4,] "./test_page_size_pdf/page04.pdf" " 5,370"
Input: just any 64-pages document
Expected output: 67 (= 64 + 3) pages, not 4 analysed
R: 3.3.2
OS: Debian 8.5
Download and install the pdftk utility if it is not already on your system and then try one of the following alternatives this from within R.
1) It will return a data frame with the page file sizes in bytes and other information.
myfile <- "Rplots.pdf"
system(paste("pdftk", myfile, "burst"))
file.info(Sys.glob("pg_*.pdf"))
It will also generate a file doc_data.txt with some miscellaneous information that may or may not be of interest.
1a) This alternative will not generate any files. It will simply return the character sizes of the pages as a numeric vector.
myfile <- "Rplots.pdf"
pages <- as.numeric(read.dcf(pipe(paste("pdftk", myfile, "dump_data")))[, "NumberOfPages"])
cmds <- sprintf("pdftk %s cat %d output - | wc -c", myfile, seq_len(pages))
unname(sapply(cmds, function(cmd) scan(pipe(cmd), quiet = TRUE)))
The above should work if pdftk and wc are on your path. Note that on Windows you can find wc in the Rtools distribution and is typically at "C:\\Rtools\\bin\\wc" once Rtools is installed.
2) This alternative is similar to (1) but uses the animation package:
library(animation)
ani.options(pdftk = "/path/to/pdftk")
pdftk("Rplots.pdf", "burst", "pg_%04d.pdf", "")
file.info(Sys.glob("pg_*.pdf"))
To measure the size of each page in a pdf-file I suggest this:
test_size <- TRUE
pdf_name <- "masterpiece"
if(test_size){
dir.create("test_page_size_pdf")
pdf_address <- paste0("./test_page_size_pdf/page%02d.pdf")
} else { pdf_address <- paste0("./", pdf_name, ".pdf")}
pdf(pdf_address, width=10, height=6, onefile=!test_size)
par(mar=c(1,1,1,1), oma=c(1,1,1,1))
plot(rnorm(10^6, 100, 5), type="l")
plot(sin, -pi, 2*pi)
plot(table(rpois(100, 5)), type = "h", col = "red", lwd = 10,
main = "rpois(100, lambda = 5)")
plot(x <- sort(rnorm(47)), type = "s", main = "plot(x, type = \"s\")")
points(x, cex = .5, col = "dark red")
dev.off()
if(test_size){
files <- paste0("./test_page_size_pdf/", list.files("./test_page_size_pdf/"))
size_bytes <- format(file.size(files), big.mark = ",")
file.remove(files)
file.remove("test_page_size_pdf")
cbind(files, size_bytes)
}
The size of a pdf-page in R depends on three things: the content of the plot(), the options used in the pdf() function and the plotting options which are here defined in par().
All this is difficult to estimate. You mention also that you like to have something similar to the shell function ls, which run on files as well. So in this solution I create a temporary folder dir.create() in which we save every page of the pdf separately in a file. We implement this with the option onefile. When the plotting is finish every pdf-page-file as well as the temporary folder will be deleted. And you can see the result in the console.
If you are finish with the testing and want the result in a single file you just have to change in the first line of this script the variable test_size <- FALSE. By the way; I have some doubt that the size of a page is a proxy for the quality of an image. Pdf is a vector format, so the size correspondent with the number of elements: see the size of the first page in my example where I plot 1mio points.

knitr adds an empty figure with ssplot from seqHMM package

I have the following chunk in RStudio:
<<sumfig,dependson='data',fig.cap="Summary of sequences">>=
ssplot(smult)
#
ssplot is a function in seqHMM package which creates a frequency graph and smult is my sequence data.
When I run my code, I get two figures in my pdf: The first one is an empty white figure with label {fig:sumfig1} and the second one is the real figure with label {fig:sumfig1}. I have similar experience with other plots from this package. I also have some other graphs in my file from other packages which work just fine.
Is it something wrong with the package or I am doing something wrong?
The root of this issue seems to be seqHMM:ssplot, not knitr: Even in an interactive sesion, ssplot generates two plots, an empty one and the actual plot.
If there is only one plot generated in the chunk with ssplot, the chunk option fig.keep = "last" can be used to disregard the first plot and show only the second (last) one.
\documentclass{article}
\begin{document}
<<echo = FALSE, message = FALSE, fig.keep = "last">>=
library(seqHMM)
# from ?ssplot
data("biofam3c")
# Creating sequence objects
child_seq <- seqdef(biofam3c$children, start = 15)
marr_seq <- seqdef(biofam3c$married, start = 15)
left_seq <- seqdef(biofam3c$left, start = 15)
## Choosing colors
attr(child_seq, "cpal") <- c("#66C2A5", "#FC8D62")
attr(marr_seq, "cpal") <- c("#AB82FF", "#E6AB02", "#E7298A")
attr(left_seq, "cpal") <- c("#A6CEE3", "#E31A1C")
# Plotting state distribution plots of observations
ssplot(list("Children" = child_seq, "Marriage" = marr_seq,
"Residence" = left_seq))
#
\end{document}
As of knitr 1.14 (the current development version, available on GitHub), you can also use fig.keep to specify which plots exactly you want to keep: fig.keep = c(1,3) will keep the first and the third plot.

Resources