drake readd function not working for plots - r

I'm trying to trouble shoot why Drake plots are not showing up with readd() - the rest of the pipeline seem's to have worked though.
Not sure if this is caused by minfi::densityPlot or some other reason; my thoughts are the later as it's also not working for the barplot function which is base R.
In the RMarkdown report I have readd(dplot1) etc. in the chunks but the output is NULL
This is the code I have in my R/setup.R file:
library(drake)
library(tidyverse)
library(magrittr)
library(minfi)
library(DNAmArray)
library(methylumi)
library(RColorBrewer)
library(minfiData)
pkgconfig::set_config("drake::strings_in_dots" = "literals") # New file API
# Your custom code is a bunch of functions.
make_beta <- function(rgSet){
rgSet_betas = minfi::getBeta(rgSet)
}
make_filter <- function(rgSet){
rgSet_filtered = DNAmArray::probeFiltering(rgSet)
}
This is my R/plan.R file:
# The workflow plan data frame outlines what you are going to do
plan <- drake_plan(
baseDir = system.file("extdata", package = "minfiData"),
targets = read.metharray.sheet(baseDir),
rgSet = read.metharray.exp(targets = targets),
mSetSq = preprocessQuantile(rgSet),
detP = detectionP(rgSet),
dplot1 = densityPlot(rgSet, sampGroups=targets$Sample_Group,main="Raw", legend=FALSE),
dplot2 = densityPlot (getBeta (mSetSq), sampGroups=targets$Sample_Group, main="Normalized", legend=FALSE),
pal = RColorBrewer::brewer.pal (8,"Dark2"),
dplot3 = barplot (colMeans (detP[,1:6]), col=pal[ factor (targets$Sample_Group[1:6])], las=2, cex.names=0.8, ylab="Mean detection p-values"),
report = rmarkdown::render(
knitr_in("report.Rmd"),
output_file = file_out("report.html"),
quiet = TRUE
)
)
After using make(plan) it looks like everything ran smoothly:
config <- drake_config(plan)
vis_drake_graph(config)
I am able to use loadd() to load the objects needed for one of these plots and then make the plots, like this:
loadd(rgSet)
loadd(targets)
densityPlot(rgSet, sampGroups=targets$Sample_Group,main="Raw", legend=FALSE)
But the readd() command doesn't work?
The output in the .html for dplot3 looks weird...

Fortunately, this is expected behavior. drake targets are return values of commands, and so the value of dplot3 is supposed to be the return value of barplot(). The return value of barplot() is actually not a plot. The "Value" section of the help file (?barplot) explains the return value.
A numeric vector (or matrix, when beside = TRUE), say mp, giving the coordinates of all the bar midpoints drawn, useful for adding to the graph.
If beside is true, use colMeans(mp) for the midpoints of each group of bars, see example.
So what is going on? As with most base graphics functions, the plot from barplot() is actually a side effect. barplot() sends the plot to a graphics device and then returns something else to the user.
Have you considered ggplot2? The return value of ggplot() is actually a plot object, which is more intuitive. If you want to stick with base graphics, maybe you could save the plot to an output file.
plan <- drake_plan(
...,
dplot3 = {
pdf(file_out("dplot3.pdf"))
barplot(...)
dev.off()
}
)

Related

Suppress graph output of a function [duplicate]

I am trying to turn off the display of plot in R.
I read Disable GUI, graphics devices in R but the only solution given is to write the plot to a file.
What if I don't want to pollute the workspace and what if I don't have write permission ?
I tried options(device=NULL) but it didn't work.
The context is the package NbClust : I want what NbClust() returns but I do not want to display the plot it does.
Thanks in advance !
edit : Here is a reproducible example using data from the rattle package :)
data(wine, package="rattle")
df <- scale (wine[-1])
library(NbClust)
# This produces a graph output which I don't want
nc <- NbClust(df, min.nc=2, max.nc=15, method="kmeans")
# This is the plot I want ;)
barplot(table(nc$Best.n[1,]),
xlab="Numer of Clusters", ylab="Number of Criteria",
main="Number of Clusters Chosen by 26 Criteria")
You can wrap the call in
pdf(file = NULL)
and
dev.off()
This sends all the output to a null file which effectively hides it.
Luckily it seems that NbClust is one giant messy function with some other functions in it and lots of icky looking code. The plotting is done in one of two places.
Create a copy of NbClust:
> MyNbClust = NbClust
and then edit this function. Change the header to:
MyNbClust <-
function (data, diss = "NULL", distance = "euclidean", min.nc = 2,
max.nc = 15, method = "ward", index = "all", alphaBeale = 0.1, plotetc=FALSE)
{
and then wrap the plotting code in if blocks. Around line 1588:
if(plotetc){
par(mfrow = c(1, 2))
[etc]
cat(paste(...
}
and similarly around line 1610. Save. Now use:
nc = MyNbClust(...etc....)
and you see no plots unless you add plotetc=TRUE.
Then ask the devs to include your patch.

Error in axis(side = side, at = at, labels = labels, ...) : invalid value specified for graphical parameter "pch"

I have applied DBSCAN algorithm on built-in dataset iris in R. But I am getting error when tried to visualise the output using the plot( ).
Following is my code.
library(fpc)
library(dbscan)
data("iris")
head(iris,2)
data1 <- iris[,1:4]
head(data1,2)
set.seed(220)
db <- dbscan(data1,eps = 0.45,minPts = 5)
table(db$cluster,iris$Species)
plot(db,data1,main = 'DBSCAN')
Error: Error in axis(side = side, at = at, labels = labels, ...) :
invalid value specified for graphical parameter "pch"
How to rectify this error?
I have a suggestion below, but first I see two issues:
You're loading two packages, fpc and dbscan, both of which have different functions named dbscan(). This could create tricky bugs later (e.g. if you change the order in which you load the packages, different functions will be run).
It's not clear what you're trying to plot, either what the x- or y-axes should be or the type of plot. The function plot() generally takes a vector of values for the x-axis and another for the y-axis (although not always, consult ?plot), but here you're passing it a data.frame and a dbscan object, and it doesn't know how to handle it.
Here's one way of approaching it, using ggplot() to make a scatterplot, and dplyr for some convenience functions:
# load our packages
# note: only loading dbscacn, not loading fpc since we're not using it
library(dbscan)
library(ggplot2)
library(dplyr)
# run dbscan::dbscan() on the first four columns of iris
db <- dbscan::dbscan(iris[,1:4],eps = 0.45,minPts = 5)
# create a new data frame by binding the derived clusters to the original data
# this keeps our input and output in the same dataframe for ease of reference
data2 <- bind_cols(iris, cluster = factor(db$cluster))
# make a table to confirm it gives the same results as the original code
table(data2$cluster, data2$Species)
# using ggplot, make a point plot with "jitter" so each point is visible
# x-axis is species, y-axis is cluster, also coloured according to cluster
ggplot(data2) +
geom_point(mapping = aes(x=Species, y = cluster, colour = cluster),
position = "jitter") +
labs(title = "DBSCAN")
Here's the image it generates:
If you're looking for something else, please be more specific about what the final plot should look like.

Save automatically produced plots in R

I'm using a function in R able to analyse my data and produce several plots.
The function is "snpzip" from adegenet package.
I would like to save automatically the three plots that the function produces as part of the output. Do you have any suggestion on how to do it?
I want to point to the fact that I know how to save a single plot, for instance with png or pdf followed by dev.off(). My problem is that when I run snpzip(snps, phen, method = "centroid"), the outcomes are three plots (which I would like to save).
I report here the same example as in the "adegenet" package:
simpop <- glSim(100, 10000, n.snp.struc = 10, grp.size = c(0.3,0.7),
LD = FALSE, alpha = 0.4, k = 4)
snps <- as.matrix(simpop)
phen <- simpop#pop
outcome <- snpzip(snps, phen, method = "centroid")
If you use a filename with a C integer format in it, then R will substitute the page number for that part of the name, generating multiple files. For example,
png("page%d.png")
plot(1)
plot(2)
plot(3)
dev.off()
will generate 3 files, page1.png, page2.png, and page3.png. For pdf(), you also need onefile=FALSE:
pdf("page%d.pdf", onefile = FALSE)
plot(1)
plot(2)
plot(3)
dev.off()

Change of colors in compare.matrix command in r

I'm trying to change the colors for the compare.matrix command in r, but the error is always the same:
Error in image.default(x = mids, y = mids, z = mdata, col = c(heat.colors(10)[10:1]), :
formal argument "col" matched by multiple actual arguments
My code is very simple:
compare.matrix(current,ech_b1,nbins=40)
and some of my attempts are:
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(5)))
compare.matrix(current,ech_b1,nbins=40,col=c(grey.colors(10)[10:1]))
Assuming you're using compare.matrix() from the SDMTools package, the color arguments appear to be hard-coded into the function, so you'll need to redefine the function in order to make them flexible:
# this shows you the code in the console
SDMTools::compare.matrix
function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata, col=c(heat.colors(10)[10:1]),...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE,...)
}
So you can make a new one like so, but bummer, there are two functions using the ellipsis that have a col argument predefined. If you'll only be using extra args to image() and not to contour(), this is cheap and easy.
my.compare.matrix <- function(x,y,nbins,...){
#---- preceding code snipped ----#
suppressWarnings(image(x=mids, y=mids, z=mdata,...))
#overlay contours
contour(x=mids, y=mids, z=mdata, col="black", lty="solid", add=TRUE)
}
If, however, you want to use ... for both internal calls, then the only way I know of to avoid confusion about redundant argument names is to do something like:
my.compare.matrix <- function(x,y,nbins,
image.args = list(col=c(heat.colors(10)[10:1])),
contour.args = list(col="black", lty="solid")){
#---- preceding code snipped ----#
contour.args[[x]] <- contour.args[[y]] <- image.args[[x]] <- image.args[[y]] <- mids
contour.args[[z]] <- image.args[[z]] <- mdata
suppressWarnings(do.call(image, image.args))
#overlay contours
do.call(contour, contour.args)
}
Decomposing this change: instead of ... make a named list of arguments, where the previous hard codes are now defaults. You can then change these items by renaming them in the list or adding to the list. This could be more elegant on the user side, but it gets the job done. Both of the above modifications are untested, but should get you there, and this is all prefaced by my above comment. There may be some other problem that cannot be detected by SO Samaritans because you didn't specify the package or the data.

Multiple plots with high-level plotting functions, especially plot.rqs()

I am trying to plot two regression summaries side-by-side with one centered title. Each regression summary is generated by plot.rqs() and amounts to a set of 9 plots.
I've tried using par(mfrow=c(1,2)) already, but as I learnt from Paul Murrel's (2006) book, high-level functions like plot.rqs() or pairs() save the graphics state before drawing and then restore the graphics state once completed, so that pre-emptive calls to par() or layout() can't help me. plot.rqs() doesn't have a 'panel' function either.
It seems that the only way to achieve the result is to modify the plot.rqs() function to get a new function, say modified.plot.rqs(), and then run
par(mfrow=c(1,2))
modified.plot.rqs(summary(fit1))
modified.plot.rqs(summary(fit2))
par(mfrow=c(1,1))
From there I might be able to work out how to add an overall title to the image using layout(). Does anyone know how to create a modified.plot.rqs() function that could be used in this way?
Thanks
You can patch a function as follows:
use dput and capture.output to retrieve
the code of the function, as a string;
change it as you want (here, I just replace each occurrence of par
with a function that does nothing);
finally evaluate the result to produce a new function.
library(quantreg)
a <- capture.output(dput(plot.summary.rqs))
b <- gsub("^\\s*par\\(", "nop(", a)
nop <- function(...) {}
my.plot.summary.rqs <- eval(parse(text=b))
First we generate an example object, fm . Then we copy plot.rqs and use trace on the copy to insert par <- list at top effectively nullifying any use of par within the function. Then we do the same with plot.summary.rqs. Finally we test it out with our own par:
library(quantreg)
example(plot.rqs) # fm to use in example
# plot.rqs
plot.rqs <- quantreg::plot.rqs
trace("plot.rqs", quote(par <- list), print = FALSE)
# plot.summary.rqs
plot.summary.rqs <- quantreg::plot.summary.rqs
trace("plot.summary.rqs", quote(par <- list), print = FALSE)
# test it out
op <- par(mfrow = c(2, 2))
plot(summary(fm))
plot(fm)
title("My Plots", outer = TRUE, line = -1)
par(op)
EDIT: added plot.summary.rqs.

Resources