Use of recordPlot() and replayPlot() in Parallel in R to save plot in the same PDF - r

I would like to plot data in parallel using foreach in R but I didn't find any way to get all my plots in the same pdf file. I thought of using recordPlot to save my plots in a list and then print them in a pdf device but it doesn't work.
I have the following error :
Error in replayPlot(x) : loading snapshot from a different session
I tried as well with ggplot but this is to slow with my large dataset.
Here is a piece of code showing my problem :
# Creating a dataframe : df
df=as.data.frame(matrix(nrow=1, ncol=10))
df=apply(df, 2, function(x) runif(100))
# Plotting function
par.plot=function(dat){
plot(dat)
p=recordPlot()
return(p)}
#Applying the function in parallel
library("parallel")
library("foreach")
library("doParallel")
cl <- makeCluster(detectCores())
registerDoParallel(cl, cores = detectCores())
plot.lst = foreach(i = 1:nrow(df)) %dopar% {
par.plot(df[i,])
}
# Trying to get 1st plot
plot.lst[[1]]
Error in replayPlot(x) : loading snapshot from a different session
Replacing %dopar% by %do% is working when I try to get my plots, because they seems to have been generated in the same environment.
I know I can call a pdf device inside the loop to generate a file for each iteration, but I would like to know if there is a way to get one file for all my plots at the output of my function.
Or do you know an easy way to merge my pdf files afterwards ?
Thanks for your help.
Charles

In my opinion your question can be devided into two distinctive parts:
1. Using the replayPlot function in th%dopar% without getting the weird error
2. Somehow getting 1 file at the end
The first question is easy to answer. The reason you get this error is that the R somehow remembers where (in OS level) the plots has been generated. You can get the same effect by using Rstudio server and trying to replay some of the recorded plots after couple of hours of closing the browser tab. In brief, the issue is that R remembers the PID of the process that generated the plot (Don't know why though!):
# generate a plot
plot(iris[, 1:2]
# record the plot
myplot <- recordPlot()
# check the PID
attr(x = myplot, which = "pid")
the good thing is you can overwrite this by assigning your current PID:
attr(x = myplot, which = "pid") <- Sys.getpid()
so you should only change the last line of your code to the following:
pdf(file = "plot.lst.pdf"))
graphics.off()
lapply(plot.lst, function(x){
attr(x = x, which = "pid") <- Sys.getpid()
replayPlot(x)})
graphics.off()
The part above entirely solves your problem, but in case you are interested in merging PDF files, follow this discussion:
Merging existing PDF files using R

Related

R autoplot works when running line-by-line but not when source-ing R script

I'm observing a very strange behaviour with R.
The following code works when I type it in line-by-line into an instance of R run from my terminal. (OS is Debian Linux.)
However it does not work when I try and run source("script.R").
It also does not work from within R Studio.
Specifically, it fails to produce graphical output with autoplot. Writing to pdf file does not work, and if I remove the pdf() and dev.off() lines, no window containing the figure is opened.
Here's a copy of my script...
library(lubridate)
library(ggplot2)
library(matrixStats)
library(forecast)
df_input <- read.csv("postprocessed.csv")
x <- df_input$time
y <- df_input$value
df <- data.frame(x, y)
x <- df$x
y <- df$y
holtmodel <- holt(y)
pdf("autoplot.pdf")
autoplot(holtmodel)
dev.off()
And for convenience, here's a datafile.
"","time","value"
"1",1,2.61066016308988
"2",2,3.41246054742996
"3",3,3.8608767964033
"4",4,4.28686048552237
"5",5,4.4923132964825
"6",6,4.50557049744317
"7",7,4.50944447661246
"8",8,4.51097373134893
"9",9,4.48788748823809
"10",10,4.34603985656981
"11",11,4.28677073671406
"12",12,4.20065901625172
"13",13,4.02514194962519
"14",14,3.91360194972916
"15",15,3.85865748409081
"16",16,3.81318053258601
"17",17,3.70380706527433
"18",18,3.61552922363713
"19",19,3.61405310598722
"20",20,3.64591327503384
"21",21,3.70234435835577
"22",22,3.73503970503372
"23",23,3.81003078640584
"24",24,3.88201196162666
"25",25,3.89872518158949
"26",26,3.97432743542362
"27",27,4.2523675144599
"28",28,4.34654855854847
"29",29,4.49276038902684
"30",30,4.67830892029687
"31",31,4.91896819673664
"32",32,5.04350767355202
"33",33,5.09073406942046
"34",34,5.18510849382162
"35",35,5.18353176529036
"36",36,5.2210776270173
"37",37,5.22643491929207
"38",38,5.11137006553725
"39",39,5.01052467981257
"40",40,5.0361056705898
"41",41,5.18149486951409
"42",42,5.36334869132276
"43",43,5.43053620818444
"44",44,5.60001072279525
Pretty confused because it seems like a trivial script!
change it to:
print(autoplot(holtmodel))
When you step through code, you get an implicit print(...) statement on each code line. When you source() you don't. ggplot (and others!) use print() to trigger their ploting (so that you can conveniently build up a plot step by step without having to wait for flickering figures)

How can I View the output of this animation code in Rstudio

This R code is to create an animated plot, I have run it and it did run but I have not been able to view it. it is said to save it output on pdf file though I saw the file but unable to open it. I got the code at How do I transfer output of animation R package on a beamer frame
because I want to learn how to input R animated plot on latex thus I was given this as an example. can you show me how I can view its output either on Rstudio or where the code saves it to? If you mean that the output can be viewed on pdf that is originally saved to, show me how? I am using Acrobat Reade Dc.
brownianMotion <- function(n=10,xlim=c(-20,20),ylim=c(-20,20),steps=50)
{
x=rnorm(n)
y=rnorm(n)
for (i in 1:steps) {
plot(x,y,xlim = xlim,ylim = ylim)
text(x,y)
# iterate over particles
for(k in 1:n){
walk=rnorm(2); # random move of particle
x[k]=x[k]+walk[1] # new position
y[k]=y[k]+walk[2]
# simple model for preventing a particle from moving past the limits
if(x[k]<xlim[1]) x[k]=xlim[1]
if(x[k]>xlim[2]) x[k]=xlim[2]
if(y[k]<ylim[1]) y[k]=ylim[1]
if(y[k]>ylim[2]) y[k]=ylim[2]
}
}
}
pdf("frames.pdf") # output device and file name
par(xaxs="i", yaxs="i", pty="s") # square plot region
par(mai=c(0.9,0.9,0.2,0.2)) # plot margins
brownianMotion(n=20, steps=400) # 20 particles, 400 time steps
There are two things here :
you need to add dev.off() after plotting so that the current plot is saved to the output device
the loop over step is rewriting the same filename for each plot, so that you end-up in having only the last frame in frames.pdf. Following this tutorial, you should rather write separate pdf files to an output folder, then animate them within LaTeX.
brownianMotion <- function(n=10,xlim=c(-20,20),ylim=c(-20,20),steps=50){
x=rnorm(n)
y=rnorm(n)
for (i in 1:steps) {
pdf(paste0("out/frames", i, ".pdf")) # save frames{i}.pdf to 'out' folder
plot(x,y,xlim = xlim,ylim = ylim)
text(x,y)
dev.off() # Adding dev.off()
...
}
}
par(xaxs="i", yaxs="i", pty="s") # square plot region
par(mai=c(0.9,0.9,0.2,0.2)) # plot margins
if (!dir.exists("out")) dir.create("out") # create 'out' folder if it doesn't exist
brownianMotion(n=20, steps=4) # 20 particles, 4 steps
The out folder will be located where your working directory is (use getwd() to see it).

R script that should plot xts class data to a png file doesn't produce a png file

I had a working script that used a loop to produce many png file plots of xts class data. Now the script throws an error and if I comment out the line throwing the error (a call to abline() ) then the script executes but without producing a png file. The issue seems to involve plotting xts class data and/or a loop or script.
Searching on stackoverflow didn't provide a solution or reference to this issue. I've reproduced the issue in the following example. In practice, the script would use different filenames within the loop and non-trivial data.
# put following code in 'myscript.R' and execute using source('myscript.R',print.eval=TRUE) or source('myscript.R')
# xts class data
data <- xts(seq(1:10),order.by=as.Date(seq(1:10)))
# a non xts version of same data
#data <- seq(1:10)
for(i in 1:1) {
filename <- 'myfile.png'
png(filename)
plot(data)
lines( (data-1),col='red')
abline(h=1)
dev.off()
}
# The call to abline in above script with xts class data gives error 'plot.new has not been called yet'
# If comment out the call to abline it completes but doesn't produce a png file
# script works fine with abline for non xts data
Using xts >= 0.10.1, this saves to file what you want
for(i in 1:1) {
filename <- 'myfile.png'
png(filename)
plot(data)
print(lines( (data-1),col='red', on = 1))
print(lines(xts(x = rep(1, NROW(data)), order.by = index(data)),col='green', on= 1) )
dev.off()
}
Use the print calls for the extra lines. I'd also use lines for the horizonal line, instead of abline, as this is more consistent with plotting with xts.
Also your error can be avoided if you do print(abline(h=1))

R plot loop problems with last image

I've created a script for calculating several model for regression. I made a triple loop to save the results of the model in a list and then I can call whatever I need for plotting etc. I've then created other three loop for plotting my data. Everything seems to work but the last loop of the cycle create a pdf file for the plots it gets hang and corrupted. I can, of course, add dummy data in order to have the correct plot that I need but I cannot understand what it is.
I've tried all the options for graphics.off() and dev.off() but it seems I get something wrong. my R 3.3.2 version Any help appreciated
Plot DSC thermograms of isothermal crysallization
for (intK in 1:nrow(sample_levels)) #the levels of my sample
{ for (intJ in 1:nrow(conc_levels)) #concentration levels of my samples
{
plot(0,0,type='n', xlim=c(0,lim_Max_Time) ,ylim=c(0,lim_Max_exo_up),xlab=expression(paste("Time(s)")),ylab=expression(paste("Heat flow (J/g) -exo up")) ) #create null plot
for (intL in 1:nrow(Temperatures_levels))
{
if (!is.null(matrix_Avrami[[intK,intJ,intL]] ))
{
data_plot <-matrix_Avrami[[intK,intJ,intL]] #recall my data from previous part in the script
Time_p=data_plot[5] #choose the x I need
Time_p<-as.matrix(Time_p) #to avoid Error in xy.coords(x, y) : 'x' and 'y' lengths differ
Heat_flow_exo_up<-data_plot[4] #my y
Heat_flow_exo_up<-as.matrix(Heat_flow_exo_up) #same as before for avoiding erro
points(Time_p,Heat_flow_exo_up,pch=intL) #create correctly the plot I need
}
}
title(main=paste("Conc",as.character(conc_levels[intJ,]),"% GO",as.character(sample_levels[intK,]), sep = " " ) )
legend ("topright", paste(as.character(Temperatures_levels[,]),"°C",sep = ""),pch=1:nrow(Temperatures_levels))
mypath <- file.path("C:","R","SAVEHERE",paste("Heat_flow_vs_Time", as.character(intK),as.character(intJ),".pdf", sep = ""))
pdf(file=mypath)
}
dev.off()
} #the last plot of the loop correctly visualized in my console

R programming - Graphic edges too large error while using clustering.plot in EMA package

I'm an R programming beginner and I'm trying to implement the clustering.plot method available in R package EMA. My clustering works fine and I can see the results populated as well. However, when I try to generate a heat map using clustering.plot, it gives me an error "Error in plot.new (): graphic edges too large". My code below,
#Loading library
library(EMA)
library(colonCA)
#Some information about the data
data(colonCA)
summary(colonCA)
class(colonCA) #Expression set
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
#Applying average linkage clustering on colonCA data using Pearson correlation
expr_genes <- genes.selection(expr_mat, thres.num=100)
expr_sample <- clustering(expr_mat[expr_genes,],metric = "pearson",method = "average")
expr_gene <- clustering(data = t(expr_mat[expr_genes,]),metric = "pearson",method = "average")
expr_clust <- clustering.plot(tree = expr_sample,tree.sup=expr_gene,data=expr_mat[expr_genes,],title = "Heat map of clustering",trim.heatmap =1)
I do not get any error when it comes to actually executing the clustering process. Could someone help?
In your example, some of the rownames of expr_mat are very long (max(nchar(rownames(expr_mat)) = 271 characters). The clustering_plot function tries to make a margin large enough for all the names but because the names are so long, there isn't room for anything else.
The really long names seem to have long stretches of periods in them. One way to condense the names of these genes is to replace runs of 2 or more periods with just one, so I would add in this line
#Extract expression matrix from colonCA
expr_mat <- exprs(colonCA)
rownames(expr_mat)<-gsub("\\.{2,}","\\.", rownames(expr_mat))
Then you can run all the other commands and plot like normal.

Resources