R package or technique to save progress in parallel computing job - r

I am trying to schedule the following R script (exposure_train.R):
.libPaths("/gscratch/csde/mienkoja/rpackages")
library(Rmpi)
library(missForest)
library(snow)
library(doSNOW)
# load the data prepared above
load("/home/mienkoja/pse_rodis/dat_clean.rds")
# get rid of garbage from memory
gc()
# read node information from the system environment
nodefile <- Sys.getenv("PBS_NODEFILE")
# assign node information to a nodes object
nodes <- readLines(nodefile)
# create a cluster
cl <- makeMPIcluster(length(nodes), includemaster=TRUE)
# register the cluster
registerDoSNOW(cl)
rf_random <- missForest(dat_clean, maxiter = 10, ntree = 100, parallelize = "forests")
stopCluster(cl)
save(rf_random
,file = "/home/mienkoja/pse_rodis/rf_random_out.rds")
I am working on a cluster with several hundred nodes - each of which typically has 16 processor cores and 128GB of memory. All the nodes run CentOS 6 Linux, and they are tied together by the Moab cluster software. I am scheduling the job with a TORQUE scheduler using the following bash file
#!/bin/bash
### User specs
## Name the job "hyak_train"
#PBS -N hyak_train
## Request 16 CPUs (cores) on 2 nodes, 32 total cores
## If the job doesn't finish in 1 day, cancel it
#PBS -l nodes=2:ppn=16,pmem=2gb,feature=16core,walltime=24:00:00
## Put the output from jobs into the below directory
#PBS -o /gscratch/csde/mienkoja/exposure_train_outn_2p16
## Put both the stderr and stdout into a single file
#PBS -k oe
#PBS -j oe
#PBS -d /home/mienkoja/pse_rodis
## Send an email when the job is aborted, begins, or terminates
#PBS -m abe
### Standard specs
HYAK_NPE=$(wc -l < $PBS_NODEFILE)
HYAK_NNODES=$(uniq $PBS_NODEFILE | wc -l )
HYAK_TPN=$((HYAK_NPE/HYAK_NNODES))
NODEMEM=`grep MemTotal /proc/meminfo | awk '{print $2}'`
NODEFREE=$((NODEMEM-2097152))
MEMPERTASK=$((NODEFREE/HYAK_TPN))
ulimit -v $MEMPERTASK
export MX_RCACHE=0
### Modules
module load r_3.2.4
module load icc_14.0.3-ompi_1.8.3
### App
Rscript exposure_train.R
I am getting the following output using the script above:
Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0729:14816] [[43070,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2822635520
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0829:32101] [[43995,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2883256320
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0829:32615] [[43481,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2849570816
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0768:11458] [[54933,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 3600089088
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...done!
missForest iteration 2 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
....
This output repeats itself several times, but the point is that I get 22 successful iterations (i.e. missForest iteration 1 in progress...done!) over the course of the day that I have the job scheduled.
The problem is that I am scheduling the job in a backfill queue and the job gets preempted (at least) once every 4 hours. This means that I never get beyond the second iteration (i.e. missForest iteration 2 in progress...) because the job starts over every 4 hours.
While all iterations are not created equal, since I only want maxiter = 10 (see R script above) it seems that one day is enough time to run the job providing I can build some sort of checkpointing scheme (saving my progress) into the R script.
The source of the missForest function is available in the package repo. The relevant foreach loop (based on my parameterization of parallelize = "forests”) is pasted below.
foreach(xntree=idiv(ntree, chunks=getDoParWorkers()),
.combine='combine', .multicombine=TRUE,
.packages='randomForest') %dopar% {
randomForest(
x = obsX,
y = obsY,
ntree = xntree,
mtry = mtry,
replace = replace,
classwt = if (!is.null(classwt)) classwt[[varInd]] else
rep(1, nlevels(obsY)),
cutoff = if (!is.null(cutoff)) cutoff[[varInd]] else
rep(1/nlevels(obsY), nlevels(obsY)),
strata = if (!is.null(strata)) strata[[varInd]] else obsY,
sampsize = if (!is.null(sampsize)) sampsize[[varInd]] else
if (replace) nrow(obsX) else ceiling(0.632*nrow(obsX)),
nodesize = if (!is.null(nodesize)) nodesize[2] else 5,
maxnodes = if (!is.null(maxnodes)) maxnodes else NULL)
}
It seems like the solution is just to re-write the missForest function and write the relevant information from each iteration to an appropriately indexed file so that I can pick up where I left off on the next preemption.
QUESTION: Are there any R packages/techniques available to simplify the implementation of a checkpointing scheme like this?

Related

Checkpoint package R not finding dependency

I'm trying to set up a checkpoint using the checkpoint package.
I'm doing a simple example in which I only load tidyverse and I run into an error for missing dependencies.
library(checkpoint)
#> Warning: package 'checkpoint' was built under R version 3.6.3
#>
#> checkpoint: Part of the Reproducible R Toolkit from Microsoft
#> https://mran.microsoft.com/documents/rro/reproducibility/
checkpoint(snapshotDate = "2020-06-12", forceInstall = TRUE)
#> Scanning for packages used in this project
#> No file at path 'C:\...'.
#> - Discovered 3 packages
#> Unable to parse 1 files:
#> - reprex_reprex.spin.Rmd
#> Removing packages to force re-install
#> Installing packages used in this project
#> - Installing 'tidyverse'
#> tidyverse
#> also installing the dependencies 'tibble', 'broom', 'dbplyr', 'forcats', 'haven', 'modelr'
#> - Installing 'knitr'
#> knitr
#> checkpoint process complete
#> ---
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.6.3
#> Error: package or namespace load failed for 'tidyverse' in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
#> namespace 'rlang' 0.4.4 is already loaded, but >= 0.4.6 is required
Created on 2020-06-13 by the reprex package (v0.3.0)
According to the documentation, I'd expect the dependencies to install automatically but this seems not to be the case. If I update the rlang package on using install.packages everything works fine:
library(checkpoint)
#> Warning: package 'checkpoint' was built under R version 3.6.3
#>
#> checkpoint: Part of the Reproducible R Toolkit from Microsoft
#> https://mran.microsoft.com/documents/rro/reproducibility/
checkpoint(snapshotDate = "2020-06-12", forceInstall = TRUE, auto.install.knitr = FALSE)
#> Scanning for packages used in this project
#> No file at path 'C:\...'.
#> - Discovered 2 packages
#> Unable to parse 1 files:
#> - reprex_reprex.spin.Rmd
#> Removing packages to force re-install
#> Installing packages used in this project
#> - Installing 'tidyverse'
#> tidyverse
#> also installing the dependencies 'tibble', 'broom', 'dbplyr', 'forcats', 'haven', 'modelr'
#> checkpoint process complete
#> ---
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.6.3
#> Warning: package 'ggplot2' was built under R version 3.6.2
#> Warning: package 'tibble' was built under R version 3.6.3
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'readr' was built under R version 3.6.2
#> Warning: package 'purrr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.3
#> Warning: package 'stringr' was built under R version 3.6.2
#> Warning: package 'forcats' was built under R version 3.6.3
Created on 2020-06-13 by the reprex package (v0.3.0)
I believe it might be related to having rlang loaded via a namespace (and not attached), but I thought checkpoint isolated these issues.

Running R function jags on cluster

I am trying to run a R program on cluster. In the R program,jags function is called from the package R2jags. If I don't use the cluster and simply use R, then the program works fine. However, when I try to submit a job, then I get the following error. If I don't call the function jags and use the cluster then it works just fine.
Loading required package: rjags
Loading required package: coda
Linked to JAGS 4.0.0
Loaded modules: basemod,bugs
Attaching package: ‘R2jags’
The following object is masked from ‘package:coda’:
traceplot
*** caught illegal operation ***
address 0x7fe566be8917, cause 'illegal operand'
Traceback:
1: dyn.load(file)
2: load.module(jags.module[m])
3: jags(model.file = "myfile.txt", data = model.data, inits = model.initial.values, parameters = model.parameters, n.chains = 1, n.iter = 500, n.burnin = 5, n.thin = 5)
An irrecoverable exception occurred. R is aborting now ...
line 15: 34161 Illegal instruction (core dumped) Rscript test1.R

'Error: could not find function runmean' from an installed package: caTools?

I installed 'caTools' R package through the command line:
$ R
$ install.packages("caTools", lib="~/R/library")
Then, did this command:
INPUT=/home/user/file.bam
OUTPUT=/home/user/file_cor.bam
Rscript run_spp_nodups.R -c=$INPUT -savp -out=$OUTPUT
And got the error:
Error: could not find function "runmean"
Execution halted
The function 'runmean' belongs to package I installed, 'caTools'.
The R version is appropriate, as R in my machine is version 3.3.2 and 'caTools' depends on R (≥ 2.2.0).
The R code of 'run_spp_nodups.R' is to big to paste here. I show only the part with runmean:
# Smooth the cross-correlation curve if required
cc <- crosscorr$cross.correlation
crosscorr$min.cc <- crosscorr$cross.correlation[ length(crosscorr$cross.correlation$y) , ] # minimum value and shift of cross-correlation
cat("Minimum cross-correlation value", crosscorr$min.cc$y,"\n",file=stdout())
cat("Minimum cross-correlation shift", crosscorr$min.cc$x,"\n",file=stdout())
sbw <- 2*floor(ceiling(5/iparams$sep.range[2]) / 2) + 1 # smoothing bandwidth
cc$y <- runmean(cc$y,sbw,alg="fast")
What's happening and how to solve it?

Error of prcomp in R

I am running a microarray data analysis,
raw_data = read.celfiles(....... )
exp_raw <- log2(exprs(raw_data))
PCA_raw <- prcomp(t(exp_raw), scale = FALSE)
and I got
Error in La.svd(x, nu, nv) : LAPACK routines cannot be loaded
Besides: Warning message:
In La.svd(x, nu, nv) :
unable to load share-object'/Library/Frameworks/R.framework/Resources/modules//lapack.so' : `maximal number of DLLs reached...
this is the packages I loaded
library(Biobase, oligoClasses, knitr, BiocStyle, oligo, geneplotter, ggplot2, dplyr, LSD, gplots, RColorBrewer, ArrayExpress, arrayQualityMetrics, stringr, matrixStats, topGO, genefilter, pd.hugene.1.0.st.v1, hugene10sttranscriptcluster.db, pheatmap, mvtnorm, DAAG, multcomp, limma, ReactomePA, clusterProfiler, openxlsx, devtools, biomaRt, EnrichmentBrowser)
my session info
setting value
version R version 3.4.1 (2017-06-30)
system x86_64, darwin15.6.0
ui RStudio (1.0.153)
language (EN)
collate zh_TW.UTF-8
date 2017-10-03
Can someone tell me how to fix this?

Getting TTR to work on R 2.13? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Cannot install R-forge package using install.packages
Has anyone gotten the latest version of TTR from R-forge working on R 2.13? I can't install it on either my mac or my PC, even if I try compiling from the source.
/edit: here's the exact error I'm getting, when I try to install from the R command line.
install.packages("TTR", repos="http://R-Forge.R-project.org")
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘TTR’ is not available (for R version 2.13.0)
Yes, sure:
edd#max:~/svn/ttr$ svn up
At revision 107.
edd#max:~/svn/ttr$ R CMD INSTALL .
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘TTR’ ...
** libs
make: Nothing to be done for `all'.
installing to /usr/local/lib/R/site-library/TTR/libs
** R
** data
** preparing package for lazy loading
Loading required package: zoo
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
* DONE (TTR)
and
edd#max:~/svn/ttr$ R -e 'library(TTR); example(EMA)'
R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
[...]
R> library(TTR); example(EMA)
Loading required package: xts
Loading required package: zoo
EMAR> data(ttrc)
EMAR> ema.20 <- EMA(ttrc[,"Close"], 20)
EMAR> sma.20 <- SMA(ttrc[,"Close"], 20)
EMAR> dema.20 <- DEMA(ttrc[,"Close"], 20)
EMAR> evwma.20 <- EVWMA(ttrc[,"Close"], ttrc[,"Volume"], 20)
EMAR> zlema.20 <- ZLEMA(ttrc[,"Close"], 20)
EMAR> ## Example of Tim Tillson's T3 indicator
EMAR> T3 <- function(x, n=10, v=1) DEMA(DEMA(DEMA(x,n,v),n,v),n,v)
EMAR> t3 <- T3(ttrc[,"Close"])
EMAR> ## Example of short-term instability of EMA
EMAR> ## (and other indicators mentioned above)
EMAR> x <- rnorm(100)
EMAR> tail( EMA(x[90:100],10), 1 )
[1] 0.192859
EMAR> tail( EMA(x[70:100],10), 1 )
[1] 0.149217
EMAR> tail( EMA(x[50:100],10), 1 )
[1] 0.153751
EMAR> tail( EMA(x[30:100],10), 1 )
[1] 0.153703
EMAR> tail( EMA(x[10:100],10), 1 )
[1] 0.153703
EMAR> tail( EMA(x[ 1:100],10), 1 )
[1] 0.153703
R>
The News says it's now on CRAN. My Mac has 0.20-2 on it and the installer reports that to be the most recent. Loading seems to succeed and no errors from running a few examples.
Re the r-forge version 20-3 I get this:
install.packages("TTR", repos="http://R-Forge.R-project.org")
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘TTR’ is not available (for R version 2.13.0 beta)
R version 2.13.0 beta (2011-04-04 r55296) (not the latest)

Resources