I'm trying to set up a checkpoint using the checkpoint package.
I'm doing a simple example in which I only load tidyverse and I run into an error for missing dependencies.
library(checkpoint)
#> Warning: package 'checkpoint' was built under R version 3.6.3
#>
#> checkpoint: Part of the Reproducible R Toolkit from Microsoft
#> https://mran.microsoft.com/documents/rro/reproducibility/
checkpoint(snapshotDate = "2020-06-12", forceInstall = TRUE)
#> Scanning for packages used in this project
#> No file at path 'C:\...'.
#> - Discovered 3 packages
#> Unable to parse 1 files:
#> - reprex_reprex.spin.Rmd
#> Removing packages to force re-install
#> Installing packages used in this project
#> - Installing 'tidyverse'
#> tidyverse
#> also installing the dependencies 'tibble', 'broom', 'dbplyr', 'forcats', 'haven', 'modelr'
#> - Installing 'knitr'
#> knitr
#> checkpoint process complete
#> ---
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.6.3
#> Error: package or namespace load failed for 'tidyverse' in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
#> namespace 'rlang' 0.4.4 is already loaded, but >= 0.4.6 is required
Created on 2020-06-13 by the reprex package (v0.3.0)
According to the documentation, I'd expect the dependencies to install automatically but this seems not to be the case. If I update the rlang package on using install.packages everything works fine:
library(checkpoint)
#> Warning: package 'checkpoint' was built under R version 3.6.3
#>
#> checkpoint: Part of the Reproducible R Toolkit from Microsoft
#> https://mran.microsoft.com/documents/rro/reproducibility/
checkpoint(snapshotDate = "2020-06-12", forceInstall = TRUE, auto.install.knitr = FALSE)
#> Scanning for packages used in this project
#> No file at path 'C:\...'.
#> - Discovered 2 packages
#> Unable to parse 1 files:
#> - reprex_reprex.spin.Rmd
#> Removing packages to force re-install
#> Installing packages used in this project
#> - Installing 'tidyverse'
#> tidyverse
#> also installing the dependencies 'tibble', 'broom', 'dbplyr', 'forcats', 'haven', 'modelr'
#> checkpoint process complete
#> ---
library(tidyverse)
#> Warning: package 'tidyverse' was built under R version 3.6.3
#> Warning: package 'ggplot2' was built under R version 3.6.2
#> Warning: package 'tibble' was built under R version 3.6.3
#> Warning: package 'tidyr' was built under R version 3.6.2
#> Warning: package 'readr' was built under R version 3.6.2
#> Warning: package 'purrr' was built under R version 3.6.2
#> Warning: package 'dplyr' was built under R version 3.6.3
#> Warning: package 'stringr' was built under R version 3.6.2
#> Warning: package 'forcats' was built under R version 3.6.3
Created on 2020-06-13 by the reprex package (v0.3.0)
I believe it might be related to having rlang loaded via a namespace (and not attached), but I thought checkpoint isolated these issues.
Related
Loading the package logistf breaks MCMCglmm(). Unloading logistf before running the command doesn't remove the error.
Why is that? Is there a way to solve this?
Works
library(MCMCglmm)
#> Loading required package: Matrix
#> Loading required package: coda
#> Loading required package: ape
data(PlodiaPO)
MCMCglmm(PO ~ plate, data = PlodiaPO)
#>
#> MCMC iteration = 0
#>
#> MCMC iteration = 1000
#>
#> MCMC iteration = 2000
#>
#> MCMC iteration = 3000
#>
[...]
#> attr(,"class")
#> [1] "MCMCglmm"
Created on 2022-06-07 by the reprex package (v2.0.1)
Doesn't work
library(logistf)
library(MCMCglmm)
#> Loading required package: Matrix
#> Loading required package: coda
#> Loading required package: ape
data(PlodiaPO)
MCMCglmm(PO ~ plate, data = PlodiaPO)
#> Error in terms.formula(formula, data = data): invalid term in model formula
unloadNamespace("logistf")
MCMCglmm(PO ~ plate, data = PlodiaPO)
#> Error in terms.formula(formula, data = data): invalid term in model formula
Created on 2022-06-07 by the reprex package (v2.0.1)
After some research i found that the problem not from logistf but it comes from the imported package formula.tools to reproduce the error try :
library(formula.tools)
#>formula.tools-1.7.1 - Copyright © 2022 Decision Patterns
library(MCMCglmm)
#> Loading required package: Matrix
#> Loading required package: coda
#> Loading required package: ape
data(PlodiaPO)
MCMCglmm(PO ~ plate, data = PlodiaPO)
#> Error in terms.formula(formula, data = data) :
invalid term in model formula
and this issue known for formula.tools see Weird package dependency introduces error
The solution detailed in this issue is:
fork fomula.tools repo
(remove this line)[https://github.com/decisionpatterns/formula.tools/blob/45b6654e4d8570cbaf1e2fd527652471202d97ad/NAMESPACE#L3]
install_github from your repo
OR
run as.character.formula = function(x) as.character.default(x) right after loading formula.tools. That might break code using as.character.formula though (but not sure).
Thanks for this question
I am trying to run a oneway whelch Anova in R using the package userfriendlyscience. Below are my steps:
install the package using the following command
install.packages("userfriendlyscience")
direct R to the package library using library(userfriendlyscience).
run the oneway test using the command:
one.way <- oneway(data$group, y = data$Volume, posthoc = 'games-howell')
However, I get the following errors:
When I direct R to the package:
Error: package or namespace load failed for ‘userfriendlyscience’ in
loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck =
vI[[j]]): there is no package called ‘psych’ In addition: Warning
message: package ‘userfriendlyscience’ was built under R version 3.5.3
when I run the oneway test:
Error in oneway(data$group, y = data$Volume, posthoc = "games-howell")
: could not find function "oneway"
I am using R version 3.5.2 - could this be the problem? and if so is there a work around or should I download a newer version of R?
Thanks
I have run
install.packages('statnet')
library('statnet')
Result:
Loading required package: tergm
Loading required package: ergm
Error: package or namespace load failed for ‘ergm’ in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]):
there is no package called ‘statnet.common’
Error: package ‘ergm’ could not be loaded
In addition: Warning messages:
1: package ‘statnet’ was built under R version 3.4.4
2: package ‘tergm’ was built under R version 3.4.4
Error: package ‘ergm’ could not be loaded
5.
stop(gettextf("package %s could not be loaded", sQuote(pkg)),
call. = FALSE, domain = NA)
4.
.getRequiredPackages2(pkgInfo, quietly = quietly)
3.
library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc = lib.loc,
quietly = quietly)
2.
.getRequiredPackages2(pkgInfo, quietly = quietly)
1.
library(statnet)
Next I tried
install.packages('ergm') # Worked with warning: dependency ‘statnet.common’ is not available
But library(statnet) still does not work, and library(ergm) has a similar error message.
Also tried install.packages(statnet.common) but also get
package ‘statnet.common’ is not available
I'm running RStudio Version 1.1.419, with R version 3.4.3 on Windows 10
Any ideas how to load statnet in R?
Updating my version of R to 3.5.3 solved my problem.
I am trying to schedule the following R script (exposure_train.R):
.libPaths("/gscratch/csde/mienkoja/rpackages")
library(Rmpi)
library(missForest)
library(snow)
library(doSNOW)
# load the data prepared above
load("/home/mienkoja/pse_rodis/dat_clean.rds")
# get rid of garbage from memory
gc()
# read node information from the system environment
nodefile <- Sys.getenv("PBS_NODEFILE")
# assign node information to a nodes object
nodes <- readLines(nodefile)
# create a cluster
cl <- makeMPIcluster(length(nodes), includemaster=TRUE)
# register the cluster
registerDoSNOW(cl)
rf_random <- missForest(dat_clean, maxiter = 10, ntree = 100, parallelize = "forests")
stopCluster(cl)
save(rf_random
,file = "/home/mienkoja/pse_rodis/rf_random_out.rds")
I am working on a cluster with several hundred nodes - each of which typically has 16 processor cores and 128GB of memory. All the nodes run CentOS 6 Linux, and they are tied together by the Moab cluster software. I am scheduling the job with a TORQUE scheduler using the following bash file
#!/bin/bash
### User specs
## Name the job "hyak_train"
#PBS -N hyak_train
## Request 16 CPUs (cores) on 2 nodes, 32 total cores
## If the job doesn't finish in 1 day, cancel it
#PBS -l nodes=2:ppn=16,pmem=2gb,feature=16core,walltime=24:00:00
## Put the output from jobs into the below directory
#PBS -o /gscratch/csde/mienkoja/exposure_train_outn_2p16
## Put both the stderr and stdout into a single file
#PBS -k oe
#PBS -j oe
#PBS -d /home/mienkoja/pse_rodis
## Send an email when the job is aborted, begins, or terminates
#PBS -m abe
### Standard specs
HYAK_NPE=$(wc -l < $PBS_NODEFILE)
HYAK_NNODES=$(uniq $PBS_NODEFILE | wc -l )
HYAK_TPN=$((HYAK_NPE/HYAK_NNODES))
NODEMEM=`grep MemTotal /proc/meminfo | awk '{print $2}'`
NODEFREE=$((NODEMEM-2097152))
MEMPERTASK=$((NODEFREE/HYAK_TPN))
ulimit -v $MEMPERTASK
export MX_RCACHE=0
### Modules
module load r_3.2.4
module load icc_14.0.3-ompi_1.8.3
### App
Rscript exposure_train.R
I am getting the following output using the script above:
Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0729:14816] [[43070,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2822635520
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0829:32101] [[43995,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2883256320
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0829:32615] [[43481,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 2849570816
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
Warning message:
package ‘snow’ was built under R version 3.3.2
Warning message:
package ‘doSNOW’ was built under R version 3.3.2
used (Mb) gc trigger (Mb) max used (Mb)
Ncells 114354 6.2 350000 18.7 350000 18.7
Vcells 13378510 102.1 17463238 133.3 13379451 102.1
[n0768:11458] [[54933,1],0] FORKING HNP: orted --hnp --set-sid --report-uri 17 --singleton-died-pipe 18 -mca state_novm_select 1 -mca ess_base_jobid 3600089088
32 slaves are spawned successfully. 0 failed.
missForest iteration 1 in progress...done!
missForest iteration 2 in progress...Warning message:
package ‘Rmpi’ was built under R version 3.3.2
Loading required package: foreach
Loading required package: iterators
Warning messages:
1: package ‘doMPI’ was built under R version 3.3.2
2: package ‘foreach’ was built under R version 3.3.2
3: package ‘iterators’ was built under R version 3.3.2
randomForest 4.6-12
Type rfNews() to see new features/changes/bug fixes.
Warning message:
package ‘randomForest’ was built under R version 3.3.2
Attaching package: ‘snow’
The following object is masked from ‘package:doMPI’:
sinkWorkerOutput
....
This output repeats itself several times, but the point is that I get 22 successful iterations (i.e. missForest iteration 1 in progress...done!) over the course of the day that I have the job scheduled.
The problem is that I am scheduling the job in a backfill queue and the job gets preempted (at least) once every 4 hours. This means that I never get beyond the second iteration (i.e. missForest iteration 2 in progress...) because the job starts over every 4 hours.
While all iterations are not created equal, since I only want maxiter = 10 (see R script above) it seems that one day is enough time to run the job providing I can build some sort of checkpointing scheme (saving my progress) into the R script.
The source of the missForest function is available in the package repo. The relevant foreach loop (based on my parameterization of parallelize = "forests”) is pasted below.
foreach(xntree=idiv(ntree, chunks=getDoParWorkers()),
.combine='combine', .multicombine=TRUE,
.packages='randomForest') %dopar% {
randomForest(
x = obsX,
y = obsY,
ntree = xntree,
mtry = mtry,
replace = replace,
classwt = if (!is.null(classwt)) classwt[[varInd]] else
rep(1, nlevels(obsY)),
cutoff = if (!is.null(cutoff)) cutoff[[varInd]] else
rep(1/nlevels(obsY), nlevels(obsY)),
strata = if (!is.null(strata)) strata[[varInd]] else obsY,
sampsize = if (!is.null(sampsize)) sampsize[[varInd]] else
if (replace) nrow(obsX) else ceiling(0.632*nrow(obsX)),
nodesize = if (!is.null(nodesize)) nodesize[2] else 5,
maxnodes = if (!is.null(maxnodes)) maxnodes else NULL)
}
It seems like the solution is just to re-write the missForest function and write the relevant information from each iteration to an appropriately indexed file so that I can pick up where I left off on the next preemption.
QUESTION: Are there any R packages/techniques available to simplify the implementation of a checkpointing scheme like this?
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Cannot install R-forge package using install.packages
Has anyone gotten the latest version of TTR from R-forge working on R 2.13? I can't install it on either my mac or my PC, even if I try compiling from the source.
/edit: here's the exact error I'm getting, when I try to install from the R command line.
install.packages("TTR", repos="http://R-Forge.R-project.org")
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘TTR’ is not available (for R version 2.13.0)
Yes, sure:
edd#max:~/svn/ttr$ svn up
At revision 107.
edd#max:~/svn/ttr$ R CMD INSTALL .
* installing to library ‘/usr/local/lib/R/site-library’
* installing *source* package ‘TTR’ ...
** libs
make: Nothing to be done for `all'.
installing to /usr/local/lib/R/site-library/TTR/libs
** R
** data
** preparing package for lazy loading
Loading required package: zoo
** help
*** installing help indices
** building package indices ...
** testing if installed package can be loaded
* DONE (TTR)
and
edd#max:~/svn/ttr$ R -e 'library(TTR); example(EMA)'
R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)
[...]
R> library(TTR); example(EMA)
Loading required package: xts
Loading required package: zoo
EMAR> data(ttrc)
EMAR> ema.20 <- EMA(ttrc[,"Close"], 20)
EMAR> sma.20 <- SMA(ttrc[,"Close"], 20)
EMAR> dema.20 <- DEMA(ttrc[,"Close"], 20)
EMAR> evwma.20 <- EVWMA(ttrc[,"Close"], ttrc[,"Volume"], 20)
EMAR> zlema.20 <- ZLEMA(ttrc[,"Close"], 20)
EMAR> ## Example of Tim Tillson's T3 indicator
EMAR> T3 <- function(x, n=10, v=1) DEMA(DEMA(DEMA(x,n,v),n,v),n,v)
EMAR> t3 <- T3(ttrc[,"Close"])
EMAR> ## Example of short-term instability of EMA
EMAR> ## (and other indicators mentioned above)
EMAR> x <- rnorm(100)
EMAR> tail( EMA(x[90:100],10), 1 )
[1] 0.192859
EMAR> tail( EMA(x[70:100],10), 1 )
[1] 0.149217
EMAR> tail( EMA(x[50:100],10), 1 )
[1] 0.153751
EMAR> tail( EMA(x[30:100],10), 1 )
[1] 0.153703
EMAR> tail( EMA(x[10:100],10), 1 )
[1] 0.153703
EMAR> tail( EMA(x[ 1:100],10), 1 )
[1] 0.153703
R>
The News says it's now on CRAN. My Mac has 0.20-2 on it and the installer reports that to be the most recent. Loading seems to succeed and no errors from running a few examples.
Re the r-forge version 20-3 I get this:
install.packages("TTR", repos="http://R-Forge.R-project.org")
Warning message:
In getDependencies(pkgs, dependencies, available, lib) :
package ‘TTR’ is not available (for R version 2.13.0 beta)
R version 2.13.0 beta (2011-04-04 r55296) (not the latest)