Listing R Package Dependencies Without Installing Packages - r

Is there a simple way to get a list of R package dependencies (all recursive dependencies) for a given package, without installing the package and it's dependencies? Something similar to a fake install in portupgrade or apt.

You can use the result of the available.packages function. For example, to see what ggplot2 depends on :
pack <- available.packages()
pack["ggplot2","Depends"]
Which gives :
[1] "R (>= 2.14), stats, methods"
Note that depending on what you want to achieve, you may need to check the Imports field, too.

I am surprised no one mentioned tools::package_dependencies() , which is the simplest solution, and has a recursive argument (which the accepted solution does not offer).
Simple example looking at the recursive dependencies for the first 200 packages on CRAN:
library(tidyverse)
avail_pks <- available.packages()
deps <- tools::package_dependencies(packages = avail_pks[1:200, "Package"],
recursive = TRUE)
tibble(Package=names(deps),
data=map(deps, as_tibble)) %>%
unnest(data)
#> # A tibble: 7,125 x 2
#> Package value
#> <chr> <chr>
#> 1 A3 xtable
#> 2 A3 pbapply
#> 3 A3 parallel
#> 4 A3 stats
#> 5 A3 utils
#> 6 aaSEA DT
#> 7 aaSEA networkD3
#> 8 aaSEA shiny
#> 9 aaSEA shinydashboard
#> 10 aaSEA magrittr
#> # … with 7,115 more rows
Created on 2020-12-04 by the reprex package (v0.3.0)

Another neat and simple solution is the internal function recursivePackageDependencies from the library packrat. However, the package must be installed in some library on your machine. The advantage is that it works with selfmade non-CRAN packages as well. Example:
packrat:::recursivePackageDependencies("ggplot2",lib.loc = .libPaths()[1])
giving:
[1] "R6" "RColorBrewer" "Rcpp" "colorspace" "dichromat" "digest" "gtable"
[8] "labeling" "lazyeval" "magrittr" "munsell" "plyr" "reshape2" "rlang"
[15] "scales" "stringi" "stringr" "tibble" "viridisLite"

I do not have R installed and I needed to find out which R Packages were dependencies upon a list of R Packages being requested for usage at my company.
I wrote a bash script that iterates over a list of R Packages in a file and will recursively discover dependencies.
The script uses a file named rinput_orig.txt as input (example below). The script will create a file named rinput.txt as it does its work.
The script will create the following files:
rdepsfound.txt - Lists dependencies found including the R Package that is dependent upon it (example below).
routput.txt - Lists all R Packages (from original list and list of dependencies) along with the license and CRAN URL (example below).
r404.txt - List of R Packages where a 404 was received when trying to curl. This is handy if your original list has any typos.
Bash script:
#!/bin/bash
# CLEANUP
rm routput.txt
rm rdepsfound.txt
rm r404.txt
# COPY ORIGINAL INPUT TO WORKING INPUT
cp rinput_orig.txt rinput.txt
IFS=","
while read PACKAGE; do
echo Processing $PACKAGE...
PACKAGEURL="http://cran.r-project.org/web/packages/${PACKAGE}/index.html"
if [ `curl -o /dev/null --silent --head --write-out '%{http_code}\n' ${PACKAGEURL}` != 404 ]; then
# GET LICENSE INFO OF PACKAGE
LICENSEINFO=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "License:" | grep -v "License:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}' | sed "s/|/,/g" | sed "s/+/,/g")
for x in ${LICENSEINFO[*]}
do
# SAVE LICENSE
LICENSE=$(echo ${x} | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}')
break
done
# WRITE PACKAGE AND LICENSE TO OUTPUT FILE
echo $PACKAGE $LICENSE $PACKAGEURL >> routput.txt
# GET DEPENDENCIES OF PACKAGE
DEPS=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "Depends:" | grep -v "Depends:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}')
for x in ${DEPS[*]}
do
FOUNDDEP=$(echo "${x}" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}' | sed "s/<\/span>//g")
if [ "$FOUNDDEP" != "" ]; then
echo Found dependency $FOUNDDEP for $PACKAGE...
grep $FOUNDDEP rinput.txt > /dev/null
if [ "$?" = "0" ]; then
echo $FOUNDDEP already exists in package list...
else
echo Adding $FOUNDDEP to package list...
# SAVE FOUND DEPENDENCY BACK TO INPUT LIST
echo $FOUNDDEP >> rinput.txt
# SAVE FOUND DEPENDENCY TO DEPENDENCY LIST FOR EASY VIEWING OF ALL FOUND DEPENDENCIES
echo $FOUNDDEP is a dependency of $PACKAGE >> rdepsfound.txt
fi
fi
done
else
echo Skipping $PACKAGE because 404 was received...
echo $PACKAGE $PACKAGEURL >> r404.txt
fi
done < rinput.txt
echo -e "\nRESULT:"
sort -u routput.txt
Example rinput_orig.txt:
shiny
rmarkdown
xtable
RODBC
RJDBC
XLConnect
openxlsx
xlsx
Rcpp
Example console output when running script:
Processing shiny...
Processing rmarkdown...
Processing xtable...
Processing RODBC...
Processing RJDBC...
Found dependency DBI for RJDBC...
Adding DBI to package list...
Found dependency rJava for RJDBC...
Adding rJava to package list...
Processing XLConnect...
Found dependency XLConnectJars for XLConnect...
Adding XLConnectJars to package list...
Processing openxlsx...
Processing xlsx...
Found dependency rJava for xlsx...
rJava already exists in package list...
Found dependency xlsxjars for xlsx...
Adding xlsxjars to package list...
Processing Rcpp...
Processing DBI...
Processing rJava...
Processing XLConnectJars...
Processing xlsxjars...
Found dependency rJava for xlsxjars...
rJava already exists in package list...
Example rdepsfound.txt:
DBI is a dependency of RJDBC
rJava is a dependency of RJDBC
XLConnectJars is a dependency of XLConnect
xlsxjars is a dependency of xlsx
Example routput.txt:
shiny GPL-3 http://cran.r-project.org/web/packages/shiny/index.html
rmarkdown GPL-3 http://cran.r-project.org/web/packages/rmarkdown/index.html
xtable GPL-2 http://cran.r-project.org/web/packages/xtable/index.html
RODBC GPL-2 http://cran.r-project.org/web/packages/RODBC/index.html
RJDBC GPL-2 http://cran.r-project.org/web/packages/RJDBC/index.html
XLConnect GPL-3 http://cran.r-project.org/web/packages/XLConnect/index.html
openxlsx GPL-3 http://cran.r-project.org/web/packages/openxlsx/index.html
xlsx GPL-3 http://cran.r-project.org/web/packages/xlsx/index.html
Rcpp GPL-2 http://cran.r-project.org/web/packages/Rcpp/index.html
DBI LGPL-2 http://cran.r-project.org/web/packages/DBI/index.html
rJava GPL-2 http://cran.r-project.org/web/packages/rJava/index.html
XLConnectJars GPL-3 http://cran.r-project.org/web/packages/XLConnectJars/index.html
xlsxjars GPL-3 http://cran.r-project.org/web/packages/xlsxjars/index.html
I hope this helps someone!

I tested my own solution (local installed packages checked) against packrat and tools ones.
You could find out clear differences between methods.
tools::package_dependencies looks to give too much for older R versions (till 4.1.0 and recursive = TRUE) and is not efficient solution.
R 4.1.0 NEWS
"Function tools::package_dependencies() (in package tools) can now use different dependency types for direct and recursive dependencies."
packrat:::recursivePackageDependencies is using available.packages so it is based on newest remote packages, not local ones.
My function by default is skipping base packages, change the base arg if you want to attach them too.
Tested under R 4.1.0:
get_deps <- function(package, fields = c("Depends", "Imports", "LinkingTo"), base = FALSE, lib.loc = NULL) {
stopifnot((length(package) == 1) && is.character(package))
stopifnot(all(fields %in% c("Depends", "Imports", "Suggests", "LinkingTo")))
stopifnot(is.logical(base))
stopifnot(package %in% rownames(utils::installed.packages(lib.loc = lib.loc)))
paks_global <- NULL
deps <- function(pak, fileds) {
pks <- packageDescription(pak)
res <- NULL
for (f in fileds) {
ff <- pks[[f]]
if (!is.null(ff)) {
res <- c(
res,
vapply(
strsplit(trimws(strsplit(ff, ",")[[1]]), "[ \n\\(]"),
function(x) x[1],
character(1)
)
)
}
}
if (is.null(res)) {
return(NULL)
}
for (r in res) {
if (r != "R" && !r %in% paks_global) {
paks_global <<- c(r, paks_global)
deps(r, fields)
}
}
}
deps(package, fields)
setdiff(unique(paks_global), c(
package,
"R",
if (!base) {
c(
"stats",
"graphics",
"grDevices",
"utils",
"datasets",
"methods",
"base",
"tools"
)
} else {
NULL
}
))
}
own = get_deps("shiny", fields = c("Depends", "Imports"))
packrat = packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
setdiff(own, packrat)
#> character(0)
setdiff(packrat, own)
#> character(0)
setdiff(own, tools)
#> character(0)
setdiff(tools, own)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
setdiff(packrat, tools)
#> character(0)
setdiff(tools, packrat)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
own
#> [1] "lifecycle" "ellipsis" "cachem" "jquerylib" "rappdirs"
#> [6] "fs" "sass" "bslib" "glue" "commonmark"
#> [11] "withr" "fastmap" "crayon" "sourcetools" "base64enc"
#> [16] "htmltools" "digest" "xtable" "jsonlite" "mime"
#> [21] "magrittr" "rlang" "later" "promises" "R6"
#> [26] "Rcpp" "httpuv"
packrat
#> [1] "R6" "Rcpp" "base64enc" "bslib" "cachem"
#> [6] "commonmark" "crayon" "digest" "ellipsis" "fastmap"
#> [11] "fs" "glue" "htmltools" "httpuv" "jquerylib"
#> [16] "jsonlite" "later" "lifecycle" "magrittr" "mime"
#> [21] "promises" "rappdirs" "rlang" "sass" "sourcetools"
#> [26] "withr" "xtable"
tools
#> [1] "methods" "utils" "grDevices" "httpuv" "mime"
#> [6] "jsonlite" "xtable" "digest" "htmltools" "R6"
#> [11] "sourcetools" "later" "promises" "tools" "crayon"
#> [16] "rlang" "fastmap" "withr" "commonmark" "glue"
#> [21] "bslib" "cachem" "ellipsis" "lifecycle" "sass"
#> [26] "jquerylib" "magrittr" "base64enc" "Rcpp" "stats"
#> [31] "graphics" "fs" "rappdirs"
microbenchmark::microbenchmark(get_deps("shiny", fields = c("Depends", "Imports")),
packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports")),
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]],
times = 5
)
#> Warning in microbenchmark::microbenchmark(get_deps("shiny", fields =
#> c("Depends", : less accurate nanosecond times to avoid potential integer
#> overflows
#> Unit: milliseconds
#> expr
#> get_deps("shiny", fields = c("Depends", "Imports"))
#> packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
#> tools
#> min lq mean median uq max neval
#> 5.316552 5.607365 6.054568 5.674359 6.633308 7.041258 5
#> 18.767340 19.387588 21.739127 21.581457 23.526169 25.433079 5
#> 411.589734 449.179354 458.526354 465.431262 468.440211 497.991207 5
Created on 2021-06-25 by the reprex package (v0.3.0)
Proof that sth was wrong with tools solution under older R versions. Tested under R 3.6.3.
paks <- tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
"lifecycle" %in% paks
#> [1] TRUE
any(c(paks, "shiny") %in% tools::dependsOnPkgs("lifecycle"))
#> [1] FALSE
Created on 2021-06-25 by the reprex package (v0.3.0)

Try this: tools::package_dependencies(recursive = TRUE)$package_name
As an example- here are the dependencies for dplyr:
tools::package_dependencies(recursive = TRUE)$dplyr
[1] "ellipsis" "generics" "glue" "lifecycle" "magrittr" "methods"
[7] "R6" "rlang" "tibble" "tidyselect" "utils" "vctrs"
[13] "cli" "crayon" "fansi" "pillar" "pkgconfig" "purrr"
[19] "digest" "assertthat" "grDevices" "utf8" "tools"

Related

Rscript called with crontab not finding local packages

I have the following R script ~/test.R :
print(.libPaths())
print(system(command = "whoami",ignore.stderr = TRUE))
library(lubridate)
ymd("2022-09-15")
If I run this script from the terminal with /opt/R/3.6.2/lib64/R/bin/Rscript test.R > test2.log I get the following output:
[1] "/home/domain/username/R/library/3.6.2"
[2] "/applis/R/site-library/x86_64-pc-linux-gnu/3.6.2"
[3] "/opt/R/3.6.2/lib64/R/library"
username#domain
[1] 0
[1] "2022-09-15"
So it's working as intended and I have 3 paths for packages. Now let's run this script with cron :
* * * * * /opt/R/3.6.2/lib64/R/bin/Rscript $HOME/test.R > $HOME/test.log 2>&1
I get this for test.log:
[1] "/opt/R/3.6.2/lib64/R/library"
username#domain
[1] 0
Error in library(lubridate) :
aucun package nommé ‘lubridate’ n'est trouvé
Exécution arrêtée
So I only have one path for libraries, consequently lubridate is not found, because it's installed in /home/domain/username/R/library/3.6.2. I cannot install packages within /opt/R/3.6.2/lib64/R/library, so I'm looking for a way to add libpaths to crontab.

Why would loading a package change the resid function being used?

I understand that resid() is a generic function in R, and which specific residual function is used depends on the object to which resid() is applied, just like print().
However, I noticed that, sometimes loading a package would change which specific residual function is used, yielding drastically different residual plots. Could anyone help me understand why that happens?
This is an example from my data:
> #### Showing packages loaded after starting up R ####
> search()
[1] ".GlobalEnv" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices" "package:utils"
[7] "package:datasets" "package:methods" "Autoloads" "package:base"
>
> #### Before loading nlme ####
>
> ## s1 is a gls object, calculated using the nlme package
> s1 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s1, type = "pearson"), main = "before loading nlme")
> qqline(resid(s1, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.HoltWinters* residuals.isoreg* residuals.lm
[6] residuals.nls* residuals.smooth.spline* residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s1, type = "pearson"))
resid.gls
=> resid.default
> ## the resid.default is used
And the resulting qqplot is
Then, after loading the nlme package,
> #### After loading nlme ####
>
> library(nlme)
Warning message:
package ‘nlme’ was built under R version 4.1.2
> search()
[1] ".GlobalEnv" "package:nlme" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices"
[7] "package:utils" "package:datasets" "package:methods" "Autoloads" "package:base"
>
> # s2 is the same as s1
> s2 <- readRDS("../Data/my_gls.RDS")
> qqnorm(resid(s2, type = "pearson"), main = "after loading nlme")
> qqline(resid(s2, type = "pearson"))
>
> methods(resid)
[1] residuals.default* residuals.glm residuals.gls* residuals.glsStruct* residuals.gnls*
[6] residuals.gnlsStruct* residuals.HoltWinters* residuals.isoreg* residuals.lm residuals.lme*
[11] residuals.lmeStruct* residuals.lmList* residuals.nlmeStruct* residuals.nls* residuals.smooth.spline*
[16] residuals.tukeyline*
see '?methods' for accessing help and source code
Warning message:
In .S3methods(generic.function, class, envir) :
generic function 'resid' dispatches methods for generic 'residuals'
> sloop::s3_dispatch(resid(s2, type = "pearson"))
=> resid.gls
* resid.default
> # resid.gls is used
the qqplot looks like this
As the command sloop::s3_dispatch(resid(s1, type = "pearson")) indicated, resid.default is the function being used before the nlme package is loaded, but resid.gls is the one being used after nlme is loaded. Why such a change---is it because resid.gls is not included in the default options of resid(), as the first methods(resid) suggested?
I am using R 4.1.0, and I would appreciate your feedback very much, if any. Thank you.
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status
major 4
minor 1.0
year 2021
month 05
day 18
svn rev 80317
language R
version.string R version 4.1.0 (2021-05-18)
nickname Camp Pontaneze

Julia Conda : Not a Conda environment error

I want to run python from julia using PyCall and Conda.jl. Once I added Conda to one of julia environments, a conda environment was automatically created,
(base) ➜ ~ conda info --envs
# Out >
/Users/imantha/.julia/conda/3
base * /Users/imantha/Software/miniforge3
CRTTEnv /Users/imantha/Software/miniforge3/envs/CRTTEnv
mlEnv /Users/imantha/Software/miniforge3/envs/mlEnv
nlpEnv /Users/imantha/Software/miniforge3/envs/nlpEnv
pyEnv /Users/imantha/Software/miniforge3/envs/pyEnv
The very first one is the julia environment.
Now when enter the following in Julia i get,
julia> using Conda
julia> Conda.PYTHONDIR
# Out >
"/Users/imantha/.julia/conda/3/bin"
But if I try to add package or find the installed packages I get
julia> Conda.list(Conda.PYTHONDIR)
# Out >
EnvironmentLocationNotFound: Not a conda environment: /Users/imantha/.julia/conda/3/bin
ERROR: failed process: Process(setenv(`/Users/imantha/.julia/conda/3/bin/conda list`,["XPC_FLAGS=0x0", "LSCOLORS=Gxfxcxdxbxegedabagacad", "PATH=/Users/imantha/Software/miniforge3/bin:/Users/imantha/Software/miniforge3/condabin:/Users/imantha/.cabal/bin:/Users/imantha/.ghcup/bin:/opt/homebrew/bin:/opt/homebrew/sbin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/Applications/Julia-1.6.app/Contents/Resources/julia/bin:/Users/imantha/Software/anaconda3/bin:/Applications/R.app/Contents/MacOS/R#:/Library/PostgreSQL/9.4/bin", "_CE_M=", "PWD=/Users/imantha", "XPC_SERVICE_NAME=0", "TERM_PROGRAM=Apple_Terminal", "HOMEBREW_PREFIX=/opt/homebrew", "SHELL=/bin/zsh", "__CF_USER_TEXT_ENCODING=0x1F5:0:2" … "HOMEBREW_REPOSITORY=/opt/homebrew", "_CE_CONDA=", "USER=imantha", "TERM=xterm-256color", "HOME=/Users/imantha", "TERM_PROGRAM_VERSION=440", "OPENBLAS_MAIN_FREE=1", "LESS=-R", "PYTHONIOENCODING=UTF-8", "ZSH=/Users/imantha/.oh-my-zsh"]), ProcessExited(1)) [1]
Stacktrace:
[1] pipeline_error
# ./process.jl:525 [inlined]
[2] run(::Cmd; wait::Bool)
# Base ./process.jl:440
[3] run
# ./process.jl:438 [inlined]
[4] runconda(args::Cmd, env::String)
# Conda ~/.julia/packages/Conda/sNGum/src/Conda.jl:129
[5] list(env::String)
# Conda ~/.julia/packages/Conda/sNGum/src/Conda.jl:262
[6] top-level scope
# REPL[34]:1
Any ideas why this maybe ?

Colophon for an R book

At the end of an R book, I'd like to show the versions of main R packages used to compile the book. I'm wondering if there is anything I could do better than just use sessionInfo() in a chunk, e.g.,
\section*{Colophon}
This book was produced using \Sexpr{R.version.string},
\pkg{knitr} (\Sexpr{packageDescription("knitr")[["Version"]]})
and other package versions listed below.
<<session-info, size='footnotesize',R.options=list(width=90)>>=
print(sessionInfo(), locale = FALSE)
#
In particular, sessionInfo() lists all packages loaded indirectly as well as those loaded directly.
```{r}
library(knitr)
p = devtools::loaded_packages()
p$version = unlist(lapply(p$package, function(x) as.character(packageVersion(x))))
kable(p[order(p$package),], row.names=FALSE)
```
If you do not have devtools installed, steal the code from loaded_packages.
This will give a comma separated list of the packages loaded into the current session:
pkgs <- sort(sub("package:", "", grep("package:", search(), value = TRUE)));
toString(Map(function(p) sprintf("%s (%s)", p, packageVersion(p)), pkgs))
giving this string which you can insert by placing the code above in a \Sexpr:
[1] "base (3.2.0), datasets (3.2.0), graphics (3.2.0), grDevices (3.2.0), methods (3.2.0), stats (3.2.0), utils (3.2.0)"
Only core R functions are used in this code.
I don't want to list all packages loaded (base packages, dependencies) in the current session, so I came up with a better solution for my needs. Maybe this will be useful to someone else.
Find all packages explicitly loaded via library() in the .Rnw files for the book.
Use devtools:::package_info() for formatting
For (1.), I used the following pipe of shell commands, all standard except for my trusty tcgrep perl script that find strings in files recursively
tcgrep -E Rnw '^library(.*)' . \
| grep '/ch' \
| perl -p -e 's/^.*://; s/\s*#.*//' \
| perl -p -e 's/library\(([\w\d]+)\)/"$1"/g; s/;/, /' \
| sort -u | perl -p -e 's/\n/, /' > packages-used.R
This gave me
packages <- c(
"AER", "ca", "car", "colorspace", "corrplot", "countreg", "directlabels", "effects", "ggparallel", "ggplot2", "ggtern", "gmodels", "gnm", "gpairs", "heplots", "Lahman", "lattice", "lmtest", "logmult", "MASS", "MASS", "countreg", "mgcv", "nnet", "plyr", "pscl", "RColorBrewer", "reshape2", "rms", "rsm", "sandwich", "splines", "UBbipl", "vcd", "vcdExtra", "VGAM", "xtable")
Then for (2.),
library(devtools)
pkg_info <- devtools:::package_info(packages)
# clean up unwanted
pkg_info$source <- sub(" \\(R.*\\)", "", pkg_info$source)
pkg_info <- pkg_info[,-2]
pkg_info
I like the result because it also identifies non-CRAN (development version) packages. I could also format this with kable:
> pkg_info
package version date source
AER 1.2-3 2015-02-24 CRAN
ca 0.60 2015-03-01 R-Forge
car 2.0-25 2015-03-03 R-Forge
colorspace 1.2-6 2015-03-11 CRAN
corrplot 0.73 2013-10-15 CRAN
countreg 0.1-2 2014-10-17 R-Forge
directlabels 2013.6.15 2013-07-23 CRAN
effects 3.0-4 2015-03-22 R-Forge
ggparallel 0.1.1 2012-09-09 CRAN
ggplot2 1.0.1 2015-03-17 CRAN
ggtern 1.0.5.0 2015-04-15 CRAN
gmodels 2.15.4.1 2013-09-21 CRAN
gnm 1.0-8 2015-04-22 CRAN
gpairs 1.2 2014-03-09 CRAN
heplots 1.0-15 2015-04-18 CRAN
Lahman 3.0-1 2014-09-13 CRAN
lattice 0.20-31 2015-03-30 CRAN
lmtest 0.9-33 2014-01-23 CRAN
logmult 0.6.2 2015-04-22 CRAN
MASS 7.3-40 2015-03-21 CRAN
mgcv 1.8-6 2015-03-31 CRAN
nnet 7.3-9 2015-02-11 CRAN
plyr 1.8.2 2015-04-21 CRAN
pscl 1.4.9 2015-03-29 CRAN
RColorBrewer 1.1-2 2014-12-07 CRAN
reshape2 1.4.1 2014-12-06 CRAN
rms 4.3-1 2015-05-01 CRAN
rsm 2.7-2 2015-05-13 CRAN
sandwich 2.3-3 2015-03-26 CRAN
UBbipl 3.0.4 2013-10-13 local
vcd 1.4-0 2015-04-20 local
vcdExtra 0.6-8 2015-04-16 CRAN
VGAM 0.9-8 2015-05-11 CRAN
xtable 1.7-4 2014-09-12 CRAN
If you're using LaTeX you could simply generate bibliography for all the packages using:
%% begin.rcode rubber, results = 'asis', cache = FALSE
% write_bib(file = "generated.bib")
%% end.rcode
You can put this after your \end{document} and add corresponding \bibliography{mybib,generated} entry. This way you could also reference them with usual \cite{}

readRDS() loads extra packages

Under what circumstances does the readRDS() function in R try to load packages/namespaces? I was surprised to see the following in a fresh R session:
> loadedNamespaces()
[1] "base" "datasets" "graphics" "grDevices" "methods" "stats"
[7] "tools" "utils"
> x <- readRDS('../../../../data/models/my_model.rds')
There were 19 warnings (use warnings() to see them)
> loadedNamespaces()
[1] "base" "class" "colorspace" "data.table"
[5] "datasets" "dichromat" "e1071" "earth"
[9] "evaluate" "fields" "formatR" "gbm"
[13] "ggthemes" "graphics" "grDevices" "grid"
[17] "Iso" "knitr" "labeling" "lattice"
[21] "lubridate" "MASS" "methods" "munsell"
[25] "plotmo" "plyr" "proto" "quantreg"
[29] "randomForest" "RColorBrewer" "reshape2" "rJava"
[33] "scales" "spam" "SparseM" "splines"
[37] "stats" "stringr" "survival" "tools"
[41] "utils" "wra" "wra.ops" "xlsx"
[45] "xlsxjars" "xts" "zoo"
If any of those new packages aren't available, the readRDS() fails.
The 19 warnings mentioned are:
> warnings()
Warning messages:
1: replacing previous import ‘hour’ when loading ‘data.table’
2: replacing previous import ‘last’ when loading ‘data.table’
3: replacing previous import ‘mday’ when loading ‘data.table’
4: replacing previous import ‘month’ when loading ‘data.table’
5: replacing previous import ‘quarter’ when loading ‘data.table’
6: replacing previous import ‘wday’ when loading ‘data.table’
7: replacing previous import ‘week’ when loading ‘data.table’
8: replacing previous import ‘yday’ when loading ‘data.table’
9: replacing previous import ‘year’ when loading ‘data.table’
10: replacing previous import ‘here’ when loading ‘plyr’
11: replacing previous import ‘hour’ when loading ‘data.table’
12: replacing previous import ‘last’ when loading ‘data.table’
13: replacing previous import ‘mday’ when loading ‘data.table’
14: replacing previous import ‘month’ when loading ‘data.table’
15: replacing previous import ‘quarter’ when loading ‘data.table’
16: replacing previous import ‘wday’ when loading ‘data.table’
17: replacing previous import ‘week’ when loading ‘data.table’
18: replacing previous import ‘yday’ when loading ‘data.table’
19: replacing previous import ‘year’ when loading ‘data.table’
So apparently it's loading something like lubridate and then data.table, generating namespace conflicts as it goes.
FWIW, unserialize() gives the same results.
What I really want is to load these objects without also loading everything the person who saved them seemed to have loaded at the time, which is what it sort of looks like it's doing.
Update: here are the classes in the object x:
> classes <- function(x) {
cl <- c()
for(i in x) {
cl <- c(cl, if(is.list(i)) c(class(i), classes(i)) else class(i))
}
cl
}
> unique(classes(x))
[1] "list" "numeric" "rq"
[4] "terms" "formula" "call"
[7] "character" "smooth.spline" "integer"
[10] "smooth.spline.fit"
qr is from the quantreg package, all the rest are from base or stats.
Ok. This may not be a useful answer (which would need more details) but I think it is at least an aswer to the "under what circumstances.." part.
First of all, I think it is not specific to readRDS but works the same way with any save'd objects that can be load'ed.
The "under what circumstances" part: when the saved object contains an environment having a package/namespace environment as a parent. Or when it contains a function whose environment is a package/namespace environment.
require(Matrix)
foo <- list(
a = 1,
b = new.env(parent=environment(Matrix)),
c = "c")
save(foo, file="foo.rda")
loadedNamespaces() # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there!
load("foo.rda")
loadedNamespaces() # Matrix is back again
And the following works too:
require(Matrix)
bar <- list(
a = 1,
b = force,
c = "c")
environment(bar$b) <- environment(Matrix)
save(bar, file="bar.rda")
loadedNamespaces() # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there!
load("bar.rda")
loadedNamespaces() # Matrix is back!
I haven't tried but there's no reason why it shouldn't work the same way with saveRDS/readRDS. And the solution: if that does no harm to the saved objects (i.e., if you're sure that the environments are actually not needed), you can remove the parent environments by replacing them e.g. by setting the parent.env to something that makes sense. So using the foo above,
parent.env(foo$b) <- baseenv()
save(foo, file="foo.rda")
loadedNamespaces() # Matrix is there ....
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there ...
load("foo.rda")
loadedNamespaces() # still no Matrix ...
One painful workaround I've come up with is to cleanse the object of any environments it had attached to it, by a nasty eval:
sanitizeEnvironments <- function(obj) {
tc <- textConnection(NULL, 'w')
dput(obj, tc)
source(textConnection(textConnectionValue(tc)))$value
}
I can take the old object, run it through this function, then do saveRDS() on it again. Then loading the new object doesn't blow chunks all over my namespace.

Resources