readRDS() loads extra packages - r

Under what circumstances does the readRDS() function in R try to load packages/namespaces? I was surprised to see the following in a fresh R session:
> loadedNamespaces()
[1] "base" "datasets" "graphics" "grDevices" "methods" "stats"
[7] "tools" "utils"
> x <- readRDS('../../../../data/models/my_model.rds')
There were 19 warnings (use warnings() to see them)
> loadedNamespaces()
[1] "base" "class" "colorspace" "data.table"
[5] "datasets" "dichromat" "e1071" "earth"
[9] "evaluate" "fields" "formatR" "gbm"
[13] "ggthemes" "graphics" "grDevices" "grid"
[17] "Iso" "knitr" "labeling" "lattice"
[21] "lubridate" "MASS" "methods" "munsell"
[25] "plotmo" "plyr" "proto" "quantreg"
[29] "randomForest" "RColorBrewer" "reshape2" "rJava"
[33] "scales" "spam" "SparseM" "splines"
[37] "stats" "stringr" "survival" "tools"
[41] "utils" "wra" "wra.ops" "xlsx"
[45] "xlsxjars" "xts" "zoo"
If any of those new packages aren't available, the readRDS() fails.
The 19 warnings mentioned are:
> warnings()
Warning messages:
1: replacing previous import ‘hour’ when loading ‘data.table’
2: replacing previous import ‘last’ when loading ‘data.table’
3: replacing previous import ‘mday’ when loading ‘data.table’
4: replacing previous import ‘month’ when loading ‘data.table’
5: replacing previous import ‘quarter’ when loading ‘data.table’
6: replacing previous import ‘wday’ when loading ‘data.table’
7: replacing previous import ‘week’ when loading ‘data.table’
8: replacing previous import ‘yday’ when loading ‘data.table’
9: replacing previous import ‘year’ when loading ‘data.table’
10: replacing previous import ‘here’ when loading ‘plyr’
11: replacing previous import ‘hour’ when loading ‘data.table’
12: replacing previous import ‘last’ when loading ‘data.table’
13: replacing previous import ‘mday’ when loading ‘data.table’
14: replacing previous import ‘month’ when loading ‘data.table’
15: replacing previous import ‘quarter’ when loading ‘data.table’
16: replacing previous import ‘wday’ when loading ‘data.table’
17: replacing previous import ‘week’ when loading ‘data.table’
18: replacing previous import ‘yday’ when loading ‘data.table’
19: replacing previous import ‘year’ when loading ‘data.table’
So apparently it's loading something like lubridate and then data.table, generating namespace conflicts as it goes.
FWIW, unserialize() gives the same results.
What I really want is to load these objects without also loading everything the person who saved them seemed to have loaded at the time, which is what it sort of looks like it's doing.
Update: here are the classes in the object x:
> classes <- function(x) {
cl <- c()
for(i in x) {
cl <- c(cl, if(is.list(i)) c(class(i), classes(i)) else class(i))
}
cl
}
> unique(classes(x))
[1] "list" "numeric" "rq"
[4] "terms" "formula" "call"
[7] "character" "smooth.spline" "integer"
[10] "smooth.spline.fit"
qr is from the quantreg package, all the rest are from base or stats.

Ok. This may not be a useful answer (which would need more details) but I think it is at least an aswer to the "under what circumstances.." part.
First of all, I think it is not specific to readRDS but works the same way with any save'd objects that can be load'ed.
The "under what circumstances" part: when the saved object contains an environment having a package/namespace environment as a parent. Or when it contains a function whose environment is a package/namespace environment.
require(Matrix)
foo <- list(
a = 1,
b = new.env(parent=environment(Matrix)),
c = "c")
save(foo, file="foo.rda")
loadedNamespaces() # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there!
load("foo.rda")
loadedNamespaces() # Matrix is back again
And the following works too:
require(Matrix)
bar <- list(
a = 1,
b = force,
c = "c")
environment(bar$b) <- environment(Matrix)
save(bar, file="bar.rda")
loadedNamespaces() # Matrix is there!
detach("package:Matrix")
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there!
load("bar.rda")
loadedNamespaces() # Matrix is back!
I haven't tried but there's no reason why it shouldn't work the same way with saveRDS/readRDS. And the solution: if that does no harm to the saved objects (i.e., if you're sure that the environments are actually not needed), you can remove the parent environments by replacing them e.g. by setting the parent.env to something that makes sense. So using the foo above,
parent.env(foo$b) <- baseenv()
save(foo, file="foo.rda")
loadedNamespaces() # Matrix is there ....
unloadNamespace("Matrix")
loadedNamespaces() # no Matrix there ...
load("foo.rda")
loadedNamespaces() # still no Matrix ...

One painful workaround I've come up with is to cleanse the object of any environments it had attached to it, by a nasty eval:
sanitizeEnvironments <- function(obj) {
tc <- textConnection(NULL, 'w')
dput(obj, tc)
source(textConnection(textConnectionValue(tc)))$value
}
I can take the old object, run it through this function, then do saveRDS() on it again. Then loading the new object doesn't blow chunks all over my namespace.

Related

Error in R parallel:Error in checkForRemoteErrors(val) : 2 nodes produced errors; first error: cannot open the connection

I wrote a function to run R parallel, but it doesn't seem to work. The code is
'''
rm(list=ls())
square<-function(x){
library(Iso)
y=ufit(x,lmode<-2,x<-c(1:length(x)),type="b")[[2]]
return(y)
}
num<-c(1,2,1,4)
cl <- makeCluster(getOption("cl.cores",2))
clusterExport(cl,"square")
results<-parLapply(cl,num,square)
stopCluster(cl)
'''
and the error is:
Error in checkForRemoteErrors(val) :
2 nodes produced errors; first error: cannot open the connection
I think a possible reason is that I used the Iso package in the function. BUT I DON'T KNOW how to solve it.
You have to export your functions/whole packages to each cluster if you want to do it in parallel:
library(doSNOW)
## the rest is the same
rm(list=ls())
square<-function(x){
y=ufit(x,lmode<-2,x<-c(1:length(x)),type="b")[[2]]
return(y)
}
num<-c(1,2,1,4)
cl <- makeCluster(getOption("cl.cores",2))
clusterExport(cl,"square")
clusterEvalQ(cl,library(Iso))
## here you should see smth like this, where each cluster prints attached libraries
[[1]]
[1] "Iso" "snow" "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base"
[[2]]
[1] "Iso" "snow" "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base"
## then just call the same as with parallel
results<-parLapply(cl,num,square)
stopCluster(cl)
## alternative is to use Iso::ufit

summarise(n=n()) returns "Evaluation error: This function should not be called directly." even when plyr is not loaded

I remove plyr, load dplyr and check the current packages
detach("package:plyr", unload=TRUE)
library(dplyr)
(.packages())
[1] "dplyr" "bindrcpp" "stats" "graphics" "grDevices"
"utils" "datasets"
[8] "methods" "base"
For info here are the conflicts:
conflicts()
[1] "filter" "lag" "body<-" "intersect" "kronecker"
"setdiff" "setequal"
[8] "union"
Then I use summarise and get the error. This is the same code that I used 6 months ago without issue.
by_vs_am <- group_by(mtcars, vs, am)
by_vs <- summarise(by_vs_am, n = n())
Error in summarise_impl(.data, dots) : Evaluation error: This
function should not be called directly.
Try use dplyr::n() instead.
The code should look like this:
by_vs_am <- group_by(mtcars, vs, am)
by_vs <- summarise(by_vs_am, n = dplyr::n())
As mentioned by others, this has to do with conflicts. Looking at your loaded packages and their dependences can help. For me it was the XML library, so I ran detach("package:XML", unload = TRUE) to fix it.

Listing R Package Dependencies Without Installing Packages

Is there a simple way to get a list of R package dependencies (all recursive dependencies) for a given package, without installing the package and it's dependencies? Something similar to a fake install in portupgrade or apt.
You can use the result of the available.packages function. For example, to see what ggplot2 depends on :
pack <- available.packages()
pack["ggplot2","Depends"]
Which gives :
[1] "R (>= 2.14), stats, methods"
Note that depending on what you want to achieve, you may need to check the Imports field, too.
I am surprised no one mentioned tools::package_dependencies() , which is the simplest solution, and has a recursive argument (which the accepted solution does not offer).
Simple example looking at the recursive dependencies for the first 200 packages on CRAN:
library(tidyverse)
avail_pks <- available.packages()
deps <- tools::package_dependencies(packages = avail_pks[1:200, "Package"],
recursive = TRUE)
tibble(Package=names(deps),
data=map(deps, as_tibble)) %>%
unnest(data)
#> # A tibble: 7,125 x 2
#> Package value
#> <chr> <chr>
#> 1 A3 xtable
#> 2 A3 pbapply
#> 3 A3 parallel
#> 4 A3 stats
#> 5 A3 utils
#> 6 aaSEA DT
#> 7 aaSEA networkD3
#> 8 aaSEA shiny
#> 9 aaSEA shinydashboard
#> 10 aaSEA magrittr
#> # … with 7,115 more rows
Created on 2020-12-04 by the reprex package (v0.3.0)
Another neat and simple solution is the internal function recursivePackageDependencies from the library packrat. However, the package must be installed in some library on your machine. The advantage is that it works with selfmade non-CRAN packages as well. Example:
packrat:::recursivePackageDependencies("ggplot2",lib.loc = .libPaths()[1])
giving:
[1] "R6" "RColorBrewer" "Rcpp" "colorspace" "dichromat" "digest" "gtable"
[8] "labeling" "lazyeval" "magrittr" "munsell" "plyr" "reshape2" "rlang"
[15] "scales" "stringi" "stringr" "tibble" "viridisLite"
I do not have R installed and I needed to find out which R Packages were dependencies upon a list of R Packages being requested for usage at my company.
I wrote a bash script that iterates over a list of R Packages in a file and will recursively discover dependencies.
The script uses a file named rinput_orig.txt as input (example below). The script will create a file named rinput.txt as it does its work.
The script will create the following files:
rdepsfound.txt - Lists dependencies found including the R Package that is dependent upon it (example below).
routput.txt - Lists all R Packages (from original list and list of dependencies) along with the license and CRAN URL (example below).
r404.txt - List of R Packages where a 404 was received when trying to curl. This is handy if your original list has any typos.
Bash script:
#!/bin/bash
# CLEANUP
rm routput.txt
rm rdepsfound.txt
rm r404.txt
# COPY ORIGINAL INPUT TO WORKING INPUT
cp rinput_orig.txt rinput.txt
IFS=","
while read PACKAGE; do
echo Processing $PACKAGE...
PACKAGEURL="http://cran.r-project.org/web/packages/${PACKAGE}/index.html"
if [ `curl -o /dev/null --silent --head --write-out '%{http_code}\n' ${PACKAGEURL}` != 404 ]; then
# GET LICENSE INFO OF PACKAGE
LICENSEINFO=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "License:" | grep -v "License:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}' | sed "s/|/,/g" | sed "s/+/,/g")
for x in ${LICENSEINFO[*]}
do
# SAVE LICENSE
LICENSE=$(echo ${x} | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}')
break
done
# WRITE PACKAGE AND LICENSE TO OUTPUT FILE
echo $PACKAGE $LICENSE $PACKAGEURL >> routput.txt
# GET DEPENDENCIES OF PACKAGE
DEPS=$(curl ${PACKAGEURL} 2>/dev/null | grep -A1 "Depends:" | grep -v "Depends:" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[0]}')
for x in ${DEPS[*]}
do
FOUNDDEP=$(echo "${x}" | gawk 'match($0, /<a href=".*">(.*)<\/a>/, a) {print a[1]}' | sed "s/<\/span>//g")
if [ "$FOUNDDEP" != "" ]; then
echo Found dependency $FOUNDDEP for $PACKAGE...
grep $FOUNDDEP rinput.txt > /dev/null
if [ "$?" = "0" ]; then
echo $FOUNDDEP already exists in package list...
else
echo Adding $FOUNDDEP to package list...
# SAVE FOUND DEPENDENCY BACK TO INPUT LIST
echo $FOUNDDEP >> rinput.txt
# SAVE FOUND DEPENDENCY TO DEPENDENCY LIST FOR EASY VIEWING OF ALL FOUND DEPENDENCIES
echo $FOUNDDEP is a dependency of $PACKAGE >> rdepsfound.txt
fi
fi
done
else
echo Skipping $PACKAGE because 404 was received...
echo $PACKAGE $PACKAGEURL >> r404.txt
fi
done < rinput.txt
echo -e "\nRESULT:"
sort -u routput.txt
Example rinput_orig.txt:
shiny
rmarkdown
xtable
RODBC
RJDBC
XLConnect
openxlsx
xlsx
Rcpp
Example console output when running script:
Processing shiny...
Processing rmarkdown...
Processing xtable...
Processing RODBC...
Processing RJDBC...
Found dependency DBI for RJDBC...
Adding DBI to package list...
Found dependency rJava for RJDBC...
Adding rJava to package list...
Processing XLConnect...
Found dependency XLConnectJars for XLConnect...
Adding XLConnectJars to package list...
Processing openxlsx...
Processing xlsx...
Found dependency rJava for xlsx...
rJava already exists in package list...
Found dependency xlsxjars for xlsx...
Adding xlsxjars to package list...
Processing Rcpp...
Processing DBI...
Processing rJava...
Processing XLConnectJars...
Processing xlsxjars...
Found dependency rJava for xlsxjars...
rJava already exists in package list...
Example rdepsfound.txt:
DBI is a dependency of RJDBC
rJava is a dependency of RJDBC
XLConnectJars is a dependency of XLConnect
xlsxjars is a dependency of xlsx
Example routput.txt:
shiny GPL-3 http://cran.r-project.org/web/packages/shiny/index.html
rmarkdown GPL-3 http://cran.r-project.org/web/packages/rmarkdown/index.html
xtable GPL-2 http://cran.r-project.org/web/packages/xtable/index.html
RODBC GPL-2 http://cran.r-project.org/web/packages/RODBC/index.html
RJDBC GPL-2 http://cran.r-project.org/web/packages/RJDBC/index.html
XLConnect GPL-3 http://cran.r-project.org/web/packages/XLConnect/index.html
openxlsx GPL-3 http://cran.r-project.org/web/packages/openxlsx/index.html
xlsx GPL-3 http://cran.r-project.org/web/packages/xlsx/index.html
Rcpp GPL-2 http://cran.r-project.org/web/packages/Rcpp/index.html
DBI LGPL-2 http://cran.r-project.org/web/packages/DBI/index.html
rJava GPL-2 http://cran.r-project.org/web/packages/rJava/index.html
XLConnectJars GPL-3 http://cran.r-project.org/web/packages/XLConnectJars/index.html
xlsxjars GPL-3 http://cran.r-project.org/web/packages/xlsxjars/index.html
I hope this helps someone!
I tested my own solution (local installed packages checked) against packrat and tools ones.
You could find out clear differences between methods.
tools::package_dependencies looks to give too much for older R versions (till 4.1.0 and recursive = TRUE) and is not efficient solution.
R 4.1.0 NEWS
"Function tools::package_dependencies() (in package tools) can now use different dependency types for direct and recursive dependencies."
packrat:::recursivePackageDependencies is using available.packages so it is based on newest remote packages, not local ones.
My function by default is skipping base packages, change the base arg if you want to attach them too.
Tested under R 4.1.0:
get_deps <- function(package, fields = c("Depends", "Imports", "LinkingTo"), base = FALSE, lib.loc = NULL) {
stopifnot((length(package) == 1) && is.character(package))
stopifnot(all(fields %in% c("Depends", "Imports", "Suggests", "LinkingTo")))
stopifnot(is.logical(base))
stopifnot(package %in% rownames(utils::installed.packages(lib.loc = lib.loc)))
paks_global <- NULL
deps <- function(pak, fileds) {
pks <- packageDescription(pak)
res <- NULL
for (f in fileds) {
ff <- pks[[f]]
if (!is.null(ff)) {
res <- c(
res,
vapply(
strsplit(trimws(strsplit(ff, ",")[[1]]), "[ \n\\(]"),
function(x) x[1],
character(1)
)
)
}
}
if (is.null(res)) {
return(NULL)
}
for (r in res) {
if (r != "R" && !r %in% paks_global) {
paks_global <<- c(r, paks_global)
deps(r, fields)
}
}
}
deps(package, fields)
setdiff(unique(paks_global), c(
package,
"R",
if (!base) {
c(
"stats",
"graphics",
"grDevices",
"utils",
"datasets",
"methods",
"base",
"tools"
)
} else {
NULL
}
))
}
own = get_deps("shiny", fields = c("Depends", "Imports"))
packrat = packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
setdiff(own, packrat)
#> character(0)
setdiff(packrat, own)
#> character(0)
setdiff(own, tools)
#> character(0)
setdiff(tools, own)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
setdiff(packrat, tools)
#> character(0)
setdiff(tools, packrat)
#> [1] "methods" "utils" "grDevices" "tools" "stats" "graphics"
own
#> [1] "lifecycle" "ellipsis" "cachem" "jquerylib" "rappdirs"
#> [6] "fs" "sass" "bslib" "glue" "commonmark"
#> [11] "withr" "fastmap" "crayon" "sourcetools" "base64enc"
#> [16] "htmltools" "digest" "xtable" "jsonlite" "mime"
#> [21] "magrittr" "rlang" "later" "promises" "R6"
#> [26] "Rcpp" "httpuv"
packrat
#> [1] "R6" "Rcpp" "base64enc" "bslib" "cachem"
#> [6] "commonmark" "crayon" "digest" "ellipsis" "fastmap"
#> [11] "fs" "glue" "htmltools" "httpuv" "jquerylib"
#> [16] "jsonlite" "later" "lifecycle" "magrittr" "mime"
#> [21] "promises" "rappdirs" "rlang" "sass" "sourcetools"
#> [26] "withr" "xtable"
tools
#> [1] "methods" "utils" "grDevices" "httpuv" "mime"
#> [6] "jsonlite" "xtable" "digest" "htmltools" "R6"
#> [11] "sourcetools" "later" "promises" "tools" "crayon"
#> [16] "rlang" "fastmap" "withr" "commonmark" "glue"
#> [21] "bslib" "cachem" "ellipsis" "lifecycle" "sass"
#> [26] "jquerylib" "magrittr" "base64enc" "Rcpp" "stats"
#> [31] "graphics" "fs" "rappdirs"
microbenchmark::microbenchmark(get_deps("shiny", fields = c("Depends", "Imports")),
packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports")),
tools = tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]],
times = 5
)
#> Warning in microbenchmark::microbenchmark(get_deps("shiny", fields =
#> c("Depends", : less accurate nanosecond times to avoid potential integer
#> overflows
#> Unit: milliseconds
#> expr
#> get_deps("shiny", fields = c("Depends", "Imports"))
#> packrat:::recursivePackageDependencies("shiny", lib.loc = .libPaths(), fields = c("Depends", "Imports"))
#> tools
#> min lq mean median uq max neval
#> 5.316552 5.607365 6.054568 5.674359 6.633308 7.041258 5
#> 18.767340 19.387588 21.739127 21.581457 23.526169 25.433079 5
#> 411.589734 449.179354 458.526354 465.431262 468.440211 497.991207 5
Created on 2021-06-25 by the reprex package (v0.3.0)
Proof that sth was wrong with tools solution under older R versions. Tested under R 3.6.3.
paks <- tools::package_dependencies("shiny", which = c("Depends", "Imports"), recursive = TRUE)[[1]]
"lifecycle" %in% paks
#> [1] TRUE
any(c(paks, "shiny") %in% tools::dependsOnPkgs("lifecycle"))
#> [1] FALSE
Created on 2021-06-25 by the reprex package (v0.3.0)
Try this: tools::package_dependencies(recursive = TRUE)$package_name
As an example- here are the dependencies for dplyr:
tools::package_dependencies(recursive = TRUE)$dplyr
[1] "ellipsis" "generics" "glue" "lifecycle" "magrittr" "methods"
[7] "R6" "rlang" "tibble" "tidyselect" "utils" "vctrs"
[13] "cli" "crayon" "fansi" "pillar" "pkgconfig" "purrr"
[19] "digest" "assertthat" "grDevices" "utf8" "tools"

XCMS Package - Retention time

Is there a simple way to get the list/array of retention times from a xcmsRaw object?
Example Code:
xraw <- xcmsRaw(cdfFile)
So for example getting information from it :
xraw#env$intensity
or
xraw#env$mz
You can see what slots are available in your xcmsRaw instance with
> slotNames(xraw)
[1] "env" "tic" "scantime"
[4] "scanindex" "polarity" "acquisitionNum"
[7] "profmethod" "profparam" "mzrange"
[10] "gradient" "msnScanindex" "msnAcquisitionNum"
[13] "msnPrecursorScan" "msnLevel" "msnRt"
[16] "msnPrecursorMz" "msnPrecursorIntensity" "msnPrecursorCharge"
[19] "msnCollisionEnergy" "filepath"
What you want is xraw#msnRt - it is a vector of numeric.
The env slot is a environment that stores 3 variables:
> ls(xraw#env)
[1] "intensity" "mz" "profile"
More details on the class itself at class?xcmsRaw.
EDIT: The msnRt slot is populated only if you specify includeMSn = TRUE and your input file must be in mzXML or mzML, not in cdf; if you use the faahKO example from ?xcmasRaw, you will see that
xr <- xcmsRaw(cdffiles[1], includeMSn = TRUE)
Warning message:
In xcmsRaw(cdffiles[1], includeMSn = TRUE) :
Reading of MSn spectra for NetCDF not supported
Also, xr#msnRt will only store the retention times for MSn scans, with n > 1. See the xset#rt where xset is an xcmsSet instance for the raw/corrected MS1 retention times as provided by xcms.
EDIT2: Alternatively, have a go with the mzR package
> library(mzR)
> cdffiles[1]
[2] "/home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF"
> xx <- openMSfile(cdffiles[1])
> xx
Mass Spectrometry file handle.
Filename: /home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF
Number of scans: 1278
> hd <- header(xx)
> names(hd)
[1] "seqNum" "acquisitionNum"
[3] "msLevel" "peaksCount"
[5] "totIonCurrent" "retentionTime"
[7] "basePeakMZ" "basePeakIntensity"
[9] "collisionEnergy" "ionisationEnergy"
[11] "highMZ" "precursorScanNum"
[13] "precursorMZ" "precursorCharge"
[15] "precursorIntensity" "mergedScan"
[17] "mergedResultScanNum" "mergedResultStartScanNum"
[19] "mergedResultEndScanNum"
> class(hd)
[1] "data.frame"
> dim(hd)
[1] 1278 19
but you will be outside of the default xcms pipeline if you take this route (although Steffen Neumann, from xcms, does know mzR very well, oubviously).
Finally, you are better of to use the Bioconductor mailing list of the xcms online forum if you want to maximise your chances to get feedback from the xcms developers.
Hope this helps.
Good answer but i was looking for this :
xraw <- xcmsRaw(cdfFile)
dp_index <- which(duplicated(rawMat(xraw)[,1]))
xraw_rt_dp <- rawMat(xraw)[,1]
xrawData.rt <- xraw_rt_dp[-dp_index]
Now :
xrawData.rt #contains the retention time.
Observation --> Using mzr package:
nc <- mzR:::netCDFOpen(cdfFile)
ncData <- mzR:::netCDFRawData(nc)
mzR:::netCDFClose(nc)
ncData$rt #contains the same retention time for the same input !!!

parent.env( x ) confusion

I've read the documentation for parent.env() and it seems fairly straightforward - it returns the enclosing environment. However, if I use parent.env() to walk the chain of enclosing environments, I see something that I cannot explain. First, the code (taken from "R in a nutshell")
library( PerformanceAnalytics )
x = environment(chart.RelativePerformance)
while (environmentName(x) != environmentName(emptyenv()))
{
print(environmentName(parent.env(x)))
x <- parent.env(x)
}
And the results:
[1] "imports:PerformanceAnalytics"
[1] "base"
[1] "R_GlobalEnv"
[1] "package:PerformanceAnalytics"
[1] "package:xts"
[1] "package:zoo"
[1] "tools:rstudio"
[1] "package:stats"
[1] "package:graphics"
[1] "package:utils"
[1] "package:datasets"
[1] "package:grDevices"
[1] "package:roxygen2"
[1] "package:digest"
[1] "package:methods"
[1] "Autoloads"
[1] "base"
[1] "R_EmptyEnv"
How can we explain the "base" at the top and the "base" at the bottom? Also, how can we explain "package:PerformanceAnalytics" and "imports:PerformanceAnalytics"? Everything would seem consistent without the first two lines. That is, function chart.RelativePerformance is in the package:PerformanceAnalytics environment which is created by xts, which is created by zoo, ... all the way up (or down) to base and the empty environment.
Also, the documentation is not super clear on this - is the "enclosing environment" the environment in which another environment is created and thus walking parent.env() shows a "creation" chain?
Edit
Shameless plug: I wrote a blog post that explains environments, parent.env(), enclosures, namespace/package, etc. with intuitive diagrams.
1) Regarding how base could be there twice (given that environments form a tree), its the fault of the environmentName function. Actually the first occurrence is .BaseNamespaceEnv and the latter occurrence is baseenv().
> identical(baseenv(), .BaseNamespaceEnv)
[1] FALSE
2) Regarding the imports:PerformanceAnalytics that is a special environment that R sets up to hold the imports mentioned in the package's NAMESPACE or DESCRIPTION file so that objects in it are encountered before anything else.
Try running this for some clarity. The str(p) and following if statements will give a better idea of what p is:
library( PerformanceAnalytics )
x <- environment(chart.RelativePerformance)
str(x)
while (environmentName(x) != environmentName(emptyenv())) {
p <- parent.env(x)
cat("------------------------------\n")
str(p)
if (identical(p, .BaseNamespaceEnv)) cat("Same as .BaseNamespaceEnv\n")
if (identical(p, baseenv())) cat("Same as baseenv()\n")
x <- p
}
The first few items in your results give evidence of the rules R uses to search for variables used in functions in packages with namespaces. From the R-ext manual:
The namespace controls the search strategy for variables used by functions in the package.
If not found locally, R searches the package namespace first, then the imports, then the base
namespace and then the normal search path.
Elaborating just a bit, have a look at the first few lines of chart.RelativePerformance:
head(body(chart.RelativePerformance), 5)
# {
# Ra = checkData(Ra)
# Rb = checkData(Rb)
# columns.a = ncol(Ra)
# columns.b = ncol(Rb)
# }
When a call to chart.RelativePerformance is being evaluated, each of those symbols --- whether the checkData on line 1, or the ncol on line 3 --- needs to be found somewhere on the search path. Here are the first few enclosing environments checked:
First off is namespace:PerformanceAnalytics. checkData is found there, but ncol is not.
Next stop (and the first location listed in your results) is imports:PerformanceAnalytics. This is the list of functions specified as imports in the package's NAMESPACE file. ncol is not found here either.
The base environment namespace (where ncol will be found) is the last stop before proceeding to the normal search path. Almost any R function will use some base functions, so this stop ensures that none of that functionality can be broken by objects in the global environment or in other packages. (R's designers could have left it to package authors to explicitly import the base environment in their NAMESPACE files, but adding this default pass through base does seem like the better design decision.)
The second base is .BaseNamespaceEnv, while the second to last base is baseenv(). These are not different (probably w.r.t. its parents). The parent of .BaseNamespaceEnv is .GlobalEnv, while that of baseenv() is emptyenv().
In a package, as #Josh says, R searches the namespace of the package, then the imports, and then the base (i.e., BaseNamespaceEnv).
you can find this by, e.g.:
> library(zoo)
> packageDescription("zoo")
Package: zoo
# ... snip ...
Imports: stats, utils, graphics, grDevices, lattice (>= 0.18-1)
# ... snip ...
> x <- environment(zoo)
> x
<environment: namespace:zoo>
> ls(x) # objects in zoo
[1] "-.yearmon" "-.yearqtr" "[.yearmon"
[4] "[.yearqtr" "[.zoo" "[<-.zoo"
# ... snip ...
> y <- parent.env(x)
> y # namespace of imported packages
<environment: 0x116e37468>
attr(,"name")
[1] "imports:zoo"
> ls(y) # objects in the imported packages
[1] "?" "abline"
[3] "acf" "acf2AR"
# ... snip ...

Resources