I've read the documentation for parent.env() and it seems fairly straightforward - it returns the enclosing environment. However, if I use parent.env() to walk the chain of enclosing environments, I see something that I cannot explain. First, the code (taken from "R in a nutshell")
library( PerformanceAnalytics )
x = environment(chart.RelativePerformance)
while (environmentName(x) != environmentName(emptyenv()))
{
print(environmentName(parent.env(x)))
x <- parent.env(x)
}
And the results:
[1] "imports:PerformanceAnalytics"
[1] "base"
[1] "R_GlobalEnv"
[1] "package:PerformanceAnalytics"
[1] "package:xts"
[1] "package:zoo"
[1] "tools:rstudio"
[1] "package:stats"
[1] "package:graphics"
[1] "package:utils"
[1] "package:datasets"
[1] "package:grDevices"
[1] "package:roxygen2"
[1] "package:digest"
[1] "package:methods"
[1] "Autoloads"
[1] "base"
[1] "R_EmptyEnv"
How can we explain the "base" at the top and the "base" at the bottom? Also, how can we explain "package:PerformanceAnalytics" and "imports:PerformanceAnalytics"? Everything would seem consistent without the first two lines. That is, function chart.RelativePerformance is in the package:PerformanceAnalytics environment which is created by xts, which is created by zoo, ... all the way up (or down) to base and the empty environment.
Also, the documentation is not super clear on this - is the "enclosing environment" the environment in which another environment is created and thus walking parent.env() shows a "creation" chain?
Edit
Shameless plug: I wrote a blog post that explains environments, parent.env(), enclosures, namespace/package, etc. with intuitive diagrams.
1) Regarding how base could be there twice (given that environments form a tree), its the fault of the environmentName function. Actually the first occurrence is .BaseNamespaceEnv and the latter occurrence is baseenv().
> identical(baseenv(), .BaseNamespaceEnv)
[1] FALSE
2) Regarding the imports:PerformanceAnalytics that is a special environment that R sets up to hold the imports mentioned in the package's NAMESPACE or DESCRIPTION file so that objects in it are encountered before anything else.
Try running this for some clarity. The str(p) and following if statements will give a better idea of what p is:
library( PerformanceAnalytics )
x <- environment(chart.RelativePerformance)
str(x)
while (environmentName(x) != environmentName(emptyenv())) {
p <- parent.env(x)
cat("------------------------------\n")
str(p)
if (identical(p, .BaseNamespaceEnv)) cat("Same as .BaseNamespaceEnv\n")
if (identical(p, baseenv())) cat("Same as baseenv()\n")
x <- p
}
The first few items in your results give evidence of the rules R uses to search for variables used in functions in packages with namespaces. From the R-ext manual:
The namespace controls the search strategy for variables used by functions in the package.
If not found locally, R searches the package namespace first, then the imports, then the base
namespace and then the normal search path.
Elaborating just a bit, have a look at the first few lines of chart.RelativePerformance:
head(body(chart.RelativePerformance), 5)
# {
# Ra = checkData(Ra)
# Rb = checkData(Rb)
# columns.a = ncol(Ra)
# columns.b = ncol(Rb)
# }
When a call to chart.RelativePerformance is being evaluated, each of those symbols --- whether the checkData on line 1, or the ncol on line 3 --- needs to be found somewhere on the search path. Here are the first few enclosing environments checked:
First off is namespace:PerformanceAnalytics. checkData is found there, but ncol is not.
Next stop (and the first location listed in your results) is imports:PerformanceAnalytics. This is the list of functions specified as imports in the package's NAMESPACE file. ncol is not found here either.
The base environment namespace (where ncol will be found) is the last stop before proceeding to the normal search path. Almost any R function will use some base functions, so this stop ensures that none of that functionality can be broken by objects in the global environment or in other packages. (R's designers could have left it to package authors to explicitly import the base environment in their NAMESPACE files, but adding this default pass through base does seem like the better design decision.)
The second base is .BaseNamespaceEnv, while the second to last base is baseenv(). These are not different (probably w.r.t. its parents). The parent of .BaseNamespaceEnv is .GlobalEnv, while that of baseenv() is emptyenv().
In a package, as #Josh says, R searches the namespace of the package, then the imports, and then the base (i.e., BaseNamespaceEnv).
you can find this by, e.g.:
> library(zoo)
> packageDescription("zoo")
Package: zoo
# ... snip ...
Imports: stats, utils, graphics, grDevices, lattice (>= 0.18-1)
# ... snip ...
> x <- environment(zoo)
> x
<environment: namespace:zoo>
> ls(x) # objects in zoo
[1] "-.yearmon" "-.yearqtr" "[.yearmon"
[4] "[.yearqtr" "[.zoo" "[<-.zoo"
# ... snip ...
> y <- parent.env(x)
> y # namespace of imported packages
<environment: 0x116e37468>
attr(,"name")
[1] "imports:zoo"
> ls(y) # objects in the imported packages
[1] "?" "abline"
[3] "acf" "acf2AR"
# ... snip ...
Related
According to this post, environment() function is the function to call a current environment.
However, I found that at least that is not the case in eval function, with following examples.
.env <- new.env()
.env$info$progress <- 3
.expr <- "environment()$info$progress <- 5"
eval(parse(text = .expr), envir = .env, enclos = .env)
> invalid (NULL) left side of assignment
I also tried assign function, but it does not work either
.env <- new.env()
.env$info$progress <- 3
.expr <- "assign(info$progress, 11, envir = environment())"
eval(parse(text = .expr), envir = .env, enclos = .env)
> Error in assign(info$progress, 11, envir = environment()) :
> invalid first argument
So environment function failed to find current environment in eval.
I would appreciate if anyone lets me know how to access current environment in above examples or how to move-around this issue in eval.
environment() does what you think it does. The issue is with assigning directly to the result of a function call.
> new.env()$info$progress <- 3
Error in new.env()$info$progress <- 3 :
invalid (NULL) left side of assignment
> .env <- new.env()
> .env$info$progress <- 3
> evalq(identical(environment(), .env), envir = .env)
[1] TRUE
> evalq({ e <- environment(); e$info$progress <- 5 }, envir = .env)
> .env$info
$progress
[1] 5
The goal (which I thought was access to a defined environment) can be accomplished by considering the fact that no call to environment is needed. That function with a NULL argument doesn't retrieve anything useful. The .env object is an environment, so the assignment should just be into it:
.env <- new.env()
.env$info$progres <- 3
.expr <- ".env$info$progres <- 5"
eval(parse(text = .expr) )
#------------
> ls(envir=.env)
[1] "info"
> ?get
> get("info", envir=.env)
$progres
[1] 5
The environment assignment operation is supposed to put values into the environment of functions. I think it's probably undefined when you make an assignment into an unbound environment. I would not have thought that environment()$info$progres <- 5 would have succeeded in placing a value into .env since the target of environment(.)<- was NULL.
Responding to your comment: I'm not sure what was meant by "a current environment". There is "the current environment" and the .env-environment was not that environment (nor was it ever that environment, even for an instant). Creating an environment with new.env does not make it the current environment. It only creates an environment which allows you to store or retrieve objects in it by referencing its name.
.env <- new.env()
environment()
#<environment: R_GlobalEnv>
It isn't even on the search path. It's kind of "on the sidelines" waiting to be referenced.
> search()
[1] ".GlobalEnv" "package:acs" "package:XML" "package:acepack" "package:abind"
[6] "package:downloader" "package:forcats" "package:stringr" "package:dplyr" "package:purrr"
[11] "package:readr" "package:tidyr" "package:tibble" "package:tidyverse" "tools:RGUI"
[16] "package:grDevices" "package:utils" "package:datasets" "package:graphics" "package:rms"
[21] "package:SparseM" "package:Hmisc" "package:ggplot2" "package:Formula" "package:stats"
[26] "package:survival" "package:sos" "package:brew" "package:lattice" "package:methods"
[31] "Autoloads" "package:base"
> ls(envir=.env)
[1] "info"
I find myself wondering if the goal was to use a more object-oriented style, and if so would recommend looking at the ?R6 help page and the section in the R Language Definition entitled: "5 Object-oriented programming".
After navigating through the help pages looking at the code for getAnywhere, ?find, ?ls, ?objects, I found a particular use of apropos that you might find interesting:
apropos("\\.", mode="environment")
[1] ".AutoloadEnv" ".BaseNamespaceEnv" ".env" ".GenericArgsEnv" ".GlobalEnv"
[6] ".userHooksEnv"
If you use:
apropos("." , mode="environment")`
..., constructed with the most generic pattern possible, you will also find the 100 or so ggproto-environments defined by ggplot2-functions, assuming you have that package loaded. I think Hadley's "Advanced Programming" may have more on this topic of interest because he defines a "environment list" class and functions to manipulate them.
I am currently trying to translate my loaded packages into a character vector to use in the pkgDep function. Does anyone have any idea on how to do this? Currently my results are formatted as a list, and using the unlist()function has not worked for me. I think rapply would do the trick, but I am running into issues on how to set up the function. I have pasted my code below. Thanks!
x <- loaded_packages()
typeof(x)
#need a character vector with package names to pass into function
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
Use search() function to see the packages currently loaded.
x <- search()
x
# [1] ".GlobalEnv" "package:dplyr" "package:stats"
# [4] "package:graphics" "package:grDevices" "package:utils"
# [7] "package:datasets" "package:methods" "Autoloads"
# [10] "package:base"
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
If you can tell us what pkgDep() function does, we can get the loaded packages list in specific format.
Try this function:
x <- search()
As per this link.
I am wrapping my head around this:
> .packages()
> (.packages())
[1] "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base"
How is it possible that the first command outputs nothing and the second one works? I guess this is yet another syntax gotcha of R.
From the help page for .packages
‘.packages()’ returns the names of the currently attached packages
_invisibly_ whereas ‘.packages(all.available = TRUE)’ gives
(visibly) _all_ packages available in the library location path
‘lib.loc’.
Read the help page on invisible for more info but basically if something is returned invisibly then it won't automatically print. It will still be there so you can store it into an object it just won't display by default. Here are a few other examples
> 3
[1] 3
> invisible(3)
> x <- invisible(3)
> x
[1] 3
We see that when wrapped in invisible the "3" doesn't automatically print. We still can store it into an object even when it's invisible though.
Edit: Note that using invisible only masks the printing when the result would be autoprinted by the interpreter. We can force it to print using print or pretty much any other function call (of which ( counts as a function which is why wrapping the command in parenthesis prints the result).
> invisible(3) + 0
[1] 3
> I(invisible(3))
[1] 3
> (invisible(3))
[1] 3
> print(invisible(3))
[1] 3
Is there any way to programmatically distinguish between package environments and non-package environment objects? For example, the objects x and y below are both environments, with the same class and attributes.
x <- as.environment(cars)
y <- getNamespace("graphics")
However judging from the print method there is a difference:
> print(x)
<environment: 0x1d38118>
> print(y)
<environment: namespace:graphics>
Now suppose I have an arbitrary object, how can I determine which of the two it is (without looking at the output of print)? I would like to know this to determine how to store the object on disk. In case of the former I need to store the list representation of the environment (and perhaps its parents), but for the latter I would just store the name and version of the package.
isNamespace ?
isNamespace(y)
# [1] TRUE
isNamespace(x)
# [1] FALSE
And, for future reference, apropos is often helpful when you've got a question like this.
apropos("namespace")
# [1] "..getNamespace" ".BaseNamespaceEnv" ".getNamespace"
# [4] ".methodsNamespace" "asNamespace" "assignInMyNamespace"
# [7] "assignInNamespace" "attachNamespace" "fixInNamespace"
# [10] "getFromNamespace" "getNamespace" "getNamespaceExports"
# [13] "getNamespaceImports" "getNamespaceInfo" "getNamespaceName"
# [16] "getNamespaceUsers" "getNamespaceVersion" "isBaseNamespace"
# [19] "isNamespace" "loadedNamespaces" "loadingNamespaceInfo"
# [22] "loadNamespace" "namespaceExport" "namespaceImport"
# [25] "namespaceImportClasses" "namespaceImportFrom" "namespaceImportMethods"
# [28] "packageHasNamespace" "parseNamespaceFile" "requireNamespace"
# [31] "setNamespaceInfo" "unloadNamespace"
Is there a simple way to get the list/array of retention times from a xcmsRaw object?
Example Code:
xraw <- xcmsRaw(cdfFile)
So for example getting information from it :
xraw#env$intensity
or
xraw#env$mz
You can see what slots are available in your xcmsRaw instance with
> slotNames(xraw)
[1] "env" "tic" "scantime"
[4] "scanindex" "polarity" "acquisitionNum"
[7] "profmethod" "profparam" "mzrange"
[10] "gradient" "msnScanindex" "msnAcquisitionNum"
[13] "msnPrecursorScan" "msnLevel" "msnRt"
[16] "msnPrecursorMz" "msnPrecursorIntensity" "msnPrecursorCharge"
[19] "msnCollisionEnergy" "filepath"
What you want is xraw#msnRt - it is a vector of numeric.
The env slot is a environment that stores 3 variables:
> ls(xraw#env)
[1] "intensity" "mz" "profile"
More details on the class itself at class?xcmsRaw.
EDIT: The msnRt slot is populated only if you specify includeMSn = TRUE and your input file must be in mzXML or mzML, not in cdf; if you use the faahKO example from ?xcmasRaw, you will see that
xr <- xcmsRaw(cdffiles[1], includeMSn = TRUE)
Warning message:
In xcmsRaw(cdffiles[1], includeMSn = TRUE) :
Reading of MSn spectra for NetCDF not supported
Also, xr#msnRt will only store the retention times for MSn scans, with n > 1. See the xset#rt where xset is an xcmsSet instance for the raw/corrected MS1 retention times as provided by xcms.
EDIT2: Alternatively, have a go with the mzR package
> library(mzR)
> cdffiles[1]
[2] "/home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF"
> xx <- openMSfile(cdffiles[1])
> xx
Mass Spectrometry file handle.
Filename: /home/lgatto/R/x86_64-unknown-linux-gnu-library/2.16/faahKO/cdf/KO/ko15.CDF
Number of scans: 1278
> hd <- header(xx)
> names(hd)
[1] "seqNum" "acquisitionNum"
[3] "msLevel" "peaksCount"
[5] "totIonCurrent" "retentionTime"
[7] "basePeakMZ" "basePeakIntensity"
[9] "collisionEnergy" "ionisationEnergy"
[11] "highMZ" "precursorScanNum"
[13] "precursorMZ" "precursorCharge"
[15] "precursorIntensity" "mergedScan"
[17] "mergedResultScanNum" "mergedResultStartScanNum"
[19] "mergedResultEndScanNum"
> class(hd)
[1] "data.frame"
> dim(hd)
[1] 1278 19
but you will be outside of the default xcms pipeline if you take this route (although Steffen Neumann, from xcms, does know mzR very well, oubviously).
Finally, you are better of to use the Bioconductor mailing list of the xcms online forum if you want to maximise your chances to get feedback from the xcms developers.
Hope this helps.
Good answer but i was looking for this :
xraw <- xcmsRaw(cdfFile)
dp_index <- which(duplicated(rawMat(xraw)[,1]))
xraw_rt_dp <- rawMat(xraw)[,1]
xrawData.rt <- xraw_rt_dp[-dp_index]
Now :
xrawData.rt #contains the retention time.
Observation --> Using mzr package:
nc <- mzR:::netCDFOpen(cdfFile)
ncData <- mzR:::netCDFRawData(nc)
mzR:::netCDFClose(nc)
ncData$rt #contains the same retention time for the same input !!!