According to this post, environment() function is the function to call a current environment.
However, I found that at least that is not the case in eval function, with following examples.
.env <- new.env()
.env$info$progress <- 3
.expr <- "environment()$info$progress <- 5"
eval(parse(text = .expr), envir = .env, enclos = .env)
> invalid (NULL) left side of assignment
I also tried assign function, but it does not work either
.env <- new.env()
.env$info$progress <- 3
.expr <- "assign(info$progress, 11, envir = environment())"
eval(parse(text = .expr), envir = .env, enclos = .env)
> Error in assign(info$progress, 11, envir = environment()) :
> invalid first argument
So environment function failed to find current environment in eval.
I would appreciate if anyone lets me know how to access current environment in above examples or how to move-around this issue in eval.
environment() does what you think it does. The issue is with assigning directly to the result of a function call.
> new.env()$info$progress <- 3
Error in new.env()$info$progress <- 3 :
invalid (NULL) left side of assignment
> .env <- new.env()
> .env$info$progress <- 3
> evalq(identical(environment(), .env), envir = .env)
[1] TRUE
> evalq({ e <- environment(); e$info$progress <- 5 }, envir = .env)
> .env$info
$progress
[1] 5
The goal (which I thought was access to a defined environment) can be accomplished by considering the fact that no call to environment is needed. That function with a NULL argument doesn't retrieve anything useful. The .env object is an environment, so the assignment should just be into it:
.env <- new.env()
.env$info$progres <- 3
.expr <- ".env$info$progres <- 5"
eval(parse(text = .expr) )
#------------
> ls(envir=.env)
[1] "info"
> ?get
> get("info", envir=.env)
$progres
[1] 5
The environment assignment operation is supposed to put values into the environment of functions. I think it's probably undefined when you make an assignment into an unbound environment. I would not have thought that environment()$info$progres <- 5 would have succeeded in placing a value into .env since the target of environment(.)<- was NULL.
Responding to your comment: I'm not sure what was meant by "a current environment". There is "the current environment" and the .env-environment was not that environment (nor was it ever that environment, even for an instant). Creating an environment with new.env does not make it the current environment. It only creates an environment which allows you to store or retrieve objects in it by referencing its name.
.env <- new.env()
environment()
#<environment: R_GlobalEnv>
It isn't even on the search path. It's kind of "on the sidelines" waiting to be referenced.
> search()
[1] ".GlobalEnv" "package:acs" "package:XML" "package:acepack" "package:abind"
[6] "package:downloader" "package:forcats" "package:stringr" "package:dplyr" "package:purrr"
[11] "package:readr" "package:tidyr" "package:tibble" "package:tidyverse" "tools:RGUI"
[16] "package:grDevices" "package:utils" "package:datasets" "package:graphics" "package:rms"
[21] "package:SparseM" "package:Hmisc" "package:ggplot2" "package:Formula" "package:stats"
[26] "package:survival" "package:sos" "package:brew" "package:lattice" "package:methods"
[31] "Autoloads" "package:base"
> ls(envir=.env)
[1] "info"
I find myself wondering if the goal was to use a more object-oriented style, and if so would recommend looking at the ?R6 help page and the section in the R Language Definition entitled: "5 Object-oriented programming".
After navigating through the help pages looking at the code for getAnywhere, ?find, ?ls, ?objects, I found a particular use of apropos that you might find interesting:
apropos("\\.", mode="environment")
[1] ".AutoloadEnv" ".BaseNamespaceEnv" ".env" ".GenericArgsEnv" ".GlobalEnv"
[6] ".userHooksEnv"
If you use:
apropos("." , mode="environment")`
..., constructed with the most generic pattern possible, you will also find the 100 or so ggproto-environments defined by ggplot2-functions, assuming you have that package loaded. I think Hadley's "Advanced Programming" may have more on this topic of interest because he defines a "environment list" class and functions to manipulate them.
Related
I have found that the usual way of finding the WD of a script sourced in RStudio is "dirname(parent.frame(2)$ofile)". I tried to research the meaning of this, read lots of explanations on environments, but I am still no closer to understanding this command. I ran this script:
print(environment())
print(parent.frame(1))
print(parent.frame(2))
print(parent.frame(3))
print(parent.frame(4))
f <- function() {
print('Do:')
print(environment())
print(parent.frame(1))
print(parent.frame(2))
print(parent.frame(3))
print(parent.frame(4))
}
f()
print("Parent 1:")
print(ls(parent.frame(1)))
print("Parent 2:")
print(ls(parent.frame(2)))
print(identical(environment(),parent.frame(1)))
print(identical(environment(),parent.frame(2)))
print(identical(environment(),parent.frame(3)))
print(identical(environment(),parent.frame(4)))
I got the output:
<environment: R_GlobalEnv>
<environment: 0x111987ab0>
<environment: 0x107fef700>
<environment: R_GlobalEnv>
<environment: R_GlobalEnv>
[1] "Do:"
<environment: 0x1119957d8>
<environment: R_GlobalEnv>
<environment: 0x111995998>
<environment: 0x107fef700>
<environment: R_GlobalEnv>
[1] "Parent 1:"
[1] "enclos" "envir" "expr"
[1] "Parent 2:"
[1] "chdir" "continue.echo" "curr.fun" "deparseCtrl"
[5] "echo" "ei" "enc" "encoding"
[9] "envir" "exprs" "file" "filename"
[13] "from_file" "have_encoding" "i" "i.symbol"
[17] "keep.source" "lastshown" "lines" "loc"
[21] "local" "max.deparse.length" "Ne" "ofile"
[25] "print.eval" "prompt.echo" "skip.echo" "spaced"
[29] "srcfile" "srcrefs" "tail" "use_file"
[33] "verbose" "width.cutoff" "yy"
[1] FALSE
[1] FALSE
[1] TRUE
[1] TRUE
I am not sure I understand the output.
1) What exactly are parents 1 and 2 of the Global Environment? Where can I read more on their attributes, including the used $ofile?
2) Why is parent.frame(1) in not equivalent to parent.frame(2) from within the function? Aren't they identical - the parent of Global Env?
3) Why does parent.frame start returning global environment when numbers get sufficiently big? Is this just how the function is written or is there some logic to this hierarchy?
When you go up the parent.frame, you can see the environments of the functions that are calling your function. The source() function has a lot of code that makes it work. It doesn't just dump the commands into your console. Basically it's running something like
source <- function(...) {
...
eval(ei, envir)
...
}
where ei is one of the expressions in your file. Then eval looks like this
eval <- function (expr, envir , enclos = ) {
.Internal(eval(expr, envir, enclos))
}
So when you call the first parent.frame() from a function that you call in a file that's sourced, it's going to see the eval() call first. If you look at formals(eval) you can see that it has those three variables that are in your first parent. The second parent lists all the variables that are created in the source() function itself, Including the ei variable we just saw. So heres where those values are
# parent.frame(4)
# parent.frame(3)
source <- function(...) {
# parent.frame(2)
eval(ei, envir)
}
eval <- function (expr, envir , enclos = ) {
# parent.frame(1)
.Internal(eval(expr, envir, enclos))
# ^^ your code
}
But variable resolution in R doesn't look in environments where functions are called from. Rather it uses lexical scoping so it looks where a function is defined (not called). If you want to get that environment, you can call parent.env(environment()) from inside the function instead. With a simple function, you should get the global environment. So really this means that parent.frame is just an unfortunate name because that's not "really" what it is.
I am trying to dive into the internals of static code analysis packages like codetools and CodeDepends, and my immediate goal is to understand how to detect function calls written as package_name::function_name() or package_name:::function_name(). I would have liked to just use findGlobals() from codetools, but this is not so simple.
Example function to analyze:
f <- function(n){
tmp <- digest::digest(n)
stats::rnorm(n)
}
Desired functionality:
analyze_function(f)
## [1] "digest::digest" "stats::rnorm"
Attempt with codetools:
library(codetools)
f = function(n) stats::rnorm(n)
findGlobals(f, merge = FALSE)
## $functions
## [1] "::"
##
## $variables
## character(0)
CodeDepends comes closer, but I am not sure I can always use the output to match functions to packages. I am looking for an automatic rule that connects rnorm() to stats and digest() to digest.
library(CodeDepends)
getInputs(body(f)
## An object of class "ScriptNodeInfo"
## Slot "files":
## character(0)
##
## Slot "strings":
## character(0)
##
## Slot "libraries":
## [1] "digest" "stats"
##
## Slot "inputs":
## [1] "n"
##
## Slot "outputs":
## [1] "tmp"
##
## Slot "updates":
## character(0)
##
## Slot "functions":
## { :: digest rnorm
## NA NA NA NA
##
## Slot "removes":
## character(0)
##
## Slot "nsevalVars":
## character(0)
##
## Slot "sideEffects":
## character(0)
##
## Slot "code":
## {
## tmp <- digest::digest(n)
## stats::rnorm(n)
## }
EDIT To be fair to CodeDepends, there is so much customizability and power for those who understand the internals. At the moment, I am just trying to wrap my head around collectors, handlers, walkers, etc. Apparently, it is possible to modify the standard :: collector to make special note of each namespaced call. For now, here is a naive attempt at something similar.
col <- inputCollector(`::` = function(e, collector, ...){
collector$call(paste0(e[[2]], "::", e[[3]]))
})
getInputs(quote(stats::rnorm(x)), collector = col)#functions
Browse[1]> getInputs(quote(stats::rnorm(x)), collector = col)#functions
stats::rnorm rnorm
NA NA
If you want to extract namespaced functions from a function, try something like this
find_ns_functions <- function(f, found=c()) {
if( is.function(f) ) {
# function, begin search on body
return(find_ns_functions(body(f), found))
} else if (is.call(f) && deparse(f[[1]]) %in% c("::", ":::")) {
found <- c(found, deparse(f))
} else if (is.recursive(f)) {
# compound object, iterate through sub-parts
v <- lapply(as.list(f), find_ns_functions, found)
found <- unique( c(found, unlist(v) ))
}
found
}
And we can test with
f <- function(n){
tmp <- digest::digest(n)
stats::rnorm(n)
}
find_ns_functions(f)
# [1] "digest::digest" "stats::rnorm"
Ok, so this was possible with CodeDepends previously, but a bit harder than it should have been. I've just committed version 0.5-4 to github, which now makes this really "easy". Essentially you just need to modify the default colonshandlers ("::" and/or ":::") as follows:
library(CodeDepends) # version >= 0.5-4
handler = function(e, collector, ..., iscall = FALSE) {
collector$library(asVarName(e[[2]]))
## :: or ::: name, remove if you don't want to count those as functions called
collector$call(asVarName(e[[1]]))
if(iscall)
collector$call(deparse((e))) #whole expr ie stats::norm
else
collector$vars(deparse((e)), input=TRUE) #whole expr ie stats::norm
}
getInputs(quote(stats::rnorm(x,y,z)), collector = inputCollector("::" = handler))
getInputs(quote(lapply( 1:10, stats::rnorm)), collector = inputCollector("::" = handler))
The first getInputs call above gives the result:
An object of class "ScriptNodeInfo"
Slot "files":
character(0)
Slot "strings":
character(0)
Slot "libraries":
[1] "stats"
Slot "inputs":
[1] "x" "y" "z"
Slot "outputs":
character(0)
Slot "updates":
character(0)
Slot "functions":
:: stats::rnorm
NA NA
Slot "removes":
character(0)
Slot "nsevalVars":
character(0)
Slot "sideEffects":
character(0)
Slot "code":
stats::rnorm(x, y, z)
As, I believe, desired.
One thing to note here is the iscall argument I've added to the colons handler. The default handler and applyhandlerfactory now have special logic so that when they invoke one of the colons handlers in a situation where it is a function being called, that is set to TRUE.
I haven't done extensive testing yet of what will happen when "stats::rnorm" appears in lieu of symbols, particularly in the inputs slot when calculating dependencies, but I'm hopeful that should all continue to work as well. If it doesn't let me know.
~G
Follow up to this
I want to source scripts inside a given environment, like in sys.source, but "exporting" only some functions and keeping the others private.
I created this function:
source2=function(script){
ps=paste0(script, "_")
assign(ps, new.env(parent=baseenv()))
assign(script, new.env(parent=get(ps)))
private=function(f){
fn=deparse(substitute(f))
assign(fn, f, parent.env(parent.frame()))
rm(list=fn, envir=parent.frame())
}
assign("private", private, get(script))
sys.source(paste0(script, ".R"), envir=get(script))
rm(private, envir=get(script))
attach(get(script), name=script)
}
For the most part, this function works as expected.
Consider the script:
## foo.R
f=function() g()
g=function() print('hello')
private(g)
Note the private() function, which will hide g().
If I, so to say, import the module foo:
source2("foo")
I have a new environment in the search path:
search()
## [1] ".GlobalEnv" "foo" "package:stats"
## [4] "package:graphics" "package:grDevices" "package:utils"
## [7] "package:datasets" "package:methods" "Autoloads"
## [10] "package:base"
The current environment, .GlobalEnv, shows only:
ls()
## [1] "source2"
But if I list items in foo environment:
ls("foo")
## [1] "f"
Therefore I can run:
f()
## [1] "hello"
The problem is that g() is hidden totally.
getAnywhere(g)
## no object named 'g' was found
Too much. In fact, if I want to debug f():
debug(f)
f()
debugging in: f()
## Error in f() : could not find function "g"
The question is:
Where is g()? Can I still retrieve it?
Use:
get("g",env=environment(f))
## function ()
## print("hello")
## <environment: 0x0000000018780c30>
ls(parent.env(environment(f)))
## [1] "g"
Credit goes to Alexander Griffith for the solution.
I am currently trying to translate my loaded packages into a character vector to use in the pkgDep function. Does anyone have any idea on how to do this? Currently my results are formatted as a list, and using the unlist()function has not worked for me. I think rapply would do the trick, but I am running into issues on how to set up the function. I have pasted my code below. Thanks!
x <- loaded_packages()
typeof(x)
#need a character vector with package names to pass into function
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
Use search() function to see the packages currently loaded.
x <- search()
x
# [1] ".GlobalEnv" "package:dplyr" "package:stats"
# [4] "package:graphics" "package:grDevices" "package:utils"
# [7] "package:datasets" "package:methods" "Autoloads"
# [10] "package:base"
pkgList <- pkgDep(x, availPkgs = pkgdata, suggests=TRUE)`
If you can tell us what pkgDep() function does, we can get the loaded packages list in specific format.
Try this function:
x <- search()
As per this link.
I've read the documentation for parent.env() and it seems fairly straightforward - it returns the enclosing environment. However, if I use parent.env() to walk the chain of enclosing environments, I see something that I cannot explain. First, the code (taken from "R in a nutshell")
library( PerformanceAnalytics )
x = environment(chart.RelativePerformance)
while (environmentName(x) != environmentName(emptyenv()))
{
print(environmentName(parent.env(x)))
x <- parent.env(x)
}
And the results:
[1] "imports:PerformanceAnalytics"
[1] "base"
[1] "R_GlobalEnv"
[1] "package:PerformanceAnalytics"
[1] "package:xts"
[1] "package:zoo"
[1] "tools:rstudio"
[1] "package:stats"
[1] "package:graphics"
[1] "package:utils"
[1] "package:datasets"
[1] "package:grDevices"
[1] "package:roxygen2"
[1] "package:digest"
[1] "package:methods"
[1] "Autoloads"
[1] "base"
[1] "R_EmptyEnv"
How can we explain the "base" at the top and the "base" at the bottom? Also, how can we explain "package:PerformanceAnalytics" and "imports:PerformanceAnalytics"? Everything would seem consistent without the first two lines. That is, function chart.RelativePerformance is in the package:PerformanceAnalytics environment which is created by xts, which is created by zoo, ... all the way up (or down) to base and the empty environment.
Also, the documentation is not super clear on this - is the "enclosing environment" the environment in which another environment is created and thus walking parent.env() shows a "creation" chain?
Edit
Shameless plug: I wrote a blog post that explains environments, parent.env(), enclosures, namespace/package, etc. with intuitive diagrams.
1) Regarding how base could be there twice (given that environments form a tree), its the fault of the environmentName function. Actually the first occurrence is .BaseNamespaceEnv and the latter occurrence is baseenv().
> identical(baseenv(), .BaseNamespaceEnv)
[1] FALSE
2) Regarding the imports:PerformanceAnalytics that is a special environment that R sets up to hold the imports mentioned in the package's NAMESPACE or DESCRIPTION file so that objects in it are encountered before anything else.
Try running this for some clarity. The str(p) and following if statements will give a better idea of what p is:
library( PerformanceAnalytics )
x <- environment(chart.RelativePerformance)
str(x)
while (environmentName(x) != environmentName(emptyenv())) {
p <- parent.env(x)
cat("------------------------------\n")
str(p)
if (identical(p, .BaseNamespaceEnv)) cat("Same as .BaseNamespaceEnv\n")
if (identical(p, baseenv())) cat("Same as baseenv()\n")
x <- p
}
The first few items in your results give evidence of the rules R uses to search for variables used in functions in packages with namespaces. From the R-ext manual:
The namespace controls the search strategy for variables used by functions in the package.
If not found locally, R searches the package namespace first, then the imports, then the base
namespace and then the normal search path.
Elaborating just a bit, have a look at the first few lines of chart.RelativePerformance:
head(body(chart.RelativePerformance), 5)
# {
# Ra = checkData(Ra)
# Rb = checkData(Rb)
# columns.a = ncol(Ra)
# columns.b = ncol(Rb)
# }
When a call to chart.RelativePerformance is being evaluated, each of those symbols --- whether the checkData on line 1, or the ncol on line 3 --- needs to be found somewhere on the search path. Here are the first few enclosing environments checked:
First off is namespace:PerformanceAnalytics. checkData is found there, but ncol is not.
Next stop (and the first location listed in your results) is imports:PerformanceAnalytics. This is the list of functions specified as imports in the package's NAMESPACE file. ncol is not found here either.
The base environment namespace (where ncol will be found) is the last stop before proceeding to the normal search path. Almost any R function will use some base functions, so this stop ensures that none of that functionality can be broken by objects in the global environment or in other packages. (R's designers could have left it to package authors to explicitly import the base environment in their NAMESPACE files, but adding this default pass through base does seem like the better design decision.)
The second base is .BaseNamespaceEnv, while the second to last base is baseenv(). These are not different (probably w.r.t. its parents). The parent of .BaseNamespaceEnv is .GlobalEnv, while that of baseenv() is emptyenv().
In a package, as #Josh says, R searches the namespace of the package, then the imports, and then the base (i.e., BaseNamespaceEnv).
you can find this by, e.g.:
> library(zoo)
> packageDescription("zoo")
Package: zoo
# ... snip ...
Imports: stats, utils, graphics, grDevices, lattice (>= 0.18-1)
# ... snip ...
> x <- environment(zoo)
> x
<environment: namespace:zoo>
> ls(x) # objects in zoo
[1] "-.yearmon" "-.yearqtr" "[.yearmon"
[4] "[.yearqtr" "[.zoo" "[<-.zoo"
# ... snip ...
> y <- parent.env(x)
> y # namespace of imported packages
<environment: 0x116e37468>
attr(,"name")
[1] "imports:zoo"
> ls(y) # objects in the imported packages
[1] "?" "abline"
[3] "acf" "acf2AR"
# ... snip ...