Finding the origin environment of the ... (dots) arguments of a call - r

I want to be able to find the environment from which the ... (dots) arguments of a call originate.
Scenario
For example, consider a function
foo <- function(x, ...) {
# do something
}
We want a function env_dots(), which we invoke from within foo(), that finds the originating environment of the ... in a call to foo(), even when the call to foo() is deeply nested. That is, if we define
foo <- function(x, ...) {
# find the originating environment of '...'
env <- env_dots()
# do something
}
and nest a call to foo, like so,
baz <- function(...) {
a <- "You found the dots"
bar(1, 2)
}
bar <- function(...)
foo(...)
then calling baz() should return the environment in which the ... in the (nested) call to foo(...) originates: this is the environment where the call bar(1, 2) is made, since the 2 (but not the 1) gets passed to the dots of foo. In particular, we should get
baz()$a
#> [1] "You found the dots"
Naive implementation of env_dots()
Update — env_dots(), as defined here, will not work in general, because the final ... may be populated by arguments that are called at multiple levels of the call stack.
Here's one possibility for env_dots():
# mc: match.call() of function from which env_dots() is called
env_dots <- function(mc) {
# Return NULL if initial call invokes no dots
if (!rlang::has_name(mc, "...")) return(NULL)
# Otherwise, climb the call stack until the dots origin is found
stack <- rlang::call_stack()[-1]
l <- length(stack)
i <- 1
while (i <= l && has_dots(stack[[i]]$expr)) i <- i + 1
# return NULL if no dots invoked
if (i <= l) stack[[i + 1]]$env else NULL
}
# Does a call have dots?
has_dots <- function(x) {
if (is.null(x))
return(FALSE)
args <- rlang::lang_tail(x)
any(vapply(args, identical, logical(1), y = quote(...)))
}
This seems to work: with
foo <- function(x, ...)
env_dots(match.call(expand.dots = FALSE))
we get
baz()$a
#> [1] "You found the dots"
bar(1, 2) # 2 gets passed down to the dots of foo()
#> <environment: R_GlobalEnv>
bar(1) # foo() captures no dots
#> NULL
Questions
The above implementation of env_dots() is not very efficient.
Is there are more skillful way to implement env_dots() in rlang and/or base R?
How can I move the match.call() invocation to within env_dots()?
match.call(sys.function(-1), call = sys.call(-1), expand.dots = FALSE) will indeed work.
Remark — One can't infer the origin environment of the dots from rlang::quos(...), because some quosures won't be endowed with the calling environment (e.g., when an expression is a literal object).

I'm sorry to dig up an old question, but I'm not sure the desired behavior is well-defined. ... is not a single expression; it's a list of expressions. In case of rlang quosures, each of those expressions has their own environment. So what should the environment of the list be?
Furthermore, the ... list itself can be modified. Consider the following example, where g takes its ..., prepends it with an (unevaluated) expression x+3 and passes it onto f.
f <- function(...) {rlang::enquos( ... )}
g <- function(...) {
a <- rlang::quo( x + 3 )
l <- rlang::list2( a, ... )
f(!!!l)
}
b <- rlang::quo( 5 * y )
g( b, 10 )
# [[1]]
# <quosure>
# expr: ^x + 3
# env: 0x7ffd1eca16f0
# [[2]]
# <quosure>
# expr: ^5 * y
# env: global
# [[3]]
# <quosure>
# expr: ^10
# env: empty
Notice that each of the three quosures that make it over to f has their own environment. (As you noted in your question, literals like 10 have an empty environment. This is because the value is the same independent of which environment it's evaluated in.)
Given this scenario, what should the hypothetical env_dots() return when called inside f()?

Related

Access function arguments with names of primitive functions

I am trying to understand the behaviour of user-defined functions like the below (based on the first answer to this question), which returns the arguments supplied to it as a named list:
function(a, b, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
Essentially, functions like the above produce unexpected behaviour when one of the argument names is also the name of a primitive function whose only parameter is ...
Below are some reproducible examples.
Example 1 - function behaves as expected, missing argument does not cause error
#define function as above
fun1 <- function(a, b, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun1(a = 1)
#returns the below. note that $b has the missing argument and this does not cause an error
#$a
#[1] 1
#$b
Example 2 - function returns error if 'c' is one of the explicit parameters and missing
#define function as above but with new explicit argument, called 'c'
#note that c() is a primitive function whose only parameter is ...
fun2 <- function(a, b, c, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun2(a = 1)
#returns error:
#Error in c(as.list(environment()), list(...)) :
# argument "c" is missing, with no default
Example 3 - replace 'c' with 'switch', a primitive function with parameters other than ...
#define function same way as fun2, but change 'c' parameter to 'switch'
#note that switch() is a primitive function that has parameters other than ...
fun3 <- function(a, b, switch, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun3(a = 1)
#returns the below. note that $b and $switch have the missing argument and this does not cause an error
#$a
#[1] 1
#$b
#$switch
I have tried numerous variations of the above that seem pointless to print here given that the basic pattern should be clear and thus easily reproducible without specific passages of code; suffice to say that as far as I have been able to tell, it appears that the function returns an error if one of its arguments a.) has the same name as a primitive function whose only parameter is ... and b.) is also missing. No other changes that I tested (such as removing the ... from the user-defined function's parameters; altering the order in which the arguments are specified when calling the function or when defining the function; changing the names and quantity of other arguments specified when calling the function or defining the function, etc.) had an impact on whether the behaviour was as expected.
Another point to note is that I don't see an error if I define a function with the same parameters as fun2, and with the c argument still missing, if I am not trying to access the function's arguments inside it. For example:
#define function with same parameters but different content to fun2
fun4 <- function(a, b, c, ...) {
return(a+b)
}
#run function
fun4(a = 1, b = 2)
#returns
#[1] 3
Please could somebody explain why I see this pattern of behaviour and the reason for the key role apparently played by primitive functions that only have ... as a parameter.
Please do not submit answers or comments suggesting 'workarounds' or querying the practical significance of the issue at hand. I am not asking my question in order to address a specific practical problem and there is no reason I can think of why I would ever be forced to use the name of a primitive function as a parameter; rather, I want to understand why the errors occur when they do in order to gain a clearer understanding of how functions in general, and the processes used to access their parameters in particular, work in R.
It's not the ... that's causing the problem. When you call c(), R looks for the function definition in the environment. Outside of a function it will normally find this as base::c. But within your function it first looks for the definition in the argument c in the function call, which it then can't find. This way of calling shows that it can work by telling R specifically where to find the definition of c:
fun4 <- function(a, b, c, ...) {
argg <- base::c(as.list(environment()), list(...))
print(argg)
}
#run function
fun4(a = 1)
#> $a
#> [1] 1
#>
#> $b
#>
#>
#> $c
Environments - from Advanced R
To demonstrate where things are being called you can use this tip from Advanced R by Hadley Wickham to see where R is finding each object. In the function where c isn't an argument, it finds it in base, otherwise it "finds" it in the function environment (where a and b are also defined):
library(rlang)
where <- function(name, env = caller_env()) {
if (identical(env, empty_env())) {
stop("Can't find ", name, call. = FALSE)
} else if (env_has(env, name)) {
env
} else {
where(name, env_parent(env))
}
}
fun5 <- function(a, b, ...) {
print(where("a"))
print(where("b"))
print(where("c"))
}
#run function
fun5(a = 1)
#> <environment: 0x000000001de35890>
#> <environment: 0x000000001de35890>
#> <environment: base>
fun6 <- function(a, b, c, ...) {
print(where("a"))
print(where("b"))
print(where("c"))
}
#run function
fun6(a = 1)
#> <environment: 0x000000001e1381f0>
#> <environment: 0x000000001e1381f0>
#> <environment: 0x000000001e1381f0>
Created on 2021-12-15 by the reprex package (v2.0.1)

R: last argument in elipsis

I have a function f that takes three arguments, and returns the last one. For example:
f <- function(x,y,z){
return(z)
}
f(11,22,33) #yields 33
However, later on I may have more/less than three arguments, so I want to create a function g that returns the last argument of ...
g <- function(...){
#return final argument in '...'
}
g(11,22) #should yield 22
g(11,22,33,44,'abc') #should yield 'abc'
Is there any simple way to do this?
I've looked at existing posts on using ..., but they all seem to use it to pass all the arguments to another function (which is not what I'm trying to do).
I could just make the argument into a vector, and return the last element, but I'd like to avoid that if possible.
Use ...length and ...elt like this:
f <- function(...) ...elt(...length())
f(11, 12, 13)
## [1] 13
g <- function(...) {
dots <- list(...)
if (length(dots)) dots[[length(dots)]]
}
g(11,22)
# [1] 22
g(11,22,33,44,'abc')
# [1] "abc"
g() # returns nothing, NULL invisibly
I often choose to assign to dots or similar in the function and then deal with it, though that can be both good and bad. If you want to pass the args on to other functions, then you can still use ... as before (or you can use do.call(..), though that's not your question). One side-effect is that by doing this, the ... are evaluated, which though generally fine, in some corner-cases this evaluation may be too soon.
A demonstration:
g(stop("quux"), "abc")
# Error in g(stop("quux"), "abc") : quux
If you want to avoid early evaluation of other args, you can use match.call:
g <- function(...){
cl <- match.call(expand.dots = TRUE)
if (length(cl) > 1) cl[[length(cl)]]
}
g() # nothing
g(1)
# [1] 1
g(stop("quux"), "abc")
# [1] "abc"

Manipulating enclosing environment of a function

I'm trying to get a better understanding of closures, in particular details on a function's scope and how to work with its enclosing environment(s)
Based on the Description section of the help page on rlang::fn_env(), I had the understanding, that a function always has access to all variables in its scope and that its enclosing environment belongs to that scope.
But then, why isn't it possible to manipulate the contents of the closure environment "after the fact", i.e. after the function has been created?
By means of R's lexical scoping, shouldn't bar() be able to find x when I put into its enclosing environment?
foo <- function(fun) {
env_closure <- rlang::fn_env(fun)
env_closure$x <- 5
fun()
}
bar <- function(x) x
foo(bar)
#> Error in fun(): argument "x" is missing, with no default
Ah, I think I got it down now.
It has to do with the structure of a function's formal arguments:
If an argument is defined without a default value, R will complain when you call the function without specifiying that even though it might technically be able to look it up in its scope.
One way to kick off lexical scoping even though you don't want to define a default value would be to set the defaults "on the fly" at run time via rlang::fn_fmls().
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fun()
}
# No argument at all -> lexical scoping takes over
baz <- function() x
foo(baz)
#> [1] 5
# Set defaults to desired values on the fly at run time of `foo()`
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fmls <- rlang::fn_fmls(fun)
fmls$x <- substitute(get("x", envir = env_enclosing, inherits = FALSE))
rlang::fn_fmls(fun) <- fmls
fun()
}
bar <- function(x) x
foo(bar)
#> [1] 5
I can't really follow your example as I am unfamiliar with the rlang library but I think a good example of a closure in R would be:
bucket <- function() {
n <- 1
foo <- function(x) {
assign("n", n+1, envir = parent.env(environment()))
n
}
foo
}
bar <- bucket()
Because bar() is define in the function environment of bucket then its parent environment is bucket and therefore you can carry some data there. Each time you run it you modify the bucket environment:
bar()
[1] 2
bar()
[1] 3
bar()
[1] 4

Why is this simple function not working?

I first defined new variable x, then created function that require x within its body (not as argument). See code below
x <- c(1,2,3)
f1 <- function() {
x^2
}
rm(x)
f2 <- function() {
x <- c(1,2,3)
f1()
}
f(2)
Error in f1() : object 'x' not found
When I removed x, and defined new function f2 that first define x and then execute f1, it shows objects x not found.
I just wanted to know why this is not working and how I can overcome this problem. I do not want x to be name as argument in f1.
Please provide appropriate title because I do not know what kind of problem is this.
You could use a closure to make an f1 with the desired properties:
makeF <- function(){
x <- c(1,2,3)
f1 <- function() {
x^2
}
f1
}
f1 <- makeF()
f1() #returns 1 4 9
There is no x in the global scope but f1 still knows about the x in the environment that it was defined in.
In short: Your are expecting dynamic scoping but are a victim of R's lexical scoping:
dynamic scoping = the enclosing environment of a command is determined during run-time
lexical scoping = the enclosing environment of a command is determined at "compile time"
To understand the lookup path of your variable x in the current and parent environments try this code.
It shows that both functions do not share the environment in with x is defined in f2 so it can't never be found:
# list all parent environments of an environment to show the "search path"
parents <- function(env) {
while (TRUE) {
name <- environmentName(env)
txt <- if (nzchar(name)) name else format(env)
cat(txt, "\n")
if (txt == "R_EmptyEnv") break
env <- parent.env(env)
}
}
x <- c(1,2,3)
f1 <- function() {
print("f1:")
parents(environment())
x^2
}
f1() # works
# [1] "f1:"
# <environment: 0x4ebb8b8>
# R_GlobalEnv
# ...
rm(x)
f2 <- function() {
print("f2:")
parents(environment())
x <- c(1,2,3)
f1()
}
f2() # does not find "x"
# [1] "f2:"
# <environment: 0x47b2d18>
# R_GlobalEnv
# ...
# [1] "f1:"
# <environment: 0x4765828>
# R_GlobalEnv
# ...
Possible solutions:
Declare x in the global environment (bad programming style due to lack of encapsulation)
Use function parameters (this is what functions are made for)
Use a closure if x has always the same value for each call of f1 (not for beginners). See the other answer from #JohnColeman...
I strongly propose using 2. (add x as parameter - why do you want to avoid this?).

Why must local({...}) be defined using two rounds of expression quoting?

I'm trying to understand how R's local function is working. With it, you can open a temporary local scope, which means what happens in local (most notably, variable definitions), stays in local. Only the last value of the block is returned to the outside world. So:
x <- local({
a <- 2
a * 2
})
x
## [1] 4
a
## Error: object 'a' not found
local is defined like this:
local <- function(expr, envir = new.env()){
eval.parent(substitute(eval(quote(expr), envir)))
}
As I understand it, two rounds of expression quoting and subsequent evaluation happen:
eval(quote([whatever expr input]), [whatever envir input]) is generated as an unevaluated call by substitute.
The call is evaluated in local's caller frame (which is in our case, the Global Environment), so
[whatever expr input] is evaluated in [whatever envir input]
However, I do not understand why step 2 is nessecary. Why can't I simply define local like this:
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
I would think it evaluates the expression expr in an empty environment? So any variable defined in expr should exist in envir and therefore vanish after the end of local2?
However, if I try this, I get:
x <- local2({
a <- 2
a * 2
})
x
## [1] 4
a
## [1] 2
So a leaks to the Global Environment. Why is this?
EDIT: Even more mysterious: Why does it not happen for:
eval(quote({a <- 2; a*2}), new.env())
## [1] 4
a
## Error: object 'a' not found
Parameters to R functions are passed as promises. They are not evaluated unless the value is specifically requested. So look at
# clean up first
if exists("a") rm(a)
f <- function(x) print(1)
f(a<-1)
# [1] 1
a
# Error: object 'a' not found
g <- function(x) print(x)
g(a<-1)
# [1] 1
a
# [1] 1
Note that in the g() function, we are using the value passed to the function which is that assignment to a so that creates a in the global environment. With f(), that variable is never created because that function parameter remained a promise end was never evaluated.
If you want to access a parameter without evaluating it, you need to use something like match.call() or subsititute(). The local() function does the latter.
If you remove the eval.parent(), you'll see that the substitute puts the un-evaluated expression from the parameter into a new call to eval().
h <- function(expr, envir = new.env()){
substitute(eval(quote(expr), envir))
}
h(a<-1)
# eval(quote(a <- 1), new.env())
Where as if you do
j<- function(x) {
quote(x)
}
j(a<-1)
# x
you are not really creating a new function call. Further more when you eval() that expression, you are triggering the evaluation of x from it's original calling environment (triggering the evaluation of the promise), not evaluating the expression in a new environment.
local() then uses the eval.parent() so that you can use existing variables in the environment within your block. For example
b<-5
local({
a <- b
a * 2
})
# [1] 10
Look at the behaviors here
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
local2({a<-5; a})
# [1] 5
local2({a<-5; a}, list(a=100, expr="hello"))
# [1] "hello"
See how when we use a non-empty environment, the eval() is looking up expr in the environment, it's not evaluating your code block in the environment.

Resources