How does an environment remember that it exists? - r

Take
adder <- local({
x <- 0
function() {x <<- x+1; x}
})
or equivalently
adderGen <- function(){
x <- 0
function() {x <<- x+1; x}
}
adder<-adderGen()
Calling adder() will return 1 calling it again returns 2, and so on. But how does adder keep count of this? I can't see any variables hitting the global environment, so what is actually being used to store these? Particularly in the second case, you'd expect adder to forget that it was made inside of a function call.

Every function retains the environment in which it was defined as part of the function. If f is a function then environment(f) shows it. Normally the execution environment within adderGen would be discarded when it exited but because adderGen passes a function out whose environment is the execution environment within adderGen that environment is retained as part of the function that is passed out. We can verify that by displaying the execution environment within adderGen and then verify that it is the same as the environment of adder. The trace function will insert the print statement at the beginning of the body of adderGen and will show the execution environment each time adderGen runs. environment(adder) is the same environment.
trace(adderGen, quote(print(environment())))
## [1] "adderGen"
adder <- adderGen()
## Tracing adderGen() on entry
## <environment: 0x0000000014e77780>
environment(adder)
## <environment: 0x0000000014e77780>

To see what is happening, let us define the function as follows:
adderGen <- function(){
print("Initialize")
x <- 0
function() {x <<- x+1; x}
}
When we evaluate it, we obtain:
adder <- adderGen()
# [1] "Initialize"
The object that has been assigned to adder is the inside function of adderGen (which is the output of adderGen). Note that, adder does not print "Initialize" any more.
adderGen
# function(){
# print("Initialize")
# x <- 0
# a <- function() {x <<- x+1; x}
# }
adder
# function() {x <<- x+1; x}
# <environment: 0x55cd4ebd3390>
We can see that it also creates a new calling environment, which inherits the variable x in the environment of adderGen.
ls(environment(adder))
# [1] "x"
get("x",environment(adder))
# [1] 0
The first time adder is executed, it uses the inherited value of x, i.e. 0, to redefine x as a global variable (in its calling environment). And this global variable is the one that it is used in the next executions. Since x <-0 is not part of the function adder, when adder is executed, the variable x is not initialized to 0 and it increments by one the current value of x.
adder()
# [1] 1

Related

return function inside with() block

How does return() in a with() block work?
Here is a test function
test_func <- function(df, y) {
with(df,
if(x > 10){
message('Inside step 1')
return(1)
}
)
message("After step 1")
if(y > 10){
message('In side step 2')
return(2)
}
message("After step 2")
}
The function keeps going after return(1).
df <- data.frame(x = 11)
y <- 11
test_func(df, y) ## result is 2
Output
Inside step 1
After step 1
In side step 2
[1] 2
return(1) doesn't return to the test_func() rather than get out the with() block
df <- data.frame(x = 11)
y <- 5
test_func(df, y) ## no result
Output
Inside step 1
After step 1
After step 2
I think the main point here is that return() is designed to exit the current scope to the parent scope with a particular value. In the case of running
return("hello")
# Error: no function to return from, jumping to top level
You get an error because we are calling this from the global environment and there is no parent scope you are jumping back to. Note that thanks to R's lazy evaluation, parameters passed to a function are usually evaluated in the environment where they were passed from. So in this example
f <- function(x) x
f(return("hello"))
# Error in f(return("hello")) :
# no function to return from, jumping to top level
So because we are actually passing in the return() call to the function from the global environment, that's where the return will be evaluated and will return the same error. But note that this is not what with does, instead it captures the commands you pass and re-runs them in a new environment. It's closer to something like this
f <- function(x) eval(substitute(x))
f(return("hello"))
# [1] "hello"
This eval() creates a new level of scope we can escape from, and because we aren't evaluating the parameter passed to the function directly, we are just running those commands in a different environment, the return is no longer evaluated in the global environment so there's no error. So the function we created is basically the same as calling
with(NULL, return("hello"))
# [1] "hello"
which is different from something like
print(return("hello"))
# no function to return from, jumping to top level
where the parameter is directly evaluated. So the different meanings of return() really are side effects of non-standard evaluation in this case.
When return() is used inside a with(), you are not returning from the function that you called with() from, but you are returning from the scope that with() created for you to run your command.
How to fix this particular problem is already addressed by the comments left by #IceCreamToucan. You just need to move the return outside of the with().

Why is this simple function not working?

I first defined new variable x, then created function that require x within its body (not as argument). See code below
x <- c(1,2,3)
f1 <- function() {
x^2
}
rm(x)
f2 <- function() {
x <- c(1,2,3)
f1()
}
f(2)
Error in f1() : object 'x' not found
When I removed x, and defined new function f2 that first define x and then execute f1, it shows objects x not found.
I just wanted to know why this is not working and how I can overcome this problem. I do not want x to be name as argument in f1.
Please provide appropriate title because I do not know what kind of problem is this.
You could use a closure to make an f1 with the desired properties:
makeF <- function(){
x <- c(1,2,3)
f1 <- function() {
x^2
}
f1
}
f1 <- makeF()
f1() #returns 1 4 9
There is no x in the global scope but f1 still knows about the x in the environment that it was defined in.
In short: Your are expecting dynamic scoping but are a victim of R's lexical scoping:
dynamic scoping = the enclosing environment of a command is determined during run-time
lexical scoping = the enclosing environment of a command is determined at "compile time"
To understand the lookup path of your variable x in the current and parent environments try this code.
It shows that both functions do not share the environment in with x is defined in f2 so it can't never be found:
# list all parent environments of an environment to show the "search path"
parents <- function(env) {
while (TRUE) {
name <- environmentName(env)
txt <- if (nzchar(name)) name else format(env)
cat(txt, "\n")
if (txt == "R_EmptyEnv") break
env <- parent.env(env)
}
}
x <- c(1,2,3)
f1 <- function() {
print("f1:")
parents(environment())
x^2
}
f1() # works
# [1] "f1:"
# <environment: 0x4ebb8b8>
# R_GlobalEnv
# ...
rm(x)
f2 <- function() {
print("f2:")
parents(environment())
x <- c(1,2,3)
f1()
}
f2() # does not find "x"
# [1] "f2:"
# <environment: 0x47b2d18>
# R_GlobalEnv
# ...
# [1] "f1:"
# <environment: 0x4765828>
# R_GlobalEnv
# ...
Possible solutions:
Declare x in the global environment (bad programming style due to lack of encapsulation)
Use function parameters (this is what functions are made for)
Use a closure if x has always the same value for each call of f1 (not for beginners). See the other answer from #JohnColeman...
I strongly propose using 2. (add x as parameter - why do you want to avoid this?).

Why must local({...}) be defined using two rounds of expression quoting?

I'm trying to understand how R's local function is working. With it, you can open a temporary local scope, which means what happens in local (most notably, variable definitions), stays in local. Only the last value of the block is returned to the outside world. So:
x <- local({
a <- 2
a * 2
})
x
## [1] 4
a
## Error: object 'a' not found
local is defined like this:
local <- function(expr, envir = new.env()){
eval.parent(substitute(eval(quote(expr), envir)))
}
As I understand it, two rounds of expression quoting and subsequent evaluation happen:
eval(quote([whatever expr input]), [whatever envir input]) is generated as an unevaluated call by substitute.
The call is evaluated in local's caller frame (which is in our case, the Global Environment), so
[whatever expr input] is evaluated in [whatever envir input]
However, I do not understand why step 2 is nessecary. Why can't I simply define local like this:
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
I would think it evaluates the expression expr in an empty environment? So any variable defined in expr should exist in envir and therefore vanish after the end of local2?
However, if I try this, I get:
x <- local2({
a <- 2
a * 2
})
x
## [1] 4
a
## [1] 2
So a leaks to the Global Environment. Why is this?
EDIT: Even more mysterious: Why does it not happen for:
eval(quote({a <- 2; a*2}), new.env())
## [1] 4
a
## Error: object 'a' not found
Parameters to R functions are passed as promises. They are not evaluated unless the value is specifically requested. So look at
# clean up first
if exists("a") rm(a)
f <- function(x) print(1)
f(a<-1)
# [1] 1
a
# Error: object 'a' not found
g <- function(x) print(x)
g(a<-1)
# [1] 1
a
# [1] 1
Note that in the g() function, we are using the value passed to the function which is that assignment to a so that creates a in the global environment. With f(), that variable is never created because that function parameter remained a promise end was never evaluated.
If you want to access a parameter without evaluating it, you need to use something like match.call() or subsititute(). The local() function does the latter.
If you remove the eval.parent(), you'll see that the substitute puts the un-evaluated expression from the parameter into a new call to eval().
h <- function(expr, envir = new.env()){
substitute(eval(quote(expr), envir))
}
h(a<-1)
# eval(quote(a <- 1), new.env())
Where as if you do
j<- function(x) {
quote(x)
}
j(a<-1)
# x
you are not really creating a new function call. Further more when you eval() that expression, you are triggering the evaluation of x from it's original calling environment (triggering the evaluation of the promise), not evaluating the expression in a new environment.
local() then uses the eval.parent() so that you can use existing variables in the environment within your block. For example
b<-5
local({
a <- b
a * 2
})
# [1] 10
Look at the behaviors here
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
local2({a<-5; a})
# [1] 5
local2({a<-5; a}, list(a=100, expr="hello"))
# [1] "hello"
See how when we use a non-empty environment, the eval() is looking up expr in the environment, it's not evaluating your code block in the environment.

Finding the origin environment of the ... (dots) arguments of a call

I want to be able to find the environment from which the ... (dots) arguments of a call originate.
Scenario
For example, consider a function
foo <- function(x, ...) {
# do something
}
We want a function env_dots(), which we invoke from within foo(), that finds the originating environment of the ... in a call to foo(), even when the call to foo() is deeply nested. That is, if we define
foo <- function(x, ...) {
# find the originating environment of '...'
env <- env_dots()
# do something
}
and nest a call to foo, like so,
baz <- function(...) {
a <- "You found the dots"
bar(1, 2)
}
bar <- function(...)
foo(...)
then calling baz() should return the environment in which the ... in the (nested) call to foo(...) originates: this is the environment where the call bar(1, 2) is made, since the 2 (but not the 1) gets passed to the dots of foo. In particular, we should get
baz()$a
#> [1] "You found the dots"
Naive implementation of env_dots()
Update — env_dots(), as defined here, will not work in general, because the final ... may be populated by arguments that are called at multiple levels of the call stack.
Here's one possibility for env_dots():
# mc: match.call() of function from which env_dots() is called
env_dots <- function(mc) {
# Return NULL if initial call invokes no dots
if (!rlang::has_name(mc, "...")) return(NULL)
# Otherwise, climb the call stack until the dots origin is found
stack <- rlang::call_stack()[-1]
l <- length(stack)
i <- 1
while (i <= l && has_dots(stack[[i]]$expr)) i <- i + 1
# return NULL if no dots invoked
if (i <= l) stack[[i + 1]]$env else NULL
}
# Does a call have dots?
has_dots <- function(x) {
if (is.null(x))
return(FALSE)
args <- rlang::lang_tail(x)
any(vapply(args, identical, logical(1), y = quote(...)))
}
This seems to work: with
foo <- function(x, ...)
env_dots(match.call(expand.dots = FALSE))
we get
baz()$a
#> [1] "You found the dots"
bar(1, 2) # 2 gets passed down to the dots of foo()
#> <environment: R_GlobalEnv>
bar(1) # foo() captures no dots
#> NULL
Questions
The above implementation of env_dots() is not very efficient.
Is there are more skillful way to implement env_dots() in rlang and/or base R?
How can I move the match.call() invocation to within env_dots()?
match.call(sys.function(-1), call = sys.call(-1), expand.dots = FALSE) will indeed work.
Remark — One can't infer the origin environment of the dots from rlang::quos(...), because some quosures won't be endowed with the calling environment (e.g., when an expression is a literal object).
I'm sorry to dig up an old question, but I'm not sure the desired behavior is well-defined. ... is not a single expression; it's a list of expressions. In case of rlang quosures, each of those expressions has their own environment. So what should the environment of the list be?
Furthermore, the ... list itself can be modified. Consider the following example, where g takes its ..., prepends it with an (unevaluated) expression x+3 and passes it onto f.
f <- function(...) {rlang::enquos( ... )}
g <- function(...) {
a <- rlang::quo( x + 3 )
l <- rlang::list2( a, ... )
f(!!!l)
}
b <- rlang::quo( 5 * y )
g( b, 10 )
# [[1]]
# <quosure>
# expr: ^x + 3
# env: 0x7ffd1eca16f0
# [[2]]
# <quosure>
# expr: ^5 * y
# env: global
# [[3]]
# <quosure>
# expr: ^10
# env: empty
Notice that each of the three quosures that make it over to f has their own environment. (As you noted in your question, literals like 10 have an empty environment. This is because the value is the same independent of which environment it's evaluated in.)
Given this scenario, what should the hypothetical env_dots() return when called inside f()?

Difference between <- and <<- [duplicate]

This question already has answers here:
How do you use "<<-" (scoping assignment) in R?
(7 answers)
Closed 7 years ago.
CASE 1:
rm(list = ls())
foo <- function(x = 6){
set <- function(){
x <- x*x}
set()
x}
foo()
# [1] 6
CASE 2:
rm(list = ls())
foo <- function(x = 6){
set <- function(){
x <<- x*x}
set()
x}
foo()
# [1] 36
I read that <<- operator can be used to assign a value to an object in an environment that is different from the current environment. It says that object initialization using <<- can be done to the objects that is not in the current environment. I want to ask which environment's object can be initialized using <<- . In my case the environment is environment of foo function, can <<-initialize the objects outside the function or the object in the current environment? Totally confused when to use <- and when to use <<-.
The operator <<- is the parent scope assignment operator. It is used to make assignments to variables in the nearest parent scope to the scope in which it is evaluated. These assignments therefore "stick" in the scope outside of function calls. Consider the following code:
fun1 <- function() {
x <- 10
print(x)
}
> x <- 5 # x is defined in the outer (global) scope
> fun1()
[1] 10 # x was assigned to 10 in fun1()
> x
[1] 5 # but the global value of x is unchanged
In the function fun1(), a local variable x is assigned to the value 10, but in the global scope the value of x is not changed. Now consider rewriting the function to use the parent scope assignment operator:
fun2 <- function() {
x <<- 10
print(x)
}
> x <- 5
> fun2()
[1] 10 # x was assigned to 10 in fun2()
> x
[1] 10 # the global value of x changed to 10
Because the function fun2() uses the <<- operator, the assignment of x "sticks" after the function has finished evaluating. What R actually does is to go through all scopes outside fun2() and look for the first scope containing a variable called x. In this case, the only scope outside of fun2() is the global scope, so it makes the assignment there.
As a few have already commented, the <<- operator is frowned upon by many because it can break the encapsulation of your R scripts. If we view an R function as an isolated piece of functionality, then it should not be allowed to interfere with the state of the code which calls it. Abusing the <<- assignment operator runs the risk of doing just this.
The <<- operator can be used to assign a variable to the global environment. It's better to use the assign function than <<-. You probably shouldn't need to use <<- though - outputs needed from functions should be returned as objects in R.
Here's an example
f <- function(x) {
y <<- x * 2 # y outside the function
}
f(5) # y = 10
This is equivalent to
f <- function(x) {
x * 2
}
y <- f(5) # y = 10
With the assign function,
f <- function(x) {
assign('y', x*2 envir=.GlobalEnv)
}
f(5) # y = 10

Resources