R specify function environment - r

I have a question about function environments in the R language.
I know that everytime a function is called in R, a new environment E
is created in which the function body is executed. The parent link of
E points to the environment in which the function was created.
My question: Is it possible to specify the environment E somehow, i.e., can one
provide a certain environment in which function execution should happen?

A function has an environment that can be changed from outside the function, but not inside the function itself. The environment is a property of the function and can be retrieved/set with environment(). A function has at most one environment, but you can make copies of that function with different environments.
Let's set up some environments with values for x.
x <- 0
a <- new.env(); a$x <- 5
b <- new.env(); b$x <- 10
and a function foo that uses x from the environment
foo <- function(a) {
a + x
}
foo(1)
# [1] 1
Now we can write a helper function that we can use to call a function with any environment.
with_env <- function(f, e=parent.frame()) {
stopifnot(is.function(f))
environment(f) <- e
f
}
This actually returns a new function with a different environment assigned (or it uses the calling environment if unspecified) and we can call that function by just passing parameters. Observe
with_env(foo, a)(1)
# [1] 6
with_env(foo, b)(1)
# [1] 11
foo(1)
# [1] 1

Here's another approach to the problem, taken directly from http://adv-r.had.co.nz/Functional-programming.html
Consider the code
new_counter <- function() {
i <- 0
function() {
i <<- i + 1
i
}
}
(Updated to improve accuracy)
The outer function creates an environment, which is saved as a variable. Calling this variable (a function) effectively calls the inner function, which updates the environment associated with the outer function. (I don't want to directly copy Wickham's entire section on this, but I strongly recommend that anyone interested read the section entitled "Mutable state". I suspect you could get fancier than this. For example, here's a modification with a reset option:
new_counter <- function() {
i <- 0
function(reset = FALSE) {
if(reset) i <<- 0
i <<- i + 1
i
}
}
counter_one <- new_counter()
counter_one()
counter_one()
counter_two <- new_counter()
counter_two()
counter_two()
counter_one(reset = TRUE)

I am not sure I completely track the goal of the question. But one can set the environment that a function executes in, modify the objects in that environment and then reference them from the global environment. Here is an illustrative example, but again I do not know if this answers the questioners question:
e <- new.env()
e$a <- TRUE
testFun <- function(){
print(a)
}
testFun()
Results in: Error in print(a) : object 'a' not found
testFun2 <- function(){
e$a <- !(a)
print(a)
}
environment(testFun2) <- e
testFun2()
Returns: FALSE
e$a
Returns: FALSE

Related

assigning delayed variables in R

I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}

Define function in environment that changes an object in the environment

I would like to write a function that returns an environment containing a function which assigns the value of an object inside the environment. For example, what I want to do is:
makeenv <- function() {
e <- new.env(parent = .GlobalEnv)
e$x <- 0
e$setx <- function(k) { e$x <- k } # NOT OK
e
}
I would like to fix the e$setx function above. The behavior of the above is weird to me:
e1 <- makeenv()
e1$x
## [1] 0
e1$setx
## function(k) e$x <- k
## <environment: 0x7f96144d8240>
e1$setx(3) # Strangely, this works.
e1$x
## [1] 3
# --------- clone ------------
e2 <- new.env(parent = .GlobalEnv)
e2$x <- e1$x
e2$setx <- e1$setx
e2$x
## [1] 3
# ----- e2$setx() changes e1$x -----
e2$setx(7) # HERE
e2$x # e2$x is not changed.
## [1] 3
e1$x # e1$x is changed instead.
## [1] 7
Could someone please help me understand what is going on here? I especially don't understand why e2$setx(7) sets e1$x to 7 rather than issuing an error. I think I am doing something very wrong here.
I would also like to write a correct function e$setx inside the makeenv function that correctly assigns a value to the x object in the environment e. Would it be possible to have one without using S4 or R6 classes? I know that a function like setx <- function(e,k) { e$x <- k } works, but to me e1$setx(5) looks more intuitive than setx(e1,5) and I would like to investigate this possibility first. Is it possible to have something like e$setx <- function(k) { self$x <- k }, say, where self refers to the e preceding the $?
This page The equivalent of 'this' or 'self' in R looks relevant, but I like to have the effect without using S4 or R6. Or am I trying to do something impossible? Thank you.
You can use local to evaluate the function definition in the environment:
local(etx <- function (k) e$x <- k, envir = e)
Alternatively, you can also change the function’s environment after the fact:
e$setx <- function(k) e$x <- k
environment(e$setx) <- e
… but neither is strictly necessary in your case, since you probably don’t need to create a brand new environment. Instead, you can reuse the current calling environment. Doing this is a very common pattern:
makeenv <- function() {
e <- environment()
x <- 0
setx <- function(k) e$x <- k
e
}
Instead of e$x <- k you could also write e <<- k; that way, you don’t need the e variable at all:
makeenv <- function() {
x <- 0
setx <- function(k) x <<- k
environment()
}
… however, I actually recommend against this: <<- is error-prone because it looks for assignment targets in all parent environments; and if it can’t find any, it creates a new variable in the global environment. It’s better to explicitly specify the assignment target.
However, note that none of the above changes the observed semantics of your code: when you copy the function into a new environment, it retains its old environment! If you want to “move over” the function, you explicitly need to reassign its environment:
e2$setx <- e1$setx
environment(e2$setx) <- e2
… of course writing that entire code manually is pretty error-prone. If you want to create value semantics (“deep copy” semantics) for an environment in R, you should wrap this functionality into a function:
copyenv <- function (e) {
new_env <- list2env(as.list(e, all.names = TRUE), parent = parent.env(e), hash = TRUE)
new_env$e <- new_env
environment(new_env$setx) <- new_env
new_env
}
e2 <- copyenv(e1)
Note that the copyenv function is not trying to be general; it needs to be adapted for other environment structures. There is no good, general way of writing a deep-copy function for environments that handles all cases, since a general function can’t know how to handle self-references (i.e. the e in the above): in your case, you want to preserve self-references (i.e. change them to point to the new environment). But in other cases, the reference might need to point to something else.
This is a general problem of deep copying. That’s why the R serialize function, for instance, has refHook parameter that tells the function how to serialise environment references.

Manipulating enclosing environment of a function

I'm trying to get a better understanding of closures, in particular details on a function's scope and how to work with its enclosing environment(s)
Based on the Description section of the help page on rlang::fn_env(), I had the understanding, that a function always has access to all variables in its scope and that its enclosing environment belongs to that scope.
But then, why isn't it possible to manipulate the contents of the closure environment "after the fact", i.e. after the function has been created?
By means of R's lexical scoping, shouldn't bar() be able to find x when I put into its enclosing environment?
foo <- function(fun) {
env_closure <- rlang::fn_env(fun)
env_closure$x <- 5
fun()
}
bar <- function(x) x
foo(bar)
#> Error in fun(): argument "x" is missing, with no default
Ah, I think I got it down now.
It has to do with the structure of a function's formal arguments:
If an argument is defined without a default value, R will complain when you call the function without specifiying that even though it might technically be able to look it up in its scope.
One way to kick off lexical scoping even though you don't want to define a default value would be to set the defaults "on the fly" at run time via rlang::fn_fmls().
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fun()
}
# No argument at all -> lexical scoping takes over
baz <- function() x
foo(baz)
#> [1] 5
# Set defaults to desired values on the fly at run time of `foo()`
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fmls <- rlang::fn_fmls(fun)
fmls$x <- substitute(get("x", envir = env_enclosing, inherits = FALSE))
rlang::fn_fmls(fun) <- fmls
fun()
}
bar <- function(x) x
foo(bar)
#> [1] 5
I can't really follow your example as I am unfamiliar with the rlang library but I think a good example of a closure in R would be:
bucket <- function() {
n <- 1
foo <- function(x) {
assign("n", n+1, envir = parent.env(environment()))
n
}
foo
}
bar <- bucket()
Because bar() is define in the function environment of bucket then its parent environment is bucket and therefore you can carry some data there. Each time you run it you modify the bucket environment:
bar()
[1] 2
bar()
[1] 3
bar()
[1] 4

Allowing R functions to directly alter the parent environment

I'm trying to figure out how to allow a function to directly alter or create variables in its parent environment, whether the parent environment is the global environment or another function.
For example if I have a function
my_fun <- function(){
a <- 1
}
I would like a call to my_fun() to produce the same results as doing a <- 1.
I know that one way to do this is by using parent.frame as per below but I would prefer a method that doesn't involve rewriting every variable assignment.
my_fun <- function(){
env = parent.frame()
env$a <- 1
}
Try with:
g <- function(env = parent.frame()) with(env, { b <- 1 })
g()
b
## [1] 1
Note that normally it is preferable to pass the variables as return values rather than directly create them in the parent frame. If you have many variables to return you can always return them in a list, e.g. h <- function() list(a = 1, b = 2); result <- h() Now result$a and result$b have the values of a and b.
Also see Function returning more than one value.

Convenience function for exporting objects to the global environment

UPDATE: I have added a variant
of Roland's implementation to the kimisc package.
Is there a convenience function for exporting objects to the global environment, which can be called from a function to make objects available globally?
I'm looking for something like
export(obj.a, obj.b)
which would behave like
assign("obj.a", obj.a, .GlobalEnv)
assign("obj.b", obj.b, .GlobalEnv)
Rationale
I am aware of <<- and assign. I need this to refactor oldish code which is simply a concatenation of scripts:
input("script1.R")
input("script2.R")
input("script3.R")
script2.R uses results from script1.R, and script3.R potentially uses results from both 1 and 2. This creates a heavily polluted namespace, and I wanted to change each script
pollute <- the(namespace)
useful <- result
to
(function() {
pollute <- the(namespace)
useful <- result
export(useful)
})()
as a first cheap countermeasure.
Simply write a wrapper:
myexport <- function(...) {
arg.list <- list(...)
names <- all.names(match.call())[-1]
for (i in seq_along(names)) assign(names[i],arg.list[[i]],.GlobalEnv)
}
fun <- function(a) {
ttt <- a+1
ttt2 <- a+2
myexport(ttt,ttt2)
return(a)
}
print(ttt)
#object not found error
fun(2)
#[1] 2
print(ttt)
#[1] 3
print(ttt2)
#[1] 4
Not tested thoroughly and not sure how "safe" that is.
You can create an environment variable and use it within your export function. For example:
env <- .GlobalEnv ## better here to create a new one :new.env()
exportx <- function(x)
{
x <- x+1
env$y <- x
}
exportx(3)
y
[1] 4
For example , If you want to define a global options(emulate the classic R options) in your package ,
my.options <- new.env()
setOption1 <- function(value) my.options$Option1 <- value
EDIT after OP clarification:
You can use evalq which take 2 arguments :
envir the environment in which expr is to be evaluated
enclos where R looks for objects not found in envir.
Here an example:
env.script1 <- new.env()
env.script2 <- new.env()
evalq({
x <- 2
p <- 3
z <- 5
} ,envir = env.script1,enclos=.GlobalEnv)
evalq({
h <- x +2
} ,envir = env.script2,enclos=myenv.script1)`
You can see that all variable are created within the environnment ( like local)
env.script2$h
[1] 4
env.script1$p
[1] 3
> env.script1$x
[1] 2
First, given your use case, I don't see how an export function is any better than using good (?) old-fashioned <<-. You could just do
(function() {
pollute <- the(namespace)
useful <<- result
})()
which will give the same result as what's in your example.
Second, rather than anonymous functions, it seems better form to use local, which allows you to run involved computations without littering your workspace with various temporary objects.
local({
pollute <- the(namespace)
useful <<- result
})
ETA: If it's important for whatever reason to avoid modifying an existing variable called useful, put an exists check in there. The same applies to the other solutions presented.
local({
.....
useful <- result
if(!exists("useful", globalenv())) useful <<- useful
})

Resources