Allowing R functions to directly alter the parent environment - r

I'm trying to figure out how to allow a function to directly alter or create variables in its parent environment, whether the parent environment is the global environment or another function.
For example if I have a function
my_fun <- function(){
a <- 1
}
I would like a call to my_fun() to produce the same results as doing a <- 1.
I know that one way to do this is by using parent.frame as per below but I would prefer a method that doesn't involve rewriting every variable assignment.
my_fun <- function(){
env = parent.frame()
env$a <- 1
}

Try with:
g <- function(env = parent.frame()) with(env, { b <- 1 })
g()
b
## [1] 1
Note that normally it is preferable to pass the variables as return values rather than directly create them in the parent frame. If you have many variables to return you can always return them in a list, e.g. h <- function() list(a = 1, b = 2); result <- h() Now result$a and result$b have the values of a and b.
Also see Function returning more than one value.

Related

assigning delayed variables in R

I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}

Define function in environment that changes an object in the environment

I would like to write a function that returns an environment containing a function which assigns the value of an object inside the environment. For example, what I want to do is:
makeenv <- function() {
e <- new.env(parent = .GlobalEnv)
e$x <- 0
e$setx <- function(k) { e$x <- k } # NOT OK
e
}
I would like to fix the e$setx function above. The behavior of the above is weird to me:
e1 <- makeenv()
e1$x
## [1] 0
e1$setx
## function(k) e$x <- k
## <environment: 0x7f96144d8240>
e1$setx(3) # Strangely, this works.
e1$x
## [1] 3
# --------- clone ------------
e2 <- new.env(parent = .GlobalEnv)
e2$x <- e1$x
e2$setx <- e1$setx
e2$x
## [1] 3
# ----- e2$setx() changes e1$x -----
e2$setx(7) # HERE
e2$x # e2$x is not changed.
## [1] 3
e1$x # e1$x is changed instead.
## [1] 7
Could someone please help me understand what is going on here? I especially don't understand why e2$setx(7) sets e1$x to 7 rather than issuing an error. I think I am doing something very wrong here.
I would also like to write a correct function e$setx inside the makeenv function that correctly assigns a value to the x object in the environment e. Would it be possible to have one without using S4 or R6 classes? I know that a function like setx <- function(e,k) { e$x <- k } works, but to me e1$setx(5) looks more intuitive than setx(e1,5) and I would like to investigate this possibility first. Is it possible to have something like e$setx <- function(k) { self$x <- k }, say, where self refers to the e preceding the $?
This page The equivalent of 'this' or 'self' in R looks relevant, but I like to have the effect without using S4 or R6. Or am I trying to do something impossible? Thank you.
You can use local to evaluate the function definition in the environment:
local(etx <- function (k) e$x <- k, envir = e)
Alternatively, you can also change the function’s environment after the fact:
e$setx <- function(k) e$x <- k
environment(e$setx) <- e
… but neither is strictly necessary in your case, since you probably don’t need to create a brand new environment. Instead, you can reuse the current calling environment. Doing this is a very common pattern:
makeenv <- function() {
e <- environment()
x <- 0
setx <- function(k) e$x <- k
e
}
Instead of e$x <- k you could also write e <<- k; that way, you don’t need the e variable at all:
makeenv <- function() {
x <- 0
setx <- function(k) x <<- k
environment()
}
… however, I actually recommend against this: <<- is error-prone because it looks for assignment targets in all parent environments; and if it can’t find any, it creates a new variable in the global environment. It’s better to explicitly specify the assignment target.
However, note that none of the above changes the observed semantics of your code: when you copy the function into a new environment, it retains its old environment! If you want to “move over” the function, you explicitly need to reassign its environment:
e2$setx <- e1$setx
environment(e2$setx) <- e2
… of course writing that entire code manually is pretty error-prone. If you want to create value semantics (“deep copy” semantics) for an environment in R, you should wrap this functionality into a function:
copyenv <- function (e) {
new_env <- list2env(as.list(e, all.names = TRUE), parent = parent.env(e), hash = TRUE)
new_env$e <- new_env
environment(new_env$setx) <- new_env
new_env
}
e2 <- copyenv(e1)
Note that the copyenv function is not trying to be general; it needs to be adapted for other environment structures. There is no good, general way of writing a deep-copy function for environments that handles all cases, since a general function can’t know how to handle self-references (i.e. the e in the above): in your case, you want to preserve self-references (i.e. change them to point to the new environment). But in other cases, the reference might need to point to something else.
This is a general problem of deep copying. That’s why the R serialize function, for instance, has refHook parameter that tells the function how to serialise environment references.

R, pass-by-value inside a function

Suppose you define a function in R using the following code:
a <- 1
f <- function(x) x + a
If you latter redefine a you will change the function f. (So, f(1) = 2 as given but if you latter on redefine a =2 then f(1) = 3. Is there a way to force R to use the value of a at the time it compiles the function? (That is, f would not change with latter redefinitions of a).
The above is the shortest case I could thought of that embodies the problem I am having. More specifically, as requested, my situation is:
I am working with a bunch of objects I am calling "person". Each person is defined as a probability distribution that depends on a n dimensional vector $a$ and a n dimensional vector of constrains w (the share of wealth).
I want to create a "society" with N people, that is a list of N persons. To that end, I created two n by N matrices A and W. I now loop over 1 to N to create the individuals.
Society <- list()
### doesn't evaluate theta at the time, but does w...
for (i in 1:Npeople) {
w <- WealthDist[i,]
u <- function(x) prod(x^A[i,])
P <- list(u,w)
names(P) <- c("objective","w")
Society[[length(Society)+1]] <- P
}
w gets is pass-by-value, so each person gets the right amount of wealth. But A is pass-by-reference -- everybody is being assigned the same function u (namely, the function using i = N)
To finish it up, the next steps are to get the Society and, via two optimizations get an "equilibrium point".
You can create a function which uses a locked binding and creates a function to complete your purpose. The former value of a will be used for w which will be stored in the environment of the function and will not be replaced by further values changes of a.
a <- 1
j <- new.env() # create a new environment
create.func <- function () {
j$w <<- a
function (x) {
x+ j$w
}
}
f <- create.func()
a <- 2
f(2)
[1] 3 # if w was changed this should be 4
Credits to Andrew Taylor (see comments)
EDIT: BE CAREFUL: f will change if you call create.func, even if you do not store it into f. To avoid this, you could write this code (it clearly depends on what you want).
a <- 1
create.func <- function (x) {
j <- new.env()
j$w <- a
function (x) {
x + j$w
}
}
f <- create.func()
f(1)
[1] 2
a <- 2
q <- create.func()
q(1)
[1] 3
f(1)
[1] 2
EDIT 2: Lazy evaluation doesn't apply here because a is evaluated by being set to j$w. If you had used it as an argument say:
function(a)
function(x)
#use a here
you would have to use force before defining the second function, because then it wouldn't be evaluated.
EDIT 3: I removed the foo <- etc. The function will return as soon as it is declared, since you want it to be similar to the code factories defined in your link.
EDIT by OPJust to add to the accepted answer that in spirit of
Function Factory in R
the code below works:
funs.gen <- function(n) {
force(n)
function(x) {
x + n
}
}
funs = list()
for (i in seq(length(names))) {
n = names[i]
funs[[n]] = funs.gen(i)
}
R doesn't do pass by reference; everything is passed to functions by value. As you've noticed, since a is defined in the global environment, functions which reference a are referencing the global value of a, which is subject to change. To ensure that a specific value of a is used, you can use it as a parameter in the function.
f <- function(x, a = 1) {
x + a
}
This defines a as a parameter that defaults to 1. The value of a used by the function will then always be the value passed to the function, regardless of whether a is defined in the global environment.
If you're going to use lapply(), you simply pass a as a parameter to lapply().
lapply(X, f, a = <value>)
Define a within f
f <- function(x) {a<-1;x + a}

Anonymous passing of variables from current environment to subfunction calls

The function testfun1, defined below, does what I want it to do. (For the reasoning of all this, see the background info below the code example.) The question I wanted to ask you is why what I tried in testfun2 doesn't work. To me, both appear to be doing the exact same thing. As shown by the print in testfun2, the evaluation of the helper function inside testfun2 takes place in the correct environment, but the variables from the main function environment get magically passed to the helper function in testfun1, but not in testfun2. Does anyone of you know why?
helpfun <- function(){
x <- x^2 + y^2
}
testfun1 <- function(x,y){
xy <- x*y
environment(helpfun) <- sys.frame(sys.nframe())
x <- eval(as.call(c(as.symbol("helpfun"))))
return(list(x=x,xy=xy))
}
testfun1(x = 2,y = 1:3)
## works as intended
eval.here <- function(fun){
environment(fun) <- parent.frame()
print(environment(fun))
eval(as.call(c(as.symbol(fun))))
}
testfun2 <- function(x,y){
print(sys.frame(sys.nframe()))
xy <- x*y
x <- eval.here("helpfun")
return(list(x=x,xy=xy))
}
testfun2(x = 2,y = 1:3)
## helpfun can't find variable 'x' despite having the same environment as in testfun1...
Background info: I have a large R code in which I want to call helperfunctions inside my main function. They alter variables of the main function environment. The purpose of all this is mainly to unclutter my code. (Main function code is currently over 2000 lines, with many calls to various helperfunctions which themselves are 40-150 lines long...)
Note that the number of arguments to my helper functions is very high, so that the traditional explicit passing of function arguments ( "helpfun(arg1 = arg1, arg2 = arg2, ... , arg50 = arg50)") would be cumbersome and doesnt yield the uncluttering of the code that I am aiming for. Therefore, I need to pass the variables from the parent frame to the helper functions anonymously.
Use this instead:
eval.here <- function(fun){
fun <- get(fun)
environment(fun) <- parent.frame()
print(environment(fun))
fun()
}
Result:
> testfun2(x = 2,y = 1:3)
<environment: 0x0000000013da47a8>
<environment: 0x0000000013da47a8>
$x
[1] 5 8 13
$xy
[1] 2 4 6

R specify function environment

I have a question about function environments in the R language.
I know that everytime a function is called in R, a new environment E
is created in which the function body is executed. The parent link of
E points to the environment in which the function was created.
My question: Is it possible to specify the environment E somehow, i.e., can one
provide a certain environment in which function execution should happen?
A function has an environment that can be changed from outside the function, but not inside the function itself. The environment is a property of the function and can be retrieved/set with environment(). A function has at most one environment, but you can make copies of that function with different environments.
Let's set up some environments with values for x.
x <- 0
a <- new.env(); a$x <- 5
b <- new.env(); b$x <- 10
and a function foo that uses x from the environment
foo <- function(a) {
a + x
}
foo(1)
# [1] 1
Now we can write a helper function that we can use to call a function with any environment.
with_env <- function(f, e=parent.frame()) {
stopifnot(is.function(f))
environment(f) <- e
f
}
This actually returns a new function with a different environment assigned (or it uses the calling environment if unspecified) and we can call that function by just passing parameters. Observe
with_env(foo, a)(1)
# [1] 6
with_env(foo, b)(1)
# [1] 11
foo(1)
# [1] 1
Here's another approach to the problem, taken directly from http://adv-r.had.co.nz/Functional-programming.html
Consider the code
new_counter <- function() {
i <- 0
function() {
i <<- i + 1
i
}
}
(Updated to improve accuracy)
The outer function creates an environment, which is saved as a variable. Calling this variable (a function) effectively calls the inner function, which updates the environment associated with the outer function. (I don't want to directly copy Wickham's entire section on this, but I strongly recommend that anyone interested read the section entitled "Mutable state". I suspect you could get fancier than this. For example, here's a modification with a reset option:
new_counter <- function() {
i <- 0
function(reset = FALSE) {
if(reset) i <<- 0
i <<- i + 1
i
}
}
counter_one <- new_counter()
counter_one()
counter_one()
counter_two <- new_counter()
counter_two()
counter_two()
counter_one(reset = TRUE)
I am not sure I completely track the goal of the question. But one can set the environment that a function executes in, modify the objects in that environment and then reference them from the global environment. Here is an illustrative example, but again I do not know if this answers the questioners question:
e <- new.env()
e$a <- TRUE
testFun <- function(){
print(a)
}
testFun()
Results in: Error in print(a) : object 'a' not found
testFun2 <- function(){
e$a <- !(a)
print(a)
}
environment(testFun2) <- e
testFun2()
Returns: FALSE
e$a
Returns: FALSE

Resources