Following the recent discussions here (e.g. 1, 2 ) I am now using environments in some of my code. My question is, how do I create functions that modify environments according to its arguments? For example:
y <- new.env()
with(y, x <- 1)
f <- function(env,z) {
with(env, x+z)
}
f(y,z=1)
throws
Error in eval(expr, envir, enclos) : object 'z' not found
I am using environments to keep concurrently two sets of simulations apart (without refactoring my code, which I wrote for a single set of experiments).
The simplest solution is to use the environment when referencing the object:
y <- new.env()
y$x <- 1
f <- function(env,z) {
env$x+z
}
f(y,z=1)
You would need to assign z to your environment as well.
y <- new.env()
with(y, x <- 1)
f <- function(env,z) {
assign("z", z, envir=env)
with(env, x+z)
}
f(y,z=1)
One other option would be to attach your environment so that the variables can now be used directly.
y <- new.env()
with(y, x <- 1)
f <- function(env,z) {
attach(env)
y <- x + z
detach(env)
y
}
f(y,z=1)
This latter solution is powerful because it means you can use any object from any attached environment within your new environment, but it also means that you need to be very careful about what has been assigned globally.
Edit:
This is interesting, and I don't entirely understand the behavior (i.e. why z is not in the scope of the with call). It has something to do with the creation of the environment originally that is causing it to be outside the scope of the function, because this version works:
f <- function(z) {
y <- new.env()
with(y, x <- 1)
with(y, x+z)
}
f(y,z=1)
You only need to make one change to make your example work - redefine your function to use substitute() to 'fix' the desired values within the scope of f():
f <- function(env,z) {
eval(substitute(x+z,list(z=z)), env)
}
This can quickly get murky especially since you can even include assignment statements within substitute() (for instance, replace x+z with y <- x+z, not that this is entirely relevant here) but that choice can be made by the developer...
Additionally, you can replace list(z=z) in the substitution expression above with environment() (e.g., substitute(x+z,environment())) as long as you don't have conflicting variable names between those passed to f() and those residing in your 'env', but you may not want to take this too far.
Edit: Here are two other ways, the first of which is only meant to show the flexibility in manipulating environments and the second is more reasonable to actually use.
1) modify the enclosing environment of 'env' (but change it back to original value before exiting function):
f <- function(env,z) {
e <- environment(env)
environment(env) <- environment()
output <- with(env,x+z)
environment(env) <- e
output
}
2) Force evaluation of 'z' in current environment of the function (using environment()) rather than letting it remain a free variable after evaluation of the expression, x+z, in 'env'.
f <- function(env,z) {
with(environment(),with(env,x+z))
}
Depending on your desired resolution order, in case of conflicting symbol-value associations - e.g., if you have 'x' defined in both your function environment and the environment you created, 'y' (which value of 'x' do you want it to assume?) - you can instead define the function body to be with(env,with(environment(),x+z)).
y <- new.env()
with(y, x <- 1)
f <- function(env,z) {
with(env, x+z)
}
f(y,z=1)
mind the parentheses:) The following will work:
with(env, x)+z
Related
I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}
The following code is supposed to change the value of the enclosed variable some.var calling the function set.var. The latter calls itself the outer function g, whose environment is changed to the parent environment of set.var
new.obj <- function(){
some.var = NULL
set.var <- function(...) {
environment(g) <- parent.frame()
g(x="some.var", ...)
}
get.var <- function(){some.var}
return(list(set.var=set.var, get.var=get.var))
}
g <- function(x) assign(x,1)
However, obj<-new.obj(); obj$set.var(); obj$get.var() returns NULL. What goes wrong here and how can the behaviour be fixed?
The reason why I am considering this construction is, that I would like to reuse the code within g in different closures. Hence, it should be placed outside of these.
I suspect that you are making things more complicated than necessary. Take a look at this example and comment if you need to achieve something that this cannot do:
do_the_job <- function(x) {
return(x * 1000)
}
wrapper_function <- function(y) {
return( do_the_job(y) )
}
my_value <- 5
my_new_value <- wrapper_function(my_value)
Not entirely sure what I'm doing here myself.
I hope it helps/inspires you:
new.obj <- function(){
env1 <- new.env()
env1$some.var = NULL
f <- function() {
environment(g) <- parent.frame()
g(x="some.var", envir = env1)
}
get.var <- function(){ env1$some.var }
return(list(f=f, get.var=get.var))
}
g <- function(x, ...) assign(x, "hihi_changed", envir = ...)
obj<-new.obj(); obj$get.var() null is returned as we expect.
obj$f(); obj$get.var() function g is called eventually that changes some.var.
My trick is to add the variable some.var to a new environment and always refer to that env1 environment.
So in function g(), always use the ellipsis to refer to the new.obj environment env1, where currently some.var lives.
Hope this keeps you going.
I'm trying to use the curve3d function in the emdbook-package to create a contour plot of a function defined locally inside another function as shown in the following minimal example:
library(emdbook)
testcurve3d <- function(a) {
fn <- function(x,y) {
x*y*a
}
curve3d(fn(x,y))
}
Unexpectedly, this generates the error
> testcurve3d(2)
Error in fn(x, y) : could not find function "fn"
whereas the same idea works fine with the more basic curve function of the base-package:
testcurve <- function(a) {
fn <- function(x) {
x*a
}
curve(a*x)
}
testcurve(2)
The question is how curve3d can be rewritten such that it behaves as expected.
You can temporarily attach the function environment to the search path to get it to work:
testcurve3d <- function(a) {
fn <- function(x,y) {
x*y*a
}
e <- environment()
attach(e)
curve3d(fn(x,y))
detach(e)
}
Analysis
The problem comes from this line in curve3d:
eval(expr, envir = env, enclos = parent.frame(2))
At this point, we appear to be 10 frames deep, and fn is defined in parent.frame(8). So you can edit the line in curve3d to use that, but I'm not sure how robust this is. Perhaps parent.frame(sys.nframe()-2) might be more robust, but as ?sys.parent warns there can be some strange things going on:
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
Beware of the effect of lazy evaluation: these two functions look at
the call stack at the time they are evaluated, not at the time they
are called. Passing calls to them as function arguments is unlikely to
be a good idea.
The eval - parse solution bypasses some worries about variable scope. This passes the value of both the variable and function directly as opposed to passing the variable or function names.
library(emdbook)
testcurve3d <- function(a) {
fn <- eval(parse(text = paste0(
"function(x, y) {",
"x*y*", a,
"}"
)))
eval(parse(text = paste0(
"curve3d(", deparse(fn)[3], ")"
)))
}
testcurve3d(2)
I have found other solution that I do not like very much, but maybe it will help you.
You can create the function fn how a call object and eval this in curve3d:
fn <- quote((function(x, y) {x*y*a})(x, y))
eval(call("curve3d", fn))
Inside of the other function, the continuous problem exists, a must be in the global environment, but it is can fix with substitute.
Example:
testcurve3d <- function(a) {
fn <- substitute((function(x, y) {
c <- cos(a*pi*x)
s <- sin(a*pi*y/3)
return(c + s)
})(x, y), list(a = a))
eval(call("curve3d", fn, zlab = "fn"))
}
par(mfrow = c(1, 2))
testcurve3d(2)
testcurve3d(5)
I am working with some large data sets and have constructed a negative log likelihood function and associated gradient to pass to an optimisation routine. Both the functions require a vector of parameters and the passing of the large data sets into them.
The optimisation routine will call the two functions multiple times and the speed at which the two functions execute at is most of the bottleneck in the process. I dont want to pass the data directly to function as I was under the impression that some copying by R may occur.
I have considered:
# some large data sets
a<-1; b<-2
# place the data sets in an environment
varSpace <- new.env()
assign('c', a, envir = varSpace)
assign('d', b, envir = varSpace)
dFunA <- function(x){
x <- x + a+b
x
}
dFunB <- function(x, envir = varSpace){
x <- x + get('c', envir) + get('d', envir)
x
}
dFunC <- function(x, envir = varSpace){
with(envir,{
x <- x + c + d
})
x
}
dFunD <- function(x, envir = varSpace){
attach(envir)
on.exit({detach(envir)})
x <- x + c + d
x
}
> dFunA(1)
[1] 4
> dFunB(1)
[1] 4
> dFunC(1)
Error in eval(expr, envir, enclos) : object 'x' not found
> dFunD(1)
[1] 4
Approach A requires the data sets to be further up the calling stack. It works but I would like a tidier approach.
Approach B requires the use of get and calling the environment where the data has been placed.
Approach C doesnt work .
Approach D appears to work but I am mindful of ?detach which carries the good practice comment Use of attach/detach is best avoided in functions.
Any help and advice would be appreciated.
You don't need to fiddle around with assign, get or attach. Just set the environment for your functions to the one that you've created.
dFunA <- function(x)
x + a + b
varSpace <- new.env()
varSpace$a <- 1
varSpace$b <- 2
environment(dFunA) <- varSpace
... assuming that this is necessary in the first place. As Aaron commented, R is copy-on-write, so unless you're modifying a or b they're not likely to be copied.
Suppose we have this functions in a R package.
prova <- function() {
print(attr(prova, 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
'myattr<-' <- function(x, value) {
attr(x, 'myattr') <- value
x
}
myattr <- function(x) attr(x, 'myattr')
So, I install the package and then I test it. This is the result:
prova()
# NULL
# NULL
myattr(prova) <- 'ciao' # setting 'ciao' for 'myattr' attribute
prova()
# NULL
# NULL # Why NULL here ?
myattr(prova)
# [1] "ciao"
attr(prova, 'myattr')
# [1] "ciao"
The question is: how to get the attribute of the function from within itself?
Inside the function itself I cannot get its attribute, as demonstrated by the example.
I suppose that the solution will be of the serie "computing on the language" (match.call()[[1L]], substitute, environments and friends). Am I wrong?
I think that the important point here is that this function is in a package (so, it has its environment and namespace) and I need its attribute inside itself, in the package, not outside.
you can use get with the envir argument.
prova <- function() {
print(attr(get("prova", envir=envir.prova), 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
eg:
envir.prova <- environment()
prova()
# NULL
# NULL
myattr(prova) <- 'ciao'
prova()
# [1] "ciao"
# [1] "ciao"
Where envir.prova is a variable whose value you set to the environment in which prova is defined.
Alternatively you can use get(.. envir=parent.frame()), but that is less reliable as then you have to track the calls too, and ensure against another object with the same name between the target environment and the calling environment.
Update regarding question in the comments:
regarding using parent.frame() versus using an explicit environment name: parent.frame, as the name suggests, goes "up one level." Often, that is exactly where you want to go, so that works fine. And yet, even when your goal is get an object in an environment further up, R searches up the call stack until it finds the object with the matching name. So very often, parent.frame() is just fine.
HOWEVER if there are multiple calls between where you are invoking parent.frame() and where the object is located AND in one of the intermediary environments there exists another object with the same name, then R will stop at that intermediary environment and return its object, which is not the object you were looking for.
Therefore, parent.frame() has an argument n (which defaults to 1), so that you can tell R to begin it's search at n levels back.
This is the "keeping track" that I refer to, where the developer has to be mindful of the number of calls in between. The straightforward way to go about this is to have an n argument in every function that is calling the function in question, and have that value default to 1. Then for the envir argument, you use: get/assign/eval/etc (.. , envir=parent.frame(n=n) )
Then if you call Func2 from Func1, (both Func1 and Func2 have an n argument), and Func2 is calling prova, you use:
Func1 <- function(x, y, ..., n=1) {
... some stuff ...
Func2( <some, parameters, etc,> n=n+1)
}
Func2 <- function(a, b, c, ..., n=1) {
.... some stuff....
eval(quote(prova()), envir=parent.frame(n=n) )
}
As you can see, it is not complicated but it is * tedious* and sometimes what seems like a bug creeps in, which is simply forgetting to carry the n over.
Therefore, I prefer to use a fixed variable with the environment name.
The solution that I found is:
myattr <- function(x) attr(x, 'myattr')
'myattr<-' <- function(x, value) {
# check that x is a function (e.g. the prova function)
# checks on value (e.g. also value is a function with a given precise signature)
attr(x, 'myattr') <- value
x
}
prova <- function(..., env = parent.frame()) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], env)
# print(eval(as.call(c(myattr, this)), env)) # alternative
print(myattr(this))
# print(attr(this, 'myattr')
invisible(TRUE)
}
I want to thank #RicardoSaporta for the help and the clarification about keeping tracks of the calls.
This solution doesn't work when e.g. myattr(prova) <- function() TRUE is nested in func1 while prova is called in func2 (that it's called by func1). Unless you do not properly update its parameter env ...
For completeness, following the suggestion of #RicardoSaporta, I slightly modified the prova function:
prova <- function(..., pos = 1L) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], parent.frame(n = pos)
print(myattr(this))
# ...
}
This way, it works also when nested, if the the correct pos parameter is passed in.
With this modification it is easier to go to fish out the environment in which you set the attribute on the function prova.
myfun1 <- function() {
myattr(prova) <- function() print(FALSE)
myfun2(n = 2)
}
myfun2 <- function(n) {
prova(pos = n)
}
myfun1()
# function() print(FALSE)
# <environment: 0x22e8208>