Recursive default argument reference [duplicate] - r

This question already has an answer here:
Unexpected behaviour with argument defaults
(1 answer)
Closed 5 years ago.
Can anyone explain me what is wrong in this code below. What I thought I am doing here is
a declaration of a global variable a=5
a definition of a function fun which takes one argument which defaults to the aforementioned global variable a
And when I call fun() without any parameters the local variable a becomes a copy of the global variable a and at any point in the function code it takes precedence over the global a (unless I specifically use get("a", envir=parent.frame))
But I must be wrong. Why isn't it allowed?
> a = 5
> fun = function(a=a) { a + 1 }
> fun(4)
[1] 5
> fun()
Error in fun() :
promise already under evaluation: recursive default argument reference or earlier problems?

And when I call fun() without any parameters the local variable a becomes a copy of the global variable a
No: default arguments are evaluated inside the scope of the function. Your code is similar to the following code:
fun = function(a) {
if (missing(a)) a = a
a + 1
}
This makes the scoping clearer and explains why your code doesn’t work.
Note that this is only true for default arguments; arguments that are explicitly passed are (of course) evaluated in the scope of the caller.

Related

Where are function constants stored if a function is created inside another function?

I am using a parent function to generate a child function by returning the function in the parent function call. The purpose of the parent function is to set a constant (y) in the child function. Below is a MWE. When I try to debug the child function I cannot figure out in which environment the variable is stored in.
power=function(y){
return(function(x){return(x^y)})
}
square=power(2)
debug(square)
square(3)
debugging in: square(3)
debug at #2: {
return(x^y)
}
Browse[2]> x
[1] 3
Browse[2]> y
[1] 2
Browse[2]> ls()
[1] "x"
Browse[2]> find('y')
character(0)
If you inspect the type of an R function, you’ll observe the following:
> typeof(square)
[1] "closure"
And that is, in fact, exactly the answer to your question: a closure is a function that carries an environment around.
R also tells you which environment this is (albeit not in a terribly useful way):
> square
function(x){return(x^y)}
<environment: 0x7ffd9218e578>
(The exact number will differ with each run — it’s just a memory address.)
Now, which environment does this correspond to? It corresponds to a local environment that was created when we executed power(2) (a “stack frame”). As the other answer says, it’s now the parent environment of the square function (in fact, in R every function, except for certain builtins, is associated with a parent environment):
> ls(environment(square))
[1] "y"
> environment(square)$y
[1] 2
You can read more about environments in the chapter in Hadley’s Advanced R book.
Incidentally, closures are a core feature of functional programming languages. Another core feature of functional languages is that every expression is a value — and, by implication, a function’s (return) value is the value of its last expression. This means that using the return function in R is both unnecessary and misleading!1 You should therefore leave it out: this results in shorter, more readable code:
power = function (y) {
function (x) x ^ y
}
There’s another R specific subtlety here: since arguments are evaluated lazily, your function definition is error-prone:
> two = 2
> square = power(two)
> two = 10
> square(5)
[1] 9765625
Oops! Subsequent modifications of the variable two are reflected inside square (but only the first time! Further redefinitions won’t change anything). To guard against this, use the force function:
power = function (y) {
force(y)
function (x) x ^ y
}
force simply forces the evaluation of an argument name, nothing more.
1 Misleading, because return is a function in R and carries a slightly different meaning compared to procedural languages: it aborts the current function exectuion.
The variable y is stored in the parent environment of the function. The environment() function returns the current environment, and we use parent.env() to get the parent environment of a particular environment.
ls(envir=parent.env(environment())) #when using the browser
The find() function doesn't seem helpful in this case because it seems to only search objects that have been attached to the global search path (search()). It doesn't try to resolve variable names in the current scope.

Call Arguments of Function inside Function / R language

I have a function:
func <- function (x)
{
arguments <- match.call()
return(arguments)
}
1) If I call my function with specifying argument in the call:
func("value")
I get:
func(x = "value")
2) If I call my function by passing a variable:
my_variable <-"value"
func(my_variable)
I get:
func(x = my_variable)
Why is the first and the second result different?
Can I somehow get in the second call "func(x = "value")"?
I'm thinking my problem is that the Environment inside a function simply doesn't contain values if they were passed by variables. The Environment contains only names of variables for further lookup. Is there a way to follow such reference and get value from inside a function?
In R, when you pass my_variable as formal argument x into a function, the value of my_variable will only be retrieved when the function tries to read x (if it does not use x, my_variable will not be read at all). The same applies when you pass more complicated arguments, such as func(x = compute_my_variable()) -- the call to compute_my_variable will take place when func tries to read x (this is referred to as lazy evaluation).
Given lazy evaluation, what you are trying to do is not well defined because of side effects - in which order would you like to evaluate the arguments? Which arguments would you like to evaluate at all? (note a function can just take an expression for its argument using substitute, but not evaluate it). As a side effect, compute_my_variable could modify something that would impact the result of another argument of func. This can happen even when you only passed variables and constants as arguments (function func could modify some of the variables that will be later read, or even reading a variable such as my_variable could trigger code that would modify some of the variables that will be read later, e.g. with active bindings or delayed assignment).
So, if all you want to do is to log how a function was called, you can use sys.call (or match.call but that indeed expands argument names, etc). If you wanted a more complete stacktrace, you can use e.g. traceback(1).
If for some reason you really wanted values of all arguments, say as if they were all read in the order of match.call, which is the order in which they are declared, you can do it using eval (returns them as list):
lapply(as.list(match.call())[-1], eval)
can't you simply
return paste('func(x =', x, ')')

Inherit R function argument from global variable

I am trying to create a function where the default argument is given by a variable that only exists in the environment temporarily, e.g.:
arg=1:10
test=function(x=arg[3]){2*x}
> test()
[1] 6
The above works fine, as long as arg exists in the function environment. However, if I remove arg:
> rm(arg)
> test()
> Error in test() : object 'arg' not found
Is there a way such that the default argument is taken as 3, even when arg ceases to exist? I have a feeling the correct answer involves some mixture of eval, quote and/or substitute, but I can't seem to find the correct incantation.
The proper way to do it in my opinion would be:
test <- function(x=3) { 2 *x }
and then call it with an argument:
arg<-1:10
test(arg[3])
This way the default value is 3, then you pass it the argument you wish at runtime, if you call it without argument test() it will use the default.
The post above got me on the right track. Using formals:
arg=1:10
test=function(x){x*2}
formals(test)$x=eval(arg[3])
rm(arg)
test()
[1] 6
And that is what I was looking to achieve.

Mutating a variable in a closure [duplicate]

This question already has answers here:
Global and local variables in R
(3 answers)
Closed 8 years ago.
I'm pretty new to R, but coming from Scheme—which is also lexically scoped and has closures—I would expect being able to mutate outer variables in a closure.
E.g., in
foo <- function() {
s <- 100
add <- function() {
s <- s + 1
}
add()
s
}
cat(foo(), "\n") # prints 100 and not 101
I would expect foo() to return 101, but it actually returns 100:
$ Rscript foo.R
100
I know that Python has the global keyword to declare scope of variables (doesn't work with this example, though). Does R need something similar?
What am I doing wrong?
Update
Ah, is the problem that in add I am creating a new, local variable s that shadows the outer s? If so, how can I mutate s without creating a local variable?
Use the <<- operator for assignment in the add() function.
From ?"<<-":
The operators <<- and ->> are normally only used in functions, and cause a search to made through parent environments for an existing definition of the variable being assigned. If such a variable is found (and its binding is not locked) then its value is redefined, otherwise assignment takes place in the global environment. Note that their semantics differ from that in the S language, but are useful in conjunction with the scoping rules of R. See ‘The R Language Definition’ manual for further details and examples.
You can also use assign and define the scope precisely using the envir argument, works the same way as <<- in your add function in this case but makes your intention a little more clear:
foo <- function() {
s <- 100
add <- function() {
assign("s", s + 1, envir = parent.frame())
}
add()
s
}
cat(foo(), "\n")
Of course the better way for this kind of thing in R is to have your function return the variable (or variables) it modifies and explicitly reassigning them to the original variable:
foo <- function() {
s <- 100
add <- function(x) x + 1
s <- add(s)
s
}
cat(foo(), "\n")
Here is one more approach that can be a little safer than the assign or <<- approaches:
foo <- function() {
e <- environment()
s <- 100
add <- function() {
e$s <- e$s + 1
}
add()
s
}
foo()
The <<- assignment can cause problems if you accidentally misspell your variable name, it will still do something, but it will not be what you are expecting and can be hard to find the source of the problem. The assign approach can be tricky if you then want to move your add function to inside another function, or call it from another function. The best approach overall is to not have the functions modify variables outside their own scope and have the function return anything that is important. But when that is not possible, the above method uses lexical scoping to access the environment e, then assigns into the environment so it will always assign specifically into that function, never above or below.

What is the mechanism that makes `+` work when defined by + in empty environment?

Here I create an unevaluated expression:
e2 <- expression(x+10)
If I supply an environment in which x is defined like
env <- as.environment(list(x=20))
eval(e2,env)
R will report an error:
Error in eval(expr, envir, enclos) : could not find function "+"
It is understandable since env is an environment created from scratch, that is, it has no parent environment where + is defined.
However, if I supply + in the list to be converted to an environment like this
env <- as.environment(list(x=20,`+`=function(a,b) {a+b}))
eval(e2,env)
The evaluation works correctly and yields 30.
However, when I define + in the list, it is a binary function whose body also uses + which is defined in {base}. I know that function returns are lazily evaluated in R, but why this could work? If a+b in the function body is lazily evaluated, when I call eval for e2 within env, even though + is defined in this environment which has no parent environment, it should still calls + in itself, which should end up in endless loop. Why is does not happen like this? What is the mechanism here?
When you define the environment here:
env <- as.environment(list(x=20,`+`=function(a,b) {a+b}))
then the function definition is actually defined in the .GlobalEnv (namely, where the definition is executed. You can verify this:
$ environment(env$`+`)
<environment: R_GlobalEnv>
This observation is worth being pondered a little: a function can be a member of environment x, yet belong to y (where “belong to” means that its object lookup uses y rather than x).
And .GlobalEnv knows about +, since it is defined somewhere in its parents (accessible via search()).
Incidentally, if you had used list2env instead of as.environment, your initial code would have worked:
$ env = list2env(list(x = 20))
$ eval(e2, env)
30
The reason is that, unlike as.environment, list2env by default uses the current environment as the new environment’s parent (this can be controlled via the parent argument). as.environment by contrast uses the empty environment (when creating an environment from a list).

Resources