I just ran into something odd, which hopefully someone here can shed some light on. Basically, when a function has an argument whose default value is the argument's name, strange things happen (well, strange to me anyway).
For example:
y <- 5
f <- function(x=y) x^2
f2 <- function(y=y) y^2
I would consider f and f2 to be equivalent; although they use different variable names internally, they should both pick up the y object in the global environment to use as the default. However:
> f()
[1] 25
> f2()
Error in y^2 : 'y' is missing
Not sure why that is happening.
Just to make things even more interesting:
f3 <- function(y=y) y$foo
> f3()
Error in f3() :
promise already under evaluation: recursive default argument reference or earlier problems?
I expected f3 to throw an error, but not that one!
This was tested on R 2.11.1, 2.12.2, and 2.14, on 32-bit Windows XP SP3. Only the standard packages loaded.
Default arguments are evaluated inside the scope of the function. Your f2 is similar (almost equivalent) to the following code:
f2 = function(y) {
if (missing(y)) y = y
y^2
}
This makes the scoping clearer and explains why your code doesn’t work.
Note that this is only true for default arguments; arguments that are explicitly passed are (of course) evaluated in the scope of the caller.
Lazy evaluation, on the other hand, has nothing to do with this: all arguments are lazily evaluated, but calling f2(y) works without complaint. To show that lazy evaluation always happens, consider this:
f3 = function (x) {
message("x has not been evaluated yet")
x
}
f3(message("NOW x has been evaluated")
This will print, in this order:
x has not been evaluated yet
NOW x has been evaluated
Related
I have been grappling with the application of promise since I first read about it on Advanced R. It is mentioned that a promise is a data structure that powers lazy evaluation. The concept of lazy evaluation is quite clear as function arguments are only evaluated whenever they are accessed. However, in some examples I just cannot discover the presence of a promise and how/where it is evaluated. Consider the following example from Advanced R:
y <- 10
h02 <- function(x) {
y <- 100
x + 1
}
h02(y)
[1] 11
It returns 11 instead of 101, as apparently when we assign a variable like y which already exists in the global environment to x it is bound and evaluated outside of the function.
So I would like to know is a promise always involved some sort of assignment or every expression could be a promise and how we can detect their presence.
It is mentioned that they are evaluated in the calling environment of a function. So the second question is are their evaluation environments different from normal arguments as user-defined arguments are evaluated outside of the function.
There is also another example which I cannot understand why it involves lazy evaluation, and we only see Calculating... once.
double <- function(x) {
message("Calculating...")
x * 2
}
h03 <- function(x) {
c(x, x)
}
h03(double(20))
Calculating...
[1] 40 40
I am so sorry if I sound a little bit confused here, I got the point but it has never quite sunk in and I wanted to ask for a little bit of explanation for which I am very grateful.
Thank you very much in advance
When an object such as y within h02 is created in a function it is created in the local execution frame/environment of that function (a new frame is created each time the function is run). The created object is distinct from an object of the same name in any other environment.
Regarding h03 once a promise is forced, i.e. evaluated, its value is stored in the promise's value component and its evaled component is set to TRUE so that upon further accesses it does not have to be evaluated again.
Arguments of functions are promises but normally not other objects. Use pryr to inspect objects.
library(pryr)
f <- function(x) {
z <- 1
cat("is_promise(z):", is_promise(z), "\n")
cat("is_promise(x):", is_promise(x), "\n")
cat("before forcing - promise_info(x):\n")
print(promise_info(x))
force(x)
cat("after forcing - promise_info(x):\n")
print(promise_info(x))
delayedAssign("w", 3)
cat("is_promise(w):", is_promise(w), "\n")
invisible()
}
a <- 3
f(a)
giving:
is_promise(z): FALSE
is_promise(x): TRUE
before forcing - promise_info(x):
$code
a
$env
<environment: R_GlobalEnv>
$evaled
[1] FALSE
$value
NULL
after forcing - promise_info(x):
$code
a
$env
NULL
$evaled
[1] TRUE
$value
[1] 3
is_promise(w): TRUE
I am trying to understand how R scopes variables inside a function. Why is the output 12? Why not 4? How are a & b assigned here
I am learning R. Please explain with some references
f1 <- function(a = {b <- 10; 2}, b = 2) {
a+b
}
f1()
This is explained in section 4.3.3 of the R Language manual.
When a function is called, each formal argument is assigned a promise
in the local environment of the call with the expression slot
containing the actual argument (if it exists) and the environment slot
containing the environment of the caller. If no actual argument for a
formal argument is given in the call and there is a default
expression, it is similarly assigned to the expression slot of the
formal argument, but with the environment set to the local
environment.
The process of filling the value slot of a promise by evaluating the
contents of the expression slot in the promise’s environment is called
forcing the promise. A promise will only be forced once, the value
slot content being used directly later on.
Nothing has a value until the sum starts getting computed. First a is required and so it's expression is evaluated. The promise for b is lost as it gets assigned a value directly during the forcing of a and so the actual b assignment promise in the function definition is not evaluated at all.
If the order is the other way round, you see a different result:
f2 <- function(a = 2, b = {a <- 10; 2}) {
a+b
}
f2()
[1] 4
However, note that the value of a will be 10 at end of the function, but 2 when it is required during the sum. Both promises get evaluated here.
If the order of the sum is reversed in f1 to instead be b+a you would find similar behaviour to f2.
Earlier in that section there is a general warning that side-effects should be avoided in assignments because they is no guarantee they will be evaluated.
R has a form of lazy evaluation of function arguments. Arguments are
not evaluated until needed. It is important to realize that in some
cases the argument will never be evaluated. Thus, it is bad style to
use arguments to functions to cause side-effects. While in C it is
common to use the form, foo(x = y) to invoke foo with the value of y
and simultaneously to assign the value of y to x this same style
should not be used in R. There is no guarantee that the argument will
ever be evaluated and hence the assignment may not take place.
Refer https://www.rdocumentation.org/packages/base/versions/3.5.2/topics/assignOpsenter link description here
Try this
f1 <- function(a = {b <= 10; 2}, b = 2) {
a+b
}
f1()
or
f1 <- function(a = {b <<- 10; 2}, b = 2) {
a+b
}
f1()
I'm currently having some issues understanding the behavior of the eval function- specifically the enclos/third argument when an argument isn't supplied to it/ the default argument parent.fame() is used.
name <- function(x){
print(substitute(x))
t <- substitute(x)
eval(t, list(a=7), parent.frame())
}
z <-5
name(a+z)
# returns 12, makes sense because this amounts to
# eval(a+z, list(a=7), glovalenv())
# however the return here makes no sense to me
name2 <- function(x){
print(substitute(x))
t <- substitute(x)
eval(t, list(a=7)) # third/enclosure argument is left missing
}
z <-5
name2(a+z)
# Also returns 12
I'm having trouble understanding why the second call returns 12. According to my understanding of R, the second call should result in an error because
1) eval's default third argument enclos= parent.frame(), which is not specified.
2) Therefore, parent.frame() is evaluated in the local environment of eval. This is confirmed by Hadley in When/how/where is parent.frame in a default argument interpreted?
3) Thus, the last expression ought to resolve to eval(a+z, list(a=7), executing environment of name)
4) This should return an error because z is not defined in the executing environment of name nor in list(a=7).
Can someone please explain what's wrong with this logic?
z will be available inside the function since it's defined in .GlobalEnv.
Simply put,
name <- function(x) {
print(z)
}
z <- 5
name(z)
# [1] 5
So while a is still unknown until eval(t, list(a=7)), z is already available. If z is not defined inside name, it will be looked for in .GlobalEnv. What might be counterintuitive is that (a+z) is undefined unless you specify an environment for a. But for z, there is no need to do so.
Background
I’m in the process of creating a shortcut for lambdas, since the repeated use of function (…) … clutters my code considerably. As a remedy, I’m trying out alternative syntaxes inspired by other languages such as Haskell, as far as this is possible in R. Simplified, my code looks like this:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = parent.frame())
}
This allows the following code:
f(x = x * 2)(5) # => 10
f(x, y = x + y)(1, 2) # => 3
etc.
Of course the real purpose is to use this with higher-order functions1:
Map(f(x = x * 2), 1 : 10)
The problem
Unfortunately, I sometimes have to nest higher-order functions and then it stops working:
f(x = Map(f(y = x + y), 1:2))(10)
yields “Error in eval(expr, envir, enclos): object x not found”. The conceptually equivalent code using function instead of f works. Furthermore, other nesting scenarios also work:
f(x = f(y = x + y)(2))(3) # => 5
I’m suspecting that the culprit is the parent environment of the nested f inside the map: it’s the top-level environment rather than the outer f’s. But I have no idea how to fix this, and it also leaves me puzzled that the second scenario above works. Related questions (such as this one) suggest workarounds which are not applicable in my case.
Clearly I have a gap in my understanding of environments in R. Is what I want possible at all?
1 Of course this example could simply be written as (1 : 10) * 2. The real application is with more complex objects / operations.
The answer is to attach parent.frame() to the output function's environment:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
e <- parent.frame()
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = e)
}
Hopefully someone can explain well why this works and not yours. Feel free to edit.
Great question.
Why your code fails
Your code fails because eval()'s supplied enclos= argument does not point far enough up the call stack to reach the environment in which you are wanting it to next search for unresolved symbols.
Here is a partial diagram of the call stack from the bottom of which your call to parent.frame() occurs. (To make sense of this, it's important to keep in mind that the function call from which parent.frame() is here being called is not f(), but a call the anonymous function returned by f() (let's call it fval)).
## Note: E.F. = "Evaluation Frame"
## fval = anonymous function returned as value of nested call to f()
f( <------------------------- ## E.F. you want, ptd to by parent.frame(n=3)
Map(
mapply( <-------------------- ## E.F. pointed to by parent.frame(n=1)
fval( |
parent.frame(n=1 |
In this particular case, redefining the function returned by f() to call parent.frame(n=3) rather than parent.frame(n=1) produces working code, but that's not a good general solution. For instance, if you wanted to call f(x = mapply(f(y = x + y), 1:2))(10), the call stack would then be one step shorter, and you'd instead need parent.frame(n=2).
Why flodel's code works
flodel's code provides a more robust solution by calling parent.frame() during evaluation of the inner call to f in the nested chain f(Map(f(), ...)) (rather than during the subsequent evaluation of the anonymous function fval returned by f()).
To understand why his parent.frame(n=1) points to the appropriate environment, it's important to recall that in R, supplied arguments are evaluated in the the evaluation frame of the calling function. In the OP's example of nested code, the inner f() is evaluated during the processing of Map()'s supplied arguments, so it's evaluation environment is that of the function calling Map(). Here, the function calling Map() is the outer call to f(), and its evaluation frame is exactly where you want eval() to next be looking for symbols:
f( <--------------------- ## Evaluation frame of the nested call to f()
Map(f( |
parent.frame(n=1 |
Consider the following simple function:
f <- function(x, value){print(x);print(substitute(value))}
Argument x will eventually be evaluated by print, but value never will. So we can get results like this:
> f(a, a)
Error in print(x) : object 'a' not found
> f(3, a)
[1] 3
a
> f(1+1, 1+1)
[1] 2
1 + 1
> f(1+1, 1+"one")
[1] 2
1 + "one"
Everything as expected.
Now consider the same function body in a replacement function:
'g<-' <- function(x, value){print(x);print(substitute(value))}
(the single quotes should be fancy quotes)
Let's try it:
> x <- 3
> g(x) <- 4
[1] 3
[1] 4
Nothing unusual so far...
> g(x) <- a
Error: object 'a' not found
This is unexpected. Name a should be printed as a language object.
> g(x) <- 1+1
[1] 4
1 + 1
This is ok, as x's former value is 4. Notice the expression passed unevaluated.
The final test:
> g(x) <- 1+"one"
Error in 1 + "one" : non-numeric argument to binary operator
Wait a minute... Why did it try to evaluate this expression?
Well the question is: bug or feature? What is going on here? I hope some guru users will shed some light about promises and lazy evaluation on R. Or we may just conclude it's a bug.
We can reduce the problem to a slightly simpler example:
g <- function(x, value)
'g<-' <- function(x, value) x
x <- 3
# Works
g(x, a)
`g<-`(x, a)
# Fails
g(x) <- a
This suggests that R is doing something special when evaluating a replacement function: I suspect it evaluates all arguments. I'm not sure why, but the comments in the C code (https://github.com/wch/r-source/blob/trunk/src/main/eval.c#L1656 and https://github.com/wch/r-source/blob/trunk/src/main/eval.c#L1181) suggest it may be to make sure other intermediate variables are not accidentally modified.
Luke Tierney has a long comment about the drawbacks of the current approach, and illustrates some of the more complicated ways replacement functions can be used:
There are two issues with the approach here:
A complex assignment within a complex assignment, like
f(x, y[] <- 1) <- 3, can cause the value temporary
variable for the outer assignment to be overwritten and
then removed by the inner one. This could be addressed by
using multiple temporaries or using a promise for this
variable as is done for the RHS. Printing of the
replacement function call in error messages might then need
to be adjusted.
With assignments of the form f(g(x, z), y) <- w the value
of z will be computed twice, once for a call to g(x, z)
and once for the call to the replacement function g<-. It
might be possible to address this by using promises.
Using more temporaries would not work as it would mess up
replacement functions that use substitute and/or
nonstandard evaluation (and there are packages that do
that -- igraph is one).
I think the key may be found in this comment beginning at line 1682 of "eval.c" (and immediately followed by the evaluation of the assignment operation's RHS):
/* It's important that the rhs get evaluated first because
assignment is right associative i.e. a <- b <- c is parsed as
a <- (b <- c). */
PROTECT(saverhs = rhs = eval(CADR(args), rho));
We expect that if we do g(x) <- a <- b <- 4 + 5, both a and b will be assigned the value 9; this is in fact what happens.
Apparently, the way that R ensures this consistent behavior is to always evaluate the RHS of an assignment first, before carrying out the rest of the assignment. If that evaluation fails (as when you try something like g(x) <- 1 + "a"), an error is thrown and no assignment takes place.
I'm going to go out on a limb here, so please, folks with more knowledge feel free to comment/edit.
Note that when you run
'g<-' <- function(x, value){print(x);print(substitute(value))}
x <- 1
g(x) <- 5
a side effect is that 5 is assigned to x. Hence, both must be evaluated. But if you then run
'g<-'(x,10)
both the values of x and 10 are printed, but the value of x remains the same.
Speculation:
So the parser is distinguishing between whether you call g<- in the course of making an actual assignment, and when you simply call g<- directly.