Lazy evaluation and promise data structure - r

I have been grappling with the application of promise since I first read about it on Advanced R. It is mentioned that a promise is a data structure that powers lazy evaluation. The concept of lazy evaluation is quite clear as function arguments are only evaluated whenever they are accessed. However, in some examples I just cannot discover the presence of a promise and how/where it is evaluated. Consider the following example from Advanced R:
y <- 10
h02 <- function(x) {
y <- 100
x + 1
}
h02(y)
[1] 11
It returns 11 instead of 101, as apparently when we assign a variable like y which already exists in the global environment to x it is bound and evaluated outside of the function.
So I would like to know is a promise always involved some sort of assignment or every expression could be a promise and how we can detect their presence.
It is mentioned that they are evaluated in the calling environment of a function. So the second question is are their evaluation environments different from normal arguments as user-defined arguments are evaluated outside of the function.
There is also another example which I cannot understand why it involves lazy evaluation, and we only see Calculating... once.
double <- function(x) {
message("Calculating...")
x * 2
}
h03 <- function(x) {
c(x, x)
}
h03(double(20))
Calculating...
[1] 40 40
I am so sorry if I sound a little bit confused here, I got the point but it has never quite sunk in and I wanted to ask for a little bit of explanation for which I am very grateful.
Thank you very much in advance

When an object such as y within h02 is created in a function it is created in the local execution frame/environment of that function (a new frame is created each time the function is run). The created object is distinct from an object of the same name in any other environment.
Regarding h03 once a promise is forced, i.e. evaluated, its value is stored in the promise's value component and its evaled component is set to TRUE so that upon further accesses it does not have to be evaluated again.
Arguments of functions are promises but normally not other objects. Use pryr to inspect objects.
library(pryr)
f <- function(x) {
z <- 1
cat("is_promise(z):", is_promise(z), "\n")
cat("is_promise(x):", is_promise(x), "\n")
cat("before forcing - promise_info(x):\n")
print(promise_info(x))
force(x)
cat("after forcing - promise_info(x):\n")
print(promise_info(x))
delayedAssign("w", 3)
cat("is_promise(w):", is_promise(w), "\n")
invisible()
}
a <- 3
f(a)
giving:
is_promise(z): FALSE
is_promise(x): TRUE
before forcing - promise_info(x):
$code
a
$env
<environment: R_GlobalEnv>
$evaled
[1] FALSE
$value
NULL
after forcing - promise_info(x):
$code
a
$env
NULL
$evaled
[1] TRUE
$value
[1] 3
is_promise(w): TRUE

Related

Scoping order of '=' and '<-' inside a function in R

I am trying to understand how R scopes variables inside a function. Why is the output 12? Why not 4? How are a & b assigned here
I am learning R. Please explain with some references
f1 <- function(a = {b <- 10; 2}, b = 2) {
a+b
}
f1()
This is explained in section 4.3.3 of the R Language manual.
When a function is called, each formal argument is assigned a promise
in the local environment of the call with the expression slot
containing the actual argument (if it exists) and the environment slot
containing the environment of the caller. If no actual argument for a
formal argument is given in the call and there is a default
expression, it is similarly assigned to the expression slot of the
formal argument, but with the environment set to the local
environment.
The process of filling the value slot of a promise by evaluating the
contents of the expression slot in the promise’s environment is called
forcing the promise. A promise will only be forced once, the value
slot content being used directly later on.
Nothing has a value until the sum starts getting computed. First a is required and so it's expression is evaluated. The promise for b is lost as it gets assigned a value directly during the forcing of a and so the actual b assignment promise in the function definition is not evaluated at all.
If the order is the other way round, you see a different result:
f2 <- function(a = 2, b = {a <- 10; 2}) {
a+b
}
f2()
[1] 4
However, note that the value of a will be 10 at end of the function, but 2 when it is required during the sum. Both promises get evaluated here.
If the order of the sum is reversed in f1 to instead be b+a you would find similar behaviour to f2.
Earlier in that section there is a general warning that side-effects should be avoided in assignments because they is no guarantee they will be evaluated.
R has a form of lazy evaluation of function arguments. Arguments are
not evaluated until needed. It is important to realize that in some
cases the argument will never be evaluated. Thus, it is bad style to
use arguments to functions to cause side-effects. While in C it is
common to use the form, foo(x = y) to invoke foo with the value of y
and simultaneously to assign the value of y to x this same style
should not be used in R. There is no guarantee that the argument will
ever be evaluated and hence the assignment may not take place.
Refer https://www.rdocumentation.org/packages/base/versions/3.5.2/topics/assignOpsenter link description here
Try this
f1 <- function(a = {b <= 10; 2}, b = 2) {
a+b
}
f1()
or
f1 <- function(a = {b <<- 10; 2}, b = 2) {
a+b
}
f1()

What's the real meaning about 'Everything that exists is an object' in R?

I saw:
“To understand computations in R, two slogans are helpful:
• Everything that exists is an object.
• Everything that happens is a function call."
— John Chambers
But I just found:
a <- 2
is.object(a)
# FALSE
Actually, if a variable is a pure base type, it's result is.object() would be FALSE. So it should not be an object.
So what's the real meaning about 'Everything that exists is an object' in R?
The function is.object seems only to look if the object has a "class" attribute. So it has not the same meaning as in the slogan.
For instance:
x <- 1
attributes(x) # it does not have a class attribute
NULL
is.object(x)
[1] FALSE
class(x) <- "my_class"
attributes(x) # now it has a class attribute
$class
[1] "my_class"
is.object(x)
[1] TRUE
Now, trying to answer your real question, about the slogan, this is how I would put it. Everything that exists in R is an object in the sense that it is a kind of data structure that can be manipulated. I think this is better understood with functions and expressions, which are not usually thought as data.
Taking a quote from Chambers (2008):
The central computation in R is a function call, defined by the
function object itself and the objects that are supplied as the
arguments. In the functional programming model, the result is defined
by another object, the value of the call. Hence the traditional motto
of the S language: everything is an object—the arguments, the value,
and in fact the function and the call itself: All of these are defined
as objects. Think of objects as collections of data of all kinds. The data contained and the way the data is organized depend on the class from which the object was generated.
Take this expression for example mean(rnorm(100), trim = 0.9). Until it is is evaluated, it is an object very much like any other. So you can change its elements just like you would do it with a list. For instance:
call <- substitute(mean(rnorm(100), trim = 0.9))
call[[2]] <- substitute(rt(100,2 ))
call
mean(rt(100, 2), trim = 0.9)
Or take a function, like rnorm:
rnorm
function (n, mean = 0, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
You can change its default arguments just like a simple object, like a list, too:
formals(rnorm)[2] <- 100
rnorm
function (n, mean = 100, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
Taking one more time from Chambers (2008):
The key concept is that expressions for evaluation are themselves
objects; in the traditional motto of the S language, everything is an
object. Evaluation consists of taking the object representing an
expression and returning the object that is the value of that
expression.
So going back to our call example, the call is an object which represents another object. When evaluated, it becomes that other object, which in this case is the numeric vector with one number: -0.008138572.
set.seed(1)
eval(call)
[1] -0.008138572
And that would take us to the second slogan, which you did not mention, but usually comes together with the first one: "Everything that happens is a function call".
Taking again from Chambers (2008), he actually qualifies this statement a little bit:
Nearly everything that happens in R results from a function call.
Therefore, basic programming centers on creating and refining
functions.
So what that means is that almost every transformation of data that happens in R is a function call. Even a simple thing, like a parenthesis, is a function in R.
So taking the parenthesis like an example, you can actually redefine it to do things like this:
`(` <- function(x) x + 1
(1)
[1] 2
Which is not a good idea but illustrates the point. So I guess this is how I would sum it up: Everything that exists in R is an object because they are data which can be manipulated. And (almost) everything that happens is a function call, which is an evaluation of this object which gives you another object.
I love that quote.
In another (as of now unpublished) write-up, the author continues with
R has a uniform internal structure for representing all objects. The evaluation process keys off that structure, in a simple form that is essentially
composed of function calls, with objects as arguments and an object as the
value. Understanding the central role of objects and functions in R makes
use of the software more effective for any challenging application, even those where extending R is not the goal.
but then spends several hundred pages expanding on it. It will be a great read once finished.
Objects For x to be an object means that it has a class thus class(x) returns a class for every object. Even functions have a class as do environments and other objects one might not expect:
class(sin)
## [1] "function"
class(.GlobalEnv)
## [1] "environment"
I would not pay too much attention to is.object. is.object(x) has a slightly different meaning than what we are using here -- it returns TRUE if x has a class name internally stored along with its value. If the class is stored then class(x) returns the stored value and if not then class(x) will compute it from the type. From a conceptual perspective it matters not how the class is stored internally (stored or computed) -- what matters is that in both cases x is still an object and still has a class.
Functions That all computation occurs through functions refers to the fact that even things that you might not expect to be functions are actually functions. For example when we write:
{ 1; 2 }
## [1] 2
if (pi > 0) 2 else 3
## [1] 2
1+2
## [1] 3
we are actually making invocations of the {, if and + functions:
`{`(1, 2)
## [1] 2
`if`(pi > 0, 2, 3)
## [1] 2
`+`(1, 2)
## [1] 3

Which function will identify the name of an R variable's enclosing environment?

I've been reading about R environments, and I'm trying to test my understanding with a simple example:
> f <- function() {
+ x <- 1
+ environment(x)
+ }
>
> f()
NULL
I'm assuming this means that the object x is enclosed by the environment named NULL, but when I try to list all the objects in that environment, R displays an error message:
> ls(NULL)
Error in as.environment(pos) : using 'as.environment(NULL)' is defunct
So I'm wondering if there's a built-in function I can use on the command line that will return the environment name given the object name. I tried this:
> environment(x)
Error in environment(x) : object 'x' not found
but that returned an error as well. Any help will be greatly appreciated.
Variables created in function calls are destroyed when the function finishes executing (unless you specifically create them in other persistent environments). As #joran pointed out, when a function is called, a temporary environment is created where local variables are defined, and is destroyed when the function is done executing (that memory is freed). However, as #MrFlick pointed out, if the function returns a function, the returned function maintains a reference to the environment it was created in. You can read more about 'scope', 'stack', and 'heap'. In R there are various ways you can define your variables into specified environments.
f <- function() {
x <<- 1 # create x in the global environment (or change it if it's there)
## or `assign` x to a value
## assign(x, value=1, envir=.GlobalEnv)
}
environment(f) # where was f defined?
exists("x", envir=.GlobalEnv)
# [1] TRUE
The package pryr has some nice functions to do these kind of things. For example, there is a function called where which will give you the environment of an object:
library(pryr)
f <- function() {
x <- 1
where("x")
}
f()
<environment: 0x0000000013356f50>
So the environment of x was the temporary enviroment created by function f(). As people have said before, this enviroment is detroyed after you run the function, so it will give you a different result each time you run f().

Use of the <<- operator in R [duplicate]

I just finished reading about scoping in the R intro, and am very curious about the <<- assignment.
The manual showed one (very interesting) example for <<-, which I feel I understood. What I am still missing is the context of when this can be useful.
So what I would love to read from you are examples (or links to examples) on when the use of <<- can be interesting/useful. What might be the dangers of using it (it looks easy to loose track of), and any tips you might feel like sharing.
<<- is most useful in conjunction with closures to maintain state. Here's a section from a recent paper of mine:
A closure is a function written by another function. Closures are
so-called because they enclose the environment of the parent
function, and can access all variables and parameters in that
function. This is useful because it allows us to have two levels of
parameters. One level of parameters (the parent) controls how the
function works. The other level (the child) does the work. The
following example shows how can use this idea to generate a family of
power functions. The parent function (power) creates child functions
(square and cube) that actually do the hard work.
power <- function(exponent) {
function(x) x ^ exponent
}
square <- power(2)
square(2) # -> [1] 4
square(4) # -> [1] 16
cube <- power(3)
cube(2) # -> [1] 8
cube(4) # -> [1] 64
The ability to manage variables at two levels also makes it possible to maintain the state across function invocations by allowing a function to modify variables in the environment of its parent. The key to managing variables at different levels is the double arrow assignment operator <<-. Unlike the usual single arrow assignment (<-) that always works on the current level, the double arrow operator can modify variables in parent levels.
This makes it possible to maintain a counter that records how many times a function has been called, as the following example shows. Each time new_counter is run, it creates an environment, initialises the counter i in this environment, and then creates a new function.
new_counter <- function() {
i <- 0
function() {
# do something useful, then ...
i <<- i + 1
i
}
}
The new function is a closure, and its environment is the enclosing environment. When the closures counter_one and counter_two are run, each one modifies the counter in its enclosing environment and then returns the current count.
counter_one <- new_counter()
counter_two <- new_counter()
counter_one() # -> [1] 1
counter_one() # -> [1] 2
counter_two() # -> [1] 1
It helps to think of <<- as equivalent to assign (if you set the inherits parameter in that function to TRUE). The benefit of assign is that it allows you to specify more parameters (e.g. the environment), so I prefer to use assign over <<- in most cases.
Using <<- and assign(x, value, inherits=TRUE) means that "enclosing environments of the supplied environment are searched until the variable 'x' is encountered." In other words, it will keep going through the environments in order until it finds a variable with that name, and it will assign it to that. This can be within the scope of a function, or in the global environment.
In order to understand what these functions do, you need to also understand R environments (e.g. using search).
I regularly use these functions when I'm running a large simulation and I want to save intermediate results. This allows you to create the object outside the scope of the given function or apply loop. That's very helpful, especially if you have any concern about a large loop ending unexpectedly (e.g. a database disconnection), in which case you could lose everything in the process. This would be equivalent to writing your results out to a database or file during a long running process, except that it's storing the results within the R environment instead.
My primary warning with this: be careful because you're now working with global variables, especially when using <<-. That means that you can end up with situations where a function is using an object value from the environment, when you expected it to be using one that was supplied as a parameter. This is one of the main things that functional programming tries to avoid (see side effects). I avoid this problem by assigning my values to a unique variable names (using paste with a set or unique parameters) that are never used within the function, but just used for caching and in case I need to recover later on (or do some meta-analysis on the intermediate results).
One place where I used <<- was in simple GUIs using tcl/tk. Some of the initial examples have it -- as you need to make a distinction between local and global variables for statefullness. See for example
library(tcltk)
demo(tkdensity)
which uses <<-. Otherwise I concur with Marek :) -- a Google search can help.
On this subject I'd like to point out that the <<- operator will behave strangely when applied (incorrectly) within a for loop (there may be other cases too). Given the following code:
fortest <- function() {
mySum <- 0
for (i in c(1, 2, 3)) {
mySum <<- mySum + i
}
mySum
}
you might expect that the function would return the expected sum, 6, but instead it returns 0, with a global variable mySum being created and assigned the value 3. I can't fully explain what is going on here but certainly the body of a for loop is not a new scope 'level'. Instead, it seems that R looks outside of the fortest function, can't find a mySum variable to assign to, so creates one and assigns the value 1, the first time through the loop. On subsequent iterations, the RHS in the assignment must be referring to the (unchanged) inner mySum variable whereas the LHS refers to the global variable. Therefore each iteration overwrites the value of the global variable to that iteration's value of i, hence it has the value 3 on exit from the function.
Hope this helps someone - this stumped me for a couple of hours today! (BTW, just replace <<- with <- and the function works as expected).
f <- function(n, x0) {x <- x0; replicate(n, (function(){x <<- x+rnorm(1)})())}
plot(f(1000,0),typ="l")
The <<- operator can also be useful for Reference Classes when writing Reference Methods. For example:
myRFclass <- setRefClass(Class = "RF",
fields = list(A = "numeric",
B = "numeric",
C = function() A + B))
myRFclass$methods(show = function() cat("A =", A, "B =", B, "C =",C))
myRFclass$methods(changeA = function() A <<- A*B) # note the <<-
obj1 <- myRFclass(A = 2, B = 3)
obj1
# A = 2 B = 3 C = 5
obj1$changeA()
obj1
# A = 6 B = 3 C = 9
I use it in order to change inside map() an object in the global environment.
a = c(1,0,0,1,0,0,0,0)
Say I want to obtain a vector which is c(1,2,3,1,2,3,4,5), that is if there is a 1, let it 1, otherwise add 1 until the next 1.
map(
.x = seq(1,(length(a))),
.f = function(x) {
a[x] <<- ifelse(a[x]==1, a[x], a[x-1]+1)
})
a
[1] 1 2 3 1 2 3 4 5

What’s the environment and enclosure of nested `eval`?

Background
I’m in the process of creating a shortcut for lambdas, since the repeated use of function (…) … clutters my code considerably. As a remedy, I’m trying out alternative syntaxes inspired by other languages such as Haskell, as far as this is possible in R. Simplified, my code looks like this:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = parent.frame())
}
This allows the following code:
f(x = x * 2)(5) # => 10
f(x, y = x + y)(1, 2) # => 3
etc.
Of course the real purpose is to use this with higher-order functions1:
Map(f(x = x * 2), 1 : 10)
The problem
Unfortunately, I sometimes have to nest higher-order functions and then it stops working:
f(x = Map(f(y = x + y), 1:2))(10)
yields “Error in eval(expr, envir, enclos): object x not found”. The conceptually equivalent code using function instead of f works. Furthermore, other nesting scenarios also work:
f(x = f(y = x + y)(2))(3) # => 5
I’m suspecting that the culprit is the parent environment of the nested f inside the map: it’s the top-level environment rather than the outer f’s. But I have no idea how to fix this, and it also leaves me puzzled that the second scenario above works. Related questions (such as this one) suggest workarounds which are not applicable in my case.
Clearly I have a gap in my understanding of environments in R. Is what I want possible at all?
1 Of course this example could simply be written as (1 : 10) * 2. The real application is with more complex objects / operations.
The answer is to attach parent.frame() to the output function's environment:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
e <- parent.frame()
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = e)
}
Hopefully someone can explain well why this works and not yours. Feel free to edit.
Great question.
Why your code fails
Your code fails because eval()'s supplied enclos= argument does not point far enough up the call stack to reach the environment in which you are wanting it to next search for unresolved symbols.
Here is a partial diagram of the call stack from the bottom of which your call to parent.frame() occurs. (To make sense of this, it's important to keep in mind that the function call from which parent.frame() is here being called is not f(), but a call the anonymous function returned by f() (let's call it fval)).
## Note: E.F. = "Evaluation Frame"
## fval = anonymous function returned as value of nested call to f()
f( <------------------------- ## E.F. you want, ptd to by parent.frame(n=3)
Map(
mapply( <-------------------- ## E.F. pointed to by parent.frame(n=1)
fval( |
parent.frame(n=1 |
In this particular case, redefining the function returned by f() to call parent.frame(n=3) rather than parent.frame(n=1) produces working code, but that's not a good general solution. For instance, if you wanted to call f(x = mapply(f(y = x + y), 1:2))(10), the call stack would then be one step shorter, and you'd instead need parent.frame(n=2).
Why flodel's code works
flodel's code provides a more robust solution by calling parent.frame() during evaluation of the inner call to f in the nested chain f(Map(f(), ...)) (rather than during the subsequent evaluation of the anonymous function fval returned by f()).
To understand why his parent.frame(n=1) points to the appropriate environment, it's important to recall that in R, supplied arguments are evaluated in the the evaluation frame of the calling function. In the OP's example of nested code, the inner f() is evaluated during the processing of Map()'s supplied arguments, so it's evaluation environment is that of the function calling Map(). Here, the function calling Map() is the outer call to f(), and its evaluation frame is exactly where you want eval() to next be looking for symbols:
f( <--------------------- ## Evaluation frame of the nested call to f()
Map(f( |
parent.frame(n=1 |

Resources