meta-programming and the substitute function in R - r

f <- function() {
x <- 6 + 4
substitute(x)
}
f()
The above will output:
[1] 10
However, the below:
x <- 6 + 4
substitute(x)
outputs:
x
Why are they different?

#akrun's answer demonstrates how to get it to resolve, but I think the answer to your question of "Why?" is in ?substitute, where it says in the Details:
If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.
(Emphasis mine.) When you are executing this on the default prompt >, you are in the global environment. Not so in your first example, within the function's namespace. (As to "Why did R-core decide on this behavior?", I do not think I am qualified to answer or even speculate.)

The evaluation is not happening
eval(substitute(x))
#[1] 10
As #r2evans showed the documentation description, we can test it on a new environment to see this in action
# create the object in another environment
e1 <- new.env()
e1$x <- 6 + 4
substitute(x) # here x is looked in the global space
#x
substitute(x, env = e1) # specify the `env` and looks for the local env
#[1] 10

Related

Manipulating enclosing environment of a function

I'm trying to get a better understanding of closures, in particular details on a function's scope and how to work with its enclosing environment(s)
Based on the Description section of the help page on rlang::fn_env(), I had the understanding, that a function always has access to all variables in its scope and that its enclosing environment belongs to that scope.
But then, why isn't it possible to manipulate the contents of the closure environment "after the fact", i.e. after the function has been created?
By means of R's lexical scoping, shouldn't bar() be able to find x when I put into its enclosing environment?
foo <- function(fun) {
env_closure <- rlang::fn_env(fun)
env_closure$x <- 5
fun()
}
bar <- function(x) x
foo(bar)
#> Error in fun(): argument "x" is missing, with no default
Ah, I think I got it down now.
It has to do with the structure of a function's formal arguments:
If an argument is defined without a default value, R will complain when you call the function without specifiying that even though it might technically be able to look it up in its scope.
One way to kick off lexical scoping even though you don't want to define a default value would be to set the defaults "on the fly" at run time via rlang::fn_fmls().
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fun()
}
# No argument at all -> lexical scoping takes over
baz <- function() x
foo(baz)
#> [1] 5
# Set defaults to desired values on the fly at run time of `foo()`
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fmls <- rlang::fn_fmls(fun)
fmls$x <- substitute(get("x", envir = env_enclosing, inherits = FALSE))
rlang::fn_fmls(fun) <- fmls
fun()
}
bar <- function(x) x
foo(bar)
#> [1] 5
I can't really follow your example as I am unfamiliar with the rlang library but I think a good example of a closure in R would be:
bucket <- function() {
n <- 1
foo <- function(x) {
assign("n", n+1, envir = parent.env(environment()))
n
}
foo
}
bar <- bucket()
Because bar() is define in the function environment of bucket then its parent environment is bucket and therefore you can carry some data there. Each time you run it you modify the bucket environment:
bar()
[1] 2
bar()
[1] 3
bar()
[1] 4

Why must local({...}) be defined using two rounds of expression quoting?

I'm trying to understand how R's local function is working. With it, you can open a temporary local scope, which means what happens in local (most notably, variable definitions), stays in local. Only the last value of the block is returned to the outside world. So:
x <- local({
a <- 2
a * 2
})
x
## [1] 4
a
## Error: object 'a' not found
local is defined like this:
local <- function(expr, envir = new.env()){
eval.parent(substitute(eval(quote(expr), envir)))
}
As I understand it, two rounds of expression quoting and subsequent evaluation happen:
eval(quote([whatever expr input]), [whatever envir input]) is generated as an unevaluated call by substitute.
The call is evaluated in local's caller frame (which is in our case, the Global Environment), so
[whatever expr input] is evaluated in [whatever envir input]
However, I do not understand why step 2 is nessecary. Why can't I simply define local like this:
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
I would think it evaluates the expression expr in an empty environment? So any variable defined in expr should exist in envir and therefore vanish after the end of local2?
However, if I try this, I get:
x <- local2({
a <- 2
a * 2
})
x
## [1] 4
a
## [1] 2
So a leaks to the Global Environment. Why is this?
EDIT: Even more mysterious: Why does it not happen for:
eval(quote({a <- 2; a*2}), new.env())
## [1] 4
a
## Error: object 'a' not found
Parameters to R functions are passed as promises. They are not evaluated unless the value is specifically requested. So look at
# clean up first
if exists("a") rm(a)
f <- function(x) print(1)
f(a<-1)
# [1] 1
a
# Error: object 'a' not found
g <- function(x) print(x)
g(a<-1)
# [1] 1
a
# [1] 1
Note that in the g() function, we are using the value passed to the function which is that assignment to a so that creates a in the global environment. With f(), that variable is never created because that function parameter remained a promise end was never evaluated.
If you want to access a parameter without evaluating it, you need to use something like match.call() or subsititute(). The local() function does the latter.
If you remove the eval.parent(), you'll see that the substitute puts the un-evaluated expression from the parameter into a new call to eval().
h <- function(expr, envir = new.env()){
substitute(eval(quote(expr), envir))
}
h(a<-1)
# eval(quote(a <- 1), new.env())
Where as if you do
j<- function(x) {
quote(x)
}
j(a<-1)
# x
you are not really creating a new function call. Further more when you eval() that expression, you are triggering the evaluation of x from it's original calling environment (triggering the evaluation of the promise), not evaluating the expression in a new environment.
local() then uses the eval.parent() so that you can use existing variables in the environment within your block. For example
b<-5
local({
a <- b
a * 2
})
# [1] 10
Look at the behaviors here
local2 <- function(expr, envir = new.env()){
eval(quote(expr), envir)
}
local2({a<-5; a})
# [1] 5
local2({a<-5; a}, list(a=100, expr="hello"))
# [1] "hello"
See how when we use a non-empty environment, the eval() is looking up expr in the environment, it's not evaluating your code block in the environment.

Unevaluated argument in R

I still a novice in R, and still understanding lazy evaluation. I read quite a few threads on SO (R functions that pass on unevaluated arguments to other functions), but I am still not sure.
Question 1:
Here's my code:
f <- function(x = ls()) {
a<-1
#x ##without x
}
f(x=ls())
When I execute this code i.e. f(), nothing returns. Specifically, I don't see the value of a. Why is it so?
Question 2:
Moreover, I do see the value of a in this code:
f <- function(x = ls()) {
a<-1
x ##with x
}
f(x=ls())
When I execute the function by f() I get :
[1] "a" "x"
Why is it so? Can someone please help me?
Question 1
This has nothing to do with lazy evaluation.
A function returns the result of the last statement it executed. In this case the last statement was a <- 1. The result of a <- 1 is one. You could for example do b <- a <- 1 which would result in b being equal to 1. So, in this case you function returns 1.
> f <- function(x = ls()) {
+ a<-1
+ }
> b <- f(x=ls())
> print(b)
[1] 1
The argument x is nowhere used, and so doesn't play any role.
Functions can return values visibly (the default) or invisibly. In order to return invisibly the function invisible can be used. An example:
> f1 <- function() {
+ 1
+ }
> f1()
[1] 1
>
> f2 <- function() {
+ invisible(1)
+ }
> f2()
>
In this case f2 doesn't seem to return anything. However, it still returns the value 1. What the invisible does, is not print anything when the function is called and the result is not assigned to anything. The relevance to your example, is that a <- 1 also returns invisibly. That is the reason that your function doesn't seem to return anything. But when assigned to b above, b still gets the value 1.
Question 2
First, I'll explain why you see the results you see. The a you see in your result, was caused some previous code. If we first clean the workspace, we only see f. This makes sense as we create a variable f (a function is also a variable in R) and then do a ls().
> rm(list = ls())
>
> f <- function(x = ls()) {
+ a<-1
+ x
+ }
> f(x=ls())
[1] "f"
What the function does (at least what you would expect), if first list all variables ls() then pass the result to the function as x. This function then returns x, which is the list of all variables, which then gets printed.
How this can be modified to show lazy evaluation at work
> rm(list = ls())
>
> f <- function(x) {
+ a <<- 1
+ x
+ }
>
> f(x = ls())
[1] "a" "f"
>
In this case the global assignment is used (a <<- 1), which creates a new variable a in the global workspace (not something you normally want to do).
In this case, one would still expect the result of the function call to be just f. The fact that it also shows a is caused by lazy evaluation.
Without lazy evaluation, it would first evaluate ls() (at that time only f exists in the workspace), copy that into the function with the name x. The function then returns x. In this case the ls() is evaluated before a is created.
However, with lazy evaluation, the expression ls() is only evaluated when the result of the expression is needed. In this case that is when the function returns and the result is printed. At that time the global environment has changed (a is created), which means that ls() also shows a.
(This is also one of the reasons why you don't want functions to change the global workspace using <<-.)

R RStudio Resetting debug / function environment

I am trying to stop R from displaying function code and environment information when I call a function. This function is part of an assignment for Coursera R Programming that was provided by the instructor. Here is the behavior:
R script:
makeVector <- function(x = numeric()) {
m <- NULL
set <- function(y) {
x <<- y
m <<- NULL
}
get <- function() x
setmean <- function(mean) m <<- mean
getmean <- function() m
list(set = set, get = get,
setmean = setmean,
getmean = getmean)
}
I run the following in the console:
> x <- 1:10
> makeVector(x)
And get:
$set
function (y)
{
x <<- y
m <<- NULL
}
<environment: 0x000000000967dd58>
$get
function ()
x
<environment: 0x000000000967dd58>
$setmean
function (mean)
m <<- mean
<environment: 0x000000000967dd58>
$getmean
function ()
m
<environment: 0x000000000967dd58>
It appears RStudio is returning function code and environment information rather than executing the function. Previously I ran debug(ls) and undebug(ls) as part of a quiz - it is my hunch that the debug() command has something to do with the behavior.
To fix the problem, I have already tried:
deleting the RStudio-Desktop folder that contains RStudio settings.
This reverted my appearance and global options to default, but the
function calling behavior still happens.
uninstalling and reinstalling both R and RStudio. The behavior still happens as
above.
Does anyone know why RStudio is displaying function code and environment rather than executing the function?
I really appreciate the help! Thanks!
First of all, this has nothing to do with Rstudio: Rstudio is just an IDE, it would be very strange if it somehow managed to mess with your code, wouldn't it? The behaviour you see is completely fine and does exactly what it should. If you are familiar with OOP, what you get is an "object" with several methods. Here's a small demo that shows the intended usage:
x <- 1:10
xx <- makeVector(x)
xx$get()
# [1] 1 2 3 4 5 6 7 8 9 10
xx$getmean()
#NULL
xx$setmean(mean(x))
xx$getmean()
#[1] 5.5
xx$setmean("Hi, I am a mean")
xx$getmean()
#[1] "Hi, I am a mean"

lexical scope in R

I recently learned that R has both lexical and dynamical scoping available, but that it uses lexical scope by default. The next case really confused me:
> x <- 1
> f <- function(y) { x + y }
> f(5) # we expect 6
[1] 6
> x <- 10
> f(5) # shouldn't we again expect 6?
[1] 15
Shouldn't f be evaluated using the environment where (and at the time!) it was defined and not where it was called ? How is this lexical scope? Thanks!
f <- function(y) { x + y }
was defined in the global environment and so for the parts not defined in the function itself (i.e.x), R looks to the global environment for them.
a=1
b=2
f<-function(x)
{
a*x + b
}
g<-function(x)
{
a=2
b=1
f(x)
}
# compare f(2) and g(2)
This example above is from here and gives a good discussion. Main point being, f() within g() ignores the definitions of a and b in g().
From the wiki on "Scope"
In object-oriented programming, dynamic dispatch selects an object method at runtime, though whether the actual name binding is done at compile time or run time depends on the language.

Resources