Name masking in R function - Advanced R by Hadley - r

I came across this example in Advanced R by Hadley. My question is after defining the function, j(1) outputs the inner function definition as supposed to what j(1)() is outputting? Intuitively, I think j(1) should output [1] 1 2
Could anyone explain what's going on actually? What's the difference between j(1) and j(1)() ?
> j <- function(x) {
+ y <- 2
+ function() {
+ c(x,y)
+ }
+ }
> k <- j(1)
> k()
[1] 1 2
> j(1)
function() {
c(x,y)
}
<environment: 0x7fa184353bf8>
> j()
function() {
c(x,y)
}
<environment: 0x7fa18b5ad0d0>
> j(1)()
[1] 1 2

tl;dr In R, the return value of a function can also be a function. That's the case here. j(1) returns a function, whereas j(1)() returns a numeric vector.
The difference between j(1) and j(1)() is that j(1) outputs a function because that's the last value in the definition of j. Functions return their last expression (or the value found in a relevant return() call), which in this case is also a function. j(1)() is calling the last value of j, which is the function returned from it. It does not take an argument, so the empty parentheses () is the argument list for j(1)
It might become a bit more clear if we have a closer look at j and some of its properties.
j <- function(x) {
y <- 2
function() {
c(x, y)
}
}
The difference between the calls becomes quite apparent when we look at their classes.
class(j(1))
# [1] "function"
class(j(1)())
# [1] "numeric"
When you defined j, 2 is hard-coded into its return function as the second value of the vector returned from that function. We can see the precise return value of a call to j(1) with
library(pryr)
unenclose(j(1))
# function ()
# {
# c(1, 2)
# }
So a call to j(1)() (or k()) will deliver the vector c(1, 2). Similarly, if we call j(5), the return value of j(5)() is c(5, 2)
unenclose(j(5))
# function ()
# {
# c(5, 2)
# }
Hope that helps.
Credit to #Khashaa for mentioning the unenclose() function (comment deleted).

Related

R - modify an unknown function

I have a function f of some vector x
The function in R is written as :
f <- function(x){#insert some function of x here}
I would like to return (-f), which denotes the negative of the function. In the case the function itself is known beforehand, this is a simple exercise.
However, in this case, I don't know what this function is
Could someone please help me with the R code to carry this out? (The output needs to be a function in the vector x.)
An example would be - f(x) = x + 1, then -f(x) = -x - 1
Thank you!
The following function getNegFn() takes a function fn and returns a function which returns the negative return value of fn:
getNegFn <- function(fn){
fnOut <- function(){
- do.call(what=fn, args=as.list(match.call())[-1])
}
formals(fnOut) <- formals(fn)
fnOut
}
An example:
fn <- function(x) x + 1
nFn <- getNegFn(fn=fn)
fn(1)
[1] 2
nFn(1)
[1] -2
Also works if the input function has ... arguments:
fn2 <- function(x, ...) x + sum(unlist(list(...)))
nFn2 <- getNegFn(fn=fn2)
fn2(x=1, y=2)
[1] 3
nFn2(x=1, y=2)
[1] -3
Look into do.call(). You can give it a string of a function's name as an arguement, and that function will then be called with the arguments you provide (in a list()), so, for example, you can do:
funcname<-"mean"
dat<-1:5
do.call(funcname, list(dat))*(-1)
Would run as if you called mean(dat)*(-1), which will give you:
[1] -3

Simplify test where one of pregiven list element values is not missing and is TRUE

I would like to ask this, because it is hard to search for. Is there more efficient way to write the following:
a <- list(x=FALSE,z=TRUE,l=list()) # a$y is not defined, list contains also lists
f <- function() 1
if(!is.null(a$x)) { if(a$x==TRUE) f() }
if(!is.null(a$y)) { if(a$y==TRUE) f() }
if(!is.null(a$z)) { if(a$z==TRUE) f() }
[1] 1
The idea is that if list any of pre-given list elements x, y or z have value TRUE function f() is called and otherwise not.
To aim is to run function f() only once, and write the function call f() only once to the code. Function f() is run if one of conditions x, y or z hold. The conditions are stored in the list a, which contains also other elements. However, list a might not contain all conditions, only some of them, which makes the missing conditions to be false.
EDIT:
I found quite convenient solution:
for (b in c("x","y","z")) {
if (!is.null(a[[b]]) & c(a[[b]]),F)[1] == T ) {
print(f())
break
}
}
but in order to prevent error:
if(!is.null(a[["y"]]) & a[["y"]] == T) 1
Error in if (!is.null(a[["y"]]) & a[["y"]] == T) 1 :
argument is of length zero
I had to make a coalesce-like solution c(a[["y"]],F)[1]:
if(!is.null(a[["y"]]) & c(a[["y"]],F)[1] == T) 1
which works, but does not look so nice, because I am not sure whether the following condition will work always, even if it does here (?):
> c(NULL,1) == c(1)
[1] TRUE
Since there is a list embedded inside another list, ie nested lists, we would rather use rapply
a <- list(x=FALSE,z=TRUE,l=list())
f <- function() 1
rapply(a,function(k)if(!is.null(k)&k==T)f())
z
1
b=list(x=FALSE,Z=TRUE,l=list(x=TRUE,y=TRUE))
rapply(b,function(k)if(!is.null(k)&k==T)f())
Z l.x l.y
1 1 1
With this kind of data:
a <- list(x=FALSE,z=TRUE,l=list()) # a$y is not defined, list contains also lists
f <- function() 1
the aim was to run function f() if one of conditions x, y or z hold. Thanks to hint from #Onyambu, it is necessary to test only conditions which are found in the list, because not found conditions are false. Below are steps which were used in final solution (last execution row):
> intersect(names(a),c("x","y","z"))
[1] "x" "z"
> a[intersect(names(a),c("x","y","z"))]
$x
[1] FALSE
$z
[1] TRUE
> unlist(a[intersect(names(a),c("x","y","z"))])
x z
FALSE TRUE
> any(unlist(a[intersect(names(a),c("x","y","z"))]))
[1] TRUE
> if (any(unlist(a[intersect(names(a),c("x","y","z"))]))) f()
[1] 1

Unevaluated argument in R

I still a novice in R, and still understanding lazy evaluation. I read quite a few threads on SO (R functions that pass on unevaluated arguments to other functions), but I am still not sure.
Question 1:
Here's my code:
f <- function(x = ls()) {
a<-1
#x ##without x
}
f(x=ls())
When I execute this code i.e. f(), nothing returns. Specifically, I don't see the value of a. Why is it so?
Question 2:
Moreover, I do see the value of a in this code:
f <- function(x = ls()) {
a<-1
x ##with x
}
f(x=ls())
When I execute the function by f() I get :
[1] "a" "x"
Why is it so? Can someone please help me?
Question 1
This has nothing to do with lazy evaluation.
A function returns the result of the last statement it executed. In this case the last statement was a <- 1. The result of a <- 1 is one. You could for example do b <- a <- 1 which would result in b being equal to 1. So, in this case you function returns 1.
> f <- function(x = ls()) {
+ a<-1
+ }
> b <- f(x=ls())
> print(b)
[1] 1
The argument x is nowhere used, and so doesn't play any role.
Functions can return values visibly (the default) or invisibly. In order to return invisibly the function invisible can be used. An example:
> f1 <- function() {
+ 1
+ }
> f1()
[1] 1
>
> f2 <- function() {
+ invisible(1)
+ }
> f2()
>
In this case f2 doesn't seem to return anything. However, it still returns the value 1. What the invisible does, is not print anything when the function is called and the result is not assigned to anything. The relevance to your example, is that a <- 1 also returns invisibly. That is the reason that your function doesn't seem to return anything. But when assigned to b above, b still gets the value 1.
Question 2
First, I'll explain why you see the results you see. The a you see in your result, was caused some previous code. If we first clean the workspace, we only see f. This makes sense as we create a variable f (a function is also a variable in R) and then do a ls().
> rm(list = ls())
>
> f <- function(x = ls()) {
+ a<-1
+ x
+ }
> f(x=ls())
[1] "f"
What the function does (at least what you would expect), if first list all variables ls() then pass the result to the function as x. This function then returns x, which is the list of all variables, which then gets printed.
How this can be modified to show lazy evaluation at work
> rm(list = ls())
>
> f <- function(x) {
+ a <<- 1
+ x
+ }
>
> f(x = ls())
[1] "a" "f"
>
In this case the global assignment is used (a <<- 1), which creates a new variable a in the global workspace (not something you normally want to do).
In this case, one would still expect the result of the function call to be just f. The fact that it also shows a is caused by lazy evaluation.
Without lazy evaluation, it would first evaluate ls() (at that time only f exists in the workspace), copy that into the function with the name x. The function then returns x. In this case the ls() is evaluated before a is created.
However, with lazy evaluation, the expression ls() is only evaluated when the result of the expression is needed. In this case that is when the function returns and the result is printed. At that time the global environment has changed (a is created), which means that ls() also shows a.
(This is also one of the reasons why you don't want functions to change the global workspace using <<-.)

R programming functions

I'm learning R programming. I'm unable to understand how function within function works in R. Example:
f <- function(y) {
function() { y }
}
f()
f(2)()
I'm not able to understand why $f() is not working and showing following message:
function() { y }
<environment: 0x0000000015e8d470>
but when I use $f(4)() then it is showing answer as 4.
Please explain your answer in brief so that I can understand it easily.
For a more general case, let's change the "inner" function a little bit:
f <- function(y) {
function(x) { y + x}
}
Now f(2) returns a function that adds 2 to its argument. The value 2 is kept in the function's environment:
> f(2)
function(x) { y + x}
<environment: 0x0000000015a690f0>
> environment(f(2))$y
[1] 2
.. and you can change it if you really want to but for this you need to assign a name to the output of f():
> g <- f(2)
> environment(g)$y
[1] 2
> environment(g)$y <- 3
> g(1)
[1] 4
Why do you need a g here? Because otherwise, the environment created with f(2) is garbage collected immediately, so there's no way to access it:
> environment(f(2))$y<-4
Error in environment(f(2))$y <- 4 :
target of assignment expands to non-language object
This won't affect the case when you use, say, f(2) only once:
> f(2)(3)
[1] 5
The reason is that the inner function behave as the answer of the outer one son by just doing f(), the result is a function. The best way is to call it double f(3)() as the inner function doesn't take any argument.

Using the same argument names for a function defined inside another function

Why does
f <- function(a) {
g <- function(a=a) {
return(a + 2)
}
return(g())
}
f(3) # Error in a + 2: 'a' is missing
cause an error? It has something to do with the a=a argument, particularly with the fact that the variable names are the same. What exactly is going on?
Here are some similar pieces of code that work as expected:
f <- function(a) {
g <- function(a) {
return(a + 2)
}
return(g(a))
}
f(3) # 5
f <- function(a) {
g <- function(g_a=a) {
return(g_a + 2)
}
return(g())
}
f(3) # 5
g <- function(a) a + 2
f <- function(a) g(a)
f(3) # 5
The problem is that, as explained in the R language definition:
The default arguments to a function are evaluated in the evaluation frame of the function.
In your first code block, when you call g() without any arguments, it falls back on its default value of a, which is a. Evaluating that in the "frame of the function" (i.e. the environment created by the call to g()), it finds an argument whose name matches the symbol a, and its value is a. When it looks for the value of that a, it finds an argument whose name matches that symbol, and whose value is a. When...
As you can see, you're stuck in a loop, which is what the error message is trying to tell you:
Error in g() :
promise already under evaluation: recursive default argument reference or
earlier problems?
Your second attempt, which calls g(a) works as you expected, because you've supplied an argument, and, as explained in the same section of R-lang:
The supplied arguments to a function are evaluated in the evaluation frame of the calling function.
There it finds a symbol a, which is bound to whatever value you passed in to the outer function's formal argument a, and all is well.
The problem is the a=a part. An argument can't be its own default. That is a circular reference.
This example may help clarify how it works:
x <- 1
f <- function(a = x) { x <- 2; a }
f()
## [1] 2
Note that a does not have the default 1; it has the default 2. It looks first in the function itself for the default. In a similar way a=a would cause a to be its own default which is circular.

Resources