Creating a list of functions using a loop in R - r

In this introduction to functional programming, the author Hadley Wickham creates the following function factory:
power <- function(exponent) {
function(x) {
x ^ exponent
}
}
He then shows how this function can be used to define other functions, such as
square <- power(2)
cube <- power(3)
Now suppose I wanted to create these functions simultaneously via the following loop:
ftns <- lapply(2:3, power)
This doesn't seem to work, as 3 gets assigned to the exponent for all entries of the list:
as.list(environment(ftns[[1]]))
$exponent
[1] 3
Can someone please help me understand what's wrong with this code?
Thanks!

What you're seeing is a consequence of R's use of promises to implement lazy argument evaluation. See Promise objects.
The problem is that in the power() function the exponent argument is never evaluated, meaning the underlying promise is never called (at least not until the generated function is evaluated).
You can force the promise to be evaluated like this:
power <- function(exponent) { exponent; function(x) x^exponent; };
ftns <- lapply(2:3,power);
sapply(ftns,function(ftn) environment(ftn)$exponent);
## [1] 2 3
Without the exponent; statement to force evaluation of the promise, we see the issue:
power <- function(exponent) { function(x) x^exponent; };
ftns <- lapply(2:3,power);
sapply(ftns,function(ftn) environment(ftn)$exponent);
## [1] 3 3
Aha! The R changelog shows that this behavior was changed in 3.2.0:
* Higher order functions such as the apply functions and Reduce()
now force arguments to the functions they apply in order to
eliminate undesirable interactions between lazy evaluation and
variable capture in closures. This resolves PR#16093.
It's classified as a new feature.

Related

rlang: Error: Can't convert a function to a string

I created a function to convert a function name to string. Version 1 func_to_string1 works well, but version 2 func_to_string2 doesn't work.
func_to_string1 <- function(fun){
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string2 <- function(fun){
is.function(fun)
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string1 works:
> func_to_string1(sum)
[1] "sum"
func_to_string2 doesn't work.
> func_to_string2(sum)
Error: Can't convert a primitive function to a string
Call `rlang::last_error()` to see a backtrace
My guess is that by calling the fun before converting it to a string, it gets evaluated inside function and hence throw the error message. But why does this happen since I didn't do any assignments?
My questions are why does it happen and is there a better way to convert function name to string?
Any help is appreciated, thanks!
This isn't a complete answer, but I don't think it fits in a comment.
R has a mechanism called pass-by-promise,
whereby a function's formal arguments are lazy objects (promises) that only get evaluated when they are used.
Even if you didn't perform any assignment,
the call to is.function uses the argument,
so the promise is "replaced" by the result of evaluating it.
Nevertheless, in my opinion, this seems like an inconsistency in rlang*,
especially given cory's answer,
which implies that R can still find the promise object even after a given parameter has been used;
the mechanism to do so might not be part of R's public API though.
*EDIT: see coments.
Regardless, you could treat enexpr/enquo/ensym like base::missing,
in the sense that you should only use them with parameters you haven't used at all in the function's body.
Maybe use this instead?
func_to_string2 <- function(fun){
is.function(fun)
deparse(substitute(fun))
#print(rlang::as_string(rlang::enexpr(fun)))
}
> func_to_string2(sum)
[1] "sum"
This question brings up an interesting point on lazy evaluations.
R arguments are lazily evaluated, meaning the arguments are not evaluated until its required.
This is best understood in the Advanced R book which has the following example,
f <- function(x) {
10
}
f(stop("This is an error!"))
the result is 10, which is surprising because x is never called and hence never evaluated. We can force x to be evaluated by using force()
f <- function(x) {
force(x)
10
}
f(stop("This is an error!"))
This behaves as expected. In fact we dont even need force() (Although it is good to be explicit).
f <- function(x) {
x
10
}
f(stop("This is an error!"))
This what is happening with your call here. The function sum which is a symbol initially is being evaluated with no arguments when is.function() is being called. In fact, even this will fail.
func_to_string2 <- function(fun){
fun
print(rlang::as_string(rlang::ensym(fun)))
}
Overall, I think its best to use enexpr() at the very beginning of the function.
Source:
http://adv-r.had.co.nz/Functions.html

Where are function constants stored if a function is created inside another function?

I am using a parent function to generate a child function by returning the function in the parent function call. The purpose of the parent function is to set a constant (y) in the child function. Below is a MWE. When I try to debug the child function I cannot figure out in which environment the variable is stored in.
power=function(y){
return(function(x){return(x^y)})
}
square=power(2)
debug(square)
square(3)
debugging in: square(3)
debug at #2: {
return(x^y)
}
Browse[2]> x
[1] 3
Browse[2]> y
[1] 2
Browse[2]> ls()
[1] "x"
Browse[2]> find('y')
character(0)
If you inspect the type of an R function, you’ll observe the following:
> typeof(square)
[1] "closure"
And that is, in fact, exactly the answer to your question: a closure is a function that carries an environment around.
R also tells you which environment this is (albeit not in a terribly useful way):
> square
function(x){return(x^y)}
<environment: 0x7ffd9218e578>
(The exact number will differ with each run — it’s just a memory address.)
Now, which environment does this correspond to? It corresponds to a local environment that was created when we executed power(2) (a “stack frame”). As the other answer says, it’s now the parent environment of the square function (in fact, in R every function, except for certain builtins, is associated with a parent environment):
> ls(environment(square))
[1] "y"
> environment(square)$y
[1] 2
You can read more about environments in the chapter in Hadley’s Advanced R book.
Incidentally, closures are a core feature of functional programming languages. Another core feature of functional languages is that every expression is a value — and, by implication, a function’s (return) value is the value of its last expression. This means that using the return function in R is both unnecessary and misleading!1 You should therefore leave it out: this results in shorter, more readable code:
power = function (y) {
function (x) x ^ y
}
There’s another R specific subtlety here: since arguments are evaluated lazily, your function definition is error-prone:
> two = 2
> square = power(two)
> two = 10
> square(5)
[1] 9765625
Oops! Subsequent modifications of the variable two are reflected inside square (but only the first time! Further redefinitions won’t change anything). To guard against this, use the force function:
power = function (y) {
force(y)
function (x) x ^ y
}
force simply forces the evaluation of an argument name, nothing more.
1 Misleading, because return is a function in R and carries a slightly different meaning compared to procedural languages: it aborts the current function exectuion.
The variable y is stored in the parent environment of the function. The environment() function returns the current environment, and we use parent.env() to get the parent environment of a particular environment.
ls(envir=parent.env(environment())) #when using the browser
The find() function doesn't seem helpful in this case because it seems to only search objects that have been attached to the global search path (search()). It doesn't try to resolve variable names in the current scope.

global variable

I have one question about using global variable in R. I write two examples
First version:
a <- 1
fun <- function(b){
return(a+b)
}
fun(b)
Second version:
a <- 1
fun <- function(a,b){
return(a+b)
}
fun(a,b)
I want to know which version is correct or recommand.
Relying on global state inside functions is frowned upon for various reasons (which can be roughly grouped under the caption encapsulation. So the second version would be superior in most scenarios.
However, the situation changes once the variable is defined inside a non-global environment. In that case, you’ve encapsulated your state into something that’s put away neatly. This is sometimes useful, because it allows you to create functions based on some input.
The classical example is something like this:
adder = function (value_to_add) {
function (x) {
x + value_to_add
}
}
This may look obscure but it’s simply a function that returns another function: you can use it to create functions. Here, for example, we create a function that takes one argument and adds the value 5 to it:
add5 = adder(5)
And here’s one that adds π to its argument:
add_pi = adder(pi)
Both of these are normal functions:
> add5(10)
[1] 15
> add_pi(10)
[1] 13.14159
Both of these functions, add5 and add_pi, access a variable, value_to_add, that’s outside of the function itself, in a separate environment. And it’s important to realise that these are different environments from each other: add5’s value_to_add is a different value, in a different environment, from add_pi’s value_to_add:
> environment(add5)$value_to_add
[1] 5
> environment(add_pi)$value_to_add
[1] 3.141593
(environment(f) allows you to inspect the environment to which a function f belongs. $ is used to access names inside that environment.)

find all functions in a package that use a function

I would like to find all functions in a package that use a function. By functionB "using" functionA I mean that there exists a set of parameters such that functionA is called when functionB is given those parameters.
Also, it would be nice to be able to control the level at which the results are reported. For example, if I have the following:
outer_fn <- function(a,b,c) {
inner_fn <- function(a,b) {
my_arg <- function(a) {
a^2
}
my_arg(a)
}
inner_fn(a,b)
}
I might or might not care to have inner_fn reported. Probably in most cases not, but I think this might be difficult to do.
Can someone give me some direction on this?
Thanks
A small step to find uses of functions is to find where the function name is used. Here's a small example of how to do that:
findRefs <- function(pkg, fn) {
ns <- getNamespace(pkg)
found <- vapply(ls(ns, all.names=TRUE), function(n) {
f <- get(n, ns)
is.function(f) && fn %in% all.names(body(f))
}, logical(1))
names(found[found])
}
findRefs('stats', 'lm.fit')
#[1] "add1.lm" "aov" "drop1.lm" "lm" "promax"
...To go further you'd need to analyze the body to ensure it is a function call or the FUN argument to an apply-like function or the f argument to Map etc... - so in the general case, it is nearly impossible to find all legal references...
Then you should really also check that getting the name from that function's environment returns the same function you are looking for (it might use a different function with the same name)... This would actually handle your "inner function" case.
(Upgraded from a comment.) There is a very nice foodweb function in Mark Bravington's mvbutils package with a lot of this capability, including graphical representations of the resulting call graphs. This blog post gives a brief description.

Can an R function access its own name?

Can you write a function that prints out its own name?
(without hard-coding it in, obviously)
You sure can.
fun <- function(x, y, z) deparse(match.call()[[1]])
fun(1,2,3)
# [1] "fun"
You can, but just in case it's because you want to call the function recursively see ?Recall which is robust to name changes and avoids the need to otherwise process to get the name.
Recall package:base R Documentation
Recursive Calling
Description:
‘Recall’ is used as a placeholder for the name of the function in
which it is called. It allows the definition of recursive
functions which still work after being renamed, see example below.
As you've seen in the other great answers here, the answer seems to be "yes"...
However, the correct answer is actually "yes, but not always". What you can get is actually the name (or expression!) that was used to call the function.
First, using sys.call is probably the most direct way of finding the name, but then you need to coerce it into a string. deparse is more robust for that.
myfunc <- function(x, y=42) deparse(sys.call()[[1]])
myfunc (3) # "myfunc"
...but you can call a function in many ways:
lapply(1:2, myfunc) # "FUN"
Map(myfunc, 1:2) # (the whole function definition!)
x<-myfunc; x(3) # "x"
get("myfunc")(3) # "get(\"myfunc\")"
The basic issue is that a function doesn't have a name - it's just that you typically assign the function to a variable name. Not that you have to - you can have anonymous functions - or assign many variable names to the same function (the x case above).

Resources