Can we pass a function as an argument - r

I'm using R to build a mathematical model. I want to write a function f(a, b, g) that takes in 3 arguments and the last one is a function. I want to know can I pass a function as an argument to another function? If this is possible, can you guys give me a simple example?

It is certainly legitimate to pass a function as an argument to another function. Many elementary R functions do this. For example,
tapply(..., FUN)
You can check them by ?tapply.
The thing is, you only treat the name of the function as a symbol. For example, in the toy example below:
foo1 <- function () print("this is function foo1!")
foo2 <- function () print("this is function foo2!")
test <- function (FUN) {
if (!is.function(FUN)) stop("argument FUN is not a function!")
FUN()
}
## let's have a go!
test(FUN = foo1)
test(FUN = foo2)
It is also possible to pass function arguments of foo1 or foo2 to test, by using .... I leave this for you to have some research.
If you are familiar with C language, then it is not difficult to understand why this is legitimate. R is written in C (though its language syntax belongs to S language), so essentially this is achieved by using pointers to function. If case you want to learn more on this, see How do function pointers in C work?

Here's a really simple example from Hadley's text:
randomise <- function(f) f(runif(1e3))
randomise(mean)
#> [1] 0.5059199
randomise(mean)
#> [1] 0.5029048
randomise(sum)
#> [1] 504.245

Related

Provide multiple function arguments by one variable

When working with packages like openxlsx, I often find myself writing repetetive code such as defining the wb and sheet arguments with the same values.
To respect the DRY principle, I would like to define one variable that contains multiple arguments. Then, when I call a function, I should be able to provide said variable to define multiple arguments.
Example:
foo <- list(a=1,b=2,c=3)
bar <- function(a,b,c,d) {
return(a+b+c+d)
}
bar(foo, d=4) # should return 10
How should the foo() function be defined to achieve this?
Apparently you are just looking for do.call, which allows you to create and evaluate a call from a function and a list of arguments.
do.call(bar, c(foo, d = 4))
#[1] 10
How should the foo() function be defined to achieve this?
You've got it slightly backwards. Rather than trying to wrangle the output of foo into something that bar can accept, write foo so that it takes input in a form that is convenient to you. That is, create a wrapper function that provides all the boilerplate arguments that bar requires, without you having to specify them manually.
Example:
bar <- function(a, b, c, d) {
return(a+b+c+d)
}
call_bar <- function(d=4) {
bar(1, 2, 3, d)
}
call_bar(42) # shorter than writing bar(1, 2, 3, 42)
I discovered a solution using rlang::exec.
First, we must have a function to structure the dots:
getDots <- function(...) {
out <- sapply(as.list(match.call())[-1], function(x) eval(parse(text=deparse(x))))
return(out)
}
Then we must have a function that executes our chosen function, feeding in our static parameters as a list (a, b, and c), in addition to d.
execute <- function(FUN, ...) {
dots <-
getDots(...) %>%
rlang::flatten()
out <- rlang::exec(FUN, !!!dots)
return(out)
}
Then calling execute(bar, abc, d=4) returns 10, as it should do.
Alternatively, we can write bar %>% execute(abc, d=4).
Let me give you an example!
How to get two or more return values ​​from a function
Method 1: Set global variables, so that if you change global variables in formal parameters, it will also be effective in actual parameters. So you can change the value of multiple global variables in the formal parameter, then in the actual parameter is equivalent to returning multiple values.
Method 2: If you use the array name as a formal parameter, then you change the contents of the array, such as sorting, or perform addition and subtraction operations, and it is still valid when returning to the actual parameter. This will also return a set of values.
Method 3: Pointer variables can be used. This principle is the same as Method 2, because the array name itself is the address of the first element of the array. Not much to say.
Method 4: If you have learned C++, you can quote parameters
You can try these four methods here, I just think the problem is a bit similar, so I provided it to you, I hope it will help you!

Evaluating a function that is an argument in another function using quo() in R

I have made a function that takes as an argument another function, the argument function takes as its argument some object (in the example a vector) which is supplied by the original function. It has been challenging to make the function call in the right way. Below are three approaches I have used after having read Programming with dplyr.
Only Option three works,
I would like to know if this is in fact the best way to evaluate a function within a function.
library(dplyr);library(rlang)
#Function that will be passed as an argument
EvaluateThis1 <- quo(mean(vector))
EvaluateThis2 <- ~mean(vector)
EvaluateThis3 <- quo(mean)
#First function that will recieve a function as an argument
MyFunc <- function(vector, TheFunction){
print(TheFunction)
eval_tidy(TheFunction)
}
#Second function that will recieve a function as an argument
MyFunc2 <- function(vector, TheFunction){
print(TheFunction)
quo(UQ(TheFunction)(vector)) %>%
eval_tidy
}
#Option 1
#This is evaluating vector in the global environment where
#EvaluateThis1 was captured
MyFunc(1:4, EvaluateThis1)
#Option 2
#I don't know what is going on here
MyFunc(1:4, EvaluateThis2)
MyFunc2(1:4, EvaluateThis2)
#Option 3
#I think this Unquotes the function splices in the argument then
#requotes before evaluating.
MyFunc2(1:4, EvaluateThis3)
My question is:
Is option 3 the best/most simple way to perform this evaluation
An explanation of what is happening
Edit
After reading #Rui Barradas very clear and concise answer I realised that I am actually trying to do someting similar to below which I didn't manage to make work using Rui's method but solved using environment setting
OtherStuff <-c(10, NA)
EvaluateThis4 <-quo(mean(c(vector,OtherStuff), na.rm = TRUE))
MyFunc3 <- function(vector, TheFunction){
#uses the captire environment which doesn't contain the object vector
print(get_env(TheFunction))
#Reset the enivronment of TheFunction to the current environment where vector exists
TheFunction<- set_env(TheFunction, get_env())
print(get_env(TheFunction))
print(TheFunction)
TheFunction %>%
eval_tidy
}
MyFunc3(1:4, EvaluateThis4)
The function is evaluated within the current environment not the capture environment. Because there is no object "OtherStuff" within that environment, the parent environments are searched finding "OtherStuff" in the Global environment.
I will try to answer to question 1.
I believe that the best and simpler way to perform this kind of evaluation is to do without any sort of fancy evaluation techniques. To call the function directly usually works. Using your example, try the following.
EvaluateThis4 <- mean # simple
MyFunc4 <- function(vector, TheFunction){
print(TheFunction)
TheFunction(vector) # just call it with the appropriate argument(s)
}
MyFunc4(1:4, EvaluateThis4)
function (x, ...)
UseMethod("mean")
<bytecode: 0x000000000489efb0>
<environment: namespace:base>
[1] 2.5
There are examples of this in base R. For instance approxfun and ecdf both return functions that you can use directly in your code to perform subsequent calculations. That's why I've defined EvaluateThis4 like that.
As for functions that use functions as arguments, there are the optimization ones, and, of course, *apply, byand ave.
As for question 2, I must admit to my complete ignorance.

How to retrieve formals of a primitive function?

For the moment, at least, this is an exercise in learning for me, so the actual functions or their complexity is not the issue. Suppose I write a function whose argument list includes some input variables and a function name, passed as a string. This function then calculates some variables internally and "decides" how to feed them to the function name I've passed in.
For nonprimitive functions, I can do (for this example, assume non of my funcname functions have any arguments other than at most (x,y,z). If they did, I'd have to write some code to search for matching names(formals(get(funcname))) so as not to delete the other arguments):
foo <- function (a,b,funcname) {
x <- 2*a
y <- a+3*b
z <- -b
formals(get(funcname)) <- list(x=x, y=y, z=z)
bar <- get(funcname)()
return(bar)
}
And the nice thing is, even if the function funcname will execute without error even if it doesn't use x, y or z (so long as there are no other args that don't have defaults) .
The problem with "primitive" functions is I don't know any way to find or modify their formals. Other than writing a wrapper, e.g. foosin <-function(x) sin(x), is there a way to set up my foo function to work with both primitive and nonprimitive function names as input arguments?
formals(args(FUN)) can be used to get the formals of a primitive function.
You could add an if statement to your existing function.
> formals(sum)
# NULL
> foo2 <- function(x) {
if(is.primitive(x)) formals(args(x)) else formals(x)
## formals(if(is.primitive(x)) args(x) else x) is another option
}
> foo2(sum)
# $...
#
#
# $na.rm
# [1] FALSE
#
> foo2(with)
# $data
#
#
# $expr
#
#
# $...
Building on Richard S' response, I ended up doing the following. Posted just in case anyone else ever tries do things as weird as I do.
EDIT: I think more type-checking needs to be done. It's possible that coleqn could be
the name of an object, in which case get(coleqn) will return some data. Probably I need
to add a if(is.function(rab)) right after the if(!is.null(rab)). (Of course, given that I wrote the function for my own needs, if I was stupid enough to pass an object, I deserve what I get :-) ).
# "coleqn" is the input argument, which is a string that could be either a function
# name or an expression.
rab<-tryCatch(get(coleqn),error=function(x) {} )
#oops, rab can easily be neither NULL nor a closure. Damn.
if(!is.null(rab)) {
# I believe this means it must be a function
# thanks to Richard Scriven of SO for this fix to handle primitives
# we are not allowed to redefine primitive's formals.
qq <- list(x=x,y=y,z=z)
# matchup the actual formals names
# by building a list of valid arguments to pass to do.call
argk<-NULL
argnames<-names(formals(args(coleqn)))
for(j in 1:length(argnames) )
argk[j]<-which(names(qq)==argnames[1] )
arglist<-list()
for(j in 1:length(qq) )
if(!is.na(argk[j])) arglist[[names(qq)[j]]]<-qq[[j]]
colvar<- do.call(coleqn,arglist)
} else {
# the input is just an expression (string), not a function
colvar <- eval(parse(text=coleqn))
}
The result is an object generated either by the expression or the function just created, using variables internal to the main function (which is not shown in this snippet)

Nested function environment selection

I am writing some functions for doing repeated tasks, but I am trying to minimize the amount of times I load the data. Basically I have one function that takes some information and makes a plot. Then I have a second function that will loop through and output multiple plots to a .pdf. In both functions I have the following line of code:
if(load.dat) load("myworkspace.RData")
where load.dat is a logical and the data I need is stored in myworkspace.RData. When I am calling the wrapper function that loops through and outputs multiple plots I do not want to reload the workspace in every call to the inner function. I thought I could just load the workspace once in the wrapper function, then the inner function could access that data, but I got an error stating otherwise.
So my understanding was when a function cannot find the variable in its local environment (created when the function gets called), the function will look to the parent environment for the variable.
I assumed the parent environment to the inner function call would be the outer function call. Obviously this is not true:
func1 <- function(...){
print(var1)
}
func2 <- function(...){
var1 <- "hello"
func1(...)
}
> func2()
Error in print(var1) : object 'var1' not found
After reading numerous questions, the language manual, and this really helpful blog post, I came up with the following:
var1 <- "hello"
save(list="var1",file="test.RData")
rm(var1)
func3 <- function(...){
attach("test.RData")
func1(...)
detach("file:test.RData")
}
> func3()
[1] "hello"
Is there a better way to do this? Why doesn't func1 look for undefined variables in the local environment created by func2, when it was func2 that called func1?
Note: I did not know how to name this question. If anyone has better suggestions I will change it and edit this line out.
To illustrate lexical scoping, consider the following:
First let's create a sandbox environment, only to avoid the oh-so-common R_GlobalEnv:
sandbox <-new.env()
Now we put two functions inside it: f, which looks for a variable named x; and g, which defines a local x and calls f:
sandbox$f <- function()
{
value <- if(exists("x")) x else "not found."
cat("This is function f looking for symbol x:", value, "\n")
}
sandbox$g <- function()
{
x <- 123
cat("This is function g. ")
f()
}
Technicality: entering function definitions in the console causes then to have the enclosing environment set to R_GlobalEnv, so we manually force the enclosures of f and g to match the environment where they "belong":
environment(sandbox$f) <- sandbox
environment(sandbox$g) <- sandbox
Calling g. The local variable x=123 is not found by f:
> sandbox$g()
This is function g. This is function f looking for symbol x: not found.
Now we create a x in the global environment and call g. The function f will look for x first in sandbox, and then in the parent of sandbox, which happens to be R_GlobalEnv:
> x <- 456
> sandbox$g()
This is function g. This is function f looking for symbol x: 456
Just to check that f looks for x first in its enclosure, we can put a x there and call g:
> sandbox$x <- 789
> sandbox$g()
This is function g. This is function f looking for symbol x: 789
Conclusion: symbol lookup in R follows the chain of enclosing environments, not the evaluation frames created during execution of nested function calls.
EDIT: Just adding a link to this very interesting answer from Martin Morgan on the related subject of parent.frame() vs parent.env()
You could use closures:
f2 <- function(...){
f1 <- function(...){
print(var1)
}
var1 <- "hello"
f1(...)
}
f2()

passing several arguments to FUN of lapply (and others *apply)

I have a question regarding passing multiple arguments to a function, when using lapply in R.
When I use lapply with the syntax of lapply(input, myfun); - this is easily understandable, and I can define myfun like that:
myfun <- function(x) {
# doing something here with x
}
lapply(input, myfun);
and elements of input are passed as x argument to myfun.
But what if I need to pass some more arguments to myfunc? For example, it is defined like that:
myfun <- function(x, arg1) {
# doing something here with x and arg1
}
How can I use this function with passing both input elements (as x argument) and some other argument?
If you look up the help page, one of the arguments to lapply is the mysterious .... When we look at the Arguments section of the help page, we find the following line:
...: optional arguments to ‘FUN’.
So all you have to do is include your other argument in the lapply call as an argument, like so:
lapply(input, myfun, arg1=6)
and lapply, recognizing that arg1 is not an argument it knows what to do with, will automatically pass it on to myfun. All the other apply functions can do the same thing.
An addendum: You can use ... when you're writing your own functions, too. For example, say you write a function that calls plot at some point, and you want to be able to change the plot parameters from your function call. You could include each parameter as an argument in your function, but that's annoying. Instead you can use ... (as an argument to both your function and the call to plot within it), and have any argument that your function doesn't recognize be automatically passed on to plot.
As suggested by Alan, function 'mapply' applies a function to multiple Multiple Lists or Vector Arguments:
mapply(myfun, arg1, arg2)
See man page:
https://stat.ethz.ch/R-manual/R-devel/library/base/html/mapply.html
You can do it in the following way:
myfxn <- function(var1,var2,var3){
var1*var2*var3
}
lapply(1:3,myfxn,var2=2,var3=100)
and you will get the answer:
[[1]]
[1] 200
[[2]]
[1] 400
[[3]]
[1] 600
myfun <- function(x, arg1) {
# doing something here with x and arg1
}
x is a vector or a list and myfun in lapply(x, myfun) is called for each element of x separately.
Option 1
If you'd like to use whole arg1 in each myfun call (myfun(x[1], arg1), myfun(x[2], arg1) etc.), use lapply(x, myfun, arg1) (as stated above).
Option 2
If you'd however like to call myfun to each element of arg1 separately alongside elements of x (myfun(x[1], arg1[1]), myfun(x[2], arg1[2]) etc.), it's not possible to use lapply. Instead, use mapply(myfun, x, arg1) (as stated above) or apply:
apply(cbind(x,arg1), 1, myfun)
or
apply(rbind(x,arg1), 2, myfun).

Resources