I would like to write a wrapper function for two functions that take optional arguments.
Here is an example of a function fun to wrap funA and funB
funA <- function(x = 1, y = 1) return(x+y)
funB <- function(z = c(1, 1) return(sum(z))
fun <- function(x, y, z)
I would like fun to return x+y if x and y are provided, and sum(z) if a vector z is provided.
I have tried to see how the lm function takes such optional arguments, but it is not clear exactly how, e.g., match.call is being used here.
After finding related questions (e.g. How to use R's ellipsis feature when writing your own function? and using substitute to get argument name with )), I have come up with a workable solution.
My solution has just been to use
fun <- function(...){
inputs <- list(...)
if (all(c("x", "y") %in% inputs){
ans <- funA(x, y)
} else if ("z" %in% inputs){
ans <- funB(z)
}
Is there a better way?
Note: Perhaps this question can be closed as a duplicate, but hopefully it can serve a purpose in guiding other users to a good solution: it would have been helpful to have expanded my search to variously include ellipsis, substitute, in addition to match.call.
Use missing. This returns funA(x, y) if both x and y are provided and returns funB if they are not but z is provided and if none of them are provided it returns NULL:
fun <- function(x, y, z) {
if (!missing(x) && !missing(y)) {
funA(x, y)
}
else if (!missing(z)) {
funB(z)
}
This seems to answer your question as stated but note that the default arguments in funA and funB are never used so perhaps you really wanted something different?
Note the fun that is provided in the question only works if the arguments are named whereas the fun here works even if they are provided positionally.
I would something like for example this using match.call. This is similar to your solution but more robust.
fun <- function(...){
arg <- as.list(match.call())[-1]
f <- ifelse(length(arg)>1,"funA","funB")
do.call(f,arg)
}
fun(x=1,y=2) ## or fun(1,2) no need to give named arguments
[1] 3
> fun(z=1:10) ## fun(1:10)
[1] 55
Related
I have a series of similar functions that all need to extract some values from a data frame. Something like this:
foo_1 <- function(data, ...) {
x <- data$x
y <- data$y
# some preparatory code common to all foo_X functions
# .. do some boring stuff with x and y
# pack and process the result into 'ret'
return(ret)
}
These functions are then provided as arguments to some other function (let us call it "the master function". I cannot modify the master function).
However, I wish I could avoid re-writing the same preparatory code in each of these functions. For example, I don't want to use data$x instead of assigning it to x and using x because it makes the boring stuff hard to read. Presently, I need to write x <- data$x (etc.) in all of the foo_1, foo_2... functions. Which is annoying and clutters the code. Also, packing and processing is common for all the foo_N functions. Other preparatory code includes scaling of variables or regularization of IDs.
What would be an elegant and terse way of doing this?
One possibility is to attach() the data frame (or use with(), as Hong suggested in the answer below), but I don't know what other variables would then be in my name space: attaching data can mask other variables I use in fun_1. Also, preferably the foo_N functions should be called with explicit parameters, so it is easier to see what they need and what they are doing.
Next possibility I was thinking of was a construction like this:
foo_generator <- function(number) {
tocall <- switch(1=foo_1, 2=foo_2, 3=foo_3) # etc.
function(data, ...) {
x <- data$x
y <- data$y
tocall(x, y, ...)
# process and pack into ret
return(ret)
}
foo_1 <- function(x, y, ...) {
# do some boring stuff
}
Then I can use foo_generator(1) instead of foo_1 as the argument for the master function.
Is there a better or more elegant way? I feel like I am overlooking something obvious here.
You might be overthinking it. You say that the code dealing with preparation and packing are common to all foo_n functions. I assume, then, that # .. do some boring stuff with x and y is where each function differs. If that's the case then just create a single prep_and_pack function which takes a function name as a parameter, and then pass in foo_1, foo_2, etc. For example:
prep_and_pack <- function(data, func){
x <- data$x
y <- data$y
# preparatory code here
xy_output <- func(x, y) # do stuff with x and y
# code to pack and process into "ret"
return(ret)
}
Now you can create your foo_n functions that do different things with x and y:
foo_1 <- function(x, y) {
# .. do some boring stuff with x and y
}
foo_2 <- function(x, y) {
# .. do some boring stuff with x and y
}
foo_3 <- function(x, y) {
# .. do some boring stuff with x and y
}
Finally, you can pass multiple calls to prep_and_pack into your master function, where foo_1 etc. are passed in via the func argument:
master_func(prep_and_pack(data = df, func = foo_1),
prep_and_pack(data = df, func = foo_2),
prep_and_pack(data = df, func = foo_3)
)
You could also use switch in prep_and_pack and/or forgo the foo_n functions completely in favor of if-else conditionals to deal with the various cases, but I think the above keeps things nice a clean.
The requirements still seem a bit vague to me,
but if your code is so similar that you can simply wrap it around a helper function like tocall in your example,
and your input is in a list-like structure
(like a data frame which is just a list of columns),
then just write all your foo_* functions to take the "spliced" parameters like in your proposed solution,
and then use do.call:
foo_1 <- function(x, y) {
x + y
}
foo_2 <- function(x, y) {
x - y
}
data <- list(x = 1:2, y = 3:4)
do.call(foo_1, data)
# [1] 4 6
do.call(foo_2, data)
# [1] -2 -2
I'm not sure the following is a good idea. It reminds me a bit of programming with macros. I don't think I would do this. You'd need to carefully document because it is unexpected, confusing and not self-explanatory.
If you want to reuse the same code in different functions, it might be an option to create it as an unevaluated call and evaluate that call in the different functions:
prepcode <- quote({
x <- data$x
y <- data$y
}
)
foo_1 <- function(data, ...) {
eval(prepcode)
# some preparatory code common to all foo_X functions
# .. do some boring stuff with x and y
# pack and process the result into 'ret'
return(list(x, y))
}
L <- list(x = 1, y = "a")
foo_1(L)
#[[1]]
#[1] 1
#
#[[2]]
#[1] "a"
It might be better, to then have prepcode as an argument to foo_1 to make sure there won't be any scoping issues.
Use with inside the function:
foo_1 <- function(data, ...) {
with(data, {
# .. in here, x and y refer to data$x and data$y
}
}
I'm not sure I understand fully, but can't you simply use a function for all common stuff, and then unpack that into the foo_N functions using list2env? For example:
prepFun <- function(data, ...) {
x <- data$x
y <- data$y
tocall(x, y, ...)
# process and pack into a named list called ret
# so you know specifically what the elements of ret are called
return(ret)
}
foo_1 <- function(data, ...) {
# Get prepFun to do the prepping, then use list2env to get the result in foo_1
# You know which are the named elements, so there should be no confusion about
# what does or does not get masked.
prepResult <- prepFun(data, ...)
list2env(prepResult)
# .. do some boring stuff with x and y
# pack and process the result into 'ret'
return(ret)
}
Hope this is what you're looking for!
I think defining a function factory for this task is a bit overkill and confusing. You can define a general function and use purrr::partial() on it when passing it to your master function.
Something like :
foo <- function(data, ..., number, foo_funs) {
tocall <- foo_funs[[number]])
with(data[c("x", "y")], tocall(x, y, ...))
# process and pack into ret
return(ret)
}
foo_1 <- function(x, y, ...) {
# do some boring stuff
}
foo_funs <- list(foo_1, foo_2, ...)
Then call master_fun(fun = purrr::partial(foo, number =1) , ...)
another possibility is to use list2env which saves the components of a list in to a specified environment:
foo_1 <- function(data){
list2env(data, envir = environment())
x + y
}
foo_1(data.frame(x = 1:2, y = 3:4))
See also this question.
I'm confused how ... works.
tt = function(...) {
return(x)
}
Why doesn't tt(x = 2) return 2?
Instead it fails with the error:
Error in tt(x = 2) : object 'x' not found
Even though I'm passing x as argument ?
Because everything you pass in the ... stays in the .... Variables you pass that aren't explicitly captured by a parameter are not expanded into the local environment. The ... should be used for values your current function doesn't need to interact with at all, but some later function does need to use do they can be easily passed along inside the .... It's meant for a scenario like
ss <- function(x) {
x
}
tt <- function(...) {
return(ss(...))
}
tt(x=2)
If your function needs the variable x to be defined, it should be a parameter
tt <- function(x, ...) {
return(x)
}
If you really want to expand the dots into the current environment (and I strongly suggest that you do not), you can do something like
tt <- function(...) {
list2env(list(...), environment())
return(x)
}
if you define three dots as an argument for your function and want it to work, you need to tell your function where the dots actually go. in your example you are neither defining x as an argument, neither ... feature elsewhere in the body of your function. an example that actually works is:
tt <- function(x, ...){
mean(x, ...)
}
x <- c(1, 2, 3, NA)
tt(x)
#[1] NA
tt(x, na.rm = TRUE)
#[1] 2
here ... is referring to any other arguments that the function mean might take. additionally you have a regular argument x. in the first example tt(x) just returns mean(x), whilst in the second example tt(x, na.rm = TRUE), passes the second argument na.rm = TRUE to mean so tt returns mean(x, na.rm = TRUE).
Another way that the programmers of R use a lot is list(...) as in
tt <- function(...) {
args <- list(...) # As in this
if("x" %in% names(args))
return(args$x)
else
return("Something else.")
}
tt(x = 2)
#[1] 2
tt(y = 1, 2)
#[1] "Something else."
I believe that this is one of their favorite, if not the favorite, way of handling the dots arguments.
I am facing some problem with the apply function passing on arguments to a function when not needed. I understand that apply don't know what to do with the optional arguments and just pass them on the function.
But anyhow, here is what I would like to do:
First I want to specify a list of functions that I would like to use.
functions <- list(length, sum)
Then I would like to create a function which apply these specified functions on a data set.
myFunc <- function(data, functions) {
for (i in 1:length(functions)) print(apply(X=data, MARGIN=2, FUN=functions[[i]]))
}
This works fine.
data <- cbind(rnorm(100), rnorm(100))
myFunc(data, functions)
[1] 100 100
[1] -0.5758939 -5.1311173
But I would also like to use additional arguments for some functions, e.g.
power <- function(x, p) x^p
Which don't work as I want to. If I modify myFunc to:
myFunc <- function(data, functions, ...) {
for (i in 1:length(functions)) print(apply(X=data, MARGIN=2, FUN=functions[[i]], ...))
}
functions as
functions <- list(length, sum, power)
and then try my function I get
myFunc(data, functions, p=2)
Error in FUN(newX[, i], ...) :
2 arguments passed to 'length' which requires 1
How may I solve this issue?
Sorry for the wall of text. Thank you!
You can use Curry from functional to fix the parameter you want, put the function in the list of function you want to apply and finally iterate over this list of functions:
library(functional)
power <- function(x, p) x^p
funcs = list(length, sum, Curry(power, p=2), Curry(power, p=3))
lapply(funcs, function(f) apply(data, 2 , f))
With your code you can use:
functions <- list(length, sum, Curry(power, p=2))
myFunc(data, functions)
I'd advocate using Colonel's Curry approach, but if you want to stick to base R you can always:
funcs <- list(length, sum, function(x) power(x, 2))
which is roughly what Curry ends up doing
One option is to pass the parameters in a list with the arguments needed for each function. You can add those parameters to the others needed for apply using c and then use do.call to call the function. Something like this. I also wrap all the output in a list here rather than using print; your usage may vary.
power <- function(x, p) x^p
myFunc <- function(data, functions, parameters) {
lapply(seq_along(functions), function(i) {
p0 <- list(X=data, MARGIN=2, FUN=functions[[i]])
do.call(apply, c(p0, parameters[[i]]))
})
}
d <- matrix(1:6, nrow=2)
functions <- list(length, sum, power)
parameters <- list(NULL, NULL, p=3)
myFunc(d, functions, parameters)
You can use lazyeval package:
library(lazyeval)
my_evaluate <- function(data, expressions, ...) {
lapply(expressions, function(e) {
apply(data, MARGIN=2, FUN=function(x) {
lazy_eval(e, c(list(x=x), list(...)))
})
})
}
And use it like this:
my_expressions <- lazy_dots(sum = sum(x), sumpow = sum(x^p), length_k = length(x)*k )
data <- cbind(rnorm(100), rnorm(100))
my_evaluate(data, my_expressions, p = 2, k = 2)
This is not really a problem, but I'm wondering if there is a more elegant solution:
Lets say i have a vector vec <- rlnorm(10) and I want to apply a not vectorized function to it, e.g. exp (ignore for the moment that it is vectorized), I can do
sapply( vec, exp )
But when the function I want to apply is nested, the expression becomes directly less simple:
sapply( vec, function(x) exp( sqrt(x) ) )
This happens to me all the time with the apply and plyr family.
So my question is, is there in general an elegant way to nest (or pipe) functions without defining explicitly an (anonymous) function function(x){...}? Something like
# notrun
sapply( vec, sqrt | exp )
or similar.
See the examples for ?Reduce:
## Iterative function application:
Funcall <- function(f, ...) f(...)
## Compute log(exp(acos(cos(0))
Reduce(Funcall, list(log, exp, acos, cos), 0, right = TRUE)
Here's a more bare-bones implementation with a slightly different interface:
Compose <- function(x, ...)
{
lst <- list(...)
for(i in rev(seq_along(lst)))
x <- lst[[i]](x)
x
}
sapply(0, Compose, log, exp, acos, cos)
The package functional includes a Compose function.
library(functional)
id <- Compose(exp, log)
id(2) # 2
Its implementation is simple enough to include in your source, if, say, you don't need the rest of the stuff in the functional package.
R> Compose
function (...)
{
fs <- list(...)
if (!all(sapply(fs, is.function)))
stop("Argument is not a function")
function(...) Reduce(function(x, f) f(x), fs, ...)
}
<environment: namespace:functional>
Does anyone know how to write a function F which takes a function call (say, mean(x = 1:10)) as an argument, and returns just the name of the function being invoked (mean)?
My best attempts so far are summarised below
(function(x1){
return(deparse(substitute(x1)))
})(mean(x = 1:10))
### 'mean(x = 1:10)'
Changing x1 (the function call) to an expression before de-parsing doesn't seem to help much: that returns
(function(x1){
return(deparse(as.expression(substitute(x1))))
})(mean(x = 1:10))
# "expression(mean(x = 1:10))"
If at all possible, I'd like to be able to use anonymous functions as an argument too, so F should return (function(x) print (x)) for (function(x) print (x))(1). If you need any clarification feel free to comment. Thanks.
edit1: just to note, I'd like to avoid checking for the first parenthesis and excising the the code before it (for "mean(x = 1:10)" that would return "mean"), as "bad(Fun_nAme" is actually a legal function name in R.
Question Answered: Josh O'Brien's answer was perfect: the function F that satisfies the above conditions is
F <- function(x) deparse(substitute(x)[[1]])
It works nicely for binary operators, standard functions and anonymous functions.
Here's a simple function that does what you want:
F <- function(x) deparse(substitute(x)[[1]])
F(mean(x=1:10))
# [1] "mean"
F((function(x) print (x))(1))
# [1] "(function(x) print(x))"
F(9+7)
# [1] "+"
I don't know what you're trying to do or if it's a good idea or if this is what you want but here's a whack at it with regex:
FUN <- function(x1){
z <- deparse(substitute(x1))
list(fun=strsplit(z, "\\(")[[c(1, 1)]],
eval=x1)
}
FUN(mean(x = 1:10))