Passing optional arguments with `...` to multiple functions; control argument matching? - r

I'm trying to write a parent function that calls a bunch of sub-functions that all have pretty sensible defaults and are well documented. Based on the value of one parameter, there are potentially different arguments I'd like to pass down to different sub-functions for customization. Is there a way to pass arguments to multiple functions using elipsis or another strategy?
Here's a simple example; here the challenge is being able to pass na.rm and/or base when a user wants, but otherwise use the existing defaults:
dat <- c(NA, 1:5)
# I want a flexible function that uses sensible defaults unless otherwise specified
meanLog<-function(x, ...){
y <- log(x, ...)
z <- mean(y, ...)
return(z)
}
# I know I can pass ... to one function wrapped inside this one.
justLog <- function(x, ...){
log(x, ...)
}
justLog(dat)
justLog(dat, base = 2)
# or another
justMean <- function(x, ...){
mean(x, ...)
}
justMean(dat)
justMean(dat, na.rm =T)
# but I can't pass both arguments
meanLog(dat) # works fine, but I want to customize a few things
meanLog(dat, na.rm =T, base = 2)
justMean(dat, base =2)
# In this case that is because justLog breaks if it gets an unused na.rm
justLog(dat, na.rm =T)

1) Define do.call2 which is like do.call except that it accepts unnamed arguments as well as named argument in the character vector accepted which defaults to the formals in the function.
Note that the arguments of mean do not include na.rm -- it is slurped up by the dot dot dot argument -- but the mean.default method does. Also primitive functions do not have formals so the accepted argument must be specified explicitly for those rather than defaulted.
do.call2 <- function(what, args, accepted = formalArgs(what)) {
ok <- names(args) %in% c("", accepted)
do.call(what, args[ok])
}
# test
dat <- c(NA, 1:5)
meanLog <- function(x, ...){
y <- do.call2("log", list(x, ...), "base")
z <- do.call2("mean.default", list(y, ...))
return(z)
}
meanLog(dat, na.rm = TRUE, base = 2)
## [1] 1.381378
# check
mean(log(dat, base = 2), na.rm = TRUE)
## [1] 1.381378
2) Another possibility is to provide separate arguments for mean and log.
(A variation of that is to use dot dot dot for one of the functions and argument lists for the others. For example nls in R uses dot dot dot but also uses a control argument to specify other arguments.)
# test
dat <- c(NA, 1:5)
meanLog <- function(x, logArgs = list(), meanArgs = list()) {
y <- do.call("log", c(list(x), logArgs))
z <- do.call("mean", c(list(y), meanArgs))
return(z)
}
meanLog(dat, logArgs = list(base = 2), meanArgs = list(na.rm = TRUE))
## [1] 1.381378
# check
mean(log(dat, base = 2), na.rm = TRUE)
## [1] 1.381378

Related

Non-standard evaluation in a user-defined function with lapply or with in R

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]))
# etc
}
However, I cannot use this wrapper with lapply nor with:
# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
"wool",
"tension")],
FUN = mytable,
row.vars = 1)
Error in (function (x, ...) : object 'X' not found
# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
expr = mytable(wool))
Error in (function (x, ...) : object 'wool' not found
These errors seem to be due to match.call not being evaluated in the right environment.
As this question is closely linked to my previous one, here is a sum up of my problems:
The wrapper with do.call and match.call cannot be used with lapply or with.
The wrapper without do.call and match.call cannot use the subset argument of ftable.
And a sum up of my questions:
How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
Is the error with lapply related to the following note from ?lapply?
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:
match_call_fun <- function(...) {
call = as.list(match.call()[-1])
print(call)
}
When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:
match_call_fun(iris['Species'], 9)
[[1]]
iris["Species"]
[[2]]
[1] 9
But watch what happens when we use lapply (I've only included the output of the internal print statement):
lapply('Species', function(x) match_call_fun(iris[x], 9))
[[1]]
iris[x]
[[2]]
[1] 9
Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret
As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:
z <- function(...) {
print(do.call(ls, list()))
print(do.call(ls, list(), envir = parent.frame()))
print(do.call(ls, list(), envir = parent.frame(2)))
}
When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:
z()
character(0) # Interior function environment
[1] "match_call_fun" "y" "z" # GlobalEnv
[1] "match_call_fun" "y" "z" # GlobalEnv
But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:
lapply(1, z)
character(0) # Interior function environment
[1] "FUN" "i" "X" # lapply
[1] "match_call_fun" "y" "z" # GlobalEnv
So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.
mytable_envir <- function(...) {
tab <- do.call(what = ftable,
args = as.list(match.call()[-1]),
envir = parent.frame())
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
round(x = margin,
digits = 1)
}
# This works!
lapply(X = c("breaks","wool","tension"),
FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))
As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.
Here's a test function that takes parent.frame() as a default value:
fun <- function(y=parent.frame()) {
print(y)
print(parent.frame())
print(parent.frame(2))
print(parent.frame(3))
}
Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:
lapply(1, function(y) fun())
<environment: 0x12c5bc1b0> # y argument
<environment: 0x12c5bc1b0> # parent.frame called inside
<environment: 0x12c5bc760> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358> # y argument
<environment: 0x104930da8> # parent.frame called inside
<environment: 0x104931358> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.
As you only want to pass all the arguments passed to ftable u do not need the do.call().
mytable <- function(...) {
tab <- ftable(...)
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
return(round(x = margin,
digits = 1))
}
The following lapply creates a table for every Variable separatly i don't know if that is what you want.
lapply(X = c("breaks",
"wool",
"tension"),
FUN = function(x) mytable(warpbreaks[x],
row.vars = 1))
If you want all 3 variables in 1 table
warpbreaks$newVar <- LETTERS[3:4]
lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
"c(\"newVar\", \"tension\",\"wool\")"),
FUN = function(X)
eval(parse(text=paste("mytable(warpbreaks[,",X,"],
row.vars = 1)")))
)
Thanks to this issue, the wrapper became:
# function 1
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]),
envir = parent.frame())
# etc
}
Or:
# function 2
mytable <- function(...) {
mc <- match.call()
mc[[1]] <- quote(expr = ftable)
eval.parent(expr = mc)
# etc
}
I can now use the subset argument of ftable, and use the wrapper in lapply:
lapply(X = warpbreaks[c("wool",
"tension")],
FUN = function(x) mytable(formula = x ~ breaks,
data = warpbreaks,
subset = breaks < 15))
However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.
More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

ISO a good way to let a function accept a mix of supplied arguments, arguments from a list, and defaults

I would like to have a function accept arguments in the usual R way, most of which will have defaults. But I would also like it to accept a list of named arguments corresponding to some or some or all of the formals. Finally, I would like arguments supplied to the function directly, and not through the list, to override the list arguments where they conflict.
I could do this with a bunch of nested if-statements. But I have a feeling there is some elegant, concise, R-ish programming-on-the-language solution -- probably multiple such solutions -- and I would like to learn to use them. To show the kind of solution I am looking for:
> arg_lst <- list(x=0, y=1)
> fn <- function(a_list = NULL, x=2, y=3, z=5, ...){
<missing code>
print(c(x, y, z))
}
> fn(a_list = arg_list, y=7)
Desired output:
x y z
0 7 5
I like a lot about #jdobres's approach, but I don't like the use of assign and the potential scoping breaks.
I also don't like the premise, that a function should be written in a special way for this to work. Wouldn't it be better to write a wrapper, much like do.call, to work this way with any function? Here is that approach:
Edit: solution based off of purrr::invoke
Thinking a bit more about this, purrr::invoke almost get's there - but it will result in an error if a list argument is also passed to .... But we can make slight modifications to the code and get a working version more concisely. This version seems more robust.
library(purrr)
h_invoke = function (.f, .x = NULL, ..., .env = NULL) {
.env <- .env %||% parent.frame()
args <- c(list(...), as.list(.x)) # switch order so ... is first
args = args[!duplicated(names(args))] # remove duplicates
do.call(.f, args, envir = .env)
}
h_invoke(fn, arg_list, y = 7)
# [1] 0 7 5
Original version borrowing heavily from jdobres's code:
hierarchical_do_call = function(f, a_list = NULL, ...){
formal_args = formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('f', 'a_list', '...')] = NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args[c('f', 'a_list')] = NULL # ...but remove the argument list and the function
a_list[names(supplied_args)] = supplied_args
do.call(what = f, args = a_list)
}
fn = function(x=2, y=3, z=5) {
print(c(x, y, z))
}
arg_list <- list(x=0, y=1)
hierarchical_do_call(f = fn, a_list = arg_list, y=7)
# x y z
# 0 7 5
I'm not sure how "elegant" this is, but here's my best attempt to satisfy the OP's requirements. The if/else logic is actually pretty straightforward (no nesting needed, per se). The real work is in collecting and sanitizing the three different input types (formal defaults, the list object, and any supplied arguments).
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
formal_args <- formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]])
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]])
}
}
print(c(x, y, z))
}
arg_lst <- list(x = 0, y = 1)
fn(a_list = arg_lst, y=7)
[1] 0 7 5
With a little more digging into R's meta-programming functions, it's actually possible to pack this hierarchical assignment into its own function, which is designed to operate on the function environment that called it. This makes it easier to reuse this functionality, but it definitely breaks scope and should be considered dangerous.
The "hierarchical assignment" function, mostly the same as before:
hierarchical_assign <- function(a_list) {
formal_args <- formals(sys.function(-1)) # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call(sys.function(-1), sys.call(-1)))[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]], envir = parent.frame())
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]], envir = parent.frame())
}
}
}
And the usage. Note that the the calling function must have an argument named a_list, and it must be passed to hierarchical_assign.
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
hierarchical_assign(a_list)
print(c(x, y, z))
}
[1] 0 7 5
I think do.call() does exactly what you want. It accepts a function and a list as arguments, the list being arguments for the functions. I think you will need a wrapper function to create this behavior of "overwriting defaults"

Pass optional arguments to function, three dots

I'm confused how ... works.
tt = function(...) {
return(x)
}
Why doesn't tt(x = 2) return 2?
Instead it fails with the error:
Error in tt(x = 2) : object 'x' not found
Even though I'm passing x as argument ?
Because everything you pass in the ... stays in the .... Variables you pass that aren't explicitly captured by a parameter are not expanded into the local environment. The ... should be used for values your current function doesn't need to interact with at all, but some later function does need to use do they can be easily passed along inside the .... It's meant for a scenario like
ss <- function(x) {
x
}
tt <- function(...) {
return(ss(...))
}
tt(x=2)
If your function needs the variable x to be defined, it should be a parameter
tt <- function(x, ...) {
return(x)
}
If you really want to expand the dots into the current environment (and I strongly suggest that you do not), you can do something like
tt <- function(...) {
list2env(list(...), environment())
return(x)
}
if you define three dots as an argument for your function and want it to work, you need to tell your function where the dots actually go. in your example you are neither defining x as an argument, neither ... feature elsewhere in the body of your function. an example that actually works is:
tt <- function(x, ...){
mean(x, ...)
}
x <- c(1, 2, 3, NA)
tt(x)
#[1] NA
tt(x, na.rm = TRUE)
#[1] 2
here ... is referring to any other arguments that the function mean might take. additionally you have a regular argument x. in the first example tt(x) just returns mean(x), whilst in the second example tt(x, na.rm = TRUE), passes the second argument na.rm = TRUE to mean so tt returns mean(x, na.rm = TRUE).
Another way that the programmers of R use a lot is list(...) as in
tt <- function(...) {
args <- list(...) # As in this
if("x" %in% names(args))
return(args$x)
else
return("Something else.")
}
tt(x = 2)
#[1] 2
tt(y = 1, 2)
#[1] "Something else."
I believe that this is one of their favorite, if not the favorite, way of handling the dots arguments.

Return a function object with arguments partially filled

I want to define an R function that uses an existing function, with some arguments filled in. For instance, suppose I want to define a function meanNA as follows:
meanNA <- mean(na.rm = TRUE)
The idea is that when I call meanNA(x, trim = 0) I will call mean(x, trim = 0, na.rm = TRUE). Is this even possible in R? If so, what's the right way to implement this?
Just take your example, we can write:
meanNA <- function (...) mean(..., na.rm = TRUE)
Example
x <- c(1:4, NA)
meanNA(x)
# [1] 2.5

Pass arguments in nested function to update default arguments

I have nested functions and wish to pass arguments to the deepest function. That deepest function will already have default arguments, so I will be updating those argument values.
My mwe is using plot(), but in reality I'm working with png(), with default height and width arguments.
Any suggestions?
f1 <- function(...){ f2(...)}
f2 <- function(...){ f3(...)}
f3 <- function(...){ plot(xlab="hello1", ...)}
#this works
f1(x=1:10,y=rnorm(10),type='b')
# I want to update the default xlab value, but it fails:
f1(x=1:10,y=rnorm(10),type='b', xlab='hello2')
In your f3(), "hello1" is not a default value for xlab in the list of function's formal arguments. It is instead the supplied value in the function body, so there's no way to override it:
f3 <- function(...){ plot(xlab="hello1", ...)}
I suspect you meant instead to do something like this.
f1 <- function(...){ f2(...)}
f2 <- function(...){ f3(...)}
f3 <- function(..., xlab="hello1") plot(..., xlab=xlab)
## Then check that it works
par(mfcol=c(1,2))
f1(x=1:10,y=rnorm(10),type='b')
f1(x=1:10,y=rnorm(10),type='b', xlab='hello2')
(Do notice that the formal argument xlab must follow the ... argument here, so that it can only be matched exactly (and not by partial matching). Otherwise, in the absence of an argument named xlab, it'll get matched by an argument named x, potentially (and actually here) causing you a lot of grief.)
My usual approach for modifying arguments in ... is as follows:
f1 = function(...) {
dots = list(...)
if (!('ylab' %in% names(dots))) {
dots$ylab = 'hello'
}
do.call(plot, dots)
}
# check results
f1(x = 1:10, y = rnorm(10))
f1(x = 1:10, y = rnorm(10), ylab = 'hi')
What happens here is that ... is captured in a list called dots. Next, R checks if this list dots contains any information about ylab. If there is no information, we set it to a specified value. If there is information, we do nothing. Last, do.call(a, b) is a function that basically stands voor execute function a with arguments b.
edit
This works better with multiple default arguments (and probably also better in general).
f1 = function(...) {
# capture ... in a list
dots = list(...)
# default arguments with their values
def.vals = list(bty = 'n', xlab = 'hello', las = 1)
# find elements in dots by names of def.vals. store those that are NULL
ind = unlist(lapply(dots[names(def.vals)], is.null))
# fill empty elements with default values
dots[names(def.vals)[ind]] = def.vals[ind]
# do plot
do.call(plot, dots)
}
f1(x = 1:10, y = rnorm(10), ylab = 'hi', bty = 'l')

Resources