Unexpected behavior from `...` in a function's definition - r

NB: This question is not a duplicate of How to use R's ellipsis feature when writing your own function?. That question asks how to use ellipses, and in particular, "how can [one] convert the ellipsis from the function's signature into, for example, a list?" Below I don't ask anything of the sort. I am asking why my use of ellipses fails to produce the expected result.
As an example, the function my.ls below is meant to be a wrapper around base::ls that makes all.names = TRUE the default:
my.ls <- function (...) base::ls(..., all.names = TRUE)
I had expected that, with this definition, my.ls() would produce the same value as base::ls(all.names = TRUE) does. To my surprise, the values from these two expressions are not even close.
For example:
% /usr/bin/R --quiet --vanilla
> x <- 3; .y <- 1; z <- 4
> base::ls(all.names = TRUE)
[1] "x" ".y" "z"
> my.ls <- function (...) base::ls(..., all.names = TRUE)
> my.ls()
[1] "..."
What am I doing wrong?

Is it because the default argument for envir in base::ls is as.environement(pos)?
Adding envir = gloal.env() seems to work for me, assuming you are working from the global environment.

OK, I figured out the problem. The following implementation of my.ls is closer to what I'm after:
my.ls <- function (name, all.names = TRUE, ...) {
if (missing(name))
return(my.ls(parent.frame(1), all.names = all.names, ...))
base::ls(name, all.names = all.names, ...)
}

Related

Non-standard evaluation in a user-defined function with lapply or with in R

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]))
# etc
}
However, I cannot use this wrapper with lapply nor with:
# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
"wool",
"tension")],
FUN = mytable,
row.vars = 1)
Error in (function (x, ...) : object 'X' not found
# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
expr = mytable(wool))
Error in (function (x, ...) : object 'wool' not found
These errors seem to be due to match.call not being evaluated in the right environment.
As this question is closely linked to my previous one, here is a sum up of my problems:
The wrapper with do.call and match.call cannot be used with lapply or with.
The wrapper without do.call and match.call cannot use the subset argument of ftable.
And a sum up of my questions:
How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
Is the error with lapply related to the following note from ?lapply?
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:
match_call_fun <- function(...) {
call = as.list(match.call()[-1])
print(call)
}
When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:
match_call_fun(iris['Species'], 9)
[[1]]
iris["Species"]
[[2]]
[1] 9
But watch what happens when we use lapply (I've only included the output of the internal print statement):
lapply('Species', function(x) match_call_fun(iris[x], 9))
[[1]]
iris[x]
[[2]]
[1] 9
Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret
As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:
z <- function(...) {
print(do.call(ls, list()))
print(do.call(ls, list(), envir = parent.frame()))
print(do.call(ls, list(), envir = parent.frame(2)))
}
When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:
z()
character(0) # Interior function environment
[1] "match_call_fun" "y" "z" # GlobalEnv
[1] "match_call_fun" "y" "z" # GlobalEnv
But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:
lapply(1, z)
character(0) # Interior function environment
[1] "FUN" "i" "X" # lapply
[1] "match_call_fun" "y" "z" # GlobalEnv
So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.
mytable_envir <- function(...) {
tab <- do.call(what = ftable,
args = as.list(match.call()[-1]),
envir = parent.frame())
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
round(x = margin,
digits = 1)
}
# This works!
lapply(X = c("breaks","wool","tension"),
FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))
As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.
Here's a test function that takes parent.frame() as a default value:
fun <- function(y=parent.frame()) {
print(y)
print(parent.frame())
print(parent.frame(2))
print(parent.frame(3))
}
Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:
lapply(1, function(y) fun())
<environment: 0x12c5bc1b0> # y argument
<environment: 0x12c5bc1b0> # parent.frame called inside
<environment: 0x12c5bc760> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358> # y argument
<environment: 0x104930da8> # parent.frame called inside
<environment: 0x104931358> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.
As you only want to pass all the arguments passed to ftable u do not need the do.call().
mytable <- function(...) {
tab <- ftable(...)
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
return(round(x = margin,
digits = 1))
}
The following lapply creates a table for every Variable separatly i don't know if that is what you want.
lapply(X = c("breaks",
"wool",
"tension"),
FUN = function(x) mytable(warpbreaks[x],
row.vars = 1))
If you want all 3 variables in 1 table
warpbreaks$newVar <- LETTERS[3:4]
lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
"c(\"newVar\", \"tension\",\"wool\")"),
FUN = function(X)
eval(parse(text=paste("mytable(warpbreaks[,",X,"],
row.vars = 1)")))
)
Thanks to this issue, the wrapper became:
# function 1
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]),
envir = parent.frame())
# etc
}
Or:
# function 2
mytable <- function(...) {
mc <- match.call()
mc[[1]] <- quote(expr = ftable)
eval.parent(expr = mc)
# etc
}
I can now use the subset argument of ftable, and use the wrapper in lapply:
lapply(X = warpbreaks[c("wool",
"tension")],
FUN = function(x) mytable(formula = x ~ breaks,
data = warpbreaks,
subset = breaks < 15))
However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.
More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

R: S3 Method dispatch depending on arguments

I have a generic function foo that I want to call three different ways depending on the arguments given to it.
foo <- function(...) UseMethod("foo")
#default
foo.default <- function(x, y, ...) {
#does some magic
print("this is the default method")
}
#formula
foo.formula <- function(formula, data = list(), ...) {
print("this is the formula method")
}
#data.frame
foo.data.frame <- function(data, x, y, ...) {
print("this is the data.frame method")
}
In the following I'm going to show how I am expecting the method dispatch to work but the outputs are presented under each call...
mydata <- data.frame(x=c(1,2,3,4),y=c(5,6,7,8))
#ways to call default function
foo(x = mydata$x, y = mydata$y)
#[1] "this is the default method"
#ways to call formula
foo(formula = mydata$x~mydata$y)
#[1] "this is the formula method"
foo(formula = x~y, data = mydata)
#[1] "this is the formula method"
foo(data = mydata, formula = x~y) #ERROR
#[1] "this is the data.frame method"
#ways to call data.frame method
foo(data = mydata, x = x, y = y)
#[1] "this is the data.frame method"
foo(x = x, y = y, data = mydata) #ERROR
#Error in foo(x = x, y = y, data = mydata) : object 'x' not found
from what I can tell, the method used depends on the class of the first argument. Essentially, I would like for the method dispatch to depend on the arguments passed to the generic function foo and not the first argument.
I would like the dispatch to have the following priority:
If the formula argument is present the formula method is used (data argument should be optional here)
Then, if no formula argument is found, if data argument is present use data.frame method (which requires x and y arguments)
else foo expects the x and y arguments or it will fail.
Note
I would like to avoid defining the generic function foo as follows
foo <- function(formula, data,...) UseMethod("foo")
while this would fix all my issues (I believe all except the last case), this will cause a devtools::check() warning because the some of S3 functions will not have the same arguments as the generic function and will no longer be consistent (specifically foo.default and foo.data.frame). And I wouldn't like to include the missing arguments because those methods do not have use for those arguments.
As Thomas has pointed out, this is not the standard behavior for S3 classes. If you really want to stick to S3, however, you could write your functions so as to "mimick" UseMethod, even though it won't be pretty and is probably not what you want to do. Nevertheless, here an idea that is based on capturing all arguments first, and then checking for the presence of your "preferred" argument type:
Get some objects first:
a <- 1; class(a) <- "Americano"
b <- 2; class(b) <- "Espresso"
Let the function in question capture all arguments with dots, and then check for the presence of an argument type in order of your preference:
drink <- function(...){
dots <- list(...)
if(any(sapply(dots, function(cup) class(cup)=="Americano"))){
drink.Americano(...)
} else { # you can add more checks here to get a hierarchy
# try to find appropriate method first if one exists,
# using the first element of the arguments as usual
tryCatch(get(paste0("drink.", class(dots[[1]])))(),
# if no appropriate method is found, try the default method:
error = function(e) drink.default(...))
}
}
drink.Americano <- function(...) print("Hmm, gimme more!")
drink.Espresso <- function(...) print("Tripple, please!")
drink.default <- function(...) print("Any caffeine in there?")
drink(a) # "Americano", dispatch hard-coded.
# [1] "Hmm, gimme more!"
drink(b) # "Espresso", not hard-coded, but correct dispatch anyway
# [1] "Tripple, please!"
drink("sthelse") # Dispatches to default method
# [1] "Any caffeine in there?"
drink(a,b,"c")
# [1] "Hmm, gimme more!"
drink(b,"c", a)
# [1] "Hmm, gimme more!"

Pass optional arguments to function, three dots

I'm confused how ... works.
tt = function(...) {
return(x)
}
Why doesn't tt(x = 2) return 2?
Instead it fails with the error:
Error in tt(x = 2) : object 'x' not found
Even though I'm passing x as argument ?
Because everything you pass in the ... stays in the .... Variables you pass that aren't explicitly captured by a parameter are not expanded into the local environment. The ... should be used for values your current function doesn't need to interact with at all, but some later function does need to use do they can be easily passed along inside the .... It's meant for a scenario like
ss <- function(x) {
x
}
tt <- function(...) {
return(ss(...))
}
tt(x=2)
If your function needs the variable x to be defined, it should be a parameter
tt <- function(x, ...) {
return(x)
}
If you really want to expand the dots into the current environment (and I strongly suggest that you do not), you can do something like
tt <- function(...) {
list2env(list(...), environment())
return(x)
}
if you define three dots as an argument for your function and want it to work, you need to tell your function where the dots actually go. in your example you are neither defining x as an argument, neither ... feature elsewhere in the body of your function. an example that actually works is:
tt <- function(x, ...){
mean(x, ...)
}
x <- c(1, 2, 3, NA)
tt(x)
#[1] NA
tt(x, na.rm = TRUE)
#[1] 2
here ... is referring to any other arguments that the function mean might take. additionally you have a regular argument x. in the first example tt(x) just returns mean(x), whilst in the second example tt(x, na.rm = TRUE), passes the second argument na.rm = TRUE to mean so tt returns mean(x, na.rm = TRUE).
Another way that the programmers of R use a lot is list(...) as in
tt <- function(...) {
args <- list(...) # As in this
if("x" %in% names(args))
return(args$x)
else
return("Something else.")
}
tt(x = 2)
#[1] 2
tt(y = 1, 2)
#[1] "Something else."
I believe that this is one of their favorite, if not the favorite, way of handling the dots arguments.

R: how to find what S3 method will be called on an object?

I know about methods(), which returns all methods for a given class. Suppose I have x and I want to know what method will be called when I call foo(x). Is there a oneliner or package that will do this?
The shortest I can think of is:
sapply(class(x), function(y) try(getS3method('foo', y), silent = TRUE))
and then to check the class of the results... but is there not a builtin for this?
Update
The full one liner would be:
fm <- function (x, method) {
cls <- c(class(x), 'default')
results <- lapply(cls, function(y) try(getS3method(method, y), silent = TRUE))
Find(function (x) class(x) != 'try-error', results)
}
This will work with most things but be aware that it might fail with some complex objects. For example, according to ?S3Methods, calling foo on matrix(1:4, 2, 2) would try foo.matrix, then foo.numeric, then foo.default; whereas this code will just look for foo.matrix and foo.default.
findMethod defined below is not a one-liner but its body has only 4 lines of code (and if we required that the generic be passed as a character string it could be reduced to 3 lines of code). It will return a character string representing the name of the method that would be dispatched by the input generic given that generic and its arguments. (Replace the last line of the body of findMethod with get(X(...)) if you want to return the method itself instead.) Internally it creates a generic X and an X method corresponding to each method of the input generic such that each X method returns the name of the method of the input generic that would be run. The X generic and its methods are all created within the findMethod function so they disappear when findMethod exits. To get the result we just run X with the input argument(s) as the final line of the findMethod function body.
findMethod <- function(generic, ...) {
ch <- deparse(substitute(generic))
f <- X <- function(x, ...) UseMethod("X")
for(m in methods(ch)) assign(sub(ch, "X", m, fixed = TRUE), "body<-"(f, value = m))
X(...)
}
Now test it. (Note that the one-liner in the question fails with an error in several of these tests but findMethod gives the expected result.)
findMethod(as.ts, iris)
## [1] "as.ts.default"
findMethod(print, iris)
## [1] "print.data.frame"
findMethod(print, Sys.time())
## [1] "print.POSIXct"
findMethod(print, 22)
## [1] "print.default"
# in this example it looks at 2nd component of class vector as no print.ordered exists
class(ordered(3))
## [1] "ordered" "factor"
findMethod(print, ordered(3))
## [1] "print.factor"
findMethod(`[`, BOD, 1:2, "Time")
## [1] "[.data.frame"
I use this:
s3_method <- function(generic, class, env = parent.frame()) {
fn <- get(generic, envir = env)
ns <- asNamespace(topenv(fn))
tbl <- ns$.__S3MethodsTable__.
for (c in class) {
name <- paste0(generic, ".", c)
if (exists(name, envir = tbl, inherits = FALSE)) {
return(get(name, envir = tbl))
}
if (exists(name, envir = globalenv(), inherits = FALSE)) {
return(get(name, envir = globalenv()))
}
}
NULL
}
For simplicity this doesn't return methods defined by assignment in the calling environment. The global environment is checked for convenience during development. These are the same rules used in r-lib packages.

The arguments of rm() in language R

I am new in language R,I found something special with it.
When using the method rm(),I wonder why I can't pass ls() as a parameter.
while using rm(list = ls()) will pass the compilation.
The method ls() will return a data whose type is List,won't it ?
It is the first time that I ask a question at foreign website, and my English is terrible, sorry! Waiting for your answers!
It has to do with the ... special argument in R (AKA "dot-dot-dot" or "ellipsis"). ... captures all unnamed arguments (as well as undocumented named arguments), "positionnally".
See ?rm for its arguments: rm(..., list = character(), pos = -1, envir = as.environment(pos), inherits = FALSE).
Since ... is the first argument, it captures ls() in rm(ls()).
But there are expectations on ... as you can see in the source code of rm (simply type rm at the command line):
function (..., list = character(), pos = -1, envir = as.environment(pos),
inherits = FALSE)
{
dots <- match.call(expand.dots = FALSE)$...
if (length(dots) && !all(vapply(dots, function(x) is.symbol(x) ||
is.character(x), NA, USE.NAMES = FALSE)))
stop("... must contain names or character strings")
names <- vapply(dots, as.character, "")
if (length(names) == 0L)
names <- character()
list <- .Primitive("c")(list, names)
.Internal(remove(list, envir, inherits))
}
Here it is is.symbol() that fails.
Maybe it will be easier with an example:
foo <- 1L
bar <- 2L
rm(ls())
# Error
ls()
# [1] "bar" "foo"
rm(c("foo", "bar"))
# Same error
rm("foo", "bar")
# OK
If you want to investigate further, I suggest: debugonce(rm) then rm(ls()) then step by step execution (easier in an IDE like RStudio).

Resources