How does the envir option of do.call work? - r

The documentation of do.call states:
If quote is FALSE, the default, then the arguments are evaluated (in the calling environment, not in envir).
This sentence would suggest to me that when quote = FALSE, specifying envir makes no difference. However, that's not the case, and in fact I've encountered cases where I need to specify envir to get the function to work.
Simplest reproducible example:
g1 <- function(x) {
args <- as.list(match.call())
args[[1]] <- NULL # remove the function call
do.call(print, args, quote = FALSE) # call print()
}
g2 <- function(x) {
args <- as.list(match.call())
args[[1]] <- NULL # remove the function call
do.call(print, args, quote = FALSE, envir = parent.frame()) # call print(), specifying envir
}
h1 <- function(x, y) {
g1(x*y)
}
h2 <- function(x, y) {
g2(x*y)
}
With these functions, h2() behaves as one would think but h1() does not:
h1(2, 3)
#Error in print(x) : object 'y' not found
h2(2, 3)
#[1] 6
y <- 100
h1(2, 3)
#[1] 600
## Looks like g1() took the value of y from the global environment
h2(2, 3)
#[1] 6
Can anybody explain to me what's going on here?
Note: There's a related post here but by my reading the answers don't specifically state what do.call() does with the envir variable.

?do.call says:
envir
an environment within which to evaluate the call. This will be
most useful if what is a character string and the arguments are
symbols or quoted expressions.
We can easily illustrate this if the what= argument of do.call is a character string. Then envir= determines where it is looked up.
e <- new.env()
e$f <- function() 2
f <- function() 3
do.call("f", list())
## [1] 3
do.call("f", list(), envir = e)
## [1] 2
The same is true for the arguments as the code in the question shows. Note that the arguments are already quoted since match.call() is being used.
What is happening in the case of h1 and g1 is that this is effectively run within g1
do.call(print, list(call("*", quote(x), quote(y))), quote = FALSE)
Now it finds x in g1 (since g1 has one argument x) but there is no y in g1 so it looks to the parent environment of g1 which is the global environment where it finds y.
In the case of h2 and g2 it runs this in g2:
do.call(print, list(call("*", quote(x), quote(y))), quote = FALSE, envir = parent.frame())
and it finds x and y in h2 which is the parentframe of g2.
Note that the parent environment is not the same as the parent frame:
the parent environment is determined by where the function was defined so if the function was defined in the global environment then its parent environment is the global environment.
the parent frame is the environment of the caller

Related

Non-standard evaluation in a user-defined function with lapply or with in R

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]))
# etc
}
However, I cannot use this wrapper with lapply nor with:
# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
"wool",
"tension")],
FUN = mytable,
row.vars = 1)
Error in (function (x, ...) : object 'X' not found
# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
expr = mytable(wool))
Error in (function (x, ...) : object 'wool' not found
These errors seem to be due to match.call not being evaluated in the right environment.
As this question is closely linked to my previous one, here is a sum up of my problems:
The wrapper with do.call and match.call cannot be used with lapply or with.
The wrapper without do.call and match.call cannot use the subset argument of ftable.
And a sum up of my questions:
How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
Is the error with lapply related to the following note from ?lapply?
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:
match_call_fun <- function(...) {
call = as.list(match.call()[-1])
print(call)
}
When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:
match_call_fun(iris['Species'], 9)
[[1]]
iris["Species"]
[[2]]
[1] 9
But watch what happens when we use lapply (I've only included the output of the internal print statement):
lapply('Species', function(x) match_call_fun(iris[x], 9))
[[1]]
iris[x]
[[2]]
[1] 9
Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret
As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:
z <- function(...) {
print(do.call(ls, list()))
print(do.call(ls, list(), envir = parent.frame()))
print(do.call(ls, list(), envir = parent.frame(2)))
}
When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:
z()
character(0) # Interior function environment
[1] "match_call_fun" "y" "z" # GlobalEnv
[1] "match_call_fun" "y" "z" # GlobalEnv
But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:
lapply(1, z)
character(0) # Interior function environment
[1] "FUN" "i" "X" # lapply
[1] "match_call_fun" "y" "z" # GlobalEnv
So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.
mytable_envir <- function(...) {
tab <- do.call(what = ftable,
args = as.list(match.call()[-1]),
envir = parent.frame())
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
round(x = margin,
digits = 1)
}
# This works!
lapply(X = c("breaks","wool","tension"),
FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))
As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.
Here's a test function that takes parent.frame() as a default value:
fun <- function(y=parent.frame()) {
print(y)
print(parent.frame())
print(parent.frame(2))
print(parent.frame(3))
}
Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:
lapply(1, function(y) fun())
<environment: 0x12c5bc1b0> # y argument
<environment: 0x12c5bc1b0> # parent.frame called inside
<environment: 0x12c5bc760> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358> # y argument
<environment: 0x104930da8> # parent.frame called inside
<environment: 0x104931358> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.
As you only want to pass all the arguments passed to ftable u do not need the do.call().
mytable <- function(...) {
tab <- ftable(...)
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
return(round(x = margin,
digits = 1))
}
The following lapply creates a table for every Variable separatly i don't know if that is what you want.
lapply(X = c("breaks",
"wool",
"tension"),
FUN = function(x) mytable(warpbreaks[x],
row.vars = 1))
If you want all 3 variables in 1 table
warpbreaks$newVar <- LETTERS[3:4]
lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
"c(\"newVar\", \"tension\",\"wool\")"),
FUN = function(X)
eval(parse(text=paste("mytable(warpbreaks[,",X,"],
row.vars = 1)")))
)
Thanks to this issue, the wrapper became:
# function 1
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]),
envir = parent.frame())
# etc
}
Or:
# function 2
mytable <- function(...) {
mc <- match.call()
mc[[1]] <- quote(expr = ftable)
eval.parent(expr = mc)
# etc
}
I can now use the subset argument of ftable, and use the wrapper in lapply:
lapply(X = warpbreaks[c("wool",
"tension")],
FUN = function(x) mytable(formula = x ~ breaks,
data = warpbreaks,
subset = breaks < 15))
However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.
More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

ISO a good way to let a function accept a mix of supplied arguments, arguments from a list, and defaults

I would like to have a function accept arguments in the usual R way, most of which will have defaults. But I would also like it to accept a list of named arguments corresponding to some or some or all of the formals. Finally, I would like arguments supplied to the function directly, and not through the list, to override the list arguments where they conflict.
I could do this with a bunch of nested if-statements. But I have a feeling there is some elegant, concise, R-ish programming-on-the-language solution -- probably multiple such solutions -- and I would like to learn to use them. To show the kind of solution I am looking for:
> arg_lst <- list(x=0, y=1)
> fn <- function(a_list = NULL, x=2, y=3, z=5, ...){
<missing code>
print(c(x, y, z))
}
> fn(a_list = arg_list, y=7)
Desired output:
x y z
0 7 5
I like a lot about #jdobres's approach, but I don't like the use of assign and the potential scoping breaks.
I also don't like the premise, that a function should be written in a special way for this to work. Wouldn't it be better to write a wrapper, much like do.call, to work this way with any function? Here is that approach:
Edit: solution based off of purrr::invoke
Thinking a bit more about this, purrr::invoke almost get's there - but it will result in an error if a list argument is also passed to .... But we can make slight modifications to the code and get a working version more concisely. This version seems more robust.
library(purrr)
h_invoke = function (.f, .x = NULL, ..., .env = NULL) {
.env <- .env %||% parent.frame()
args <- c(list(...), as.list(.x)) # switch order so ... is first
args = args[!duplicated(names(args))] # remove duplicates
do.call(.f, args, envir = .env)
}
h_invoke(fn, arg_list, y = 7)
# [1] 0 7 5
Original version borrowing heavily from jdobres's code:
hierarchical_do_call = function(f, a_list = NULL, ...){
formal_args = formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('f', 'a_list', '...')] = NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args[c('f', 'a_list')] = NULL # ...but remove the argument list and the function
a_list[names(supplied_args)] = supplied_args
do.call(what = f, args = a_list)
}
fn = function(x=2, y=3, z=5) {
print(c(x, y, z))
}
arg_list <- list(x=0, y=1)
hierarchical_do_call(f = fn, a_list = arg_list, y=7)
# x y z
# 0 7 5
I'm not sure how "elegant" this is, but here's my best attempt to satisfy the OP's requirements. The if/else logic is actually pretty straightforward (no nesting needed, per se). The real work is in collecting and sanitizing the three different input types (formal defaults, the list object, and any supplied arguments).
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
formal_args <- formals() # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call())[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]])
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]])
}
}
print(c(x, y, z))
}
arg_lst <- list(x = 0, y = 1)
fn(a_list = arg_lst, y=7)
[1] 0 7 5
With a little more digging into R's meta-programming functions, it's actually possible to pack this hierarchical assignment into its own function, which is designed to operate on the function environment that called it. This makes it easier to reuse this functionality, but it definitely breaks scope and should be considered dangerous.
The "hierarchical assignment" function, mostly the same as before:
hierarchical_assign <- function(a_list) {
formal_args <- formals(sys.function(-1)) # get the function's defined inputs and defaults
formal_args[names(formal_args) %in% c('a_list', '...')] <- NULL # remove these two from formals
supplied_args <- as.list(match.call(sys.function(-1), sys.call(-1)))[-1] # get the supplied arguments
supplied_args['a_list'] <- NULL # ...but remove the argument list
# for each uniquely named item among the 3 inputs (argument list, defaults, and supplied args):
for (i in unique(c(names(a_list), names(formal_args), names(supplied_args)))) {
if (!is.null(supplied_args[[i]])) {
assign(i, supplied_args[[i]], envir = parent.frame())
} else if (!is.null(a_list[[i]])) {
assign(i, a_list[[i]], envir = parent.frame())
}
}
}
And the usage. Note that the the calling function must have an argument named a_list, and it must be passed to hierarchical_assign.
fn <- function(a_list = NULL, x = 2, y = 3, z = 5, ...) {
hierarchical_assign(a_list)
print(c(x, y, z))
}
[1] 0 7 5
I think do.call() does exactly what you want. It accepts a function and a list as arguments, the list being arguments for the functions. I think you will need a wrapper function to create this behavior of "overwriting defaults"

How can I add additional arguments to methods for internal generics?

I want to implement an inset method for my class myClass for the internal generic [<- (~ help(Extract)).
This method should run a bunch of tests, before passing on the actual insetting off to [<- via NextMethod().
I understand that:
any method has to include at least the arguments of the generic (mine does, I think)
the NextMethod() call does not usually need any arguments (though supplying them manually doesn't seem to help either).
Here's my reprex:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i, j, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod()
}
x[1] <- 3 # this errors out with *expected* error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
What seems to be happening is that NextMethod() also passes on foo to the internal generic [<-, which mistakes foo for another index, and, consequently errors out (because, in this case, x has no second dimension to index on).
I also tried supplying the arguments explicitly no NextMethod(), but this also fails (see reprex below the break).
How can I avoid choking up NextMethod() with additional arguments to my method?
(Bonus: Does anyone know good resources for building methods for internal generics? #Hadleys adv-r is a bit short on the matter).
Reprex with explicit arguments:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i = NULL, j = NULL, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod(generic = "`[<-`", object = x, i = i, j = j, value = value, ...)
}
x[1] <- 3 # this errors out with expected error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
I don't see an easy way around this except to strip the class (which makes a copy of x)
`[<-.myClass` <- function(x, i, value, ..., foo = TRUE) {
if (foo) {
cat("hi!")
x
} else {
class_x <- class(x)
x <- unclass(x)
x[i] <- value
class(x) <- class_x
x
}
}
x <- structure(1:2, class = "myClass")
x[1] <- 3
#> hi!
x[1, foo = FALSE] <- 3
x
#> [1] 3 2
#> attr(,"class")
#> [1] "myClass"
This is not a general approach - it's only needed for [, [<-, etc because they don't use the regular rules for argument matching:
Note that these operations do not match their index arguments in the standard way: argument names are ignored and positional matching only is used. So m[j = 2, i = 1] is equivalent to m[2, 1] and not to m[1, 2].
(from the "Argument matching" section in ?`[`)
That means your x[1, foo = FALSE] is equivalent to x[1, FALSE] and then you get an error message because x is not a matrix.
Approaches that don't work:
Supplying additional arguments to NextMethod(): this can only increase the number of arguments, not decrease it
Unbinding foo with rm(foo): this leads to an error about undefined foo.
Replacing foo with a missing symbol: this leads to an error that foo is not supplied with no default argument.
Here's how I understand it, but I don't know so much about that subject so I hope I don't say too many wrong things.
From ?NextMethod
NextMethod invokes the next method (determined by the class vector,
either of the object supplied to the generic, or of the first argument
to the function containing NextMethod if a method was invoked
directly).
Your class vector is :
x <- c(1,2)
class(x) <- "myClass" # note: you might want class(x) <- c("myClass", class(x))
class(x) # [1] "myClass"
So you have no "next method" here, and [<-.default, doesn't exist.
What would happen if we define it ?
`[<-.default` <- function(x, i, j, value, ...) {print("default"); value}
x[1, foo = FALSE] <- 3
# [1] "default"
x
# [1] 3
If there was a default method with a ... argument it would work fine as the foo argument would go there, but it's not the case so I believe NextMethod just cannot be called as is.
You could do the following to hack around the fact that whatever is called doesn't like to be fed a foo argument:
`[<-.myClass` <- function(x, i, j, value, foo = FALSE, ...) {
if (foo) {
stop("'foo' must be false!")
}
`[<-.myClass` <- function(x, i, j, value, ...) NextMethod()
args <- as.list(match.call())[-1]
args <- args[names(args) %in% c("","x","i","j","value")]
do.call("[<-",args)
}
x[1, foo = FALSE] <- 3
x
# [1] 3 2
# attr(,"class")
# [1] "myClass"
Another example, with a more complex class :
library(data.table)
x <- as.data.table(iris[1:2,1:2])
class(x) <- c("myClass",class(x))
x[1, 2, foo = FALSE] <- 9999
# Sepal.Length Sepal.Width
# 1: 5.1 9999
# 2: 4.9 3
class(x)
# [1] "myClass" "data.table" "data.frame"
This would fail if the next method had other arguments than x, i, j and value, in that case better to be explicit about our additional arguments and run args <- args[! names(args) %in% c("foo","bar")]. Then it might work (as long as arguments are given explicitly as match.call doesn't catch default arguments). I couldn't test this though as I don't know such method for [<-.

R: S3 Method dispatch depending on arguments

I have a generic function foo that I want to call three different ways depending on the arguments given to it.
foo <- function(...) UseMethod("foo")
#default
foo.default <- function(x, y, ...) {
#does some magic
print("this is the default method")
}
#formula
foo.formula <- function(formula, data = list(), ...) {
print("this is the formula method")
}
#data.frame
foo.data.frame <- function(data, x, y, ...) {
print("this is the data.frame method")
}
In the following I'm going to show how I am expecting the method dispatch to work but the outputs are presented under each call...
mydata <- data.frame(x=c(1,2,3,4),y=c(5,6,7,8))
#ways to call default function
foo(x = mydata$x, y = mydata$y)
#[1] "this is the default method"
#ways to call formula
foo(formula = mydata$x~mydata$y)
#[1] "this is the formula method"
foo(formula = x~y, data = mydata)
#[1] "this is the formula method"
foo(data = mydata, formula = x~y) #ERROR
#[1] "this is the data.frame method"
#ways to call data.frame method
foo(data = mydata, x = x, y = y)
#[1] "this is the data.frame method"
foo(x = x, y = y, data = mydata) #ERROR
#Error in foo(x = x, y = y, data = mydata) : object 'x' not found
from what I can tell, the method used depends on the class of the first argument. Essentially, I would like for the method dispatch to depend on the arguments passed to the generic function foo and not the first argument.
I would like the dispatch to have the following priority:
If the formula argument is present the formula method is used (data argument should be optional here)
Then, if no formula argument is found, if data argument is present use data.frame method (which requires x and y arguments)
else foo expects the x and y arguments or it will fail.
Note
I would like to avoid defining the generic function foo as follows
foo <- function(formula, data,...) UseMethod("foo")
while this would fix all my issues (I believe all except the last case), this will cause a devtools::check() warning because the some of S3 functions will not have the same arguments as the generic function and will no longer be consistent (specifically foo.default and foo.data.frame). And I wouldn't like to include the missing arguments because those methods do not have use for those arguments.
As Thomas has pointed out, this is not the standard behavior for S3 classes. If you really want to stick to S3, however, you could write your functions so as to "mimick" UseMethod, even though it won't be pretty and is probably not what you want to do. Nevertheless, here an idea that is based on capturing all arguments first, and then checking for the presence of your "preferred" argument type:
Get some objects first:
a <- 1; class(a) <- "Americano"
b <- 2; class(b) <- "Espresso"
Let the function in question capture all arguments with dots, and then check for the presence of an argument type in order of your preference:
drink <- function(...){
dots <- list(...)
if(any(sapply(dots, function(cup) class(cup)=="Americano"))){
drink.Americano(...)
} else { # you can add more checks here to get a hierarchy
# try to find appropriate method first if one exists,
# using the first element of the arguments as usual
tryCatch(get(paste0("drink.", class(dots[[1]])))(),
# if no appropriate method is found, try the default method:
error = function(e) drink.default(...))
}
}
drink.Americano <- function(...) print("Hmm, gimme more!")
drink.Espresso <- function(...) print("Tripple, please!")
drink.default <- function(...) print("Any caffeine in there?")
drink(a) # "Americano", dispatch hard-coded.
# [1] "Hmm, gimme more!"
drink(b) # "Espresso", not hard-coded, but correct dispatch anyway
# [1] "Tripple, please!"
drink("sthelse") # Dispatches to default method
# [1] "Any caffeine in there?"
drink(a,b,"c")
# [1] "Hmm, gimme more!"
drink(b,"c", a)
# [1] "Hmm, gimme more!"

Function to assign a value in a new environment in R

I'm trying to write a function 'exported' in R that will assign a value to a name in a desired environment (say .GlobalEnv). I'd like to use the following syntax.
# desired semantics: x <- 60
exported(x) <- 60
# ok if quotes prove necessary.
exported("x") <- 60
I've tried several variations. Most basically:
`export<-` <- function(x, obj) {
call <- as.list(match.call())
elem <- as.character(call[[2]])
assign(elem, obj, .GlobalEnv)
get(elem, .GlobalEnv)
}
exported(x) <- 50
The foregoing gives an error about the last argument being unused. The following complains that "object 'x' is not found."
setGeneric("exported<-", function(object, ...) {
standardGeneric("exported<-")
})
setReplaceMethod("exported", "ANY", function(object, func) {
call <- as.list(match.call())
name <- as.character(call$object)
assign(name, func, other.env)
assign(name, func, .GlobalEnv)
get(name, .GlobalEnv)
})
exported(x) <- 50
The above approach using a character vector in place of a name yields "target of assignment expands to non-language object."
Is this possible in R?
EDIT: I would actually like to do more work inside 'exported.' Code was omitted for brevity. I also realize I can use do something like:
exported(name, func) {
...
}
but am interested in seeing if my syntax is possible.
I can't understand why you wouldn't use assign?
assign( "x" , 60 , env = .GlobalEnv )
x
[1] 60
The env argument specifies the environment into which to assign the variable.
e <- new.env()
assign( "y" , 50 , env = e )
ls( env = e )
[1] "y"

Resources