Difference between the def and useAsDefault parameters of setGeneric - r

In the S4 setGeneric function documentation, the difference between the def and useAsDefault parameters is poorly explained: they seem to do the same thing, but in practice there's a confusing and poorly contextualised explanation. Could anybody provide a practical example where the two entries behave differently?

Some context
For better or worse, the behaviour of setGeneric differs in quite subtle ways depending on its arguments. It is possible that some of these quirks are not intentional. The help page accessed by ?setGeneric and even the reference book Extending R (Chambers, 2016) follow the Pareto principle by spelling out the most frequent use cases and only sketching out the less frequent ones, in particular not expounding about useAsDefault whose use is intended only for nonstandard generic functions.
S4 generic functions are designated by the formal class genericFunction, which has three direct subclasses: standardGeneric, nonstandardGenericFunction, and groupGenericFunction. To give context to the behaviour of setGeneric, we should distinguish between standard and nonstandard generic functions. (Group generic functions are defined by setGroupGeneric, not setGeneric, so we ignore them here.)
Standard generic functions only perform method dispatch: arguments are passed "as is" to the dispatched method, and the return value of the method is returned "as is" by the generic. Nonstandard generic functions perform method dispatch while allowing for pre-processing of arguments and post-processing of return values.
Hence if we define a generic function with setGeneric("zzz", def=), then in the standard case body(zzz) is just the call standardGeneric("zzz"), whereas in the nonstandard case it is a call containing the call standardGeneric("zzz"), typically of the form:
{
## optionally do some stuff with the arguments
val <- standardGeneric("zzz")
## optionally do some stuff with 'val', then return
}
To answer your question, let's restrict attention to calls to setGeneric of the form
setGeneric("zzz", def=, useAsDefault=)
where def is a function and useAsDefault is either a function or missing, and let's further assume that there is no existing function zzz.
Then setGeneric behaves in one of three ways, depending on body(def):
If body(def) does not contain the call standardGeneric("zzz"),
then setGeneric constructs a standard generic function and assigns
it to the symbol zzz in the calling environment. It associates
with zzz a default method (of formal class derivedDefaultMethod),
which is retrieved as zzz#default. The default method
is useAsDefault if not missing and def otherwise.
setGeneric("zzz", def = function(x, ...) 1 + x, useAsDefault = function(x, ...) x)
## [1] "zzz"
zzz
## standardGeneric for "zzz" defined from package ".GlobalEnv"
##
## function (x, ...)
## standardGeneric("zzz")
## <environment: 0x14f048860>
## Methods may be defined for arguments: x
## Use showMethods(zzz) for currently available ones.
zzz#default
## Method Definition (Class "derivedDefaultMethod"):
##
## function (x, ...)
## x
##
## Signatures:
## x
## target "ANY"
## defined "ANY"
zzz(0)
## [1] 0
If body(def) is precisely the call standardGeneric("zzz"),
then, as in Case 1, setGeneric constructs and assigns to zzz
a standard generic function. However, here, the default method
is determined entirely by useAsDefault.
If useAsDefault is missing, then zzz gets no default method.
setGeneric("zzz", def = function(x, ...) standardGeneric("zzz"))
## [1] "zzz"
zzz
## standardGeneric for "zzz" defined from package ".GlobalEnv"
##
## function (x, ...)
## standardGeneric("zzz")
## <environment: 0x112010f60>
## Methods may be defined for arguments: x
## Use showMethods(zzz) for currently available ones.
zzz#default
## NULL
zzz(0)
## Error in (function (classes, fdef, mtable) :
## unable to find an inherited method for function 'zzz' for signature '"numeric"'
If body(def) is not precisely the call standardGeneric("zzz"),
but does contain it, then setGeneric constructs and assigns to
zzz a nonstandard generic function. As in Case 2, useAsDefault
specifies the default method. However, here, the generic function
inherits its body from def.
setGeneric("zzz", def = function(x, ...) 1 + standardGeneric("zzz"), useAsDefault = function(x, ...) x)
## [1] "zzz"
zzz
## nonstandardGenericFunction for "zzz" defined from package ".GlobalEnv"
##
## function (x, ...)
## 1 + standardGeneric("zzz")
## <environment: 0x123fd5410>
## Methods may be defined for arguments: x
## Use showMethods(zzz) for currently available ones.
zzz#default
## Method Definition (Class "derivedDefaultMethod"):
##
## function (x, ...)
## x
##
## Signatures:
## x
## target "ANY"
## defined "ANY"
zzz(0)
## [1] 1

Related

An imported package is an object of what kind? [duplicate]

I am (probably) NOT referring to the "all other variables" meaning like var1~. here.
I was pointed to plyr once again and looked into mlplyand wondered why parameters are defined with leading dot like this:
function (.data, .fun = NULL, ..., .expand = TRUE, .progress = "none",
.parallel = FALSE)
{
if (is.matrix(.data) & !is.list(.data))
.data <- .matrix_to_df(.data)
f <- splat(.fun)
alply(.data = .data, .margins = 1, .fun = f, ..., .expand = .expand,
.progress = .progress, .parallel = .parallel)
}
<environment: namespace:plyr>
What's the use of that? Is it just personal preference, naming convention or more? Often R is so functional that I miss a trick that's long been done before.
A dot in function name can mean any of the following:
nothing at all
a separator between method and class in S3 methods
to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame doesn't separate data from frame, other than visually.
2. Separation of methods and classes in S3 methods
plot is one example of a generic S3 method. Thus plot.lm and plot.glm are the underlying function definitions that are used when calling plot(lm(...)) or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls(). To force ls to show these variables, use ls(all.names=TRUE). By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame to a list you use as.list(..)
as.list(iris)
In this case as.list is a S3 generic method, and you are passing a data.frame to it. Thus the S3 function is called as.list.data.frame:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table package and look at the function as.data.table.data.frame:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
At the start of a name it works like the UNIX filename convention to keep objects hidden by default.
ls()
character(0)
.a <- 1
ls()
character(0)
ls(all.names = TRUE)
[1] ".a"
It can be just a token with no special meaning, it's not doing anything more than any other allowed token.
my.var <- 1
my_var <- 1
myVar <- 1
It's used for S3 method dispatch. So, if I define simple class "myClass" and create objects with that class attribute, then generic functions such as print() will automatically dispatch to my specific print method.
myvar <- 1
print(myvar)
class(myvar) <- c("myClass", class(myvar))
print.myClass <- function(x, ...) {
print(paste("a special message for myClass objects, this one has length", length(x)))
return(invisible(NULL))
}
print(myvar)
There is an ambiguity in the syntax for S3, since you cannot tell from a function's name whether it is an S3 method or just a dot in the name. But, it's a very simple mechanism that is very powerful.
There's a lot more to each of these three aspects, and you should not take my examples as good practice, but they are the basic differences.
If a user defines a function .doSomething and is lazy to specify all the roxygen documentation for parameters, it will not generate errors for compiling the package

Using a dot before a function in R [duplicate]

I am (probably) NOT referring to the "all other variables" meaning like var1~. here.
I was pointed to plyr once again and looked into mlplyand wondered why parameters are defined with leading dot like this:
function (.data, .fun = NULL, ..., .expand = TRUE, .progress = "none",
.parallel = FALSE)
{
if (is.matrix(.data) & !is.list(.data))
.data <- .matrix_to_df(.data)
f <- splat(.fun)
alply(.data = .data, .margins = 1, .fun = f, ..., .expand = .expand,
.progress = .progress, .parallel = .parallel)
}
<environment: namespace:plyr>
What's the use of that? Is it just personal preference, naming convention or more? Often R is so functional that I miss a trick that's long been done before.
A dot in function name can mean any of the following:
nothing at all
a separator between method and class in S3 methods
to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame doesn't separate data from frame, other than visually.
2. Separation of methods and classes in S3 methods
plot is one example of a generic S3 method. Thus plot.lm and plot.glm are the underlying function definitions that are used when calling plot(lm(...)) or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls(). To force ls to show these variables, use ls(all.names=TRUE). By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame to a list you use as.list(..)
as.list(iris)
In this case as.list is a S3 generic method, and you are passing a data.frame to it. Thus the S3 function is called as.list.data.frame:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table package and look at the function as.data.table.data.frame:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
At the start of a name it works like the UNIX filename convention to keep objects hidden by default.
ls()
character(0)
.a <- 1
ls()
character(0)
ls(all.names = TRUE)
[1] ".a"
It can be just a token with no special meaning, it's not doing anything more than any other allowed token.
my.var <- 1
my_var <- 1
myVar <- 1
It's used for S3 method dispatch. So, if I define simple class "myClass" and create objects with that class attribute, then generic functions such as print() will automatically dispatch to my specific print method.
myvar <- 1
print(myvar)
class(myvar) <- c("myClass", class(myvar))
print.myClass <- function(x, ...) {
print(paste("a special message for myClass objects, this one has length", length(x)))
return(invisible(NULL))
}
print(myvar)
There is an ambiguity in the syntax for S3, since you cannot tell from a function's name whether it is an S3 method or just a dot in the name. But, it's a very simple mechanism that is very powerful.
There's a lot more to each of these three aspects, and you should not take my examples as good practice, but they are the basic differences.
If a user defines a function .doSomething and is lazy to specify all the roxygen documentation for parameters, it will not generate errors for compiling the package

Use `callNextMethod()` with dotsMethods

I would like to define some S4 generics dispatching on the ... argument such that the more specialized methods call the inherited method through callNextMethod(). However, as illustrated by the MWE, this fails with the following error.
# sample function which returns the number of its arguments
f <- function(...) length(list(...))
setGeneric("f")
## [1] "f"
setMethod("f", "character", function(...){ print("character"); callNextMethod() })
## [1] "f"
f(1, 2, 3)
## [1] 3
f("a", "b", "c")
## [1] "character"
## Error in callNextMethod(): a call to callNextMethod() appears in a call to '.Method', but the call does not seem to come from either a generic function or another 'callNextMethod'
This behavior doesn't seem right to me, but maybe I'm missing something here. I would expect the failing callNextMethod() to dispatch to the inherited default method function(...) length(list(...)) effectively returning:
## [1] "character"
## [1] 3
Any thoughts on this?
Update
Additionally, I've found the following difference in behavior between S4 methods dispatching on formal arguments and ones dispatching on .... Consider the following example where switching the signature from x to ... changes the way objects are resolved.
f = function(x, ..., a = b) {
b = "missing 'a'"
cat(a)
}
f()
## missing 'a'
f(a = 1)
## 1
setGeneric("f", signature = "x")
f()
## missing 'a'
setGeneric("f", signature = "...")
f()
## Error in cat(a) : object 'b' not found
According to ?dotsMethods the dispatch on ... is implemented differently, but as suggested in the last sentence, this shouldn't cause any differences in behavior compared to regular generics. However, the above findings seem to prove the opposite.
Methods dispatching on “...” were introduced in version 2.8.0 of R. The initial implementation of the corresponding selection and dispatch is in an R function, for flexibility while the new mechanism is being studied. In this implementation, a local version of setGeneric is inserted in the generic function's environment. The local version selects a method according to the criteria above and calls that method, from the environment of the generic function. This is slightly different from the action taken by the C implementation when “...” is not involved. Aside from the extra computing time required, the method is evaluated in a true function call, as opposed to the special context constructed by the C version (which cannot be exactly replicated in R code.) However, situations in which different computational results would be obtained have not been encountered so far, and seem very unlikely.

Method dispatching in R based on presence of specific parameters

I have two function, f1(...) and f2(...). I would like to group them under a single function f(...) and conditionally pass the parameters of f(...) to either f1 or f2. If f(...) is passed a parameter called special.param, then I will call f2(...). Otherwise I will call f1(...). I don't believe UseMethod can handle this since it will check for the class of the first parameter rather than the presence of a certain parameter. What is the correct way to do this? Is there a way to check the names of the parameters in ...?
If these are your functions and p is your special parameter
f1 <- function(..., p) "f1"
f2 <- function(..., p) "f2"
In S4 (maybe that's not what you're looking for...) you could write a generic that dispatches on the special parameter
setGeneric("f", function(..., p) standardGeneric("f"),
signature="p", useAsDefault=f1)
and implement a method that is invoked when the parameter is missing
setMethod("f", "missing", f2)
A more symmetric implementation with the same consequence would be
setGeneric("f", function(..., p) standardGeneric("f"), signature="p")
setMethod("f", "ANY", f1)
setMethod("f", "missing", f2)
with
> f(p=1)
[1] "f1"
> f()
[1] "f2"
A simpler base R alternative (implied by a comment and deleted answer) is
f <- function(..., p) {
if (missing(p))
f2(...)
else
f1(..., p=p)
}
This would become tedious and error prone if there were more than two alternatives for p (e.g., missing vs. numeric vs. logical) or if dispatch were on more than 1 argument f(x, ..., p). The S4 approach also means that available methods are discoverable (showMethods(f)) but carry additional documentation and NAMESPACE burdens when implemented in a package.

How do I see code for methods written with the methods/RMethodUtils package?

I recently ran across what may be yet another OO paradigm in R.
library(RSQLite)
> dbReadTable
standardGeneric for "dbReadTable" defined from package "DBI"
defined with value class: "data.frame"
function (conn, name, ...)
.valueClassTest(standardGeneric("dbReadTable"), "data.frame",
"dbReadTable")
<environment: 0x1d252198>
Methods may be defined for arguments: conn, name
Use showMethods("dbReadTable") for currently available ones.
> showMethods('dbReadTable')
Function: dbReadTable (package DBI)
conn="SQLiteConnection", name="character"
Two questions:
Does this correspond to a new paradigm not listed here? Or is this just a way of manipulating e.g. S4 classes?
How do I see the source for dbReadTable's methods?
As usual for S4 methods, just call getMethod() with the signature of the method you're interested in examining:
## Use showMethods to view signatures of dbReadTable's methods
showMethods('dbReadTable')
# Function: dbReadTable (package DBI)
# conn="SQLiteConnection", name="character"
## getMethod's 2nd argument is a character vector containing method's signature
getMethod("dbReadTable", c("SQLiteConnection", "character"))
# Method Definition:
#
# function (conn, name, ...)
# sqliteReadTable(conn, name, ...)
# <environment: namespace:RSQLite>
#
# Signatures:
# conn name
# target "SQLiteConnection" "character"
# defined "SQLiteConnection" "character"
And then after seeing the above, you'll probably want to have a look a the code returned by:
sqliteReadTable

Resources