Method dispatching in R based on presence of specific parameters - r

I have two function, f1(...) and f2(...). I would like to group them under a single function f(...) and conditionally pass the parameters of f(...) to either f1 or f2. If f(...) is passed a parameter called special.param, then I will call f2(...). Otherwise I will call f1(...). I don't believe UseMethod can handle this since it will check for the class of the first parameter rather than the presence of a certain parameter. What is the correct way to do this? Is there a way to check the names of the parameters in ...?

If these are your functions and p is your special parameter
f1 <- function(..., p) "f1"
f2 <- function(..., p) "f2"
In S4 (maybe that's not what you're looking for...) you could write a generic that dispatches on the special parameter
setGeneric("f", function(..., p) standardGeneric("f"),
signature="p", useAsDefault=f1)
and implement a method that is invoked when the parameter is missing
setMethod("f", "missing", f2)
A more symmetric implementation with the same consequence would be
setGeneric("f", function(..., p) standardGeneric("f"), signature="p")
setMethod("f", "ANY", f1)
setMethod("f", "missing", f2)
with
> f(p=1)
[1] "f1"
> f()
[1] "f2"
A simpler base R alternative (implied by a comment and deleted answer) is
f <- function(..., p) {
if (missing(p))
f2(...)
else
f1(..., p=p)
}
This would become tedious and error prone if there were more than two alternatives for p (e.g., missing vs. numeric vs. logical) or if dispatch were on more than 1 argument f(x, ..., p). The S4 approach also means that available methods are discoverable (showMethods(f)) but carry additional documentation and NAMESPACE burdens when implemented in a package.

Related

An imported package is an object of what kind? [duplicate]

I am (probably) NOT referring to the "all other variables" meaning like var1~. here.
I was pointed to plyr once again and looked into mlplyand wondered why parameters are defined with leading dot like this:
function (.data, .fun = NULL, ..., .expand = TRUE, .progress = "none",
.parallel = FALSE)
{
if (is.matrix(.data) & !is.list(.data))
.data <- .matrix_to_df(.data)
f <- splat(.fun)
alply(.data = .data, .margins = 1, .fun = f, ..., .expand = .expand,
.progress = .progress, .parallel = .parallel)
}
<environment: namespace:plyr>
What's the use of that? Is it just personal preference, naming convention or more? Often R is so functional that I miss a trick that's long been done before.
A dot in function name can mean any of the following:
nothing at all
a separator between method and class in S3 methods
to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame doesn't separate data from frame, other than visually.
2. Separation of methods and classes in S3 methods
plot is one example of a generic S3 method. Thus plot.lm and plot.glm are the underlying function definitions that are used when calling plot(lm(...)) or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls(). To force ls to show these variables, use ls(all.names=TRUE). By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame to a list you use as.list(..)
as.list(iris)
In this case as.list is a S3 generic method, and you are passing a data.frame to it. Thus the S3 function is called as.list.data.frame:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table package and look at the function as.data.table.data.frame:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
At the start of a name it works like the UNIX filename convention to keep objects hidden by default.
ls()
character(0)
.a <- 1
ls()
character(0)
ls(all.names = TRUE)
[1] ".a"
It can be just a token with no special meaning, it's not doing anything more than any other allowed token.
my.var <- 1
my_var <- 1
myVar <- 1
It's used for S3 method dispatch. So, if I define simple class "myClass" and create objects with that class attribute, then generic functions such as print() will automatically dispatch to my specific print method.
myvar <- 1
print(myvar)
class(myvar) <- c("myClass", class(myvar))
print.myClass <- function(x, ...) {
print(paste("a special message for myClass objects, this one has length", length(x)))
return(invisible(NULL))
}
print(myvar)
There is an ambiguity in the syntax for S3, since you cannot tell from a function's name whether it is an S3 method or just a dot in the name. But, it's a very simple mechanism that is very powerful.
There's a lot more to each of these three aspects, and you should not take my examples as good practice, but they are the basic differences.
If a user defines a function .doSomething and is lazy to specify all the roxygen documentation for parameters, it will not generate errors for compiling the package

Using a dot before a function in R [duplicate]

I am (probably) NOT referring to the "all other variables" meaning like var1~. here.
I was pointed to plyr once again and looked into mlplyand wondered why parameters are defined with leading dot like this:
function (.data, .fun = NULL, ..., .expand = TRUE, .progress = "none",
.parallel = FALSE)
{
if (is.matrix(.data) & !is.list(.data))
.data <- .matrix_to_df(.data)
f <- splat(.fun)
alply(.data = .data, .margins = 1, .fun = f, ..., .expand = .expand,
.progress = .progress, .parallel = .parallel)
}
<environment: namespace:plyr>
What's the use of that? Is it just personal preference, naming convention or more? Often R is so functional that I miss a trick that's long been done before.
A dot in function name can mean any of the following:
nothing at all
a separator between method and class in S3 methods
to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame doesn't separate data from frame, other than visually.
2. Separation of methods and classes in S3 methods
plot is one example of a generic S3 method. Thus plot.lm and plot.glm are the underlying function definitions that are used when calling plot(lm(...)) or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls(). To force ls to show these variables, use ls(all.names=TRUE). By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame to a list you use as.list(..)
as.list(iris)
In this case as.list is a S3 generic method, and you are passing a data.frame to it. Thus the S3 function is called as.list.data.frame:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table package and look at the function as.data.table.data.frame:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
At the start of a name it works like the UNIX filename convention to keep objects hidden by default.
ls()
character(0)
.a <- 1
ls()
character(0)
ls(all.names = TRUE)
[1] ".a"
It can be just a token with no special meaning, it's not doing anything more than any other allowed token.
my.var <- 1
my_var <- 1
myVar <- 1
It's used for S3 method dispatch. So, if I define simple class "myClass" and create objects with that class attribute, then generic functions such as print() will automatically dispatch to my specific print method.
myvar <- 1
print(myvar)
class(myvar) <- c("myClass", class(myvar))
print.myClass <- function(x, ...) {
print(paste("a special message for myClass objects, this one has length", length(x)))
return(invisible(NULL))
}
print(myvar)
There is an ambiguity in the syntax for S3, since you cannot tell from a function's name whether it is an S3 method or just a dot in the name. But, it's a very simple mechanism that is very powerful.
There's a lot more to each of these three aspects, and you should not take my examples as good practice, but they are the basic differences.
If a user defines a function .doSomething and is lazy to specify all the roxygen documentation for parameters, it will not generate errors for compiling the package

Use `callNextMethod()` with dotsMethods

I would like to define some S4 generics dispatching on the ... argument such that the more specialized methods call the inherited method through callNextMethod(). However, as illustrated by the MWE, this fails with the following error.
# sample function which returns the number of its arguments
f <- function(...) length(list(...))
setGeneric("f")
## [1] "f"
setMethod("f", "character", function(...){ print("character"); callNextMethod() })
## [1] "f"
f(1, 2, 3)
## [1] 3
f("a", "b", "c")
## [1] "character"
## Error in callNextMethod(): a call to callNextMethod() appears in a call to '.Method', but the call does not seem to come from either a generic function or another 'callNextMethod'
This behavior doesn't seem right to me, but maybe I'm missing something here. I would expect the failing callNextMethod() to dispatch to the inherited default method function(...) length(list(...)) effectively returning:
## [1] "character"
## [1] 3
Any thoughts on this?
Update
Additionally, I've found the following difference in behavior between S4 methods dispatching on formal arguments and ones dispatching on .... Consider the following example where switching the signature from x to ... changes the way objects are resolved.
f = function(x, ..., a = b) {
b = "missing 'a'"
cat(a)
}
f()
## missing 'a'
f(a = 1)
## 1
setGeneric("f", signature = "x")
f()
## missing 'a'
setGeneric("f", signature = "...")
f()
## Error in cat(a) : object 'b' not found
According to ?dotsMethods the dispatch on ... is implemented differently, but as suggested in the last sentence, this shouldn't cause any differences in behavior compared to regular generics. However, the above findings seem to prove the opposite.
Methods dispatching on “...” were introduced in version 2.8.0 of R. The initial implementation of the corresponding selection and dispatch is in an R function, for flexibility while the new mechanism is being studied. In this implementation, a local version of setGeneric is inserted in the generic function's environment. The local version selects a method according to the criteria above and calls that method, from the environment of the generic function. This is slightly different from the action taken by the C implementation when “...” is not involved. Aside from the extra computing time required, the method is evaluated in a true function call, as opposed to the special context constructed by the C version (which cannot be exactly replicated in R code.) However, situations in which different computational results would be obtained have not been encountered so far, and seem very unlikely.

how to get variable names from list

I have list of functions which also contains one user defined function:
> fun <- function(x) {x}
> funs <- c(median, mean, fun)
Is it possible to get function names as strings from this list? My only workaround so far was to create vector which contains function names as strings:
> fun.names <- c("median", "mean", "fun")
When I want to get variable name I use to do this trick (if this is not correct correct me please) but as you can see it only work for one variable not for list:
> as.character(substitute(mean))
[1] "mean"
> as.character(substitute(funs))
[1] "funs"
Is there something that will work also for list? Is there any difference if list contains functions or data types?
EDIT:
I need to pass this list of functions (plus another data) to another function. Then those functions from list will be applied to dataset. Function names are needed because if there are several functions passed in list I want to being able to determine which function was applied. So far I've been using this:
window.size <- c(1,2,3)
combinations <- expand.grid(window.size, c(median, mean))
combinations <- cbind(combinations, rep(c("median","mean"), each = length(window.size)))
Generally speaking, this is not possible. Consider this definition of funs:
funs <- c(median,mean,function(x) x);
In this case, there's no name associated with the user-defined function at all. There's no rule in R that says all functions must be bound to a name at any point in time.
If you want to start making some assumptions about whether and where all such lambdas are defined, then possibilities open up.
One idea is to search the closure environment of each function for an entry that matches (identically) to the function itself, and then use that name. This will incur a performance penalty due to the comparison work, but may be tolerable if you don't have to run it repetitively:
getFunNameFromClosure <- function(fun) names(which(do.call(c,eapply(environment(fun),identical,fun)))[1L]);
Demo:
fun <- function(x) x;
funs <- c(median,mean,fun);
sapply(funs,getFunNameFromClosure);
## [1] "median" "mean" "fun"
Caveats:
1: As explained earlier, this will not work on functions that were never bound to a name. Furthermore, it will not work on functions whose closure environment does not contain a binding to the function. This could happen if the function was bound to a name in a different environment than its closure (via a return value, superassignment, or assign() call) or if its closure environment was explicitly changed.
2: It is possible to bind a function to multiple names. Thus, the name you get as a result of the eapply() search may not be the one you expect. Here's a good demonstration of this:
getFunNameFromClosure(ls); ## gets wrong name
## [1] "objects"
identical(ls,objects); ## this is why
## [1] TRUE
Here is a hacky approach:
funs <- list(median, mean)
fun_names = sapply(funs, function(x) {
s = as.character(deparse(eval(x)))[[2]]
gsub('UseMethod\\(|[[:punct:]]', '', s)
})
names(funs) <- fun_names
funs
$median
function (x, na.rm = FALSE)
UseMethod("median")
<bytecode: 0x103252878>
<environment: namespace:stats>
$mean
function (x, ...)
UseMethod("mean")
<bytecode: 0x103ea11b8>
<environment: namespace:base>
combinations <- expand.grid(window.size, fun_names, c(median, mean))

What does the dot mean in R – personal preference, naming convention or more?

I am (probably) NOT referring to the "all other variables" meaning like var1~. here.
I was pointed to plyr once again and looked into mlplyand wondered why parameters are defined with leading dot like this:
function (.data, .fun = NULL, ..., .expand = TRUE, .progress = "none",
.parallel = FALSE)
{
if (is.matrix(.data) & !is.list(.data))
.data <- .matrix_to_df(.data)
f <- splat(.fun)
alply(.data = .data, .margins = 1, .fun = f, ..., .expand = .expand,
.progress = .progress, .parallel = .parallel)
}
<environment: namespace:plyr>
What's the use of that? Is it just personal preference, naming convention or more? Often R is so functional that I miss a trick that's long been done before.
A dot in function name can mean any of the following:
nothing at all
a separator between method and class in S3 methods
to hide the function name
Possible meanings
1. Nothing at all
The dot in data.frame doesn't separate data from frame, other than visually.
2. Separation of methods and classes in S3 methods
plot is one example of a generic S3 method. Thus plot.lm and plot.glm are the underlying function definitions that are used when calling plot(lm(...)) or plot(glm(...))
3. To hide internal functions
When writing packages, it is sometimes useful to use leading dots in function names because these functions are somewhat hidden from general view. Functions that are meant to be purely internal to a package sometimes use this.
In this context, "somewhat hidden" simply means that the variable (or function) won't normally show up when you list object with ls(). To force ls to show these variables, use ls(all.names=TRUE). By using a dot as first letter of a variable, you change the scope of the variable itself. For example:
x <- 3
.x <- 4
ls()
[1] "x"
ls(all.names=TRUE)
[1] ".x" "x"
x
[1] 3
.x
[1] 4
4. Other possible reasons
In Hadley's plyr package, he uses the convention to use leading dots in function names. This as a mechanism to try and ensure that when resolving variable names, the values resolve to the user variables rather than internal function variables.
Complications
This mishmash of different uses can lead to very confusing situations, because these different uses can all get mixed up in the same function name.
For example, to convert a data.frame to a list you use as.list(..)
as.list(iris)
In this case as.list is a S3 generic method, and you are passing a data.frame to it. Thus the S3 function is called as.list.data.frame:
> as.list.data.frame
function (x, ...)
{
x <- unclass(x)
attr(x, "row.names") <- NULL
x
}
<environment: namespace:base>
And for something truly spectacular, load the data.table package and look at the function as.data.table.data.frame:
> library(data.table)
> methods(as.data.table)
[1] as.data.table.data.frame* as.data.table.data.table* as.data.table.matrix*
Non-visible functions are asterisked
> data.table:::as.data.table.data.frame
function (x, keep.rownames = FALSE)
{
if (keep.rownames)
return(data.table(rn = rownames(x), x, keep.rownames = FALSE))
attr(x, "row.names") = .set_row_names(nrow(x))
class(x) = c("data.table", "data.frame")
x
}
<environment: namespace:data.table>
At the start of a name it works like the UNIX filename convention to keep objects hidden by default.
ls()
character(0)
.a <- 1
ls()
character(0)
ls(all.names = TRUE)
[1] ".a"
It can be just a token with no special meaning, it's not doing anything more than any other allowed token.
my.var <- 1
my_var <- 1
myVar <- 1
It's used for S3 method dispatch. So, if I define simple class "myClass" and create objects with that class attribute, then generic functions such as print() will automatically dispatch to my specific print method.
myvar <- 1
print(myvar)
class(myvar) <- c("myClass", class(myvar))
print.myClass <- function(x, ...) {
print(paste("a special message for myClass objects, this one has length", length(x)))
return(invisible(NULL))
}
print(myvar)
There is an ambiguity in the syntax for S3, since you cannot tell from a function's name whether it is an S3 method or just a dot in the name. But, it's a very simple mechanism that is very powerful.
There's a lot more to each of these three aspects, and you should not take my examples as good practice, but they are the basic differences.
If a user defines a function .doSomething and is lazy to specify all the roxygen documentation for parameters, it will not generate errors for compiling the package

Resources