Why is `predict` not a generic function? - r

why is predict not a generic function? isGeneric('predict') is FALSE but isGeneric('summary') and isGeneric('print') is TRUE. All of them have methods, which can be listed with methods('predict') etc? So predict is not a generic function as described here (also obvious from looking at class) but still dispatches methods depending on the object passed to it (e.g. predict.lm or predict.glm). So there is a different way R dispatches methods? How can I test whether a function has methods so that all of the examples above are true? Yes, I can test the length of methods('predict') but that produces a warning for functions without methods.

For starters, none of those functions are generic by your test:
> isGeneric('predict')
[1] FALSE
> isGeneric('summary')
[1] FALSE
> isGeneric('print')
[1] FALSE
Let's try again...
> isGeneric("summary")
[1] FALSE
> require(sp)
Loading required package: sp
> isGeneric("summary")
[1] TRUE
What's going on here? Well, isGeneric only tests for S4 generic functions, and when I start R summary is an S3 generic function. If a package wants to use S4 methods and classes and there already exists an S3 generic function then it can create an S4 generic.
So, initially summary is:
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x9e4fc08>
<environment: namespace:base>
which is an S3 generic. I get the sp package...
> require(sp)
Loading required package: sp
> summary
standardGeneric for "summary" defined from package "base"
function (object, ...)
standardGeneric("summary")
<environment: 0x9f9d428>
Methods may be defined for arguments: object
Use showMethods("summary") for currently available ones.
and now summary is an S4 standard generic.
S3 generic methods despatch (usually) by calling {generic}.{class}, and this is what UseMethod("summary") does in the S3 summary generic function.
If you want to test if a function has methods for a particular class, then you probably have to test that it has an S4 method (using the functions for S4 method metadata) and an S3 method (by looking for a function called {generic}.{class}, such as summary.glm.
Great eh?

My general understanding of S3 generics comes from the R Language manual where I'm lead to believe that a function func cannot be a generic without calling UseMethod in its body. So, without any fancy packages, you can use the following gist:
isFuncS3Generic <- function(func) {
funcBody <- body(func)
sum(grepl(pattern="UseMethod", x=funcBody)) != 0
}
edit: using your litmus test:
> isFuncS3Generic(print)
[1] TRUE
> isFuncS3Generic(predict)
[1] TRUE
> isFuncS3Generic(summary)
[1] TRUE
> isFuncS3Generic(glm)
[1] FALSE
#Hadley, I'm not really sure why this is a hard problem. Would you mind elaborating?

Related

How to define an S3 generic with the same name as a primitive function?

I have a class myclass in an R package for which I would like to define a method as.raw, so of the same name as the primitive function as.raw(). If constructor, generic and method are defined as follows...
new_obj <- function(n) structure(n, class = "myclass") # constructor
as.raw <- function(obj) UseMethod("as.raw") # generic
as.raw.myclass <- function(obj) obj + 1 # method (dummy example here)
... then R CMD check leads to:
Warning: declared S3 method 'as.raw.myclass' not found
See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.
If the generic is as_raw instead of as.raw, then there's no problem, so I assume this comes from the fact that the primitive function as.raw already exists. Is it possible to 'overload' as.raw by defining it as a generic (or would one necessarily need to use a different name?)?
Update: NAMESPACE contains
export("as.raw") # export the generic
S3method("as.raw", "myclass") # export the method
This seems somewhat related, but dimnames there is a generic and so there is a solution (just don't define your own generic), whereas above it is unclear (to me) what the solution is.
The problem here appears to be that as.raw is a primitive function (is.primitive(as.raw)). From the ?setGeneric help page, it says
A number of the basic R functions are specially implemented as primitive functions, to be evaluated directly in the underlying C code rather than by evaluating an R language definition. Most have implicit generics (see implicitGeneric), and become generic as soon as methods (including group methods) are defined on them.
And according to the ?InternalMethods help page, as.raw is one of these primitive generics. So in this case, you just need to export the S3method. And you want to make sure your function signature matches the signature of the existing primitive function.
So if I have the following R code
new_obj <- function(n) structure(n, class = "myclass")
as.raw.myclass <- function(x) x + 1
and a NAMESPACE file of
S3method(as.raw,myclass)
export(new_obj)
Then this passes the package checks for me (on R 4.0.2). And I can run the code with
as.raw(new_obj(4))
# [1] 5
# attr(,"class")
# [1] "myclass"
So in this particular case, you need to leave the as.raw <- function(obj) UseMethod("as.raw") part out.

Why/how some packages define their functions in nameless environment?

In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}

R - Redefining utils::View as generic without conflicting with RStudio

I have successfully redefined utils::View as a generic function so that I can use it my package. However, it so happens that RStudio also defines some kind of hook on this function.
Before loading my package, I see:
> View
function (...)
.rs.callAs(name, hook, original, ...)
<environment: 0x000001f74d5ff0b0>
And looking for that .rs.callAs function, I get:
> findFunction('.rs.callAs')
[[1]]
<environment: 0x000001f74eb94598>
attr(,"name")
[1] "tools:rstudio"
After loading my package, I see:
> View
standardGeneric for "View" defined from package "summarytools"
function (...)
standardGeneric("View")
<bytecode: 0x000001f752ecb7e0>
<environment: 0x000001f754a8e678>
Methods may be defined for arguments: ...
Use showMethods("View") for currently available ones.
Since tools:rstudio is not directly visible, I'm not sure there's anything I can do about it. And if I can somehow include its definition in my package, I'm not sure at all that I can redefine View differently depending on whether the R session is running in RStudio or not.
I'm clearly not very optimistic about this, but I thought of asking here before giving up!

type/origin of R's 'as' function

R's S3 OO system is centered around generic functions that call methods depending on the class of the object the generic function is being called on. The crux is that the generic function calls the appropriate method, as opposed to other OO programming languages in which the method is defined within the class.
For example, the mean function is a generic function.
isGeneric("mean")
methods(mean)
This will print
TRUE
[1] mean,ANY-method mean.Date mean.default mean.difftime
[5] mean.IDate* mean,Matrix-method mean.POSIXct mean.POSIXlt
[9] mean,sparseMatrix-method mean,sparseVector-method
see '?methods' for accessing help and source code
I was exploring R a bit and found the as function. I am confused by the fact that R says the function is not generic, but it still has methods.
isGeneric("as")
methods(as)
TRUE
[1] as.AAbin as.AAbin.character
[3] as.alignment as.allPerms
[5] as.array as.array.default
[7] as.binary as.bitsplits
[9] as.bitsplits.prop.part as.call
...
At the end there is a warning that says that as is not a generic.
Warning message:
In .S3methods(generic.function, class, parent.frame()) :
function 'as' appears not to be S3 generic; found functions that look like S3 methods
Could someone explain me what the as function is and how is connected to as.list, as.data.frame etc? R says that as.list is a generic (where I am tempted to get a bit mad at the inconsistencies within R, because I would expect as.list to be a method for a list object from the as generic function). Please help.
as is not an S3 generic, but notice that you got a TRUE. (I got a FALSE.) That means you have loaded a package that definesas as an S4-generic. S3-generics work via class dispatch that employs a *.default function and the UseMethod-function. The FALSE I get means there is no method defined for a generic as that would get looked up. One arguable reason for the lack of a generic as is that calling such a function with only one data object would not specify a "coercion destination". That means the destination needs to be built into the function name.
After declaring as to be Generic (note the capitalization which is a hint that this applies to S4 features:
setGeneric("as") # note that I didn't really even need to define any methods
get('as')
#--- output----
standardGeneric for "as" defined from package "methods"
function (object, Class, strict = TRUE, ext = possibleExtends(thisClass,
Class))
standardGeneric("as")
<environment: 0x7fb1ba501740>
Methods may be defined for arguments: object, Class, strict, ext
Use showMethods("as") for currently available ones.
If I reboot R (and don't load any libraries that call setGeneric for 'as') I get:
get('as')
#--- output ---
function (object, Class, strict = TRUE, ext = possibleExtends(thisClass,
Class))
{
if (.identC(Class, "double"))
Class <- "numeric"
thisClass <- .class1(object)
if (.identC(thisClass, Class) || .identC(Class, "ANY"))
return(object)
where <- .classEnv(thisClass, mustFind = FALSE)
coerceFun <- getGeneric("coerce", where = where)
coerceMethods <- .getMethodsTable(coerceFun, environment(coerceFun),
inherited = TRUE)
asMethod <- .quickCoerceSelect(thisClass, Class, coerceFun,
coerceMethods, where)
.... trimmed the rest of the code
But you ask "why", always a dangerous question when discussing language design, of course. I've flipped through the last chapter of Statistical Models in S which is the cited reference for most of the help pages that apply to S3 dispatch and find no discussion of either coercion or the as function. There is an implicit definition of "S3 generic" requiring the use of UseMethod but no mention of why as was left out of that strategy. I think of two possibilities: it is to prevent any sort of inheritance ambiguity in the application of the coercion, or it is an efficiency decision.
I should probably add that there is an S4 setAs-function and that you can find all the S4-coercion functions with showMethods("coerce").

How can a non-imported method in a not-attached package be found by calls to functions not having it in their namespace?

An R namespace acts as the immediate environment for all functions in its associated package. In other words, when function bar() from package foo calls another function, the R evaluator first searches for the other function in <environment: namespace:foo>, then in "imports.foo", <environment: namespace:base>, <environment: R_GlobalEnv>, and so on down the search list returned by typing search().
One nice aspect of namespaces is that they can make packages act like better citizens: unexported functions in <environment: namespace:foo> and functions in imports:foo are available only: (a) to functions in foo; (b) to other packages that import from foo; or (c) via fully qualified function calls like foo:::bar().
Or so I thought until recently...
The behavior
This recent SO question highlighted a case in which a function well-hidden in its package's namespace was nonetheless found by a call to a seemingly unrelated function:
group <- c("C","F","D","B","A","E")
num <- c(12,11,7,7,2,1)
data <- data.frame(group,num)
## Evaluated **before** attaching 'gmodels' package
T1 <- transform(data, group = reorder(group,-num))
## Evaluated **after** attaching 'gmodels
library(gmodels)
T2 <- transform(data, group = reorder(group,-num))
identical(T1, T2)
# [1] FALSE
Its immediate cause
#Andrie answered the original question by pointing out that gmodels imports from the the package gdata, which includes a function reorder.factor that gets dispatched to inside the second call to transform(). T1 differs from T2 because the first is calculated by stats:::reorder.default() and the second by gdata:::reorder.factor().
My question
How is it that in the above call to transform(data, group=reorder(...)), the dispatching mechanism for reorder finds and then dispatches to gdata:::reorder.factor()?
(An answer should include an explanation of the scoping rules that lead from a call involving functions in the stats and base packages to a seemingly well-hidden method in gdata.)
Further possibly helpful details
Neither gdata:::reorder.factor, nor the gdata package as a whole are explicitly imported by gmodels. Here are the import* directives in gmodels' NAMESPACE file:
importFrom(MASS, ginv)
importFrom(gdata, frameApply)
importFrom(gdata, nobs)
There are no methods for reorder() or transform() in <environment: namespace:gmodels>, nor in "imports:gmodels":
ls(getNamespace("gmodels"))
ls(parent.env(getNamespace("gmodels")))
Detaching gmodels does not revert reorder()'s behavior: gdata:::reorder.factor() still gets dispatched:
detach("package:gmodels")
T3 <- transform(data, group=reorder(group,-num))
identical(T3, T2)
# [1] TRUE
reorder.factor() is not stored in the list of S3 methods in the base environment:
grep("reorder", ls(.__S3MethodsTable__.))
# integer(0)
R chat threads from the last couple of days include some additional ideas. Thanks to Andrie, Brian Diggs, and Gavin Simpson who (with others) should feel free to edit or add possibly impt. details to this question.
I'm not sure if I correctly understand your question, but the main point is that group is character vector while data$group is factor.
After attaching gmodels, the call for reorder(factor) calls gdata:::reorder.factor.
so, reorder(factor(group)) calls it.
In transform, the function is evaluated within the environment of the first argument, so in T2 <- transform(data, group = reorder(group,-num)), group is factor.
UPDATED
library attaches the import packages into loaded namespace.
> loadedNamespaces()
[1] "RCurl" "base" "datasets" "devtools" "grDevices" "graphics" "methods"
[8] "stats" "tools" "utils"
> library(gmodels) # here, namespace:gdata is loaded
> loadedNamespaces()
[1] "MASS" "RCurl" "base" "datasets" "devtools" "gdata" "gmodels"
[8] "grDevices" "graphics" "gtools" "methods" "stats" "tools" "utils"
Just in case, the reorder generic exists in namespace:stats:
> r <- ls(.__S3MethodsTable__., envir = asNamespace("stats"))
> r[grep("reorder", r)]
[1] "reorder" "reorder.default" "reorder.dendrogram"
And for more details
The call of reorder will search the S3generics in two envs:
see ?UseMethod
first in the environment in which the generic function is called, and then in the registration data base for the environment in which the generic is defined (typically a namespace).
then, loadNamespace registers the S3 functions to the namespace.
So , in your case, library(gmodels) -> loadNamespace(gdata) -> registerS3Methods(gdata).
After this, you can find it by:
> methods(reorder)
[1] reorder.default* reorder.dendrogram* reorder.factor*
Non-visible functions are asterisked
However, as the reorder.factor is not attached on your search path, you cannot access it directly:
> reorder.factor
Error: object 'reorder.factor' not found
Probably this is whole scenario.

Resources