In a package, I would like to call an S3 method "compact" for object foobar.
There would therefore be a compact.foobar function in my package, along with the compact function itself:
compact = function(x, ...){
UseMethod("compact", x)
}
However, this latter would be conflicting with purrr::compact.
I could default the method to use purrr (compact.default = purrr::compact, or maybe
compact.list = purrr::compact), but that would make little sense if the user does not have purrr loaded.
How can I default my method to the loaded version of compact, in the user environment? (so that it uses purrr::compact, any other declared compact function, or fails of missing function)
Unfortunately S3 does not deal with this situation well. You have to search for suitable functions manually. The following works, more or less:
get_defined_function = function (name) {
matches = getAnywhere(name)
# Filter out invisible objects and duplicates
objs = matches$objs[matches$visible & ! matches$dups]
# Filter out non-function objects
funs = objs[vapply(objs, is.function, logical(1L))]
# Filter out function defined in own package.
envs = lapply(funs, environment)
funs = funs[! vapply(envs, identical, logical(1L), topenv())]
funs[1L][[1L]] # Return `NULL` if no function exists.
}
compact.default = function (...) {
# Maybe add error handling for functions not found.
get_defined_function('compact')(...)
}
This uses getAnywhere to find all objects named compact that R knows about. It then filters out those that are not visible because they’re not inside attached packages, and those that are duplicate (this is probably redundant, but we do it anyway).
Next, it filters out anything that’s not a function. And finally it filters out the compact S3 generic that our own package defines. To do this, it compares each function’s environment to the package environment (given by topenv()).
This should work, but it has no rule for which function to prefer if there are multiple functions with the same name defined in different locations (it just picks an arbitrary one first), and it also doesn’t check whether the function signature matches (doing this is hard in R, since function calling and parameter matching is very flexible).
Related
I am trying to write a function (within a package) which involves allowing the user to specify a function from a specific package to execute and provide additional arguments for that function.
I can do this in two steps with rlang::call2 as follows:
# `my_pkg` = "foo" (user-input)
# `my_fun` = "bar" (user-input)
# `args` is a named list (user-input)
the_call <- rlang::call2(.fn = my_fun , !!!args , .ns = my_pkg)
base::eval(the_call)
I have a feeling that the "correct" way to do this is to just use rlang::exec, but rlang::exec does not have a .ns argument like rlang::call2 does. Instead, it has an .env argument.
How do you specify the package (or namespace) of a function to rlang::exec? The .env argument to rlang::exec doesn't appear to be the answer, because the user should be able to specify a function in a specific package (that is installed) without first loading that package.
Or is rlang::exec not meant for this purpose?
Try
real_fun <- get(my_fun, envir=as.environment(paste0("package:", my_pkg)))
rlang::exec(real_fun, !!!args)
Since "packagename::functionname" (string) cannot be passed to rlang::exec, we can pass the real function. This can be retrieved by using get and passing a particular environment (from a package namespace, as above).
(This doesn't alleviate the need for two expressions to effect the exec, but it does allow one to use exec directly.)
I have a class myclass in an R package for which I would like to define a method as.raw, so of the same name as the primitive function as.raw(). If constructor, generic and method are defined as follows...
new_obj <- function(n) structure(n, class = "myclass") # constructor
as.raw <- function(obj) UseMethod("as.raw") # generic
as.raw.myclass <- function(obj) obj + 1 # method (dummy example here)
... then R CMD check leads to:
Warning: declared S3 method 'as.raw.myclass' not found
See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.
If the generic is as_raw instead of as.raw, then there's no problem, so I assume this comes from the fact that the primitive function as.raw already exists. Is it possible to 'overload' as.raw by defining it as a generic (or would one necessarily need to use a different name?)?
Update: NAMESPACE contains
export("as.raw") # export the generic
S3method("as.raw", "myclass") # export the method
This seems somewhat related, but dimnames there is a generic and so there is a solution (just don't define your own generic), whereas above it is unclear (to me) what the solution is.
The problem here appears to be that as.raw is a primitive function (is.primitive(as.raw)). From the ?setGeneric help page, it says
A number of the basic R functions are specially implemented as primitive functions, to be evaluated directly in the underlying C code rather than by evaluating an R language definition. Most have implicit generics (see implicitGeneric), and become generic as soon as methods (including group methods) are defined on them.
And according to the ?InternalMethods help page, as.raw is one of these primitive generics. So in this case, you just need to export the S3method. And you want to make sure your function signature matches the signature of the existing primitive function.
So if I have the following R code
new_obj <- function(n) structure(n, class = "myclass")
as.raw.myclass <- function(x) x + 1
and a NAMESPACE file of
S3method(as.raw,myclass)
export(new_obj)
Then this passes the package checks for me (on R 4.0.2). And I can run the code with
as.raw(new_obj(4))
# [1] 5
# attr(,"class")
# [1] "myclass"
So in this particular case, you need to leave the as.raw <- function(obj) UseMethod("as.raw") part out.
In my code, I needed to check which package the function is defined from (in my case it was exprs(): I needed it from Biobase but it turned out to be overriden by rlang).
From this SO question, I thought I could use simply environmentName(environment(functionname)). But for exprs from Biobase that expression returned empty string:
environmentName(environment(exprs))
# [1] ""
After checking the structure of environment(exprs) I noticed that it has .Generic member which contains package name as an attribute:
environment(exprs)$.Generic
# [1] "exprs"
# attr(,"package")
# [1] "Biobase"
So, for now I made this helper function:
pkgparent <- function(functionObj) {
functionEnv <- environment(functionObj)
envName <- environmentName(functionEnv)
if (envName!="")
return(envName) else
return(attr(functionEnv$.Generic,'package'))
}
It does the job and correctly returns package name for the function if it is loaded, for example:
pkgparent(exprs)
# Error in environment(functionObj) : object 'exprs' not found
library(Biobase)
pkgparent(exprs)
# [1] "Biobase"
library(rlang)
# The following object is masked from ‘package:Biobase’:
# exprs
pkgparent(exprs)
# [1] "rlang"
But I still would like to learn how does it happen that for some packages their functions are defined in "unnamed" environment while others will look like <environment: namespace:packagename>.
What you’re seeing here is part of how S4 method dispatch works. In fact, .Generic is part of the R method dispatch mechanism.
The rlang package is a red herring, by the way: the issue presents itself purely due to Biobase’s use of S4.
But more generally your resolution strategy might fail in other situations, because there are other reasons (albeit rarely) why packages might define functions inside a separate environment. The reason for this is generally to define a closure over some variable.
For example, it’s generally impossible to modify variables defined inside a package at the namespace level, because the namespace gets locked when loaded. There are multiple ways to work around this. A simple way, if a package needs a stateful function, is to define this function inside an environment. For example, you could define a counter function that increases its count on each invocation as follows:
counter = local({
current = 0L
function () {
current <<- current + 1L
current
}
})
local defines an environment in which the function is wrapped.
To cope with this kind of situation, what you should do instead is to iterate over parent environments until you find a namespace environment. But there’s a simpler solution, because R already provides a function to find a namespace environment for a given environment (by performing said iteration):
pkgparent = function (fun) {
nsenv = topenv(environment(fun))
environmentName(nsenv)
}
help(unique) shows that unique function is present in two packages - base and data.table. I would like to use this function from data.table package. I thought that the following syntax - data <- data.table::unique(data) indicates the package to be used. But I get the following error -
'unique' is not an exported object from 'namespace:data.table'
But data <- unique(data) works well.
What is wrong here?
The function in question is really unique.data.table, an S3 method defined in the data.table package. That method is not really intended to be called directly, so it isn't exported. This is typically the case with S3 methods. Instead, the package registers the method as an S3 method, which then allows the S3 generic, base::unique in this case, to dispatch on it. So the right way to call the function is:
library(data.table)
irisDT <- data.table(iris)
unique(irisDT)
We use base::unique, which is exported, and it dispatches data.table:::unique.data.table, which is not exported. The function data.table:::unique does not actually exist (or does it need to).
As eddi points out, base::unique dispatches based on the class of the object called. So base::unique will call data.table:::unique.data.table only if the object is a data.table. You can force a call to that method directly with something like data.table:::unique.data.table(iris), but internally that will mostly likely result in the next method getting called unless your object is actually a data.table.
There are actually two infix operators in R that pull functions from particular package namespaces. You used :: but there is also a ::: that retrieves "unexported" functions. The unique-function is actually a family of functions and its behavior will depend on both the class of its argument and the particular packages that have been loaded. The R term of this is "generic". Try:
data <- data.table:::unique(data) # assuming 'data' is a data.table
The other tool that lets you peek behind the curtain that the lack of "exportation" is creating is the getAnywhere-function. It lets you see the code at the console:
> unique.data.table
Error: object 'unique.data.table' not found
> getAnywhere(unique.data.table)
A single object matching ‘unique.data.table’ was found
It was found in the following places
registered S3 method for unique from namespace data.table
namespace:data.table
with value
function (x, incomparables = FALSE, fromLast = FALSE, by = key(x),
...)
{
if (!cedta())
return(NextMethod("unique"))
dups <- duplicated.data.table(x, incomparables, fromLast,
by, ...)
.Call(CsubsetDT, x, which_(dups, FALSE), seq_len(ncol(x)))
}
<bytecode: 0x2ff645950>
<environment: namespace:data.table>
help(unique) shows that unique function is present in two packages - base and data.table. I would like to use this function from data.table package. I thought that the following syntax - data <- data.table::unique(data) indicates the package to be used. But I get the following error -
'unique' is not an exported object from 'namespace:data.table'
But data <- unique(data) works well.
What is wrong here?
The function in question is really unique.data.table, an S3 method defined in the data.table package. That method is not really intended to be called directly, so it isn't exported. This is typically the case with S3 methods. Instead, the package registers the method as an S3 method, which then allows the S3 generic, base::unique in this case, to dispatch on it. So the right way to call the function is:
library(data.table)
irisDT <- data.table(iris)
unique(irisDT)
We use base::unique, which is exported, and it dispatches data.table:::unique.data.table, which is not exported. The function data.table:::unique does not actually exist (or does it need to).
As eddi points out, base::unique dispatches based on the class of the object called. So base::unique will call data.table:::unique.data.table only if the object is a data.table. You can force a call to that method directly with something like data.table:::unique.data.table(iris), but internally that will mostly likely result in the next method getting called unless your object is actually a data.table.
There are actually two infix operators in R that pull functions from particular package namespaces. You used :: but there is also a ::: that retrieves "unexported" functions. The unique-function is actually a family of functions and its behavior will depend on both the class of its argument and the particular packages that have been loaded. The R term of this is "generic". Try:
data <- data.table:::unique(data) # assuming 'data' is a data.table
The other tool that lets you peek behind the curtain that the lack of "exportation" is creating is the getAnywhere-function. It lets you see the code at the console:
> unique.data.table
Error: object 'unique.data.table' not found
> getAnywhere(unique.data.table)
A single object matching ‘unique.data.table’ was found
It was found in the following places
registered S3 method for unique from namespace data.table
namespace:data.table
with value
function (x, incomparables = FALSE, fromLast = FALSE, by = key(x),
...)
{
if (!cedta())
return(NextMethod("unique"))
dups <- duplicated.data.table(x, incomparables, fromLast,
by, ...)
.Call(CsubsetDT, x, which_(dups, FALSE), seq_len(ncol(x)))
}
<bytecode: 0x2ff645950>
<environment: namespace:data.table>