S3 style dispatching for S3 objects using formal method definitions - r

Related to this question, but slightly different and hopefully more clear.
I am looking for a clean way to formally register methods for both S4 and S3 classes, but without relying on the terrible S3-dot-naming-scheme for dispatching. An example:
setClass("foo");
setClass("bar");
setGeneric("test", function(x, ...){
standardGeneric("test");
});
setMethod("test", "bar", function(x, ...){
return("success (bar).");
});
obj1 <- 123;
class(obj1) <- "bar";
test(obj1);
This example shows how we can register a test method for S3 objects of class bar, without the need to name the function test.bar, which is great. However, the limitation is if we register methods this way, they will only be dispatched to the first S3 class of the object. E.g:
obj2 <- 123;
class(obj2) <- c("foo", "bar");
test(obj2);
This doesn't work, because S4 method dispatching will only try class foo and its superclasses. How could this example be extended so that it will automatically select the test method for bar when no appropriate method for foo was found? E.g. S3 style dispatching but without having to go back to naming everything test.foo and test.bar?
So in summary: how to create a generic function that uses formal method dispatching, but in addition fall back on the second, third, etc class of an object for S3 objects with multiple classes.

?setOldClass will give the answer:
setOldClass(c("foo", "bar"))
setGeneric("test", function(x, ...)standardGeneric("test"))
setMethod("test", "bar", function(x, ...)return("success (bar)."))

You could write a method
test = function(x, ...) UseMethod("test")
setGeneric("test")
.redispatch = function(x, ...)
{
if (is.object(x) && !isS4(x) && length(class(x)) != 1L) {
class(x) = class(x)[-1]
callGeneric(x, ...)
} else callNextMethod(x, ...)
}
setMethod(test, "ANY", .redispatch)
But I personally wouldn't mix S3 and S4 in this way.

Related

How to handle unknown methods/generics in R

Many languages have special ways to handle unknown methods (examples). The one I'm most familiar with is Python's __getattr__. If someone calls a method you haven't defined for the class, __getattr__ acts as a catch-all and does something.
I've been reading up on S4 and a little on R6, but I haven't found how to do it in R. Is it possible?
No there is no standard way of doing this from inside your class definition as you would do in python.
In python you would do something like MyObject.my_method() while in R with S3 or S4 this would be my_method(MyObject) so it looks exactly like my_function(MyObject). The only difference is that under the hood the function you called dispatches the call to the adequate method. Defining these methods for multiple classes is done as follows:
mean <- function (x, ...) UseMethod("mean", x)
mean.numeric <- function(x, ...) sum(x) / length(x)
mean.data.frame <- function(x, ...) sapply(x, mean, ...)
mean.matrix <- function(x, ...) apply(x, 2, mean)
mean.default <- function(x, ...) {
# do something
}
However, if you call the mean function on a class for which no method has been defined, it is up to the function to handle this, not to the class.
Then you have RC and S6 objects which have a more python-like syntax (MyObject$my_method()), however they would just throw an error that there is no corresponding field or method for the class you used.
Error in envRefInferField(x, what, getClass(class(x)), selfEnv) :
"my_method" is not a valid field or method name for reference class “MyObject”
Here some infos about OO-programing in R.
Winston Chang provided great info here:
https://github.com/r-lib/R6/issues/189#issuecomment-506405998
He explains how you can create an S3 generic function $ for your class to catch unknown methods. Read his full reply for more details, but the key function is below (Counter is the name of the class).
`$.Counter` <- function(x, name) {
if (name %in% names(x)) {
.subset2(x, name)
} else {
function(...) {
.subset2(x, "do")(name, ...)
}
}
}
"If name is in the class, do that. If not, send name (and any arguments) to a function called do() defined in the class."
While I've marked this as the answer (because it solves the problem), jkd is still correct:
No there is no standard way of doing this from inside your class definition as you would do in python.

S4 class constructor and validation

I present a short code to create a S4 class myclass and ensure that objects are created if they verify a condition given by a parameter param
setClass("myclass", slot = c(x = "numeric"))
#constructor
ValidmyClass<- function(object, param = 1)
{
if(object#x == param) return(TRUE)
else return("problem")
}
setValidity("myclass", ValidmyClass)
setMethod("initialize","myclass", function(.Object,...){
.Object <- callNextMethod()
validObject(.Object,...)
.Object
})
For which I get the following error message Error in substituteFunctionArgs(validity, "object", functionName = sprintf("validity method for class '%s'", :
trying to change the argument list of for validity method for class 'myclass' with 2 arguments to have arguments (object)
I understand the issue with the arguments but I cannot find a way to solve this. The document about setValidity mentions that the argument method should be "validity method; that is, either NULL or a function of one argument (object)". Hence from my understanding excluding more than one argument.
Nevertheless, the idea behind this example is that I want to be able to test the construction of a myclass object based on the value of an external given parameter. If more conditions were to be added, I would like enough flexibility so only the function ValidmyClass needs to be updated, without necessarily adding more slots.
The validity function has to have one argument named object. When I need to create one argument functions but really have more arguments or data to pass in I often fall back to using closures. Here the implementation of your ValidmyClass changes in that it will now return the actual validity function. The argument of the enclosing function is then the set of additional arguments you are interested in.
setClass("myclass", slot = c(x = "numeric"))
#constructor
ValidmyClass <- function(param) {
force(param)
function(object) {
if (object#x == param) TRUE
else "problem"
}
}
setValidity("myclass", ValidmyClass(1))
Also the validity function is called automatically on init; however not when the slot x is changed after the object is created.
setMethod("initialize", "myclass", function(.Object,...) {
.Object <- callNextMethod()
.Object
})
new("myclass", x = 2)
new("myclass", x = 1)
For more information on closures see adv-R. Although I think this answers your question, I do not see how this implementation is actually helpful. When you define your class, you basically also fix the additional parameters which the validity function knows about. If you have several classes for which you can abstract the validity function then I would use the closure. If you have one class with changing parameters at runtime, I would consider to add a slot to the class. If you do not want to alter the class definition you can add a slot of class list where you the can pass in an arbitrary number of values to test against.

Can we combine S3 flexibility with S4 representation checking?

I'm looking for a method to validate S3 objects in my package Momocs.
Earlier versions of the package were written using S4, then I shifted back to S3 for the sake of flexibility, because users were more into S3, because I do not really need multiple inheritance, etc.. The main cost of this change was actually losing S4 representation / validity checking.
My problem follows: how can we prevent one from inadvertently "unvalidate" an S3 object, for instance trying to extend existing methods or manipulating object structure?
I have already written some validate function but, so far, I only validate before crucial steps, typically those turning an object from a class into another.
My question is:
do I want to have my cake and eat it (S3 flexibility and S4 representation checking) ? In that case, I would need to add my validate function across all the methods of my package?
or is there a smarter way on top of S3, something like "any time we do something on an object of a particular class, call a validate function on it"?
The easiest thing would be to write a validation function for each class and pass objects through it before S3 method dispatch or within each class's method. Here's an example with a simple validation function called check_example_class for an object of class "example_class":
check_example_class <- function(x) {
stopifnot(length(x) == 2)
stopifnot("a" %in% names(x))
stopifnot("b" %in% names(x))
stopifnot(is.numeric(x$a))
stopifnot(is.character(x$b))
NULL
}
print.example_class <- function(x, ...) {
check_example_class(x)
cat("Example class object where b =", x$b, "\n")
invisible(x)
}
# an object of the class
good <- structure(list(a = 1, b = "foo"), class = "example_class")
# an object that pretends to be of the class
bad <- structure(1, class = "example_class")
print(good) # works
## Example class object where b = foo
print(bad) # fails
## Error: length(x) == 2 is not TRUE

does every S4 needs to be generic

Suppose we have the following dummy class
Foo <- setClass(Class = "Foo",slots = c(foo = "numeric"),
prototype = list(foo = numeric())
I thought, generics are used to overload different functions. So assume we want to implement an accessor:
setMethod(f = "getFoo", signature = "Foo",
definition = function(Foo)
{
return(Foo#foo)
}
)
Is this valid? Or do I have to define a generic first:
setGeneric(name="getFoo",
def=function(Foo)
{
standardGeneric("getFoo")
}
)
If there is just one particular "instance" of this function type, there is no reason to define a generic, correct?
In order to define an S4 method, there must be an existing S4 generic (either from base, imported from another package, or defined yourself). My understand of this design is to provide the flexibility to add on additional methods in the future, even if you can't even conceive of another one ATM.
That said, if you are just trying to be more concise you could just provide the default function directly to the generic function.
setClass(Class = "Foo",slots = c(foo = "numeric"),
prototype = list(foo = numeric()))
setGeneric(name="getFoo",
def=function(Foo)
{
standardGeneric("getFoo")
}, useAsDefault=function(Foo){return(Foo#foo)}
)
# test the function
testFoo <- new("Foo", foo=3)
getFoo(testFoo)
[1] 3
So, now you have your generic including the only functionality you really wanted anyway. You also have the option to add on the the generic in the future depending upon how your application develops.

Using a method/function within a reference class method of the same name

When defining a new reference class in R there is a bunch of boiler-plate methods that are expected (by R conventions), such as length, show etc. When these are defined they aggressively masks similar named methods/functions when called from within the class' methods. As you can not necessarily know the namespace of the foreign function it is not possible to use the package:: specifier.
Is there a way to tell a method to ignore its own methods unless called specifically using .self$?
Example:
tC <- setRefClass(
'testClass',
fields = list(data='list'),
methods = list(
length=function() {
length(data)
}
)
)
example <- tC(data=list(a=1, b=2, c=3))
example$length() # Will cause error as length is defined without arguments
Alternatively one could resort to defining S4 methods for the class instead (as reference classes are S4 classes under the hood), but this seems to be working against the reference class idea...
Edit:
To avoid focusing on instances where you know the class of the data in advance consider this example:
tC <- setRefClass(
'testClass',
fields = list(data='list'),
methods = list(
length=function() {
length(data)
},
combineLengths = function(otherObject) {
.self.length() + length(otherObject)
}
)
)
example <- tC(data=list(a=1, b=2, c=3))
example$combineLength(rep(1, 3)) # Will cause error as length is defined without arguments
I am aware that it is possible to write your own dispatching to the correct method/function, but this seems as such a common situation that I thought it might have already been solved within the methods package (sort of the reverse of usingMethods())
My question is thus, and I apologise if this wasn't clear before: Are there ways of ignoring there reference class methods and fields within the method definitions and solely rely on .self for accessing these, so that methods/functions defined outside the class are not masked?
The example is not very clear. I don't know for what reason you can't know the namespace of your method. Whatever, here a couple of methods to work around this problem:
You can use a different name for the reference class method Length with Capital "L" for example
You can find dynamically the namespace of the generic function.
For example:
methods = list(
.show =function(data) {
ns = sub(".*:","",getAnywhere("show")$where[1])
func = get("show",envir = getNamespace(ns))
func(data)
},
show=function() {
.show(data)
}
)
You can use the new reference class System R6.
For example:
tC6 <- R6Class('testClass',
public = list(
data=NA,
initialize = function(data) {
if (!missing(data)) self$data <- data
},
show=function() show(self$data)
)
)

Resources