I want to subclass an S4 class and add a special method to that subclass.
The method should work only for the subclass, it does not make sense for any other class in my application.
setClass("MySpecies", contains="Species", ##Species is an S4 class
representation(x="numeric"))
setMethod("initialize", "MySpecies", function(.Object, x, ...){
.Object#x <- x
args <- list(...)
for(i in seq_len(length(args))){
attr(.Object, names(args)[i]) <- args[[i]]
}
.Object
})
##CalcMatrix <- function(.Object, y){
## x <- .Object#x
## matrix(x, x*2, y*3)
##}
setGeneric("CalcMatrix", function(object, y){standardGeneric("CalcMatrix")})
setMethod("CalcMatrix", "MySpecies",function(object, y){
x <- object#x
matrix(x, x*2, y*3)
})
With the setGeneric it works, but do I really have to define a generic function although it will be used only with this object? The commented out part works, but then there is no check if the function is called with the right arguments. What is the correct way to do this?
Thanks in advance.
You're wanting to use method dispatch, and every method needs to be associated with a generic, so yes, setGeneric is required.
And for a little unasked-for advice... It's a bit weird to use a formal class system (presumably because the well-defined classes help in writing more complicated programs) and then to subvert the structure by adding arbitrary attributes; these should really be additional, well-defined slots in your class.
Let's make your example reproducible by defining Species
setClass("Species", representation(y="numeric"))
setClass("MySpecies", contains="Species", ##Species is an S4 class
representation(x="numeric"))
An implicit requirement for S4 classes is that new("MySpecies") works; your initialize method fails this test (because x does not have a default value). In addition, it's common practice to expect that initializing MySpecies calls the initialize methods for the classes it contains. One could write
setMethod("initialize", "MySpecies", function(.Object, ..., x=numeric()) {
callNextMethod(.Object, x=x, ...)
})
Note callNextMethod, so that the base class gets initialized properly. Using ... and passing it to callNextMethod means that slots that might be defined in Species would also be initialized correctly. Also, x needs to be after ..., because initialize is defined to take unnamed arguments that represent contained classes -- new("MySpecies", new("Species")) is required to work, even if it is a way of constructing arguments that you do not use directly. The initialize method above doesn't actually do anything more than the default initialize method, so in reality (and this is often the case) it makes sense not to write an initialize method at all.
And then in more recent R, setClass returns a default constructor so
MySpecies <- setClass("MySpecies", contains="Species", ##Species is an S4 class
representation(x="numeric"))
and then
> MySpecies(x=1:5, y=5:1)
An object of class "MySpecies"
Slot "x":
[1] 1 2 3 4 5
Slot "y":
[1] 5 4 3 2 1
Related
I have my s3 class student
# a constructor function for the "student" class
student <- function(n,a,g) {
# we can add our own integrity checks
if(g>4 || g<0) stop("GPA must be between 0 and 4")
value <- list(name = n, age = a, GPA = g)
# class can be set using class() or attr() function
attr(value, "class") <- "student"
value
}
stud <- student("name", 10, 3.5)
Now I would like to create a method similar to stud.doubleGPA() which would double the GPA of the student. I know I can achieve this using
stud$GPA <- stud$GPA*2
stud$GPA # 7
However trying to define a function doesn't seem to work.
doubleGPA <- function(student) {
if(!class(student)=="student") stop("nope")
student$GPA <- student$GPA*2
}
doubleGPA(stud)
stud$GPA # 7 again (didn't work)
And replacing <- with <<- in the above function gives
Error in student$GPA <<- student$GPA * 2 :
object of type 'closure' is not subsettable
How can I define such a method which would belong to an s3 class and therefore be inherited by children ?
Cheers
You are thinking of a different kind of object oriented programming than the S3 style, something more like C++ or Java. You can do that in R, just not in the S3 system.
In the S3 system, methods "belong to" generic functions, not to classes. Like most functions in R, generic functions don't modify their arguments, they calculate new values and return those. So you might define a generic function doubleGPA(), and have it work on the "student" class using
doubleGPA <- function(x) UseMethod("doubleGPA")
doubleGPA.student <- function(x) {
x$GPA <- x$GPA*2
x
}
and then use it as
stud <- student("name", 10, 3.5)
stud <- doubleGPA(stud)
If you actually want something more like C++ or Java, there are a couple of choices: "reference classes" from the methods package (see ?methods::setRefClass) and "R6 classes" from the R6 package. There are also several prototype-based styles in packages proto, ggplot2, R.oo, and are probably more that I've forgotten to mention.
I am struggling to find an easy way to modify S4 objects having many slots. A toy example is:
setClass(
Class = "First",
slots = c(foo = "numeric")
)
setClass(
Class = "Second",
slots = c(bar = "numeric"),
contains = "First"
)
dog <- new(Class="First",
foo = 1)
cat <- new(Class="Second",
foo = dog#foo,
bar = 1)
str(cat)
This is trivial because class First contains only one slot (foo). Is there an easy way to combine/modify S4 objects which contain many slots?
Looks like you're trying to instantiate a sub-class with the values of a parent class instance. I don't think there is an easy way to do this, but it can be done. Here, we retrieve the parent class instance slot values, and use do.call to instantiate a child class object:
par.slots <- sapply(slotNames(dog), slot, object=dog, simplify=F)
do.call("new", c("Second", bar=1, par.slots))
## An object of class "Second"
## Slot "bar":
## [1] 1
##
## Slot "foo":
## [1] 1
Yes, most people who create S4 objects create a variety of methods to work on them in the ways you would be most inclined manipulate data for your given purpose.
Sometimes that is by creating a whole new class-specific method to accomplish a specific task and others it is by creating a do that instructs R to apply the class-specific implementation of an existing generic function (such as rbind or summary) You can read about it here:
Bioconductor S4 Tutorial
This should get you headed in the right direction creating your own functions or customizing existing generics to work with your objects.
Problems comes from experimenting a package and find using new(Class = 'ddmatrix', Data = X) and ddmatrix(Data = X) yields different results, in which X is a matrix(one can think class ddmatrix is a transformed Class matrix).
Document
In the package, a S4 class ddmatrix is defined. A generic constructor function by setGeneric(name = 'ddmatrix'). Further, the pacakge defines setMethod('ddmatrix', signature = 'matrix', ...) as below:
setMethod("ddmatrix", signature(data="matrix"),
function(data, nrow=1, ncol=1, byrow=FALSE, ...
bldim=.pbd_env$BLDIM, ICTXT=.pbd_env$ICTXT)
{
dim(data) <- NULL
ret <- ddmatrix(data=data, nrow=nrow, ncol=ncol, byrow=byrow, bldim=bldim, ICTXT=ICTXT)
return( ret )
}
)
I am confused how a method ddmatrix is used in the above setMethod('ddmatrix', signature = 'matrix') step. Is this ddmatrix method the default method for the generic ddmatrix?
Meanwhile, when call new('ddmatrix', Data = X), which method it will call to build a new ddmatrix object from a matrix object? new function is:
function (Class, ...)
{
ClassDef <- getClass(Class, where = topenv(parent.frame()))
value <- .Call(C_new_object, ClassDef)
initialize(value, ...)
}
Question
To answer the discrepancy between new('ddmatrix') and ddmatrix(), I think one way is to find the default constructor. Meanwhile, the package also defines setMethod('ddmatrix', signature = 'vector',...), is this the default one?
At some level this is up to the author. Many people view new() and # or slot() (for slot access) as strictly for the package developer -- these expose the implementation details directly to the user -- and prefer to write constructors and accessors that place an interface on top of the implementation. This appears to be the case for the package that you are considering, where ddmatrix() is meant to be the user-oriented constructor.
The author appears to have implemented a facade pattern, where several different methods make relatively minor data transformations before calling another function / method to do the actual object construction. From what you show, it seems ddmatrix,matrix-method invokes ddmatrix,vector-method (because inside ddmatrix,matrix-method the function sets dim(data) <- NULL, turning the matrix into a vector, and then calls ddmatrix() which now dispatches to the vector method), and this constructs the object via new() at https://github.com/RBigData/pbdDMAT/blob/master/R/constructors.r#L191. A different package author could have adopted a different design, where several methods separately call new().
The documentation often also helps, e.g., ?ddmatrix does not discuss direct object construction via new().
Here's a simpler example. I create a class "A", with a single slot containing a numeric vector
setClass("A", slots=c(x="numeric"))
Here I create a constructor, because I want the user to see the interface to the class, rather than it's implementation
A = function(x=numeric())
new("A", x=x)
So far, A() and new("A") return an object with the same structure, e.g.,
> new("A")
An object of class "A"
Slot "x":
numeric(0)
> A()
An object of class "A"
Slot "x":
numeric(0)
Maybe as the developer of the "A" class, I want an uninitialized object of class 'A' to have 'NA' as the value of the slot x, so I modify
A = function(x = NA_real_)
new("A", x=x)
now a direct call to new() returns a different object from a call to A()
> new("A")
An object of class "A"
Slot "x":
numeric(0)
> A()
An object of class "A"
Slot "x":
[1] NA
Which one is 'correct'? Well, both are correct, but as the creator of the class I intend for the user to create an object of class "A" by calling the function A().
A typical reason for separating the interface (using A() to construct an object) from the implementation (using new() to construct an object) is because the implementation is not obvious to the user. This seems to be the case with the ddmatrix() function -- for reasons that only the package author needs to know about, it is convenient to store an R matrix as a vector with information about dimensions. I guess a simple equivalent might be
setClass("A", slots=c(data="numeric", nrow="integer", ncol="integer"))
A = function(m=matrix(0, 0, 0)) {
stopifnot(is(m, "matrix"))
new("A", data=as.vector(m), nrow=nrow(m), ncol=ncol(m))
}
for instance
> A(matrix(1:10, 5))
An object of class "A"
Slot "data":
[1] 1 2 3 4 5 6 7 8 9 10
Slot "nrow":
[1] 5
Slot "ncol":
[1] 2
Why does the author want to do this? It doesn't matter to us as users. Why can't we create the same object by calling m = matrix(1:10, 5); new("A", data=as.vector(m), nrow=nrow(m), ncol(m))? We could, but then when the author decided to change their implementation such that the offsets to the start of each row were to be stored, we'd have to understand what the author had done and update our code.
I'm trying to figure out how NextMethod() works. The most detailed explanation I have found of the S3 class system is in Chambers & Hastie (edts.)'s Statistical Models in S (1993, Chapman & Hall), however I find the part concerning NextMethod invocation a little obscure. Following are the relevant paragraphs I'm trying to make sense of (pp. 268-269).
Turning now to methods invoked as a result of a call to
NextMethod(), these behave as if they had been called from the
previous method with a special call. The arguments in the call to the
inherited method are the same in number, order, and actual argument
names as those in the call to the current method (and, therefore, in
the call to the generic). The expressions for the arguments, however,
are the names of the corresponding formal arguments of the current
method. Suppose, for example, that the expression print(ratings) has
invoked the method print.ordered(). When this method invokes
NextMethod(), this is equivalent to a call to print.factor() of
the form print.factor(x), where x is here the x in the frame of
print.ordered(). If several arguments match the formal argument
"...", those arguments are represented in the call to the inherited
method y special names "..1", "..2", etc. The evaluator recognizes
these names and treats them appropriately (see page 476 for an
example).
This rather subtle definition exists to ensure that the semantics of
function calls in S carry over as cleanly as possible to the use of
methods (compare Becker, Chambers and Wilks's The New S Language,
page 354). In particular:
Arguments are passed down from the current method to the inherited method with their current values at the time NextMethod() is called.
Lazy evaluation continues in effect; unevaluated arguments stay unevaluated.
Missing arguments remain missing in the inherited method.
Arguments passed through the "..." formal argument arrive with the correct argument name.
Objects in the frame that do not correspond to actual arguments in the call will not be passed to the inherited method."
The inheritance process is essentially transparent so far as the
arguments go.
Two points that I find confusing are:
What is "the current method" and what is "the previous method"?
What is the difference between "The arguments in the call to the inherited method", "The expressions for the arguments" and "the names of the corresponding formal arguments of the current method"?
Generally speaking, if anyone could please restate the description given in the above paragraphs in a lucider fashion, I'd appreciate it.
Hard to go through all this post, but I think that this small example can help to demystify the NextMethod dispatching.
I create an object with 2 classes attributes (inheritance) 'first' and 'second'.
x <- 1
attr(x,'class') <- c('first','second')
Then I create a generic method Cat to print my object
Cate <- function(x,...)UseMethod('Cate')
I define Cate method for each class.
Cate.first <- function(x,...){
print(match.call())
print(paste('first:',x))
print('---------------------')
NextMethod() ## This will call Cate.second
}
Cate.second <- function(x,y){
print(match.call())
print(paste('second:',x,y))
}
Now you can can check Cate call using this example:
Cate(x,1:3)
Cate.first(x = x, 1:3)
[1] "first: 1"
[1] "---------------------"
Cate.second(x = x, y = 1:3)
[1] "second: 1 1" "second: 1 2" "second: 1 3"
For Cate.second the previous method is Cate.first
Arguments x and y are passed down from the current method to the inherited
method with their current values at the time NextMethod() is called.
Argument y passed through the "..." formal argument arrive with the correct argument name Cate.second(x = x, y = 1:3)
Consider this example where generic function f is called and it invokes f.ordered and then, using NextMethod, f.ordered invokes f.factor:
f <- function(x) UseMethod("f") # generic
f.ordered <- function(x) { x <- x[-1]; NextMethod() }
f.factor <- function(x) x # inherited method
x <- ordered(c("a", "b", "c"))
class(x)
## [1] "ordered" "factor"
f(x)
## [1] b c
## Levels: a < b < c
Now consider the original text:
Turning now to methods invoked as a result of a call to NextMethod(),
these behave as if they had been called from the previous method with
a special call.
Here f calls f.ordered which calls f.factor so the method "invoked as a
result of a call to NextMethod" is f.factor and the previous method is
f.ordered.
The arguments in the call to the inherited method are the same in
number, order, and actual argument names as those in the call to the
current method (and, therefore, in the call to the generic). The
expressions for the arguments, however, are the names of the
corresponding formal arguments of the current method. Suppose, for
example, that the expression print(ratings) has invoked the method
print.ordered(). When this method invokes NextMethod(), this is
equivalent to a call to print.factor() of the form print.factor(x),
where x is here the x in the frame of print.ordered()
Now we switch perspectives and we are sitting in f.ordered so now f.ordered
is the current method and f.factor is the inherited method.
At the point that f.ordered invokes NextMethod() a special call is constructed
to call f.factor whose arguments are the same as those passed to f.ordered and
to the generic f
except that they refer to the versions of the arguments in f.ordered (which
makes a difference here as f.ordered changes the argument before invoking
f.factor.
In my package, I want to subclass a class TheBaseClass from a contributed package (so it is out of my reach). There is a function for creating objects of this class. Here is a minimal example for that code.
setClass("TheBaseClass", representation(a="numeric"))
initBase <- function() new("TheBaseClass", a=1) # in reality more complex
Now I want simply use initBase as constructor for my subclass, but I do not know how
to set the new class
setClass("MyInheritedClass", contains="TheBaseClass")
initInher <- function() {
res <- initBase()
class(res) <- "MyInheritedClass" # this does not work for S4
}
How can I alter the last line to make it work? Copy & paste the initBase function is not an option, since it involves a .C call. I read about setIs, but this seems not to be the right function here.
Any hint appreciated!
Perhaps this answer provides more extensive explanation. One pattern is to provide an instance of the base class as an unnamed argument to your class constructor
.MyInheritedClass <- setClass("MyInheritedClass", contains="TheBaseClass")
.MyInheritedClass(initBase())
(setClass returns a generator function, which is really no different from calling new but seems cleaner; I use . in front, because generators are maybe a little too crude for "end users", e.g., there is no hint about what the arguments are supposed to be, just ...). This assumes that you have not written an initialize method for your class, or that your initialize method has been constructed in a way that is consistent with the contract of initialize,ANY-method, with a slightly more complicated class
.A <- setClass("A", contains="TheBaseClass",
representation=representation(x="numeric"))
setMethod(initialize, "A",
function(.Object, ..., x)
{
x <- log(x) # your class-specific initialization...
callNextMethod(.Object, ..., x = x) # passed to parent constructor
})
This pattern requires that the initialize method of the base class has been designed correctly. In action:
> .A(initBase(), x=1:2)
An object of class "A"
Slot "x":
[1] 0.0000000 0.6931472
Slot "a":
numeric(0)