If given an object x, is there a way to classify whether or not it is S3 or S4 (or "other")? I have looked at is.object() and isS4(), and can identify that something is an object (or not) and that it is an S4 object (or not). However, it doesn't seem to me that S3 objects are the complement of all objects that are not S4 objects.
Therefore, how can these assignments be done programmatically?
Here is an example of something that bugs me, taken from the help for is.object():
a = as.factor(1:3)
is.object(a) # TRUE
isS4(a) # FALSE
Does that mean that a is an S3 object?
If it is an object and is not an S4 then it is an S3:
is.object(foo) & !isS4(foo)
is.object checks for some magic OBJECT bit that gets set when the thing has a class attribute, so its essentially a fast way of doing any(names(attributes(foo))=="class"), which is what defines an S3 object.
Related
To apply a function to all slots in S4.
Of course, it can be done with for-loop over slotNames(). But I'm curious if it can be done in a vectorized way.
In general it isn't possible to operate on slots in a vectorised way, because the slots might have any class. If a class has structure
slotA = "factor"
slotB = "integer"
slotC = "numeric"
then even though you might be applying the same (generic) function to all of them (say, summary) the actual methods that get called will be different. The task just isn't vectorisable, any more than the set of commands "mop the floor, wash the car and vacuum the carpet" could be vectorised even though they might all share the generic function clean — you need a mop for one task, a sponge for another and a vacuum cleaner for the third. (Contrast that with the set of commands "vacuum the three carpets in the bedroom, hallway and lounge" which can be vectorised to an extent — you don't have to get the vacuum cleaner out of the box three times and put it away three times, you can do it just once)
If you can guarantee that all the slots will be of the same class, then it becomes easier to vectorise, but if that is the case, why does this object have the structure that it does? If it needs to be S4 then just define a simple class that contains a list, matrix or array and then use sapply or apply as needed.
In packages like marray and limma, when complex objects are loaded, they contain "members variables" that are accessed using the # symbol. What does this mean and how does it differ from the $ symbol?
See ?'#':
Description:
Extract the contents of a slot in a object with a formal (S4)
class structure.
Usage:
object#name
...
The S language has two object systems, known informally as S3 and S4.
S3 objects, classes and methods have been available in R
from the beginning, they are informal, yet very interactive.
S3 was first described in the White Book (Statistical Models in S).
S3 is not a real class system, it mostly is a set of naming
conventions.
S4 objects, classes and methods are much more formal and
rigorous, hence less interactive. S4 was first described
in the Green Book (Programming with Data). In R it is
available through the methods package, attached by default
since version 1.7.0.
See also this document: S4 Classes and Methods.
As the others have said, the # symbol is used with S4 classes, but here is a note from Google's R Style Guide: "Use S3 objects and methods unless there is a strong reason to use S4 objects or methods."
You will want to read up on S4 classes which use the # symbol.
How does one correctly do the following:
I have a class SpectraSet with slots parentSpectrum, childSpectra, name (to keep it simple)
name is character()
parentSpectrum should contain one object of class ParentSpec (so it is of type ParentSpec)
childSpectra should contain n objects of class ChildSpec. However I can't make it of type ChildSpec because vectors can only contain atomic types. What is best practice in this case? I can make it a list() and type check in the validity check, but is there anything better?
Here are related anaswers I've provided in the past.
It's usually better to re-think the class design so ChildSpec is intrinsically a vector -- minimally, supports length() and subsetting [, [[. Your problem above then goes away, the design is consistent with R's vectorized orientation, and likely common operations are efficient.
An alternative to implementing your own type-checked list (which is really the other alternative) is to re-use the infrastructure from Biocdonductor's S4Vectors class
.X = setClass("X", representation(x="numeric"))
.XList = setClass("XList", contains="SimpleList",
prototype=prototype(elementType="X"))
And in action
> xl = .XList(listData=list(.X(x=1), .X(x=2)))
> xl
XList of length 2
> xl[[2]]
An object of class "X"
Slot "x":
[1] 2
I'm implementing an S4 class that contains a data.table, and attempting to implement [ subsetting of the object (as described here) such that it also subsets the data.table. For example (defining just i subsetting):
library(data.table)
.SuperDataTable <- setClass("SuperDataTable", representation(dt="data.table"))
setMethod("[", c("SuperDataTable", "ANY", "missing", "ANY"),
function(x, i, j, ..., drop=TRUE)
{
initialize(x, dt=x#dt[i])
})
d = data.table(a=1:4, b=rep(c("x", "y"), each=2))
s = new("SuperDataTable", dt=d)
At this point, subsetting with a numeric vector (s[1:2]) works as desired (it subsets the data.table in the slot). However, I'd like to add the ability to subset using an expression. This works for the data.table itself:
s#dt[b == "x"]
# a b
# 1: 1 x
# 2: 2 x
But not for the S4 [ method:
s[b == "x"]
# Error: object 'b' not found
The problem appears to be that arguments in the signature of the S4 method are not evaluated using R's traditional lazy evaluation- see here:
All arguments in the signature of the generic function will be
evaluated when the function is called, rather than using the
traditional lazy evaluation rules of S. Therefore, it's important to
exclude from the signature any arguments that need to be dealt with
symbolically (such as the first argument to function substitute).
This explains why it doesn't work, but not how one can implement this kind of subsetting, since i and j are included in the signature of the generic. Is there any way to have the i argument not be evaluated immediately?
You may be out of luck on this one. From the R developer notes,
Arguments appearing in the signature of the generic will be evaluated as soon as the generic function
is called; therefore, any arguments that need to take advantage of lazy evaluation must not be in
the signature. These are typically arguments treated literally, often via the substitute() function.
For example, if one wanted to turn substitute() itself into a generic, the first argument, expr,
would not be in the signature since it must not be evaluated but rather treated as a literal.
Furthermore, due to method caching,
All the arguments in the full signature are evaluated as described above, not just the active
ones. Otherwise, in special circumstances the behavior of the function could change for one
method when another method was cached, definitely undesirable.
I would follow the example from the data.table package writers and use an S3 object (see line 304 of R/data.table.R in their source code). Your S3 object can still create and manipulate an S4 object underneath to maintain the semi-static typing feature.
We can't get extraordinarily clever:
‘[’ is a primitive function; methods can be defined, but the generic function is implicit, and cannot be changed.
Defining both an S3 and S4 method will dispatch the S3 method, which makes it seem like we should be able to route around the S4 call and dispatch it manually, but unfortunately the argument evaluation still occurs! You can get close by borrowing plyr::., which would give you syntax like:
s <- new('SuperDataTable', dt = as.data.table(iris))
s[.(Sepal.Length > 4), 2]
Not ideal, but closer than anything else.
In packages like marray and limma, when complex objects are loaded, they contain "members variables" that are accessed using the # symbol. What does this mean and how does it differ from the $ symbol?
See ?'#':
Description:
Extract the contents of a slot in a object with a formal (S4)
class structure.
Usage:
object#name
...
The S language has two object systems, known informally as S3 and S4.
S3 objects, classes and methods have been available in R
from the beginning, they are informal, yet very interactive.
S3 was first described in the White Book (Statistical Models in S).
S3 is not a real class system, it mostly is a set of naming
conventions.
S4 objects, classes and methods are much more formal and
rigorous, hence less interactive. S4 was first described
in the Green Book (Programming with Data). In R it is
available through the methods package, attached by default
since version 1.7.0.
See also this document: S4 Classes and Methods.
As the others have said, the # symbol is used with S4 classes, but here is a note from Google's R Style Guide: "Use S3 objects and methods unless there is a strong reason to use S4 objects or methods."
You will want to read up on S4 classes which use the # symbol.