In packages like marray and limma, when complex objects are loaded, they contain "members variables" that are accessed using the # symbol. What does this mean and how does it differ from the $ symbol?
See ?'#':
Description:
Extract the contents of a slot in a object with a formal (S4)
class structure.
Usage:
object#name
...
The S language has two object systems, known informally as S3 and S4.
S3 objects, classes and methods have been available in R
from the beginning, they are informal, yet very interactive.
S3 was first described in the White Book (Statistical Models in S).
S3 is not a real class system, it mostly is a set of naming
conventions.
S4 objects, classes and methods are much more formal and
rigorous, hence less interactive. S4 was first described
in the Green Book (Programming with Data). In R it is
available through the methods package, attached by default
since version 1.7.0.
See also this document: S4 Classes and Methods.
As the others have said, the # symbol is used with S4 classes, but here is a note from Google's R Style Guide: "Use S3 objects and methods unless there is a strong reason to use S4 objects or methods."
You will want to read up on S4 classes which use the # symbol.
Related
I recently thought about the ... argument for a function and noticed that R does not allow to check the class of the object.
f <- function(...) {
class(...)
}
f(1, 2, 3)
## Error in class(...) : 3 arguments passed to 'class' which requires 1
Now with the quote
“To understand computations in R, two slogans are helpful:
• Everything that exists is an object. • Everything that happens is a
function call."
— John Chambers
in my head I'm wondering: What kind of object is ...?
What an interesting question!
Dot-dot-dot ... is an object (John Chambers is right!) and it's a type of pairlist. Well, I searched the documentation, so I'd like to share it with you:
R Language Definition document says:
The ‘...’ object type is stored as a type of pairlist. The components of ‘...’ can be accessed in the usual pairlist manner from C code, but is not easily accessed as an object in interpreted code. The object can be captured as a list.
Another chapter defines pairlists in detail:
Pairlist objects are similar to Lisp’s dotted-pair lists.
Pairlists are handled in the R language in exactly the same way as generic vectors (“lists”).
Help on Generic and Dotted Pairs says:
Almost all lists in R internally are Generic Vectors, whereas traditional dotted pair lists (as in LISP) remain available but rarely seen by users (except as formals of functions).
And a nice summary is here at Stack Overflow!
To apply a function to all slots in S4.
Of course, it can be done with for-loop over slotNames(). But I'm curious if it can be done in a vectorized way.
In general it isn't possible to operate on slots in a vectorised way, because the slots might have any class. If a class has structure
slotA = "factor"
slotB = "integer"
slotC = "numeric"
then even though you might be applying the same (generic) function to all of them (say, summary) the actual methods that get called will be different. The task just isn't vectorisable, any more than the set of commands "mop the floor, wash the car and vacuum the carpet" could be vectorised even though they might all share the generic function clean — you need a mop for one task, a sponge for another and a vacuum cleaner for the third. (Contrast that with the set of commands "vacuum the three carpets in the bedroom, hallway and lounge" which can be vectorised to an extent — you don't have to get the vacuum cleaner out of the box three times and put it away three times, you can do it just once)
If you can guarantee that all the slots will be of the same class, then it becomes easier to vectorise, but if that is the case, why does this object have the structure that it does? If it needs to be S4 then just define a simple class that contains a list, matrix or array and then use sapply or apply as needed.
In packages like marray and limma, when complex objects are loaded, they contain "members variables" that are accessed using the # symbol. What does this mean and how does it differ from the $ symbol?
See ?'#':
Description:
Extract the contents of a slot in a object with a formal (S4)
class structure.
Usage:
object#name
...
The S language has two object systems, known informally as S3 and S4.
S3 objects, classes and methods have been available in R
from the beginning, they are informal, yet very interactive.
S3 was first described in the White Book (Statistical Models in S).
S3 is not a real class system, it mostly is a set of naming
conventions.
S4 objects, classes and methods are much more formal and
rigorous, hence less interactive. S4 was first described
in the Green Book (Programming with Data). In R it is
available through the methods package, attached by default
since version 1.7.0.
See also this document: S4 Classes and Methods.
As the others have said, the # symbol is used with S4 classes, but here is a note from Google's R Style Guide: "Use S3 objects and methods unless there is a strong reason to use S4 objects or methods."
You will want to read up on S4 classes which use the # symbol.
How does one correctly do the following:
I have a class SpectraSet with slots parentSpectrum, childSpectra, name (to keep it simple)
name is character()
parentSpectrum should contain one object of class ParentSpec (so it is of type ParentSpec)
childSpectra should contain n objects of class ChildSpec. However I can't make it of type ChildSpec because vectors can only contain atomic types. What is best practice in this case? I can make it a list() and type check in the validity check, but is there anything better?
Here are related anaswers I've provided in the past.
It's usually better to re-think the class design so ChildSpec is intrinsically a vector -- minimally, supports length() and subsetting [, [[. Your problem above then goes away, the design is consistent with R's vectorized orientation, and likely common operations are efficient.
An alternative to implementing your own type-checked list (which is really the other alternative) is to re-use the infrastructure from Biocdonductor's S4Vectors class
.X = setClass("X", representation(x="numeric"))
.XList = setClass("XList", contains="SimpleList",
prototype=prototype(elementType="X"))
And in action
> xl = .XList(listData=list(.X(x=1), .X(x=2)))
> xl
XList of length 2
> xl[[2]]
An object of class "X"
Slot "x":
[1] 2
If given an object x, is there a way to classify whether or not it is S3 or S4 (or "other")? I have looked at is.object() and isS4(), and can identify that something is an object (or not) and that it is an S4 object (or not). However, it doesn't seem to me that S3 objects are the complement of all objects that are not S4 objects.
Therefore, how can these assignments be done programmatically?
Here is an example of something that bugs me, taken from the help for is.object():
a = as.factor(1:3)
is.object(a) # TRUE
isS4(a) # FALSE
Does that mean that a is an S3 object?
If it is an object and is not an S4 then it is an S3:
is.object(foo) & !isS4(foo)
is.object checks for some magic OBJECT bit that gets set when the thing has a class attribute, so its essentially a fast way of doing any(names(attributes(foo))=="class"), which is what defines an S3 object.