How do I see existing classes - r

I have used the setClass function to define several new classes. But these classes don't appear in my Rstudio environment. How do I see all the classes that exist?
Here is an example:
setClass("geckoNss", representation(absolute = "character", item = "list"))
The class now exists somewhere, as we can do
> getClass("geckoNss")
Class "geckoNss" [in ".GlobalEnv"]
Slots:
Name: absolute item
Class: character list
and make objects of that class:
> new("geckoNss")
An object of class "geckoNss"
Slot "absolute":
character(0)
Slot "item":
list()
Yet, I still do not see the class anywhere. BondedDust's answer suggests that you can only see these classes if you assign them to an object.
So is there no way to even see the default classes R comes with?

http://stat.ethz.ch/R-manual/R-devel/library/methods/html/Classes.html
"When a class is defined, an object is stored that contains the information about that class. The object, known as the metadata defining the class, is not stored under the name of the class (to allow programmers to write generating functions of that name), but under a specially constructed name. To examine the class definition, call getClass. The information in the metadata object includes: "
From the setClass help page, it's stored in the environment where it is created (by default) or in the specified with the "where" argument:
"Create a class definition, specifying the representation (the slots) and/or the classes contained in this one (the superclasses), plus other optional details. As a side effect, the class definition is stored in the specified environment. A generator function is returned as the value of setClass(), suitable for creating objects from the class if the class is not virtual."
After running a setClass call at the console you get an object in the global environment by that name:
> track <- setClass("track",
+ slots = c(x="numeric", y="numeric"))
> ls()
[1] "A" "AE_by_factors" "B"
[4] "dat" "dd" "df"
[7] "final" "hl" "len"
[10] "lm0" "ml" "ml0"
[13] "peas2" "realdata" "temp"
[16] "tolerance" "track" "TravelMode"
[19] "vbin" "vint" "vnum"
> track
class generator function for class “track” from package ‘.GlobalEnv’
function (...)
new("track", ...)
> class(track)
#----------
[1] "classGeneratorFunction"
attr(,"package")
[1] "methods"
Your question originally asked about S4 classes, i.e. the ones created with setClass.. It wasn't at all clear that you wanted to find S3 and what might be called default or implicit classes. They are managed in a different manner. If you want to see all the classes that exist for the print function, just type:
methods(print) # I get 397 different methods at the moment. Each one implies an S3 class.
# a variable number of values will appear depending on which packages ar loaded
Also read the help page for ?methods. Those are each dispatched on the basis of the class attribute. For classes, such as 'numeric', integer, character or 'list' that are implicit but not stored in object class-attributes youyou simply need to know that they were built into the original S language. The S3 dispatch mechanism was actually bolted on to that core S mechanism back in the dawn of time. S3 was part of the language when it was described by "New S Language". I currently see that you can still get used copies at Amazon:
New S Language Paperback – June 30, 1988
by R. A. Becker (Author), J. M. Chambers (Author), Allan R Wilks (Author)
There are other functions that allow you to look at the functions accessible along the search path:
> ?objects
> length(objects())
[1] 85
> length(apropos(what="", mode="function"))
[1] 3431
So on my machine a bit more than 10% of the available functions are print methods.

Related

what is the object type of mean in R?

I am looking for the real object type of some functions in R, for example, I can not find out the object type of mean function.
> library(pryr)
> otype(mean)
[1] "base"
> ftype(mean)
[1] "s3" "generic"
Sometimes the mean function is S3 and sometimes it is base!
What does ftype tell us?
This function figures out whether the input function is a regular/primitive/internal function, a internal/S3/S4 generic, or a S3/S4/RC method. This is function is slightly simplified as it’s possible for a method from one class to be a generic for another class, but that seems like such a bad idea that hopefully no one has done it.
What does otype give us?
Figure out which object system an object belongs to:
• base: no class attribute
• S3: class attribute, but not S4
• S4: isS4, but not RC
• RC: inherits from "refClass"
For reference:
pryr package documentation
R language objects

R equivalent to the Python function "dir"?

Is there a function in R that can tell me the attributes of a given object (or class)?
Consider the "dir" function in python when passed the file class:
>>> dir(file)
['__class__', '__delattr__', '__doc__', '__enter__', '__exit__',
'__format__', '__getattribute__', '__hash__', '__init__', '__iter__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
'__sizeof__', '__str__', '__subclasshook__', 'close', 'closed',
'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name',
'newlines', 'next', 'read', 'readinto', 'readline', 'readlines',
'seek', 'soft space', 'tell', 'truncate', 'write', 'writelines',
'xreadlines']
Maybe there is an equivalent of type as well(?)
>>> type(1)
<type 'int'>
R makes several different object oriented systems available to you, so if you don't know what species of object you're dealing with, you'll first need to determine whether it is one of S3, S4, or RC. Use isS4(x) and is(x, 'refClass') for this. If it's not S4 and not RC, it's S3. See Hadley's Advanced R chapter on object oriented programming for more information.
For S3 and S4 objects there are several functions you need to call to get information equivalent to Python's dir. All of these methods will require you to supply the name of the class of the object as an argument, which you can determine with the class function.
For methods, use methods(class=class(x)) for S3 objects and showMethods(class=class(x)) for S4 objects. To reveal "attribute" names/values, use attributes(x) for S3 objects and getSlots(class(x)) for S4 objects. Note, getSlots will only show the slot names and types, not their values. To access the values, you'll have to use slot, but these values should also print when you simply print the object to the console.

Why is there a difference between length(f) and length(g) in this example?

f <- function() 1
g <- function() 2
class(g) <- "function"
class(f) ## "function"
class(g) ## "function"
length.function <- function(x) "function"
length(f) ## 1
length(g) ## "function"
First, length is not a typical generic function, but rather an "Internal Generic Function". You can see this by looking at its definition:
> length
function (x) .Primitive("length")
Compare this to a typical generic function:
> print
function (x, ...)
UseMethod("print")
<bytecode: 0x116ca6f90>
<environment: namespace:base>
length calls straight into .Primitive which then can do dispatch if it does not handle the call itself; the typical approach is directly calling UseMethod which only handles dispatch. Also note that there is no length.default function because the code in the .Primitive call does that:
> methods("length")
[1] length.function length.pdf_doc* length.POSIXlt
I am not sure it is completely defined when an Internal Generic will look at user defined methods and when it will use only internal ones; I think the general idea is that for a user/package defined (effectively, non-core) class, provided methods will be used. But overriding for internal classes may or may not work.
Additionally (though not strictly relevant for this case), even for a typical generic method, the documentation is ambiguous as to what should happen when the class is derived implicitly rather than given as an attribute. First, what class() reports is an amalgamation of things. From the class help page:
Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits. If the object does not have a class attribute, it has an implicit class, "matrix", "array" or the result of mode(x) (except that integer vectors have implicit class "integer").
So despite class returning the same thing for f and g, they are not the same.
> attributes(f)
$srcref
function() 1
> attributes(g)
$srcref
function() 2
$class
[1] "function"
Now, here is where it gets ambiguous. Method dispatch is talked about in (at least) 2 places: the class help page and the UseMethod help page. UseMethod says:
When a function calling UseMethod("fun") is applied to an object with class attribute c("first", "second"), the system searches for a function called fun.first and, if it finds it, applies it to the object. If no such function is found a function called fun.second is tried. If no class name produces a suitable function, the function fun.default is used, if it exists, or an error results.
While class says:
When a generic function fun is applied to an object with class attribute c("first", "second"), the system searches for a function called fun.first and, if it finds it, applies it to the object. If no such function is found, a function called fun.second is tried. If no class name produces a suitable function, the function fun.default is used (if it exists). If there is no class attribute, the implicit class is tried, then the default method.
The real difference is in the last sentence that the class page has that UseMethod doesn't. UseMethod does not say what happens if there is no class attribute; class says that the implicit class is used to dispatch. Your code seems to indicate that what is documented in class is not correct, as length.function would have been called for g were it.
What really happens in method dispatch when there is no class attribute will probably require examining the source code as the documentation does not seem to help.

Why is R capricious in its use of attributes on reference class objects?

I am having some trouble achieving consistent behavior accessing attributes attached to reference class objects. For example,
testClass <- setRefClass('testClass',
methods = list(print_attribute = function(name) print(attr(.self, name))))
testInstance <- testClass$new()
attr(testInstance, 'testAttribute') <- 1
testInstance$print_attribute('testAttribute')
And the R console cheerily prints NULL. However, if we try another approach,
testClass <- setRefClass('testClass',
methods = list(initialize = function() attr(.self, 'testAttribute') <<- 1,
print_attribute = function(name) print(attr(.self, name))))
testInstance <- testClass$new()
testInstance$print_attribute('testAttribute')
and now we have 1 as expected. Note that the <<- operator is required, presumably because assigning to .self has the same restrictions as assigning to reference class fields. Note that if we had tried to assign outside of the constructor, say
testClass <- setRefClass('testClass',
methods = list(set_attribute = function(name, value) attr(.self, name) <<- value,
print_attribute = function(name) print(attr(.self, name))))
testInstance <- testClass$new()
testInstance$set_attribute('testAttribute', 1)
we would be slapped with
Error in attr(.self, name) <<- value :
cannot change value of locked binding for '.self'
Indeed, the documentation ?setRefClass explains that
The entire object can be referred to in a method by the reserved name .self ... These fields are read-only (it makes no sense to
modify these references), with one exception. In principal, the
.self field can be modified in the $initialize method, because
the object is still being created at this stage.
I am happy with all of this, and agree with author's decisions. However, what I am concerned about is the following. Going back to the first example above, if we try asking for attr(testInstance, 'testAttribute'), we see from the global environment that it is 1!
Presumably, the .self that is used in the methods of the reference class object is stored in the same memory location as testInstance--it is the same object. Thus, by setting an attribute on testInstance successfully in the global environment, but not as a .self reference (as demonstrated in the first example), have we inadvertently triggered a copy of the entire object in the global environment? Or is the way attributes are stored "funny" in some way that the object can reside in the same memory, but its attributes are different depending on the calling environment?
I see no other explanation for why attr(.self, 'testAttribute') is NULL but attr(testInstance, 'testAttribute') is 1. The binding .self is locked once and for all, but that does not mean the object it references cannot change. If this is the desired behavior, it seems like a gotcha.
A final question is whether or not the preceding results imply attr<- should be avoided on reference class objects, at least if the resulting attributes are used from within the object's methods.
I think I may have figured it out. I began by digging into the implementation of reference classes for references to .self.
bodies <- Filter(function(x) !is.na(x),
structure(sapply(ls(getNamespace('methods'), all.names = TRUE), function(x) {
fn <- get(x, envir = getNamespace('methods'))
if (is.function(fn)) paste(deparse(body(fn)), collapse = "\n") else NA
}), .Names = ls(getNamespace('methods'), all.names = TRUE))
)
Now bodies holds a named character vector of all the functions in the methods package. We now look for .self:
goods <- bodies[grepl("\\.self", bodies)]
length(goods) # 4
names(goods) # [1] ".checkFieldsInMethod" ".initForEnvRefClass" ".makeDefaultBinding" ".shallowCopy"
So there are four functions in the methods package that contain the string .self. Inspecting them shows that .initForEnvRefClass is our culprit. We have the statement selfEnv$.self <- .Object. But what is selfEnv? Well, earlier in that same function, we have .Object#.xData <- selfEnv. Indeed, looking at the attributes on our testInstance from example one gives
$.xData
<environment: 0x10ae21470>
$class
[1] "testClass"
attr(,"package")
[1] ".GlobalEnv"
Peeking into attributes(attr(testInstance, '.xData')$.self) shows that we indeed can access .self directly using this approach. Notice that after executing the first two lines of example one (i.e. setting up testInstance), we have
identical(attributes(testInstance)$.xData$.self, testInstance)
# [1] TRUE
Yes! They are equal. Now, if we perform
attr(testInstance, 'testAttribute') <- 1
identical(attributes(testInstance)$.xData$.self, testInstance)
# [1] FALSE
so that adding an attribute to a reference class object has forced a creation of a copy, and .self is no longer identical to the object. However, if we check that
identical(attr(testInstance, '.xData'), attr(attr(testInstance, '.xData')$.self, '.xData'))
# [1] TRUE
we see that the environment attached to the reference class object remains the same. Thus, the copying was not very consequential in terms of memory footprint.
The end result of this foray is that the final answer is yes, you should avoid setting attributes on reference classes if you plan to use them within that object's methods. The reason for this is that the .self object in a reference class object's environment should be considered fixed once and for all after the object has been initialized--and this includes the creation of additional attributes.
Since the .self object is stored in an environment that is attached as an attribute to the reference class object, it does not seem possible to avoid this problem without using pointer yoga--and R does not have pointers.
Edit
It appears that if you are crazy, you can do
unlockBinding('.self', attr(testInstance, '.xData'))
attr(attr(testInstance, '.xData')$.self, 'testAttribute') <- 1
lockBinding('.self', attr(testInstance, '.xData'))
and the problems above magically go away.

In R, different behavior between `is.list(x)` and `is(x,'list')`

What is the explanation for the following behavior?
is.list(data.frame()) ## TRUE
is(data.frame(),'list') ## FALSE
is(data.frame()) ## "data.frame" "list" "oldClass" "vector"
extends('data.frame','list') ## TRUE
inherits(data.frame(),'list') ## FALSE
You are mixing S3 and S4 class conventions. is and extends are for S4 classes but these work with S3 ones because of the way these have been implemented. inherits was written for S3 classes and it is not intended to work with S4 objects with full compatibility.
inherits effectively compares the result of class(x) with the class you specify in the second argument. Hence
> class(data.frame())
[1] "data.frame"
doesn't contain "list" anywhere so fails.
Note also this from ?inherits:
The analogue of ‘inherits’ for formal classes is ‘is’. The two
functions behave consistently with one exception: S4 classes can
have conditional inheritance, with an explicit test. In this
case, ‘is’ will test the condition, but ‘inherits’ ignores all
conditional superclasses.
Another confusion is with the class of an object and the implementation of that object. Yes a data frame is a list as is.list() tells us, but in R's S3 class world, data.frame() is of class "data.frame" not "list".
As for is(data.frame(),'list'), well it isn't of that specific class "list" hence the FALSE. What is(data.frame()) does is documented in ?is
Summary of Functions:
‘is’: With two arguments, tests whether ‘object’ can be treated as
from ‘class2’.
With one argument, returns all the super-classes of this
object's class
Hence is(data.frame()) is showing the classes that the "data.frame" class extends (in the S4 sense, not the S3 sense). This further explains the extends('data.frame','list') behaviour as in the S4 world, the "data.frame" class does extend the "list" class.

Resources