How to tell R how to plot objects of a certain class? - r

I'm currently dealing with some objects that are a list of attributes that represents a statistical model. For example, let's say I have a matrix, a numeric vector and an integer.
myobj = list(amatrix = matrix(1:9,3,3),avector = c(1:3),aninteger = 1)
class(myobj) = 'myclass'
Suppose that, for some reason, I can create a plot that represents an object of this class. How can I make plot(myobj) recognizes that the object has the class 'myclass', and print it in the desired way, for example image(myobj$amatrix)?
I think the question is essentially how to 'modify' R's plot function so it knows how to handle a newly defined object class? Can I use functions of other packages like ggplot when executing this modification?
In a more general sense, how does functions that handle different classes of objects know how to act for each class?
I have little to none experience with classes in R, so even some simple guides about classes should be helpful.

As mentionned by #emilliman you can define your own method:
myobj = list(amatrix = matrix(1:9,3,3),avector = c(1:3),aninteger = 1)
class(myobj) <- 'myclass'
plot.myclass <- function(x) image(x$amatrix)
methods(plot) # check the 4th element of 3rd line :) (list will differ depending on what packages are loaded)
# [1] plot.acf* plot.data.frame* plot.decomposed.ts* plot.default plot.dendrogram* plot.density* plot.ecdf
# [8] plot.factor* plot.formula* plot.function plot.hclust* plot.histogram* plot.HoltWinters* plot.isoreg*
# [15] plot.lm* plot.medpolish* plot.mlm* plot.myclass plot.ppr* plot.prcomp* plot.princomp*
# [22] plot.profile.nls* plot.R6* plot.raster* plot.spec* plot.stepfun plot.stl* plot.table*
# [29] plot.ts plot.tskernel* plot.TukeyHSD*
#and plot :
plot(myobj)

Related

R: Named lists and description lists

R has two classes not so commonly used: "Dlist" and "namedList".
As regard the first, it is mentioned with respect to Sys.getenv(), which returns a result of class "Dlist" if its argument is missing, for nice printing. There is in fact a print.Dlist method for the class. There is also an apparently related formatDL function to format description lists. However I do not know how to create an object of class "Dlist".
As regards "namedList", it is defined by the manual:
the alternative to "list" that preserves the names attribute
In this case I am unable to create an object of this type and I have not found any instance where it is used.
How do you create a "Dlist" or a "namedList"?
Can you provide an example where the "namedList" is more convenient of an ordinary named list? (intended as a list L, where names(L) is not NULL)
A Dlist is an informal class defined in R's Base package that exists for the sole purpose of pretty-printing a named character vector. It's not a list at all, but a type of character vector:
a <- Sys.getenv()
class(a)
# [1] "Dlist"
typeof(a)
# [1] "character"
You can make one just by writing Dlist to a named character vector's class attribute:
hello_world <- c(Hello = "World", Hi = "Earth", Greetings = "planet");
class(hello_world) <- "Dlist"
hello_world
# Hello World
# Hi Earth
# Greetings planet
You can select formatting options with formatDL:
cat(formatDL(hello_world, style = "list", width = 20), sep = "\n")
# Hello: World
# Hi: Earth
# Greetings: planet
The Dlist is only used in base R for printing environmental variables to the console. Unless you wanted to have a named character vector printed this way, you don't need a Dlist.
On the other hand, a namedList is a formal S4 object as defined in the methods package (also preloaded in R). It inherits its attributes from list and defines a single method - its own version of the generic S4 show.
You might use it as a base class from which to make new S4 classes that inherit the properties of a named list (i.e. a normal list with a names attribute), although it's not clear why a user advanced enough to create S4 classes wouldn't just do that themselves. It's defined here.
You can create a namedList with new:
n <- new("namedList", list(a=1, b=2))
n
# An object of class “namedList”
# $`a`
# [1] 1
#
# $b
# [1] 2
isS4(n)
# [1] TRUE

What's the real meaning about 'Everything that exists is an object' in R?

I saw:
“To understand computations in R, two slogans are helpful:
• Everything that exists is an object.
• Everything that happens is a function call."
— John Chambers
But I just found:
a <- 2
is.object(a)
# FALSE
Actually, if a variable is a pure base type, it's result is.object() would be FALSE. So it should not be an object.
So what's the real meaning about 'Everything that exists is an object' in R?
The function is.object seems only to look if the object has a "class" attribute. So it has not the same meaning as in the slogan.
For instance:
x <- 1
attributes(x) # it does not have a class attribute
NULL
is.object(x)
[1] FALSE
class(x) <- "my_class"
attributes(x) # now it has a class attribute
$class
[1] "my_class"
is.object(x)
[1] TRUE
Now, trying to answer your real question, about the slogan, this is how I would put it. Everything that exists in R is an object in the sense that it is a kind of data structure that can be manipulated. I think this is better understood with functions and expressions, which are not usually thought as data.
Taking a quote from Chambers (2008):
The central computation in R is a function call, defined by the
function object itself and the objects that are supplied as the
arguments. In the functional programming model, the result is defined
by another object, the value of the call. Hence the traditional motto
of the S language: everything is an object—the arguments, the value,
and in fact the function and the call itself: All of these are defined
as objects. Think of objects as collections of data of all kinds. The data contained and the way the data is organized depend on the class from which the object was generated.
Take this expression for example mean(rnorm(100), trim = 0.9). Until it is is evaluated, it is an object very much like any other. So you can change its elements just like you would do it with a list. For instance:
call <- substitute(mean(rnorm(100), trim = 0.9))
call[[2]] <- substitute(rt(100,2 ))
call
mean(rt(100, 2), trim = 0.9)
Or take a function, like rnorm:
rnorm
function (n, mean = 0, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
You can change its default arguments just like a simple object, like a list, too:
formals(rnorm)[2] <- 100
rnorm
function (n, mean = 100, sd = 1)
.Call(C_rnorm, n, mean, sd)
<environment: namespace:stats>
Taking one more time from Chambers (2008):
The key concept is that expressions for evaluation are themselves
objects; in the traditional motto of the S language, everything is an
object. Evaluation consists of taking the object representing an
expression and returning the object that is the value of that
expression.
So going back to our call example, the call is an object which represents another object. When evaluated, it becomes that other object, which in this case is the numeric vector with one number: -0.008138572.
set.seed(1)
eval(call)
[1] -0.008138572
And that would take us to the second slogan, which you did not mention, but usually comes together with the first one: "Everything that happens is a function call".
Taking again from Chambers (2008), he actually qualifies this statement a little bit:
Nearly everything that happens in R results from a function call.
Therefore, basic programming centers on creating and refining
functions.
So what that means is that almost every transformation of data that happens in R is a function call. Even a simple thing, like a parenthesis, is a function in R.
So taking the parenthesis like an example, you can actually redefine it to do things like this:
`(` <- function(x) x + 1
(1)
[1] 2
Which is not a good idea but illustrates the point. So I guess this is how I would sum it up: Everything that exists in R is an object because they are data which can be manipulated. And (almost) everything that happens is a function call, which is an evaluation of this object which gives you another object.
I love that quote.
In another (as of now unpublished) write-up, the author continues with
R has a uniform internal structure for representing all objects. The evaluation process keys off that structure, in a simple form that is essentially
composed of function calls, with objects as arguments and an object as the
value. Understanding the central role of objects and functions in R makes
use of the software more effective for any challenging application, even those where extending R is not the goal.
but then spends several hundred pages expanding on it. It will be a great read once finished.
Objects For x to be an object means that it has a class thus class(x) returns a class for every object. Even functions have a class as do environments and other objects one might not expect:
class(sin)
## [1] "function"
class(.GlobalEnv)
## [1] "environment"
I would not pay too much attention to is.object. is.object(x) has a slightly different meaning than what we are using here -- it returns TRUE if x has a class name internally stored along with its value. If the class is stored then class(x) returns the stored value and if not then class(x) will compute it from the type. From a conceptual perspective it matters not how the class is stored internally (stored or computed) -- what matters is that in both cases x is still an object and still has a class.
Functions That all computation occurs through functions refers to the fact that even things that you might not expect to be functions are actually functions. For example when we write:
{ 1; 2 }
## [1] 2
if (pi > 0) 2 else 3
## [1] 2
1+2
## [1] 3
we are actually making invocations of the {, if and + functions:
`{`(1, 2)
## [1] 2
`if`(pi > 0, 2, 3)
## [1] 2
`+`(1, 2)
## [1] 3

How to create lists of ggplot

I have been trying to create functions that return a list of ggplot and am having various problems. Fundamentally however, I do not understand why this is TRUE
"data.frame" == class(c(qplot(1:10,rnorm(10)))[[1]])
when this is [TRUE,TRUE]
c('gg','ggplot') == class(qplot(1:10,rnorm(10)))
I havent seen any questions similar to this. I see various questions that are solved by things like
lapply(someList, function(x) {
#make ggplot, then use print(...) or whatever
})
So I am guessing there is something about passing ggplot objects out of functions or between environments or something. Thanks for any clues about the ggplot or R that I am missing.
library(ggplot2)
p = c(qplot(1:10,rnorm(10)))
p2 = qplot(1:10,rnorm(10))
p[[1]] (same as p[['data']])) is supposed to be a dataframe. It usually holds the data for the plot.
p is a list because you used the c function.
p2 is a ggplot because that's what qplot returns.
Take a look at the attributes of each object.
attributes(p)
# $names
# [1] "data" "layers" "scales" "mapping" "theme" "coordinates"
# [7] "facet" "plot_env" "labels"
attributes(p2)
# $names
# [1] "data" "layers" "scales" "mapping" "theme" "coordinates"
# [7] "facet" "plot_env" "labels"
#
# $class
# [1] "gg" "ggplot"
To store many ggplot objects, use list.
ggplot.objects = list(p2,p2,p2)
The c help file
shows that ggplot is not a possible output.
It also states that
c is sometimes used for its side effect of removing attributes
except names, for example to turn an array into a vector
If you wanted
c to return ggplot objects, then you could try defining your own c.ggplot
function. You'll have to read a good deal about S3 and S4 functions to understand what's
going on.
If you have several ggplot objects in the global environment, you can make a list of them with :
ggplot_list <- Filter(function(x) is(x, "ggplot"), mget(ls()))

How can I read the source code for an R function?

I have a data frame and I want to learn how the summary generates it's information. Specifically, how does summary generate a count for the number of elements in each level of a factor. I can use summary, but I want to learn how to work with factors better. When I try ?summary, I just get the general info. Is this impossible because it is in bytecode?
What we see when you type summary is
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x0456f73c>
<environment: namespace:base>
This is telling us that summary is a generic function and has many methods attached to it. To see what those methods are actually called we can try
> methods(summary)
[1] summary.aov summary.aovlist summary.aspell*
[4] summary.connection summary.data.frame summary.Date
[7] summary.default summary.ecdf* summary.factor
[10] summary.glm summary.infl summary.lm
[13] summary.loess* summary.manova summary.matrix
[16] summary.mlm summary.nls* summary.packageStatus*
[19] summary.PDF_Dictionary* summary.PDF_Stream* summary.POSIXct
[22] summary.POSIXlt summary.ppr* summary.prcomp*
[25] summary.princomp* summary.srcfile summary.srcref
[28] summary.stepfun summary.stl* summary.table
[31] summary.tukeysmooth*
Non-visible functions are asterisked
Here we see all the methods associated with the summary function. What this means is that there is different code for when you call summary on an lm object than there is when you call summary on a data.frame. This is good because we wouldn't expect the summary to be conducted the same way for those two objects.
To see the code that is run when you call summary on a data.frame you can just type
summary.data.frame
as shown in the methods list. You'll be able to examine it and study it and do whatever you want with the printed code. You mentioned that you were interested in factors so you will probably want to examine the output of summary.factor. Now you might notice that some of the methods printed had an asterisk (*) next to them which implies that they're non-visible. This essentially means that you can't just type the name of the function to try to view the code.
> summary.prcomp
Error: object 'summary.prcomp' not found
However, if you're determined to see what the code actually is you can use the getAnywhere function to view it.
> getAnywhere(summary.prcomp)
A single object matching ‘summary.prcomp’ was found
It was found in the following places
registered S3 method for summary from namespace stats
namespace:stats
with value
function (object, ...)
{
vars <- object$sdev^2
vars <- vars/sum(vars)
importance <- rbind(`Standard deviation` = object$sdev, `Proportion of Variance` = round(vars,
5), `Cumulative Proportion` = round(cumsum(vars), 5))
colnames(importance) <- colnames(object$rotation)
object$importance <- importance
class(object) <- "summary.prcomp"
object
}
<bytecode: 0x03e15d54>
<environment: namespace:stats>
Hopefully this helps you explore the code in R much more easily in the future.
For even more details you can view Volume 6/4 of The R Journal (warning, pdf) and read Uwe Ligge's "R Help Desk" section which deals with viewing the source code of R functions.

Sort a list of nontrivial elements in R

In R, I have a list of nontrivial objects (they aren't simple objects like scalars that R can be expected to be able to define an order for). I want to sort the list. Most languages allow the programmer to provide a function or similar that compares a pair of list elements that is passed to a sort function. How can I sort my list?
To make this is as simple I can, say your objects are lists with two elements, a name and a value. The value is a numeric; that's what we want to sort by. You can imagine having more elements and needing to do something more complex to sort.
The sort help page tells us that sort uses xtfrm; xtfrm in turn tells us it will use == and > methods for the class of x[i].
First I'll define an object that I want to sort:
xx <- lapply(c(3,5,7,2,4), function(i) list(name=LETTERS[i], value=i))
class(xx) <- "myobj"
Now, since xtfrm works on the x[i]'s, I need to define a [ function that returns the desired elements but still with the right class
`[.myobj` <- function(x, i) {
class(x) <- "list"
structure(x[i], class="myobj")
}
Now we need == and > functions for the myobj class; this potentially could be smarter by vectorizing these properly; but for the sort function, we know that we're only going to be passing in myobj's of length 1, so I'll just use the first element to define the relations.
`>.myobj` <- function(e1, e2) {
e1[[1]]$value > e2[[1]]$value
}
`==.myobj` <- function(e1, e2) {
e1[[1]]$value == e2[[1]]$value
}
Now sort just works.
sort(xx)
It might be considered more proper to write a full Ops function for your object; however, to just sort, this seems to be all you need. See p.89-90 in Venables/Ripley for more details about doing this using the S3 style. Also, if you can easily write an xtfrm function for your objects, that would be simpler and most likely faster.
The order function will allow you to determine the sort order for character or numeric aruments and break ties with subsequent arguments. You need to be more specific about what you want. Produce an example of a "non-trivial object" and specify the order you desire in some R object. Lists are probably the most non-vectorial objects:
> slist <- list(cc=list(rr=1), bb=list(ee=2, yy=7), zz="ww")
> slist[order(names(slist))] # alpha order on names()
$bb
$bb$ee
[1] 2
$bb$yy
[1] 7
$cc
$cc$rr
[1] 1
$zz
[1] "ww"
slist[c("zz", "bb", "cc")] # an arbitrary ordering
$zz
[1] "ww"
$bb
$bb$ee
[1] 2
$bb$yy
[1] 7
$cc
$cc$rr
[1] 1
One option is to create a xtfrm method for your objects. Functions like order take multiple columns which works in some cases. There are also some specialized functions for specific cases like mixedsort in the gtools package.

Resources