Take the function names: that's a primitve function in R. For primitive functions, an implicit S4 generic is created, so it is possible to construct S4 methods for that function.
Take an S4 class defined as follows :
setClass("aClass",
representation=list(
values = "character",
id = "numeric"
),
prototype=list(
values = character(0),
id = numeric(0)),
validity=function(object){
length(object#values)==length(object#id)
}
)
Now I want to create a function to extract the names, either sorted or unsorted. I wanted to do this using the function names to avoid having to make a new function getNames() or whatever, as that's less intuitive.
The following gives an idea of what needs to be done:
setMethod("names",signature="aClass",
function(x,ordered=TRUE){
if(ordered)
x#values[x#id]
else
x#values
}
This won't work, as names is a primitive function and ordered is not an argument for the implicit generic.
How can I make this work under the following conditions:
the names function should keep its original behaviour for all other objects, including objects from other packages.
the code should be acceptable for use in a package
the code should be acceptable by the high standards set by eg Bioconductor.
The generic is available as
> getGeneric("names")
standardGeneric for "names" defined from package "base"
function (x)
standardGeneric("names", .Primitive("names"))
<environment: 0x459c9c0>
Methods may be defined for arguments: x
Use showMethods("names") for currently available ones.
so from the signature you can see that the short answer is that you can't add arguments. You'd definitely not want to create your own function names. A hack would use a package-global variable getOption("pkg_names_ordered") but I wouldn't partake of that solution myself.
In some ways the contract set out by names does not say anything about order (for instance, names and numerical indecies are often used to subset; are the numerical indices for the ordered names, or the unordered names?), so you're really proposing a new generic anyway.
Related
I have the following function:
myFunction = function(objects,params) {
for (i in 1:length(objects)) {
object = objects[[i]]
object = myOtherFunction(objects, params)
objects[[i]] = object
}
return (objects)
}
#' #rdname myFunction
#' #aliases myFunction
setMethod("myFunction", signature(object ="list"), myFunction)
How can I properly set the setMethod() and setGeneric() methods to accept a list of objects of a given type, let's say a list of objects of type SingleCellExperiment ?
If you want to write different methods to handle lists of class foo and lists of class bar then S4 will need some help, since both objects are of class list and hence the same method will be called in both cases.
There are a few options:
firstly, do you need to use lists at all? Don't forget all the base types in R are vectors, so for simple classes like
setClass("cuboid",slots=list(
height="numeric",
width="numeric",
depth="numeric"
)) -> cuboid
if you want to represent a set of multiple cuboids you don't need to use a list at all, just feed vectors of values to cuboid. This doesn't work as well for more exotic classes, though.
alternatively, you can write a list method with some extra logic to determine which lower-order method to dispatch. You should also think about what to do if the list contains objects of multiple different classes.
in some situations you might be able to use either lapply or a function that takes arbitrary numbers of arguments via .... In the latter case you may be able to make use of dotsMethods (check the help page on that topic for more info).
If you want to write a method that will only be called on lists of objects of class foo and there may exist another method that wants to operate on lists, then you can either:
write a method for class foo directly, and then use sapply or lapply rather than calling your function on the list
write a method for class list that checks whether the list has foos in it and if it doesn't, calls nextMethod.
You can find all the objects in a package with
objs <- mget(ls("package:base"), inherits = TRUE)
You can select the functions from these with
funs <- objs[is.function(objs)]
You can get a complete list of the dependencies of the listed functions in a package by applying codetools::findGlobals(), miniCRAN::makeDepGraph, pkgnet::CreatePackageReport (or others) to the function list. All of these functions either graph the resulting dependencies or return an object easily plotable with, e.g., igraph or DependenciesGraph.
Is there an comparable set of commands to find all the classes created by a package and the inheritance structure of those classes? I know that for most packages the resulting web of class inheritance would be relatively simple, but I think that in a few cases, such as ggplot2 and the survey package, the resulting web of class inheritance could be quite helpful.
I have found a package, classGraph, that creates directed acyclic graphs for S4 class structures, but I am more interested in the much more common S3 structures.
This seems brute-force and sloppy, but I suppose if I had a list of all the class attributes used by objects in the base packages, and all the class attributes of objects in a package, then any of the latter which is not among the former would be new classes created by the package or inherited from another non-base package.
This is slightly tricky since I am not aware of any formal definition of a S3 class. For R objects the S3 classes are governed by a very simple character vector of class names stored in the class attribute. Method dispatch is then done by matching element(s) of that attribute with a function name.
You could essentially do:
x <- 1:5
class(x) <- "MyMadeUpClass"
x
# [1] 1 2 3 4 5
# attr(,"class")
# [1] "MyMadeUpClass"
Does the above really define a class in the intuitive formal understanding of the term ?
You can create a print method for objects of this class like (silly example incoming):
print.MyMadeUpClass <- function(x, ...) {
print(sprintf("Pretty vector: %s", paste(x, collapse = ",")))
}
x
# [1] "Pretty vector: 1,2,3,4,5"
The important distinction here is that methods in S3
"belong to" (generic) functions, not classes
are chosen based on classes of the arguments provided to the function call
Point I am trying to make is that S3 does not really have a formally defined inheritance (which I assume is what you are looking for), with contrast to S4 which implements this via the contains concept, so I am not really sure what would you like to see as a result.
Very good read on the topic Object-Oriented Programming, Functional
Programming and R by John M. Chambers: https://arxiv.org/pdf/1409.3531.pdf
Edit (after question edit) - the sloop package:
From S3 perspective I think it makes a lot of sense to examine the structure of generics and methods. A found the sloop package to be a very useful tool for this: https://github.com/r-lib/sloop.
There are a number of tests which, applied to an object of a given class, produce information about that object. Consider objects of class "function". The functions is.primitive() or is.closure(), or (from rlang) is_primitive_eager() or is_primitive_lazy(), provide information about a function object. However, Using methods(class = "function") (with rlang loaded) does not return any of these functions:
[1] as.data.frame as.list coerce coerce<- fortify head latex plot print tail .
Using extends(class1 = "function", maybe = TRUE, fullInfo = TRUE) shows two superclasses, "OptionalFunction" and "PossibleMethod".
Using completeClassDefinition(Class = "function", doExtends=TRUE) provides 23 subclasses. However, it appears to me (though I am not sure of this) that all or almost all of the super- and sub-classes from these two functions are specifically of S4 classes, which I generally do not use. One of these subclasses is "genericFunction", so I tried to apply it to a base R function which I knew to be generic. Although is(object=plot, class2 = "genericFunction") returns TRUE, and plot() antedates S4 classes, there is no "is.generic" test in base R, but there is an "isGeneric" test in the methods package, which suggests to me that plot() has been rewritten as an S4 object.
At any rate, there are a lot of obvious potential properties of functions, like whether they are generic, for which there are no is.<whatever> tests that I can find, and I would like to know if there are other ways I can search for them, e.g., in packages.
A more generic way of asking this same question is whether there is any way of identifying functions that will accept objects of a specified class and not return an error or nonsense. If so I could take a list of the functions in the reccomended packages or in some specified package and test whether each returns a sensable response when handed a function. This is not exactly an answer --- such a method would return TRUE for quote(), for example -- but it would at least cut the problem down to size.
Once and for all I want to get the R terminology right. However, none of the books I was reading was of big help, and it seems to me the authors choose the names sometimes arbitrarily. So, my question is when exactly are the names "attribute", "parameter", and "argument" used?
From what I read and understood so far, a parameter is what a function can take as input. For example if I have a function that calculates the sum of two values, sum(value1, value2), 'value1' and 'value2' are the function's parameters.
If we are calling a function, we call the values passed to the function arguments. For the sum-function example, "23" and "48" would be the function arguments for:
sum(23,48).
So basically we call it parameter when we define a function, and we call it argument when we call the function (so the arguments are passed to the function's parameters)
But what about "attributes"? From what I understand, attributes are the equivalent of parameters in methods (and methods are functions of a class object)?
For example, if I would have something like:
heatmap(myData, Colv=NA, Rowv=NA)
... , would 'myData' be an argument or attribute? And what about Colv=NA and Rowv=NA? Isn't heatmap() a function and thus everything in the parentheses should be called arguments?
Suppose we have:
f <- function(x) x + 1
comment(f) <- "my function"
f(3)
Arguments We distinguish between formal arguments and actual arguments. In the above x is the formal argument to f. The names of the formal arguments of f are given by:
> names(formals(f))
[1] "x"
The actual arguments to a function vary from one call to another and in the above example there is a single actual argument 3.
The function args can be used to display the entire function signature of a function including the formal arguments and the default arguments and if you are debugging a function you can enter match.call() to list the function signature with the actual arguments substituted.
Attributes The attributes of an R object are given by attributes(f) like this:
> attributes(f)
$srcref
function(x) x + 1
$comment
[1] "my function"
There is one exception and that is that an object's class is also regarded as an attribute but is not given by the above but rather is given by class:
> class(f)
[1] "function"
Parameters Sometimes function arguments are referred to as parameters or sometimes one refers to those arguments which are fixed as parameters but this tends to be related more to mathematics and statistics than R.
In statistical models the model is typically a function of the data and the model parameters often via the likelihood. For example, here:
> lm(demand ~ Time, BOD)
Call:
lm(formula = demand ~ Time, data = BOD)
Coefficients:
(Intercept) Time
8.521 1.721
the linear regression coefficients of Intercept and Time (viz. 8.521 and 1.721) are often referred to as model parameters.
As Dwin has already pointed out the various values influencing graphics in R are also termed parameters and can be displayed via:
> par()
and the corresponding concepts in other R graphics systems are often also referred to as parameters.
I suppose colloquial use of the term "attribute" might refer to several features of data objects, but there is a very specific meaning in R. An attribute is a value returned by either the functions: attributes or attr. These are critical to the language in that classes and names are stored as attributes. There two other assignment functions: attributes<- and attr<- that allow additional attributes to be assigned in support of class specific objectives.
?attributes
?attr
There is a par function which sets graphical "parameters" that control the base graphics behavior. So that would be an R-specific use of parameter than might be slightly different than use of "argument" which is generally applied to the formal arguments to functions.
?par
The is a function args which applied to a function name or an anonymous function will return its arguments (as a "closure" which gets printed on the console just as a user would type during a function definition) along with their default values. The function formals will return the same "argument" information in the form of a list.
?args
?formals
I realize I am implicitly arguing with Matthew whose R skills are excellent. Contrary to him, I think that attributes and arguments have more specific meanings in the context of R and that careful authors will make an effort to keep their meanings separate. I would not have a problem understanding someone who uses parameter as a synonym for argument if the context were clearly a discussion of applying a function, since that is the typical parlance in mathematics. I would agree with the conclusion of your last sentence. Those are 'arguments' and most emphatically not attributes. The attributes of an object returned by heatmap are:
> attributes(hv) #from first example in ?heatmap
#$names
# [1] "rowInd" "colInd" "Rowv" "Colv"
But only some of the arguments became attributes and then only after being assigned to the returned value during the function execution.
I am not sure how analogous R is to Python, but I think most of the terms should be consistent across different languages. From what I read and learned in the last couple of days, a parameter is basically what a function takes as its input when you define it:
my_function <- function (param1, param2){
...
}
and it is called argument if you are invoking a function with certain input values (that are passed to the function as parameters):
my_function(arg1, arg2)
Functions that are part of a class are called method. And an attribute can be either a value or method associated with a class object (or so-called instance)
So the question whether we call something argument or attribute depends on what we are calling: a function or a method. But I would say now argument is an appropriate term if we call the heatmap function, for example:
heatmap(my_data)
Attribute : Object's properties, e.g. Person has String fName, lName;
Parameter: appears in function/method definition e.g. public void setName(fName, lName)
Argument: value passed for a method/function's parameter when invoking/calling the method/function e.g. myPerson.setName("Michael", "Jackson")
What is purpose of dot before variables (i.e. "variables") in the R Plyr package?
for instance, from the R help file:
ddply(.data, .variables, .fun = NULL, ...,
.progress = "none", .drop = TRUE, .parallel = FALSE)
Any assistance would be greatly appreciated
There may be two things going on that are confusing you.
One is the . function in the 'plyr' package. The . function allows you to use a variable as a link rather than referring to the value(s) the variable contains. For instance, in some functions, we want to refer to the object x rather than the value(s) stored in x. In the 'base' package, there is no easy, concise way of doing this, so we use the 'plyr' package to say .(x). The 'plyr' functions themselves use this a lot like so:
ddply(data, .(row_1), summarize, total=sum(row_1))
If we didn't use the . function, 'ddply' would complain, because 'row_1' contains many values, when we really just want to refer to the object.
The other "." in action here is the way people use it as a character in the function arguments' names. I'm not sure what the origin is, but a lot of people seem to do it just to highlight which variables are function arguments and which variables are only part of the function's internal code. The "." is just another character, in this case.
From http://www.jstatsoft.org/v40/i01
Note that all arguments start with . This prevents name clashes with the arguments of the processing function, and helps to visually delineate arguments that control the repetition from arguments that control the individual steps. Some functions in base R use all uppercase
argument names for this purpose, but I think this method is easier to type and read.