Confused about R terminology: Attributes, parameters, and arguments - r

Once and for all I want to get the R terminology right. However, none of the books I was reading was of big help, and it seems to me the authors choose the names sometimes arbitrarily. So, my question is when exactly are the names "attribute", "parameter", and "argument" used?
From what I read and understood so far, a parameter is what a function can take as input. For example if I have a function that calculates the sum of two values, sum(value1, value2), 'value1' and 'value2' are the function's parameters.
If we are calling a function, we call the values passed to the function arguments. For the sum-function example, "23" and "48" would be the function arguments for:
sum(23,48).
So basically we call it parameter when we define a function, and we call it argument when we call the function (so the arguments are passed to the function's parameters)
But what about "attributes"? From what I understand, attributes are the equivalent of parameters in methods (and methods are functions of a class object)?
For example, if I would have something like:
heatmap(myData, Colv=NA, Rowv=NA)
... , would 'myData' be an argument or attribute? And what about Colv=NA and Rowv=NA? Isn't heatmap() a function and thus everything in the parentheses should be called arguments?

Suppose we have:
f <- function(x) x + 1
comment(f) <- "my function"
f(3)
Arguments We distinguish between formal arguments and actual arguments. In the above x is the formal argument to f. The names of the formal arguments of f are given by:
> names(formals(f))
[1] "x"
The actual arguments to a function vary from one call to another and in the above example there is a single actual argument 3.
The function args can be used to display the entire function signature of a function including the formal arguments and the default arguments and if you are debugging a function you can enter match.call() to list the function signature with the actual arguments substituted.
Attributes The attributes of an R object are given by attributes(f) like this:
> attributes(f)
$srcref
function(x) x + 1
$comment
[1] "my function"
There is one exception and that is that an object's class is also regarded as an attribute but is not given by the above but rather is given by class:
> class(f)
[1] "function"
Parameters Sometimes function arguments are referred to as parameters or sometimes one refers to those arguments which are fixed as parameters but this tends to be related more to mathematics and statistics than R.
In statistical models the model is typically a function of the data and the model parameters often via the likelihood. For example, here:
> lm(demand ~ Time, BOD)
Call:
lm(formula = demand ~ Time, data = BOD)
Coefficients:
(Intercept) Time
8.521 1.721
the linear regression coefficients of Intercept and Time (viz. 8.521 and 1.721) are often referred to as model parameters.
As Dwin has already pointed out the various values influencing graphics in R are also termed parameters and can be displayed via:
> par()
and the corresponding concepts in other R graphics systems are often also referred to as parameters.

I suppose colloquial use of the term "attribute" might refer to several features of data objects, but there is a very specific meaning in R. An attribute is a value returned by either the functions: attributes or attr. These are critical to the language in that classes and names are stored as attributes. There two other assignment functions: attributes<- and attr<- that allow additional attributes to be assigned in support of class specific objectives.
?attributes
?attr
There is a par function which sets graphical "parameters" that control the base graphics behavior. So that would be an R-specific use of parameter than might be slightly different than use of "argument" which is generally applied to the formal arguments to functions.
?par
The is a function args which applied to a function name or an anonymous function will return its arguments (as a "closure" which gets printed on the console just as a user would type during a function definition) along with their default values. The function formals will return the same "argument" information in the form of a list.
?args
?formals
I realize I am implicitly arguing with Matthew whose R skills are excellent. Contrary to him, I think that attributes and arguments have more specific meanings in the context of R and that careful authors will make an effort to keep their meanings separate. I would not have a problem understanding someone who uses parameter as a synonym for argument if the context were clearly a discussion of applying a function, since that is the typical parlance in mathematics. I would agree with the conclusion of your last sentence. Those are 'arguments' and most emphatically not attributes. The attributes of an object returned by heatmap are:
> attributes(hv) #from first example in ?heatmap
#$names
# [1] "rowInd" "colInd" "Rowv" "Colv"
But only some of the arguments became attributes and then only after being assigned to the returned value during the function execution.

I am not sure how analogous R is to Python, but I think most of the terms should be consistent across different languages. From what I read and learned in the last couple of days, a parameter is basically what a function takes as its input when you define it:
my_function <- function (param1, param2){
...
}
and it is called argument if you are invoking a function with certain input values (that are passed to the function as parameters):
my_function(arg1, arg2)
Functions that are part of a class are called method. And an attribute can be either a value or method associated with a class object (or so-called instance)
So the question whether we call something argument or attribute depends on what we are calling: a function or a method. But I would say now argument is an appropriate term if we call the heatmap function, for example:
heatmap(my_data)

Attribute : Object's properties, e.g. Person has String fName, lName;
Parameter: appears in function/method definition e.g. public void setName(fName, lName)
Argument: value passed for a method/function's parameter when invoking/calling the method/function e.g. myPerson.setName("Michael", "Jackson")

Related

To find valid argument for a function in R's help document (meaning of ...)

This question may seem basic but this has bothered me quite a while. The help document for many functions has ... as one of its argument, but somehow I can never get my head around this ... thing.
For example, suppose I have created a model say model_xgboost and want to make a prediction based on a dataset say data_tbl using the predict() function, and I want to know the syntax. So I look at its help document which says:
?predict
**Usage**
predict (object, ...)
**Arguments**
object a model object for which prediction is desired.
... additional arguments affecting the predictions produced.
To me the syntax and its examples didn't really enlighten me as I still have no idea what the valid syntax/arguments are for the function. In an online course it uses something like below, which works:
data_tbl %>%
predict(model_xgboost, new_data = .)
However, looking across the help doc I cannot find the new_data argument. Instead it mentioned newdata argument in its Details section, which actually didn't work if I displace the new_data = . with newdata = .:
Error in `check_pred_type_dots()`:
! Did you mean to use `new_data` instead of `newdata`?
My questions are:
How do I know exactly what argument(s) / syntax can be used for a function like this?
Why new_data but not newdata in this example?
I might be missing something here, but is there any reference/resource about how to use/interpret a help document, in plain English? (a lot of document, including R help file seem just give a brief sentence like "additional arguments affecting the predictions produced" etc)
#CarlWitthoft's answer is good, I want to add a little bit of nuance about this particular function. The reason the help page for ?predict is so vague is an unfortunate consequence of the fact that predict() is a generic method in R: that is, it's a function that can be applied to a variety of different object types, using slightly different (but appropriate) methods in each case. As such, the ?predict help page only lists object (which is required as the first argument in all methods) and ..., because different predict methods could take very different arguments/options.
If you call methods("predict") in a clean R session (before loading any additional packages) you'll see a list of 16 methods that base R knows about. After loading library("tidymodels"), the list expands to 69 methods. I don't know what class your object is (class("model_xgboost")), but assuming that it's of class model_fit, we look at ?predict.model_fit to see
predict(object, new_data, type = NULL, opts = list(), ...)
This tells us that we need to call the new data new_data (and, reading a bit farther down, that it needs to be "A rectangular data object, such as a data frame")
The help page for predict says
Most prediction methods which are similar to those for linear
models have an argument ‘newdata’ specifying the first place to
look for explanatory variables to be used for prediction
(emphasis added). I don't know why the parsnip authors (the predict.model_fit method comes from the parsnip package) decided to use new_data rather than newdata, presumably in line with the tidyverse style guide, which says
Use underscores (_) (so called snake case) to separate words within a name.
In my opinion this might have been a mistake, but you can see that the parsnip/tidymodels authors have realized that people are likely to make this mistake and added an informative warning, as shown in your example and noted e.g. here
Among other things, the existence of ... in a function definition means you can enter any arguments (values, functions, etc) you want to. There are some cases where the main function does not even use the ... but passes them to functions called inside the main function. Simple example:
foo <- function(x,...){
y <- x^2
plot(x,y,...)
}
I know of functions which accept a function as an input argument, at which point the items to include via ... are specific to the selected input function name.

R: Is there any way to find R functions which are tests of objects of a specified class?

There are a number of tests which, applied to an object of a given class, produce information about that object. Consider objects of class "function". The functions is.primitive() or is.closure(), or (from rlang) is_primitive_eager() or is_primitive_lazy(), provide information about a function object. However, Using methods(class = "function") (with rlang loaded) does not return any of these functions:
[1] as.data.frame as.list coerce coerce<- fortify head latex plot print tail .
Using extends(class1 = "function", maybe = TRUE, fullInfo = TRUE) shows two superclasses, "OptionalFunction" and "PossibleMethod".
Using completeClassDefinition(Class = "function", doExtends=TRUE) provides 23 subclasses. However, it appears to me (though I am not sure of this) that all or almost all of the super- and sub-classes from these two functions are specifically of S4 classes, which I generally do not use. One of these subclasses is "genericFunction", so I tried to apply it to a base R function which I knew to be generic. Although is(object=plot, class2 = "genericFunction") returns TRUE, and plot() antedates S4 classes, there is no "is.generic" test in base R, but there is an "isGeneric" test in the methods package, which suggests to me that plot() has been rewritten as an S4 object.
At any rate, there are a lot of obvious potential properties of functions, like whether they are generic, for which there are no is.<whatever> tests that I can find, and I would like to know if there are other ways I can search for them, e.g., in packages.
A more generic way of asking this same question is whether there is any way of identifying functions that will accept objects of a specified class and not return an error or nonsense. If so I could take a list of the functions in the reccomended packages or in some specified package and test whether each returns a sensable response when handed a function. This is not exactly an answer --- such a method would return TRUE for quote(), for example -- but it would at least cut the problem down to size.

In R: Why is there no complete list of every argument a function can use?

Im using R for about 3 years and one of the main advantages (in my opinion) is the wide range of questions and assistance one can find on stackoverflow and similar websites.
One thing that is missing and kind of annoys me is an entire list of every single argument a function can use (plus possible values of those arguments).
For example: In R documentation all "main" arguments are listed and in many cases the documentation says "... further arguments passed to or from other methods". How can I know which arguments are meant by "..."?
When searching on stackoverflow for a way to get my desired result of an analysis I sometimes stumble about these additional arguments which can be very helpful in many cases. It still takes much time to find these arguments hidden in other users answers. Sometimes I used a workaround which would have been unnecessary if I had known some additional function arguments.
Is anyone else experiencing the same thing?
(It's difficult to mention examples but I remember having that trouble when using the leaflet functions for the first time.)
Tim
The most direct answer is that we often don't know what arguments one might want to pass to .... In fact, that is the point of ... arguments, is to not require us to know what arguments may be passed to it.
Consider, for example, the print generic in base R. It is defined as
print(x, ...)
So what are the arguments that can be passed to ...?
print.factor defines
print(x, quote = FALSE, max.levels = NULL,
width = getOption("width"), ...)
print.table defines
print(x, digits = getOption("digits"), quote = FALSE,
na.print = "", zero.print = "0", justify = "none", ...)
Notice that the print methods for factor and table objects don't share the same arguments. In fact, every print method may be defined with a different set of arguments. R then uses the class of the object to determine which set of arguments to apply to print.
When a developer creates a new print method, CRAN requires that all new methods contain at least the same arguments as the generic. So every print method has arguments x and ....
How do I know what arguments may be acceptable to ...?
First, read and follow the documentation. In glm, you find that the ... argument accepts arguments to "form the default control argument." This references the control argument, which then references the glm.control function. Opening ?glm.control shows the arguments epsilon, maxit and trace.
Another example, in ggplot2's geom_line, the documentation states that ... arguments are passed to the layer function. Use ?layer to see what arguments are available.
If the documentation simply specifies "to other methods," then you are probably looking at a method that is dispatched with different behaviors for different types of objects.

Add extra arguments to implicit S4 generic for a primitive function

Take the function names: that's a primitve function in R. For primitive functions, an implicit S4 generic is created, so it is possible to construct S4 methods for that function.
Take an S4 class defined as follows :
setClass("aClass",
representation=list(
values = "character",
id = "numeric"
),
prototype=list(
values = character(0),
id = numeric(0)),
validity=function(object){
length(object#values)==length(object#id)
}
)
Now I want to create a function to extract the names, either sorted or unsorted. I wanted to do this using the function names to avoid having to make a new function getNames() or whatever, as that's less intuitive.
The following gives an idea of what needs to be done:
setMethod("names",signature="aClass",
function(x,ordered=TRUE){
if(ordered)
x#values[x#id]
else
x#values
}
This won't work, as names is a primitive function and ordered is not an argument for the implicit generic.
How can I make this work under the following conditions:
the names function should keep its original behaviour for all other objects, including objects from other packages.
the code should be acceptable for use in a package
the code should be acceptable by the high standards set by eg Bioconductor.
The generic is available as
> getGeneric("names")
standardGeneric for "names" defined from package "base"
function (x)
standardGeneric("names", .Primitive("names"))
<environment: 0x459c9c0>
Methods may be defined for arguments: x
Use showMethods("names") for currently available ones.
so from the signature you can see that the short answer is that you can't add arguments. You'd definitely not want to create your own function names. A hack would use a package-global variable getOption("pkg_names_ordered") but I wouldn't partake of that solution myself.
In some ways the contract set out by names does not say anything about order (for instance, names and numerical indecies are often used to subset; are the numerical indices for the ordered names, or the unordered names?), so you're really proposing a new generic anyway.

The Art of R Programming : Where else could I find the information?

I came across the editorial review of the book The Art of R Programming, and found this
The Art of R Programming takes you on a guided tour of software development with R, from basic types and data structures to advanced topics like closures, recursion, and anonymous functions
I immediately became fascinated by the idea of anonymous functions, something I had come across in Python in the form of lambda functions but could not make the connection in the R language.
I searched in the R manual and found this
Generally functions are assigned to symbols but they don't need to be. The value returned by the call to function is a function. If this is not given a name it is referred to as an anonymous function. Anonymous functions are most frequently used as arguments other functions such as the apply family or outer.
These things for a not-very-long-time programmer like me are "quirky" in a very interesting sort of way.
Where can I find more of these for R (without having to buy a book) ?
Thank you for sharing your suggestions
Functions don't have names in R. Whether you happen to put a function into a variable or not is not a property of the function itself so there does not exist two sorts of functions: anonymous and named. The best we can do is to agree to call a function which has never been assigned to a variable anonymous.
A function f can be regarded as a triple consisting of its formal arguments, its body and its environment accessible individually via formals(f), body(f) and environment(f). The name is not any part of that triple. See the function objects part of the language definition manual.
Note that if we want a function to call itself then we can use Recall to avoid knowing whether or not the function was assigned to a variable. The alternative is that the function body must know that the function has been assigned to a particular variable and what the name of that variable is. That is, if the function is assigned to variable f, say, then the body can refer to f in order to call itself. Recall is limited to self-calling functions. If we have two functions which mutually call each other then a counterpart to Recall does not exist -- each function must name the other which means that each function must have been assigned to a variable and each function body must know the variable name that the other function was assigned to.
There's not a lot to say about anonymous functions in R. Unlike Python, where lambda functions require special syntax, in R an anonymous function is simply a function without a name.
For example:
function(x,y) { x+y }
whereas a normal, named, function would be
add <- function(x,y) { x+y }
Functions are first-class objects, so you can pass them (regardless of whether they're anonymous) as arguments to other functions. Examples of functions that take other functions as arguments include apply, lapply and sapply.
Get Patrick Burns' "The R Inferno" at his site
There are several good web sites with basic introductions to R usage.
I also like Zoonekynd's manual
Great answers about style so far. Here's an answer about a typical use of anonymous functions in R:
# Make some data up
my.list <- list()
for( i in seq(100) ) {
my.list[[i]] <- lm( runif(10) ~ runif(10) )
}
# Do something with the data
sapply( my.list, function(x) x$qr$rank )
We could have named the function, but for simple data extractions and so forth it's really handy not to have to.

Resources