What is the impact of not calling the arguments while calling a function [closed] - r

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
My question is about the difference between two ways to pass the the arguments of a function
for instance
function1(obj1, obj2, obj3, obj4, obj5)
or
function1(arg1=obj1, arg2=obj2, arg3=obj3, arg4=obj4, arg5=obj5)
Is there a rule/convention/document for that?
I can see at least 2 situations where the first way is not great
If we want to add a new argument, we are forced to add it at the end of the list, which may not great for the common sense (as I like to group arguments that goes together)
The arguments with default have to be put at the end of the list, otherwise you have to input it even if you use the default value.
Any ideas on that?

For me the issue is simple: reproducible results require reproducible and explicit function calls.
In my case, I use named arguments having learned that another person may insert a new parameter in their function if they so choose, which caused my code to break.
I also tend to store parameters in a list and use these when calling a function, e.g. someCrazyFunction(stuff = stuff, eps = Par$eps, tol = Par$tol, verbose = Par$verbose, strict = Par$strict, debug = Par$debug)
If I don't do this, I am not doing my part to ensure reproducible results. It is only a few keystrokes and I don't have to worry if the author of the function or package moves arguments around, inserts new arguments, deletes some arguments (which I'll notice because R will tell me that some object is not needed), or otherwise makes seemingly harmless changes. If they make such a change, then how can someone else who looks at my code be sure of how to reproduce the same call as it was at the time I made it?
Lesson: Debugging is far more painful than the few keystrokes needed to ensure reproducibility.
(Minor update) This question & the selected answer from elsewhere on SO exemplifies a particular aspect of this implicit contract between the package creator and the person with a dependency on a package. If I develop a dependency on a given function and the author simply shuffles the arguments, then my code should work perfectly regardless. They made no explicit contract to not move things around, and I can assume no implicit contract that it will behave that way. I only assume that they will not change the definitions of arguments.

From a function implementer's point of view, you must always add new parameters to the end and name them so they don't have a prefix in common with existing arguments.
This is because people are free to use positional matching and partial names. A fact of R life...

Function arguments in R can be matched via position or by name and you as the person punching things into the keyboard are afforded some flexibility in how you decide to use or abuse that. One of the immediate benefits to using the named arguments is that you can change the order of the arguments within the function as you see fit. i.e.
function1(arg1=obj1, arg2=obj2, arg3=obj3, arg4=obj4, arg5=obj5)
and
function1(arg5=obj5, arg4=obj4, arg3=obj3, arg2=obj2, arg1=obj1)
will evaluate in the same fashion, while
function1(obj1, aobj2, obj3, obj4, obj5)
and
function1(obj5, aobj4, obj3, obj2, obj1)
will not. Function arguments can also be partially matched and are matched using the following criteria:
exact match for named argument
partial match for named argument
positional match
This can obviously lead to some unintended consequences if you aren't careful with the partial matching. I believe that if an argument is matched by name, it is "removed" from the positional search, though I can't find a specific reference for that at the moment. As a note of common use, I tend to see people use the positional matching for the first argument in a function, and then specify others that may be optional afterwords by name. Again, this is mostly personal convention and habit as far as I'm concerned.

In functional programming like R, there can be thirty or something parameters for a function. In that case, argument names are handy with default parameter values.
Other than that, especially for short list of parameters, argument names do not make good sense.

Related

How to make an R object immutable? [duplicate]

I'm working in R, and I'd like to define some variables that I (or one of my collaborators) cannot change. In C++ I'd do this:
const std::string path( "/projects/current" );
How do I do this in the R programming language?
Edit for clarity: I know that I can define strings like this in R:
path = "/projects/current"
What I really want is a language construct that guarantees that nobody can ever change the value associated with the variable named "path."
Edit to respond to comments:
It's technically true that const is a compile-time guarantee, but it would be valid in my mind that the R interpreter would throw stop execution with an error message. For example, look what happens when you try to assign values to a numeric constant:
> 7 = 3
Error in 7 = 3 : invalid (do_set) left-hand side to assignment
So what I really want is a language feature that allows you to assign values once and only once, and there should be some kind of error when you try to assign a new value to a variabled declared as const. I don't care if the error occurs at run-time, especially if there's no compilation phase. This might not technically be const by the Wikipedia definition, but it's very close. It also looks like this is not possible in the R programming language.
See lockBinding:
a <- 1
lockBinding("a", globalenv())
a <- 2
Error: cannot change value of locked binding for 'a'
Since you are planning to distribute your code to others, you could (should?) consider to create a package. Create within that package a NAMESPACE. There you can define variables that will have a constant value. At least to the functions that your package uses. Have a look at Tierney (2003) Name Space Management for R
I'm pretty sure that this isn't possible in R. If you're worried about accidentally re-writing the value then the easiest thing to do would be to put all of your constants into a list structure then you know when you're using those values. Something like:
my.consts<-list(pi=3.14159,e=2.718,c=3e8)
Then when you need to access them you have an aide memoir to know what not to do and also it pushes them out of your normal namespace.
Another place to ask would be R development mailing list. Hope this helps.
(Edited for new idea:) The bindenv functions provide an
experimental interface for adjustments to environments and bindings within environments. They allow for locking environments as well as individual bindings, and for linking a variable to a function.
This seems like the sort of thing that could give a false sense of security (like a const pointer to a non-const variable) but it might help.
(Edited for focus:) const is a compile-time guarantee, not a lock-down on bits in memory. Since R doesn't have a compile phase where it looks at all the code at once (it is built for interactive use), there's no way to check that future instructions won't violate any guarantee. If there's a right way to do this, the folks at the R-help list will know. My suggested workaround: fake your own compilation. Write a script to preprocess your R code that will manually substitute the corresponding literal for each appearance of your "constant" variables.
(Original:) What benefit are you hoping to get from having a variable that acts like a C "const"?
Since R has exclusively call-by-value semantics (unless you do some munging with environments), there isn't any reason to worry about clobbering your variables by calling functions on them. Adopting some sort of naming conventions or using some OOP structure is probably the right solution if you're worried about you and your collaborators accidentally using variables with the same names.
The feature you're looking for may exist, but I doubt it given the origin of R as a interactive environment where you'd want to be able to undo your actions.
R doesn't have a language constant feature. The list idea above is good; I personally use a naming convention like ALL_CAPS.
I took the answer below from this website
The simplest sort of R expression is just a constant value, typically a numeric value (a number) or a character value (a piece of text). For example, if we need to specify a number of seconds corresponding to 10 minutes, we specify a number.
> 600
[1] 600
If we need to specify the name of a file that we want to read data from, we specify the name as a character value. Character values must be surrounded by either double-quotes or single-quotes.
> "http://www.census.gov/ipc/www/popclockworld.html"
[1] "http://www.census.gov/ipc/www/popclockworld.html"

R: Getting more informative error messages in R

I am still not very good at using R’s standard debugging tools, and I often find that neither the error nor the traceback tell me enough to figure out what is going on. I would like to change R’s default behavior on error to provide more information.
Specifically, I would always like the call, including the formals, the expression assigned to each formal (the default expression if the default is the expression assigned), and the value of each of the argument expressions as evaluated in place, all returned in a format that makes it unambiguous which expression has been matched to which formal and which values go with which expression. Since the values might be large or of unexpected or evanescent type, I’d like them to be returned in a format, such as a str(), that makes intelligent choices about truncation and correctly identifies promises and other object types that tend to evaluate themselves into something else before you see them.
And finally, I’d like all these things, together with the return value of each call, for every function on the call stack from the error back to (and including) some piece of code that I wrote. It seems to me that the natural structure would be a single R object, a list of lists, one list per call (perhaps tidied, broom-like, into a tibble with some list columns) that I could single-step through in the obvious way.
I apologize if I have described some standard R debugging tool that I just haven't learned how to use properly yet. Is this even possible? If it is, could it be implemented via R's available error handlers, or would it need some package-scale coding project?
I would most prefer a solution that changes the default error response to this, but if that is impracticable, I'd accept a solution that requires that I rerun a code chunk with a wrapper or something similar.

In R: Why is there no complete list of every argument a function can use?

Im using R for about 3 years and one of the main advantages (in my opinion) is the wide range of questions and assistance one can find on stackoverflow and similar websites.
One thing that is missing and kind of annoys me is an entire list of every single argument a function can use (plus possible values of those arguments).
For example: In R documentation all "main" arguments are listed and in many cases the documentation says "... further arguments passed to or from other methods". How can I know which arguments are meant by "..."?
When searching on stackoverflow for a way to get my desired result of an analysis I sometimes stumble about these additional arguments which can be very helpful in many cases. It still takes much time to find these arguments hidden in other users answers. Sometimes I used a workaround which would have been unnecessary if I had known some additional function arguments.
Is anyone else experiencing the same thing?
(It's difficult to mention examples but I remember having that trouble when using the leaflet functions for the first time.)
Tim
The most direct answer is that we often don't know what arguments one might want to pass to .... In fact, that is the point of ... arguments, is to not require us to know what arguments may be passed to it.
Consider, for example, the print generic in base R. It is defined as
print(x, ...)
So what are the arguments that can be passed to ...?
print.factor defines
print(x, quote = FALSE, max.levels = NULL,
width = getOption("width"), ...)
print.table defines
print(x, digits = getOption("digits"), quote = FALSE,
na.print = "", zero.print = "0", justify = "none", ...)
Notice that the print methods for factor and table objects don't share the same arguments. In fact, every print method may be defined with a different set of arguments. R then uses the class of the object to determine which set of arguments to apply to print.
When a developer creates a new print method, CRAN requires that all new methods contain at least the same arguments as the generic. So every print method has arguments x and ....
How do I know what arguments may be acceptable to ...?
First, read and follow the documentation. In glm, you find that the ... argument accepts arguments to "form the default control argument." This references the control argument, which then references the glm.control function. Opening ?glm.control shows the arguments epsilon, maxit and trace.
Another example, in ggplot2's geom_line, the documentation states that ... arguments are passed to the layer function. Use ?layer to see what arguments are available.
If the documentation simply specifies "to other methods," then you are probably looking at a method that is dispatched with different behaviors for different types of objects.

Function argument matching: by name vs by position

What is the difference between this lines of code?
mean(some_argument)
mean(x = some_argument)
The output is the same, but has the explicit mention of x any advantages?
People typically don't add argument names for commonly used arguments, such as the x in mean, but almost always refer to the na.rm arguments when removing missing values.
While neglecting the argument name makes for compact code, here are four (related) reasons for including the names of arguments rather than relying on their position.
Re-order arguments as needed. When you refer to the arguments by name, you can arbitrarily re-order the arguments and still produce the desired result. Sometimes it is useful to re-order your arguments. For example, when running a loop over one of the arguments, you might prefer to put the looped argument in the front of the function.
It is typically safer / more future-proof. As an example, if some user-written function or package re-orders the arguments in an update, and you relied on the positions of the arguments, this would break your code. In the best scenario, you would get an error. In the worst scenario the function would run, but would an incorrect result. Including the argument names greatly reduces the chances of running into either case.
For greater code clarity. If an argument is rarely used or you want to be explicit for future readers of your code (including you 2 months from now), adding the names can make for easier reading.
Ability to skip arguments. If you want to only change the third argument, then referring to it by name is probably preferable.
See also the R Language Definition: 4.3.2 Argument matching

Why do the R functions mean() and sum() behave differently with vectors vs. raw strings?

I was wondering if there was an underlying programming logic as to why some basic R functions behave differently towards raw data input into them vs. vectors.
For example, if I do this
mean(1,2,3)
I don't get the correct answer, and don't get an error
But if I do this
sum(1,2,3)
I do get the right answer, even though I'd assume proper syntax would be sum(c(1,2,3))
And if I do this
sd(1,2,3)
I get an error Error in sd(1, 2, 3) : unused argument (3)
I'm interested into what, if any, the underlying programming logic of these different behaviors are. (I'm sure if I rooted around in the source code I could figure out exactly why they behave differently, but I want to know if there is a reason why the code might have been written that way).
Practically, I'm teaching a basic R class and want to explain to my students why things work that way; they get a bit tired of me saying "That's just how R works, live with it; and always put things in vectors to make life easy."
EDITS: I have bolded some sections to add emphasis. My question is largely about software design, not how these particular function happen to operate or how to determine their exact operation. That is, not "what arguments do these functions accept" but "why do simple mathematical functions in R appear (to a biologist) to have been designed differently".
the second argument taken by mean is trim, which is not a listed argument for sum. the first argument for sum is \dots, so, I believe, the function will try to compute the sum of all values entered as unnamed arguments.
mean and sum are generic functions, so they get deployed differently depending on an object's class.

Resources