What does passing an ellipsis (...) as an argument mean in R? [duplicate] - r

This question already has answers here:
How to use R's ellipsis feature when writing your own function?
(5 answers)
Closed 9 years ago.
I have seen several related questions to the ellipses, but I am still not sure what it means to pass "..." as an argument. I am completely new to R, but am trying to understand what the following means:
forest <- randomForest(x = train.x, y = train.y, ...)

The typical use of ... argument is when a function say f internally calls a function g and uses ... to pass arguments to g without explicitly listing all those arguments as its own formal arguments. One may want to do this, for example, when g has a lot of optional arguments that may or may not be needed by the user in the function f. Then instead of adding all those optional arguments to f and increasing complexity, one may simply use ....
What it means, as you asked, is the function f will simply ignore these and pass them on to g. The interesting thing is that ... may even have arguments that g does not want and it will ignore them too, for say h if it needed to use ... too. But also see this so post for a detailed discussion.
For example consider:
f <- function (x, y, ...) {
# do something with x
# do something with y
g(...) # g will use what it needs
h(...) # h will use that it needs
# do more stuff and exit
}
Also, see here in the intro-R manual for an example using par.
Also, this post shows how to unpack the ... if one was writing a function that made use of it.

Related

implicit variables in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
In scala we have the concept of an implicit variable or parameter, which can be handy, although sometimes confusing, in many cases. The question is:
Is there something like implicit variables in R?
If there is not, would be possible to achieve the same behavior as scala implicit parameters while calling some function in R?
Moved from comments.
If I understand this correctly an implicit parameter to a function is a function argument which, if not specified when calling the function, defaults to a default associated with that argument's type and only one such default can exist for all instances of that type at any one time; however, arguments in R don't have types -- its all dynamic. One does not write f <- function(int x) ... but just f <- function(x) ... .
I suppose one could have a convention that integerDefault is the default value associated with the integer type:
f <- function(x = integerDefault) x
g <- function(y = integerDefault) y + 1L
integerDefault <- 0L
f()
## [1] 0
g()
## [1] 1
There is nothing that will prevent you from passing a double to f and g but
if you don't pass anything then you get the default integer which seems similar to scala and
there can only be one such default at any point since they all go by the same name which seems similar to scala. Also
if no value is assigned to integerDefault then the function fails which is also similar to scala.
Note that integerDefault will be looked up lexically -- not in the caller.
I'm not sure what the desired behavior is. From the first paragraph of the site you link, it seems to be simply a default parameter setting for parameters not provided to the function. This is used in R all the time:
> f <- function(x=10) print(x)
> f()
[1] 10
Is that what you mean?

fingerprint a function with its arguments in r

I'd like to save computation time,
by avoiding running the same function with the same arguments multiple times.
given the following code:
f1 <- function(a,b) return(a+b)
f2 <- function(c,d,f) return(c*d*f)
x <- 3
y <- 4
f2(1,2,f1(x,y))
let's assume that the computation of 'f' function argument is hard,
and I'd like to cash the result somehow, so that I'd know if it had ever been executed before.
here is my main question:
I assume I can generate a key myself for f1(3,4),
example: key <- paste('f1',x,y), do my own bookkeeping and avoid running it again.
however, is it possible for f2 to generate such a key from f automatically and return it to me?
(for any function with any arguments)
if not / alternatively, before I pass f1(x,y) can I generate such a key in a generic manner,
that would work for any function with any arguments?
thanks much
Interesting question. I never thought about this.
A quick google search found this package: R.cache.
The function addMemoization takes a function as argument, and returns a function that should cache its results.
I haven't used this package myself, so I don't know how well it works, but it seems to fit what you are looking for.

If function(x) can work, why would we need function()?

I understand how "function(x)" works, but what is the role of "function()" here?
z <- function() {
y <- 2
function(x) {
x + y
}
}
function is a keyword which is part of the creation of a function (in the programming sense that Gilles describes in his answer). The other parts are the argument list (in parentheses) and the function body (in braces).
In your example, z is a function which takes no arguments. It returns a function which takes 1 argument (named x) (since R returns the last evaluated statement as the return value by default). That function returns its argument x plus 2.
When z is called (with no arguments: z()) it assigns 2 to y (inside the functions variable scope, an additional concept that I'm not going to get into). Then it creates a function (without a name) which takes a single argument named x, which, when itself called, returns its argument x plus 2. That anonymous function is returned from the call to z and, presumably, stored so that it can be called later.
See https://github.com/hadley/devtools/wiki/Functions and https://github.com/hadley/devtools/wiki/Functionals for more discussion on passing around functions as objects.
The word “function” means somewhat different things in mathematics and in programming. In mathematics, a function is a correspondence between each possible value of the parameters and a result. In programming, a function is a sequence of instructions to compute the result from the parameters.
In mathematics, a function with no argument is a constant. In programming, this is not the case, because functions can have side effects, such as printing something. So you will encounter many functions with no arguments in programs.
Hre the function function(x) { x + y } depends on the variable y. There are no side effects, so this function is very much like the mathematical function defined by $f(x) = x + y$. However, this definition is only complete for a given value of y. The previous instruction sets y to 2, so
function() {
y <- 2
function(x) {
x + y
}
}
is equivalent to
function () {
function(x) {
x + 2
}
}
in the sense that both definitions produce the same results when applied to the same value. They are, however, computed in slightly different ways.
That function is given the name z. When you call z (with no argument, so you write z()), this builds the function function (x) { x + 2 }, or something equivalent: z() is a function of one argument that adds 2 to its argument. So you can write something like z()(3) — the result is 5.
This is obviously a toy example. As you progress in your lectures, you'll see progressively more complex examples where such function building is mixed with other features to achieve useful things.
With some help I've picked out a few examples of functions without formal arguments to help you understand why they could be useful.
Functions which have side-effects
plot.new() for instance, initializes a graphics device.
Want to update the console buffer? flush.console() has your back.
Functions which have a narrow purpose
This is probably the majority of the cases.
Want to know the date/time? Call date().
Want to know the version of R? Call getRversion().

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

How are functions curried?

I understand what the concept of currying is, and know how to use it. These are not my questions, rather I am curious as to how this is actually implemented at some lower level than, say, Haskell code.
For example, when (+) 2 4 is curried, is a pointer to the 2 maintained until the 4 is passed in? Does Gandalf bend space-time? What is this magic?
Short answer: yes a pointer is maintained to the 2 until the 4 is passed in.
Longer than necessary answer:
Conceptually, you're supposed to think about Haskell being defined in terms of the lambda calculus and term rewriting. Lets say you have the following definition:
f x y = x + y
This definition for f comes out in lambda calculus as something like the following, where I've explicitly put parentheses around the lambda bodies:
\x -> (\y -> (x + y))
If you're not familiar with the lambda calculus, this basically says "a function of an argument x that returns (a function of an argument y that returns (x + y))". In the lambda calculus, when we apply a function like this to some value, we can replace the application of the function by a copy of the body of the function with the value substituted for the function's parameter.
So then the expression f 1 2 is evaluated by the following sequence of rewrites:
(\x -> (\y -> (x + y))) 1 2
(\y -> (1 + y)) 2 # substituted 1 for x
(1 + 2) # substituted 2 for y
3
So you can see here that if we'd only supplied a single argument to f, we would have stopped at \y -> (1 + y). So we've got a whole term that is just a function for adding 1 to something, entirely separate from our original term, which may still be in use somewhere (for other references to f).
The key point is that if we implement functions like this, every function has only one argument but some return functions (and some return functions which return functions which return ...). Every time we apply a function we create a new term that "hard-codes" the first argument into the body of the function (including the bodies of any functions this one returns). This is how you get currying and closures.
Now, that's not how Haskell is directly implemented, obviously. Once upon a time, Haskell (or possibly one of its predecessors; I'm not exactly sure on the history) was implemented by Graph reduction. This is a technique for doing something equivalent to the term reduction I described above, that automatically brings along lazy evaluation and a fair amount of data sharing.
In graph reduction, everything is references to nodes in a graph. I won't go into too much detail, but when the evaluation engine reduces the application of a function to a value, it copies the sub-graph corresponding to the body of the function, with the necessary substitution of the argument value for the function's parameter (but shares references to graph nodes where they are unaffected by the substitution). So essentially, yes partially applying a function creates a new structure in memory that has a reference to the supplied argument (i.e. "a pointer to the 2), and your program can pass around references to that structure (and even share it and apply it multiple times), until more arguments are supplied and it can actually be reduced. However it's not like it's just remembering the function and accumulating arguments until it gets all of them; the evaluation engine actually does some of the work each time it's applied to a new argument. In fact the graph reduction engine can't even tell the difference between an application that returns a function and still needs more arguments, and one that has just got its last argument.
I can't tell you much more about the current implementation of Haskell. I believe it's a distant mutant descendant of graph reduction, with loads of clever short-cuts and go-faster stripes. But I might be wrong about that; maybe they've found a completely different execution strategy that isn't anything at all like graph reduction anymore. But I'm 90% sure it'll still end up passing around data structures that hold on to references to the partial arguments, and it probably still does something equivalent to factoring in the arguments partially, as it seems pretty essential to how lazy evaluation works. I'm also fairly sure it'll do lots of optimisations and short cuts, so if you straightforwardly call a function of 5 arguments like f 1 2 3 4 5 it won't go through all the hassle of copying the body of f 5 times with successively more "hard-coding".
Try it out with GHC:
ghc -C Test.hs
This will generate C code in Test.hc
I wrote the following function:
f = (+) 16777217
And GHC generated this:
R1.p[1] = (W_)Hp-4;
*R1.p = (W_)&stg_IND_STATIC_info;
Sp[-2] = (W_)&stg_upd_frame_info;
Sp[-1] = (W_)Hp-4;
R1.w = (W_)&integerzmgmp_GHCziInteger_smallInteger_closure;
Sp[-3] = 0x1000001U;
Sp=Sp-3;
JMP_((W_)&stg_ap_n_fast);
The thing to remember is that in Haskell, partially applying is not an unusual case. There's technically no "last argument" to any function. As you can see here, Haskell is jumping to stg_ap_n_fast which will expect an argument to be available in Sp.
The stg here stands for "Spineless Tagless G-Machine". There is a really good paper on it, by Simon Peyton-Jones. If you're curious about how the Haskell runtime is implemented, go read that first.

Resources