How to specify FUN used in by( ) or related apply( ) functions - r

In a by() function, I will use cor (correlation) to be the FUN there. However, I'd like to setup use="complete.obs" too.
I don't know how to pass this argument in the FUN = cor part.
For example,
by(data, INDICES=list(data$Age), FUN=cor)

probably
by(data, INDICES=list(data$Age), FUN=cor, use = "complete.obs")
will work.
the arguments to by are passed to FUN.

If you start looking around at various R help files for functions like by, you may start to notice a curious 'argument' popping up over and over again: .... You're going to see an ellipsis listed along with all the other arguments to a function.
This is actually an argument itself. It will collect any other arguments you pass and hand them off to subsequent functions called later. The documentation will usually tell you what function these arguments will be handed to.
In this case, in ?by we see this:
... further arguments to FUN.
This means that any other arguments you pass to by that don't match the ones listed will be handed off to the function you pass to FUN.
Another common instance can be found in plot, where the documentation only lists two specific arguments, x and y. Then there's the ... which gathers up anything else you pass to plot and hands it off to methods or to par to set graphical parameter settings.
So in #kohske's example, use = "complete.obs" will be automatically passed on the cor, since it doesn't match any of the other arguments for by.

#kohske and #joran give equivalent answers showing built in features of by (which are also present in apply and the entire plyr family) for passing additional arguments to the supplied function since this is a common application/problem. #Tomas also shows another way to specify an anonymous function which is just a function that calls the "real" function with certain parameters fixed. Fixing parameters to a function call (to effectively make a function with fewer arguments) is a common approach, especially in functional approaches to programming; in that context it is called currying or partial application.
library("functional")
by(data, INDICES=list(data$Age), FUN=Curry(cor, use = "complete.obs"))
This approach can be used when one function does not use ... to "pass along" arguments, and you want to indicate the only reason that an anonymous function is needed is to specify certain arguments.

In general, you have 2 possibilities:
1) specify the arguments in the calling function (tapply() or by() in this case). This also works even if the key argument to fun() is not the first one:
fun <- function(arg1, arg2, arg3) { ... } # just to see how fun() looks like
tapply(var1, var2, fun, arg1 = something, arg3 = something2)
# arg2 will be filled by tapply
2) you may write your wrapper function (sometimes this is needed):
tapply(var1, var2, function (x) { fun(something, x, something2) })

Related

Function taking Vectors and Scalars

I have a function that takes a vector
function foo(x::Vector{Int64})
x
end
How can I make it work also for scalars (i.e. turn them into one-element vectors)?
I know that I can do this:
foo(x::Int64) = foo([x])
but this stops being cool when there are more arguments, because you're writing multiple methods to achieve only one thing.
I think I something like foo(x::Union{Int64, Vector{Int64}}), but I don't know where or how it works or if it is the right thing to do.
Can anyone help?
You can make a helper function which either converts or does nothing. Then the main function can accept any combination:
_vec(x::Number) = [x]
_vec(x::AbstractVector) = x
function f(x, y, z) # could specify ::Union{Number, AbstractVector}
xv = _vec(x)
yv = _vec(y)
...
end
The ... could do the actual work, or could call f(xv, yv, zv) where another method f(x::AbstractVector, y::AbstractVector, z::AbstractVector) does the work --- whichever seems cleaner.
The main time this comes up is if the version of your function for vectors does the same thing for all of it's elements. In this case, what you want to do is define f(x::Int), and use broadcasting f.([1,2,3]) for the vector case.

What is the recommended pattern to design a function that can implement multiple algorithms for S4 class in R?

I have an object of class S4 "MyOb" and a generic function "MyFun". I would like to implement multiple different algorithms for MyFun to process MyOb and be able to select the algorithm I want by specifying the "type" in the generic function. Type would be an argument of MyFun and would be just a character (string): "Algo1", "Algo2"...
However, each algorithm would require different arguments. I have started has indicated in the code below but then I am not sure how to continue, should I have a switch in the setMethod function that redirect to other separate functions ?
setGeneric("MyFun", function(x, type, ...) standardGeneric("MyFun"))
setMethod("MyFun", c("MyOb", "character"), function(x, type, ...){
switch()??? #to Algo1, Algo2, ....
})
Algo1<-function(x, M, N){ #blabla }
Algo2<-function(x, F, G, H){ #blablabla }
Ideally, I like to end up with something like the function baseline in the baseline R package, with
MyFun.Algo1, MyFun.Algo2 being the different function and MyFun the generic one...
I have been looking for this type of pattern but could not find any tutorial...
Any hint, advice, recommendation would be appreciated!
Thank you very much!
Firstly, you probably want to only have x as the signature of your function (you don't want a different method based on the class of type, for example). So you should start with
setGeneric("MyFun", function(x, type, ...) standardGeneric("MyFun"), signature="x")
(you don't even have to have type amongst the arguments to the generic if you don't want to — it depends whether it would be used for other classes of input.)
If you need different other arguments for later algorithms, that is fine. The ... sorts this out for you. So if you have your two algorithms
Algo1<-function(x, M, N){ #blabla }
Algo2<-function(x, F, G, H){ #blablabla }
then these will get called correctly when you call
MyFun(x,type="Algo1",M=1,N=2) ## dispatched Algo1 with M=1 and N=2
MyFun(x,type="Algo2",F=3,G=4,H=-2.7) ## dispatches Algo2 with F, G and H
The recommended way to write the MyFun method is as follows (you were right with your intuition to use switch):
setMethod("MyFun",signature(x="MyObj"), function(x,type=c("Algo1","Algo2"),...){
type <- match.arg(type)
switch(type,
Algo1=Algo1(x,...),
Algo2=Algo2(x,...),
stop("unknown algorithm")
)
})
It would probably be wise to make sure that Algo1 and Algo2 do some argument checking to make sure they are receiving the arguments they expect. This is good programming practice in general, but perhaps more important here.
If you haven't come across match.arg before, it's the recommended way of ensuring an argument matches one of a defined set of values. It uses the default argument as the list of allowed values.

If function(x) can work, why would we need function()?

I understand how "function(x)" works, but what is the role of "function()" here?
z <- function() {
y <- 2
function(x) {
x + y
}
}
function is a keyword which is part of the creation of a function (in the programming sense that Gilles describes in his answer). The other parts are the argument list (in parentheses) and the function body (in braces).
In your example, z is a function which takes no arguments. It returns a function which takes 1 argument (named x) (since R returns the last evaluated statement as the return value by default). That function returns its argument x plus 2.
When z is called (with no arguments: z()) it assigns 2 to y (inside the functions variable scope, an additional concept that I'm not going to get into). Then it creates a function (without a name) which takes a single argument named x, which, when itself called, returns its argument x plus 2. That anonymous function is returned from the call to z and, presumably, stored so that it can be called later.
See https://github.com/hadley/devtools/wiki/Functions and https://github.com/hadley/devtools/wiki/Functionals for more discussion on passing around functions as objects.
The word “function” means somewhat different things in mathematics and in programming. In mathematics, a function is a correspondence between each possible value of the parameters and a result. In programming, a function is a sequence of instructions to compute the result from the parameters.
In mathematics, a function with no argument is a constant. In programming, this is not the case, because functions can have side effects, such as printing something. So you will encounter many functions with no arguments in programs.
Hre the function function(x) { x + y } depends on the variable y. There are no side effects, so this function is very much like the mathematical function defined by $f(x) = x + y$. However, this definition is only complete for a given value of y. The previous instruction sets y to 2, so
function() {
y <- 2
function(x) {
x + y
}
}
is equivalent to
function () {
function(x) {
x + 2
}
}
in the sense that both definitions produce the same results when applied to the same value. They are, however, computed in slightly different ways.
That function is given the name z. When you call z (with no argument, so you write z()), this builds the function function (x) { x + 2 }, or something equivalent: z() is a function of one argument that adds 2 to its argument. So you can write something like z()(3) — the result is 5.
This is obviously a toy example. As you progress in your lectures, you'll see progressively more complex examples where such function building is mixed with other features to achieve useful things.
With some help I've picked out a few examples of functions without formal arguments to help you understand why they could be useful.
Functions which have side-effects
plot.new() for instance, initializes a graphics device.
Want to update the console buffer? flush.console() has your back.
Functions which have a narrow purpose
This is probably the majority of the cases.
Want to know the date/time? Call date().
Want to know the version of R? Call getRversion().

Using outer() with a multivariable function

Suppose you have a function f<- function(x,y,z) { ... }. How would you go about passing a constant to one argument, but letting the other ones vary? In other words, I would like to do something like this:
output <- outer(x,y,f(x,y,z=2))
This code doesn't evaluate, but is there a way to do this?
outer(x, y, f, z=2)
The arguments after the function are additional arguments to it, see ... in ?outer. This syntax is very common in R, the whole apply family works the same for instance.
Update:
I can't tell exactly what you want to accomplish in your follow up question, but think a solution on this form is probably what you should use.
outer(sigma_int, theta_int, function(s,t)
dmvnorm(y, rep(0, n), y_mat(n, lambda, t, s)))
This calculates a variance matrix for each combination of the values in sigma_int and theta_int, uses that matrix to define a dennsity and evaluates it in the point(s) defined in y. I haven't been able to test it though since I don't know the types and dimensions of the variables involved.
outer (along with the apply family of functions and others) will pass along extra arguments to the functions which they call. However, if you are dealing with a case where this is not supported (optim being one example), then you can use the more general approach of currying. To curry a function is to create a new function which has (some of) the variables fixed and therefore has fewer parameters.
library("functional")
output <- outer(x,y,Curry(f,z=2))

What is this functional "pattern" called?

I was fooling around with some functional programming when I came across the need for this function, however I don't know what this sort of thing is called in standard nomenclature.
Anyone recognizes it?
function WhatAmIDoing(args...)
return function()
return args
end
end
Edit: generalized the function, it takes a variable amount of arguments ( or perhaps an implicit list) and returns a function that when invoked returns all the args, something like a curry or pickle, but it doesn't seem to be either.
WhatAmIDoing is a higher-order function because it is a function that returns another function.
The thing that it returns is a thunk — a closure created for delayed computation of the actual value. Usually thunks are created to lazily evaluate an expression (and possibly memoize it), but in other cases, a function is simply needed in place of a bare value, as in the case of "constantly 5", which in some languages returns a function that always returns 5.
The latter might apply in the example given, because assuming the language evaluates in applicative-order (i.e. evaluates arguments before calling a function), the function serves no other purpose than to turn the values into a function that returns them.
WhatAmIDoing is really an implementation of the "constantly" function I was describing. But in general, you don't have to return just args in the inner function. You could return "ackermann(args)", which could take a long time, as in...
function WhatAmIDoing2(args...)
return function()
return ackermann(args)
end
end
But WhatAmIDoing2 would return immediately because evaluation of the ackermann function would be suspended in a closure. (Yes, even in a call-by-value language.)
In functional programming a function that takes another function as an argument or returns another function is called a higher-order function.
I would say that XXXX returns a closure of the unnamed function bound on the values of x,y and z.
This wikipedia article may shed some light
Currying is about transforming a function to a chain of functions, each taking only one parameter and returning another such function. So, this example has no relation to currying.
Pickling is a term ususally used to denote some kind of serialization. Maybe for storing a object built from multiple values.
If the aspect interesting to you is that the returned function can access the arguments of the XXXX function, then I would go with Remo.D.
As others have said, it's a higher-order function. As you have "pattern" in your question, I thought I'd add that this feature of functional languages is often modelled using the strategy pattern in languages without higher-order functions.
Something very similar is called constantly in Clojure:
http://github.com/richhickey/clojure/blob/ab6fc90d56bfb3b969ed84058e1b3a4b30faa400/src/clj/clojure/core.clj#L1096
Only the function that constantly returns takes an arbitrary amount of arguments, making it more general (and flexible) than your pattern.
I don't know if this pattern has a name, but would use it in cases where normally functions are expected, but all I care for is that a certain value is returned:
(map (constantly 9) [1 2 3])
=> (9 9 9)
Just wondering, what do you use this for?
A delegate?
Basically you are returning a function?? or the output of a function?
Didn't understand, sorry...

Resources