Does R support function overloading ??
I want to do something in the lines of :
g <- function(X,Y) { # do something and return something }
g <- function(X) { # do something and return something}
EDIT, following clarification of the question in comments above:
From a quick glance at this page, it looks like Erlang allows you to define functions that will dispatch completely different methods depending on the arity of their argument list (up to a ..., following which the arguments are optional/don't affect the dispatched method).
To do something like that in R, you'll probably want to use S4 classes and methods. In the S3 system, the method that is dispatched depends solely on the class of the first argument. In the S4 system, the method that's called can depend on the classes of an arbitrary number of arguments.
For one example of what's possible, try running the following. It requires you to have installed both the raster package and the sp package. Between them, they provide a large number of functions for plotting both raster and vector spatial data, and both of them use the S4 system to perform method dispatch. Each of the lines returned by the call to showMethods() corresponds to a separate function, which will be dispatched when plot() is passed x and y arguments that having the indicated classes (which can include being entirely "missing").
> library(raster)
> showMethods("plot")
Function: plot (package graphics)
x="ANY", y="ANY"
x="Extent", y="ANY"
x="Raster", y="Raster"
x="RasterLayer", y="missing"
x="RasterStackBrick", y="ANY"
x="Spatial", y="missing"
x="SpatialGrid", y="missing"
x="SpatialLines", y="missing"
x="SpatialPoints", y="missing"
x="SpatialPolygons", y="missing"
R sure does. Try, for an example:
plot(x = 1:10)
plot(x = 1:10, y = 10:1)
And then go have a look at how the function accomplishes that, by typing plot.default.
In general, the best way to learn how implement this kind of thing yourself will be to spend some time poking around in the code used to define functions whose behavior is already familiar to you.
Then, if you want to explore more sophisticated forms of method dispatch, you'll want to look into both the S3 and S4 class systems provided by R.
This is usually best done through optional arguments. For example:
g <- function(X, Y=FALSE) {
if (Y == FALSE) {
# do something
}
else {
# do something else
}
}
Check out the missing() function in R. For the function to still run, you need to reassign the missing variables before running the rest of the function. For example, this code:
overload = function(x,y) {
if (missing(y)) {
y = FALSE
}
if (y == FALSE) {
print("One variable provided")
} else {
print("Two variables provided")
}
}
overload(1)
overload(1, 2)
Will return:
> overload(1)
[1] "One variable provided"
> overload(1, 2)
[1] "Two variables provided"
Lastly, the missing() function is only reliable if you haven't altered the variable in question in the function.
Related
Many great R packages exists. However, often they use slightly different names for the same behaviour. As I often use different packages, the different names get in my way. Thus, I would like to extend the original package by adding local functions. E.g.
in the package "rethinking" we use the function "extract.samples()" to obtain the samples from the posterior distribution.
in the package "rstanarm" we use the function "as.matrix()" instead.
It would be nice to add the function "extract.samples()" to my local repository, and to define that it is called only if the input parameter is an "rstanarm object". Thus, I really would like to extend the package: If I load "rethinking" the "rethinking::extract.samples()" is used, and if I load "rstanarm" the function "rstanarm::extract.samples()" is used.
What I currently do is the following:
extract.samples = function(object, n=1000, ...){
if ( class(object)[[1]] == 'stanreg' ){
# rstanarm object:
SIMULATIONS = as.matrix(object)
} else if ( attr(class(object), "package") == 'rethinking' ){
SIMULATIONS = rethinking::extract.samples(object, n=n, ...)
}
return(invisible( SIMULATIONS ))
}
Thus, I explicitly have to take care of all the possible objects and parameter setting. This becomes messy, if a third package defines the function "extract.samples" or if the two packages use different parameters. I wonder, if there is a more robust method.
The proper way to do this is to create your own package that exports a generic function and several methods for it. If you can edit rethinking, then just do that, otherwise create a third package. I'll assume you're doing that.
Here's what the code could look like:
extract.samples <- function(x, ...) {
UseMethod("extract.samples")
}
extract.samples.stanreg <- function(x, ...) {
as.matrix(x, ...)
}
extract.samples.default <- function(x, ...) {
rethinking::extract.samples(x, ...)
}
I was trying to overload + operator and ran into incompatibility issues. Even though + is an S3 generic function, it seems to look at both arguments (similar to a multiple dispatch) instead of just the left argument like other S3 generic functions (see Groups "Ops" at http://www.inside-r.org/r-doc/base/summary). So, if two different + functions are defined for the arguments, R issues a warning and falls back to + for two numeric values.
Here is an example:
myType1 <- function(obj) {
structure(obj, class = "myType1")
} # function myType1
`+.myType1` <- function(obj1, obj2) {
return(obj1)
} # function +.myType1
myType2 <- function(obj) {
structure(obj, class = "myType2")
} # function myType2
`+.myType2` <- function(obj1, obj2) {
return(obj2)
} # function +.myType2
myType1("A") + 1 # this works, use defined types seem to have precedence
myType1("A") + myType2(1) # this doesn't
Is there a way to get around this problem? I know that S4 methods support multiple dispatch. Will using S4 help me avoid this problem, even if an S3 + method is defined for one of the arguments?
Thank you so much in advance.
Sincerely,
Junghoon Lee
Is there an R function that lists all the functions in an R script file along with their arguments?
i.e. an output of the form:
func1(var1, var2)
func2(var4, var10)
.
.
.
func10(varA, varB)
Using [sys.]source has the very undesirable side-effect of executing the source inside the file. At the worst this has security problems, but even “benign” code may simply have unintended side-effects when executed. At best it just takes unnecessary time (and potentially a lot).
It’s actually unnecessary to execute the code, though: it is enough to parse it, and then do some syntactical analysis.
The actual code is trivial:
file_parsed = parse(filename)
functions = Filter(is_function, file_parsed)
function_names = unlist(Map(function_name, functions))
And there you go, function_names contains a vector of function names. Extending this to also list the function arguments is left as an exercise to the reader. Hint: there are two approaches. One is to eval the function definition (now that we know it’s a function definition, this is safe); the other is to “cheat” and just get the list of arguments to the function call.
The implementation of the functions used above is also not particularly hard. There’s probably even something already in R core packages (‘utils’ has a lot of stuff) but since I’m not very familiar with this, I’ve just written them myself:
is_function = function (expr) {
if (! is_assign(expr)) return(FALSE)
value = expr[[3L]]
is.call(value) && as.character(value[[1L]]) == 'function'
}
function_name = function (expr) {
as.character(expr[[2L]])
}
is_assign = function (expr) {
is.call(expr) && as.character(expr[[1L]]) %in% c('=', '<-', 'assign')
}
This correctly recognises function declarations of the forms
f = function (…) …
f <- function (…) …
assign('f', function (…) …)
It won’t work for more complex code, since assignments can be arbitrarily complex and in general are only resolvable by actually executing the code. However, the three forms above probably account for ≫ 99% of all named function definitions in practice.
UPDATE: Please refer to the answer by #Konrad Rudolph instead
You can create a new environment, source your file in that environment and then list the functions in it using lsf.str() e.g.
test.env <- new.env()
sys.source("myfile.R", envir = test.env)
lsf.str(envir=test.env)
rm(test.env)
or if you want to wrap it as a function:
listFunctions <- function(filename) {
temp.env <- new.env()
sys.source(filename, envir = temp.env)
functions <- lsf.str(envir=temp.env)
rm(temp.env)
return(functions)
}
In my R development I need to wrap function primitives in proto objects so that a number of arguments can be automatically passed to the functions when the $perform() method of the object is invoked. The function invocation internally happens via do.call(). All is well, except when the function attempts to access variables from the closure within which it is defined. In that case, the function cannot resolve the names.
Here is the smallest example I have found that reproduces the behavior:
library(proto)
make_command <- function(operation) {
proto(
func = operation,
perform = function(., ...) {
func <- with(., func) # unbinds proto method
do.call(func, list(), envir=environment(operation))
}
)
}
test_case <- function() {
result <- 100
make_command(function() result)$perform()
}
# Will generate error:
# Error in function () : object 'result' not found
test_case()
I have a reproducible testthat test that also outputs a lot of diagnostic output. The diagnostic output has me stumped. By looking up the parent environment chain, my diagnostic code, which lives inside the function, finds and prints the very same variable the function fails to find. See this gist..
How can the environment for do.call be set up correctly?
This was the final answer after an offline discussion with the poster:
make_command <- function(operation) {
proto(perform = function(.) operation())
}
I think the issue here is clearer and easier to explore if you:
Replace the anonymous function within make_command() with a named one.
Make that function open a browser() (instead of trying to get result). That way you can look around to see where you are and what's going on.
Try this, which should clarify the cause of your problem:
test_case <- function() {
result <- 100
myFun <- function() browser()
make_command(myFun)$perform()
}
test_case()
## Then from within the browser:
#
parent.env(environment())
# <environment: 0x0d8de854>
# attr(,"class")
# [1] "proto" "environment"
get("result", parent.env(environment()))
# Error in get("result", parent.env(environment())) :
# object 'result' not found
#
parent.frame()
# <environment: 0x0d8ddfc0>
get("result", parent.frame()) ## (This works, so points towards a solution.)
# [1] 100
Here's the problem. Although you think you're evaluating myFun(), whose environment is the evaluation frame of test_case(), your call to do.call(func, ...) is really evaluating func(), whose environment is the proto environment within which it was defined. After looking for and not finding result in its own frame, the call to func() follows the rules of lexical scoping, and next looks in the proto environment. Neither it nor its parent environment contains an object named result, resulting in the error message you received.
If this doesn't immediately make sense, you can keep poking around within the browser. Here are a few further calls you might find helpful:
environment(get("myFun", parent.frame()))
ls(environment(get("myFun", parent.frame())))
environment(get("func", parent.env(environment())))
ls(environment(get("func", parent.env(environment()))))
I was just wondering if there was a way to force a function to only accept certain data types, without having to check for it within the function; or, is this not possible because R's type-checking is done at runtime (as opposed to those programming languages, such as Java, where type-checking is done during compilation)?
For example, in Java, you have to specify a data type:
class t2 {
public int addone (int n) {
return n+1;
}
}
In R, a similar function might be
addone <- function(n)
{
return(n+1)
}
but if a vector is supplied, a vector will (obviously) be returned. If you only want a single integer to be accepted, then is the only way to do to have a condition within the function, along the lines of
addone <- function(n)
{
if(is.vector(n) && length(n)==1)
{
return(n+1)
} else
{
return ("You must enter a single integer")
}
}
Thanks,
Chris
This is entirely possible using S3 classes. Your example is somewhat contrived in the context or R, since I can't think of a practical reason why one would want to create a class of a single value. Nonetheless, this is possible. As an added bonus, I demonstrate how the function addone can be used to add the value of one to numeric vectors (trivial) and character vectors (so A turns to B, etc.):
Start by creating a generic S3 method for addone, utlising the S3 despatch mechanism UseMethod:
addone <- function(x){
UseMethod("addone", x)
}
Next, create the contrived class single, defined as the first element of whatever is passed to it:
as.single <- function(x){
ret <- unlist(x)[1]
class(ret) <- "single"
ret
}
Now create methods to handle the various classes. The default method will be called unless a specific class is defined:
addone.default <- function(x) x + 1
addone.character <- function(x)rawToChar(as.raw(as.numeric(charToRaw(x))+1))
addone.single <- function(x)x + 1
Finally, test it with some sample data:
addone(1:5)
[1] 2 3 4 5 6
addone(as.single(1:5))
[1] 2
attr(,"class")
[1] "single"
addone("abc")
[1] "bcd"
Some additional information:
Hadley's devtools wiki is a valuable source of information on all things, including the S3 object system.
The S3 method doesn't provide strict typing. It can quite easily be abused. For stricter object orientation, have a look at S4 classes, reference based classesor the proto package for Prototype object-based programming.
You could write a wrapper like the following:
check.types = function(classes, func) {
n = as.name
params = formals(func)
param.names = lapply(names(params), n)
handler = function() { }
formals(handler) = params
checks = lapply(seq_along(param.names), function(I) {
as.call(list(n('assert.class'), param.names[[I]], classes[[I]]))
})
body(handler) = as.call(c(
list(n('{')),
checks,
list(as.call(list(n('<-'), n('.func'), func))),
list(as.call(c(list(n('.func')), lapply(param.names, as.name))))
))
handler
}
assert.class = function(x, cls) {
stopifnot(cls %in% class(x))
}
And use it like
f = check.types(c('numeric', 'numeric'), function(x, y) {
x + y
})
> f(1, 2)
[1] 3
> f("1", "2")
Error: cls %in% class(x) is not TRUE
Made somewhat inconvenient by R not having decorators. This is kind of hacky
and it suffers from some serious problems:
You lose lazy evaluation, because you must evaluate an argument to determine
its type.
You still can't check the types until call time; real static type checking
lets you check the types even of a call that never actually happens.
Since R uses lazy evaluation, (2) might make type checking not very useful,
because the call might not actually occur until very late, or never.
The answer to (2) would be to add static type information. You could probably
do this by transforming expressions, but I don't think you want to go there.
I've found stopifnot() to be highly useful for these situations as well.
x <- function(n) {
stopifnot(is.vector(n) && length(n)==1)
print(n)
}
The reason it is so useful is because it provides a pretty clear error message to the user if the condition is false.