How to write generic function with two inputs? - r

I am a newbee in programming, and I run into an issue with R about generic function: how to write it when there are multiple inputs?
For an easy example, for dataset and function
z <- c(2,3,4,5,8)
calc.simp <- function(a,x){a*x+8}
# Test the function:
calc.simp(x=z,a=3)
[1] 14 17 20 23 32
Now I change the class of z:
class(z) <- 'simp'
How should I write the generic function 'calc' as there are two inputs?
My attempts and errors are below:
calc <- function(x) UseMethod('calc',x)
calc(x=z)
Error in calc.simp(x = z) : argument "a" is missing, with no default
And
calc <- function(x,y) UseMethod('calc',x,y)
Error in UseMethod("calc", x, y) : unused argument (y)
My confusion might be a fundamental one as I am just a beginner. Please help! Thank you very much!

I'd suggest you model your generic function off of the template used by innumerable base R functions as, e.g., mean:
> mean
function (x, ...)
UseMethod("mean")
In your case, that would translate to the following generic which (if I understand your question correctly) works just fine:
calc <- function(x, ...) UseMethod('calc')
calc.simp <- function(a, x) {
x <- unclass(x)
a * x + 8
}
## Try it out
z <- c(2,3,4,5,8)
class(z) <- "simp"
calc.simp(x = z, 10)
## [1] 28 38 48 58 88
calc(x = z, 10)
## [1] 28 38 48 58 88

Related

Is there a way to use do.call without explicitly providing arguments

Part of a custom function I am trying to create allows the user to provide a function as a parameter. For example
#Custom function
result <- function(.func){
do.call(.func, list(x,y))
}
#Data
x <- 1:2
y <- 0:1
#Call function
result(.func = function(x,y){ sum(x, y) })
However, the code above assumes that the user is providing a function with arguments x and y. Is there a way to use do.call (or something similar) so that the user can provide a function with different arguments? I think that the correct solution might be along the lines of:
#Custom function
result <- function(.func){
do.call(.func, formals(.func))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
But this is not it.
1) Use formals/names/mget to get the values in a list. An optional argument, envir, will allow the user to specify the environment that the variables are located in so it knows where to look. The default if not specified is the parent frame, i.e. the caller.
result1 <- function(.func, envir = parent.frame()) {
do.call(.func, mget(names(formals(.func)), envir))
}
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result1(.func = function(m,n) sum(m, n) )
## [1] 9
result1(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1(function(Time, demand) Time + demand, list2env(BOD))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
1a) Another possibility is to evaluate the body. This also works if envir is specified as a data frame whose columns are to be looked up.
result1a <- function(.func, envir = parent.frame()) {
eval(body(.func), envir)
}
result1a(.func = function(m,n) sum(m, n) )
## [1] 9
result1a(.func = function(x,y,z) sum(x,y,z) )
## [1] 14
result1a(function(Time, demand) Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
2) Another design which is even simpler is to provide a one-sided formula interface. Formulas have environments so we can use that to look up the variables.
result2 <- function(fo, envir = environment(fo)) eval(fo[[2]], envir)
result2(~ sum(m, n))
## [1] 9
result2(~ sum(x,y,z))
## [1] 14
result2(~ Time + demand, BOD)
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
3) Even simpler yet is to just pass the result of the computation as an argument.
result3 <- function(x) x
result3(sum(m, n))
## [1] 9
result3(sum(x,y,z))
## [1] 14
result3(with(BOD, Time + demand))
## [1] 9.3 12.3 22.0 20.0 20.6 26.8
This works.
#Custom function
result <- function(.func){
do.call(.func, lapply(formalArgs(.func), as.name))
}
#Data
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
#Call function
result(.func = function(m,n){ sum(m, n) })
result(.func = function(x,y,z){ sum(x,y,z) })
This seems like a bit of a pointless function, since the examples in your question imply that what you are trying to do is evaluate the body of the passed function using variables in the calling environment. You can certainly do this easily enough:
result <- function(.func){
eval(body(.func), envir = parent.frame())
}
This gives the expected results from your examples:
x <- 1:2
y <- 0:1
result(.func = function(x,y){ sum(x, y) })
#> [1] 4
and
m <- 1:3
n <- 0:2
x <- 1:2
y <- 0:1
z <- c(4,6)
result(.func = function(m,n){ sum(m, n) })
#> [1] 9
result(.func = function(x,y,z){ sum(x,y,z) })
#> [1] 14
But note that, when the user types:
result(.func = function(x,y){ ...user code... })
They get the same result they would already get if they didn't use your function and simply typed
...user code....
You could argue that it would be helpful with a pre-existing function like mean.default:
x <- 1:10
na.rm <- TRUE
trim <- 0
result(mean.default)
#> [1] 5.5
But this means users have to name their variables as the parameters being passed to the function, and this is just a less convenient way of calling the function.
It might be useful if you could demonstrate a use case where what you are proposing doesn't make the user's code longer or more complex.
You could also use ..., but like the other responses, I don't quite see the value, or perhaps I don't fully understand the use-case.
result <- function(.func, ...){
do.call(.func, list(...))
}
Create function
f1 <- function(a,b) sum(a,b)
Pass f1 and values to result()
result(f1, m,n)
Output:
[1] 9
Here is how I would do it based on your clarifying comments.
Basically since you say your function will take a data.frame as input, the function you are asking for essentially just reverses the order of arguments you pass to do.call()... which takes a function, then a list of arguments. A data.frame is just a special form of list where all elements (columns) are vectors of equal length (number of rows)
result <- function(.data, .func) {
# .data is a data.frame, which is a list of argument vectors of equal length
do.call(.func, .data)
}
result(data.frame(a=1, b=1:5), function(a, b) a * b)
result(data.frame(c=1:10, d=1:10), function(c, d) c * d)

Understanding the source code of "ave" function

Here is the source code for the "ave" function in R:
function (x, ..., FUN = mean)
{
if (missing(...))
x[] <- FUN(x)
else {
g <- interaction(...)
split(x, g) <- lapply(split(x, g), FUN)
}
x
}
I am having trouble understanding how the assignment, "split(x, g) <- lapply(split(x, g), FUN)" works. Consider the following example:
# Overview: function inputs and outputs
> x = 10*1:6
> g = c('a', 'b', 'a', 'b', 'a', 'b')
> ave(x, g)
[1] 30 40 30 40 30 40
# Individual components of "split" assignment
> split(x, g)
$a
[1] 10 30 50
$b
[1] 20 40 60
> lapply(split(x, g), mean)
$a
[1] 30
$b
[1] 40
# Examine "x" before and after assignment
> x
[1] 10 20 30 40 50 60
> split(x, g) <- lapply(split(x, g), mean)
> x
[1] 30 40 30 40 30 40
Questions:
• Why does the assignment, "split(x,g) <- lapply(split(x,g), mean)", directly modify x? Does "<-" always modify the first argument of a function, or is there some other rule for this?
• How does this assignment even work? Both the "split" and "lapply" statements have lost the original ordering of x. They are also length 2. How do you end up with a vector of length(x) that matches the original ordering of x?
This is a tricky one. <- usually does not work in this way. What is actually happening is that you are not calling split(), you are calling a replacement function called split<-(). The documentation of split says
[...] The replacement forms replace values corresponding to such a division. unsplit reverses the effect of split.
See also this answer

Creating a function in R with variable number of arguments,

When creating a function in R, we usually specify the number of argument like
function(x,y){
}
That means it takes only two arguments. But when the numbers of arguments are not specified (For one case I have to use two arguments but another case I have to use three or more arguments) how can we handle this issue? I am pretty new to programming so example will be greatly appreciated.
d <- function(...){
x <- list(...) # THIS WILL BE A LIST STORING EVERYTHING:
sum(...) # Example of inbuilt function
}
d(1,2,3,4,5)
[1] 15
You can use ... to specify an additional number of arguments. For example:
myfun <- function(x, ...) {
for(i in list(...)) {
print(x * i)
}
}
> myfun(4, 3, 1)
[1] 12
[1] 4
> myfun(4, 9, 1, 0, 12)
[1] 36
[1] 4
[1] 0
[1] 48
> myfun(4)

Function parameter as argument in an R function

I am attempting to write a general function to calculate coverage probabilities for interval estimation of Binomial proportions in R. I intend to do this for a variety of confidence interval methods e.g. Wald, Clopper-Pearson, HPD intervals for varying priors.
Ideally, I would like there to be one function, that can take, as an argument, the method that should be used to calculate the interval. My question then: how can I include a function as an argument in another function?
As an example, for the Exact Clopper-Pearson interval I have the following function:
# Coverage for Exact interval
ExactCoverage <- function(n) {
p <- seq(0,1,.001)
x <- 0:n
# value of dist
dist <- sapply(p, dbinom, size=n, x=x)
# interval
int <- Exact(x,n)
# indicator function
ind <- sapply(p, function(x) cbind(int[,1] <= x & int[,2] >= x))
list(coverage = apply(ind*dist, 2, sum), p = p)
}
Where Exact(x,n) is just a function to calculate the appropriate interval. I would like to have
Coverage <- function(n, FUN, ...)
...
# interval
int <- FUN(...)
so that I have one function to calculate the coverage probabilities rather than a separate coverage function for each method of interval calculation. Is there a standard way to do this? I have not been able to find an explanation.
Thanks,
James
In R, a function can be provided as a function argument. The syntax matches the one of non-function objects.
Here is an example function.
myfun <- function(x, FUN) {
FUN(x)
}
This function applies the function FUN to the object x.
A few examples with a vector including the numbers from 1 to 10:
vec <- 1:10
> myfun(vec, mean)
[1] 5.5
> myfun(vec, sum)
[1] 55
> myfun(vec, diff)
[1] 1 1 1 1 1 1 1 1 1
This is not limited to built-in functions, but works with any function:
> myfun(vec, function(obj) sum(obj) / length(obj))
[1] 5.5
mymean <- function(obj){
sum(obj) / length(obj)
}
> myfun(vec, mymean)
[1] 5.5
you can also store a function name as a character variable, and call it with do.call()
> test = c(1:5)
> do.call(mean, list(test))
[1] 3
>
> func = 'mean'
> do.call(func, list(test))
[1] 3
Hadley's text gives a great (and simple) example:
randomise <- function(f) f(runif(1e3))
randomise(mean)
#> [1] 0.5059199
randomise(mean)
#> [1] 0.5029048
randomise(sum)
#> [1] 504.245

using the ... argument inline

I have the following function
sjbDo <- function(operation, x, statelist, Spos, isFuture = FALSE) {
# run the operation on x
xvec <- operation(x);
# and so on
}
and I could call it like this:
A <- sjbDo( function(x) {x}, statelist$A, statelist, 1)
However, I want to modify sjbDo so that the inline function can take additional arguments. Something like:
kTheta <- sjbDo( function(x, b) {x^b}, statelist$K, statelist, 1, FALSE, b=theta.k)
I tried
sjbDo <- function(operation, x, statelist, Spos, isFuture = FALSE, ...) {
# run the operation on x
xvec <- operation(x,...);
But this doesn't seem to work. How can I get this to work?
A more canonical solution would look like:
operation <- function(x, ...) {
dots <- list(...)
x^dots[[1]]
}
but if you know enough to say that the argument you want is the first argument passed as ... then you should make that an argument. Because your code (and mine) won't work when called like this for example:
> operation(1:10, foo = "bar", b = 2)
Error in x^dots[[1]] : non-numeric argument to binary operator
If you grab ... as I have above, then you can pull out the argument you want if it is named:
operation <- function(x, ...) {
dots <- list(...)
want <- which(names(dots) == "b")
stopifnot(length(want) > 0)
b <- dots[[want]]
x^b
}
Which works like this:
> operation(1:10, foo = "bar", b = 2)
[1] 1 4 9 16 25 36 49 64 81 100
but still fails if b is not a named argument:
> operation(1:10, foo = "bar", 2)
Error: length(want) > 0 is not TRUE
So what you have come up with might work in the one use case you present, but it isn't a more general strategy for doing what you want to do. What should operation do if there are no extra arguments passed in? Your code assumes there are other arguments and as such are no longer optional - which is what you indicated they were. If b should take some other default value if non is supplied, then the whole thing becomes easier:
operation <- function(x, b = 1) {
x^b
}
sjbDo <- function(FUN, x, ...) {
## function matching
FUN <- match.fun(FUN)
# run the operation on x
xvec <- FUN(x, ...)
xvec
}
Which gives:
> sjbDo(operation, 1:10)
[1] 1 2 3 4 5 6 7 8 9 10
> sjbDo(operation, 1:10, b = 2)
[1] 1 4 9 16 25 36 49 64 81 100
> sjbDo("operation", 1:10, b = 2)
[1] 1 4 9 16 25 36 49 64 81 100
The latter works because of the use of match.fun.
The point of the above is that I don't think you want operation() to have a ... argument because I don't see how such code could possibly work. What I think you want is a way to write the outer function call sjbDo() to have a few named arguments and pass any other arguments on to the function you want to call within sjbDo(), which I call here FUN and you called operation.
In other words, what I think you want is a wrapper (sjbDo()) that can call a given function (supplied as argument FUN) with argument x, plus any other arguments that FUN requires, without having to think of all the possible arguments FUN will require?
Ooops, I figured it out
operation <- function(x,...) {x^...[[1]]}
Thanks anyway.

Resources