Create closure using the base R function constructor - r

Using base R, is it possible to construct a closure from its 3 components directly? All I could manage so far was the slightly verbose
val <- 3L
fun_a <- function(x = 1L) val + x
fun_b <- function(x = 2L) val * x
fun_c <- function(){}
formals(fun_c) <- formals(fun_a)
body(fun_c) <- body(fun_b)
environment(fun_c) <- list2env(list(val = 5L))
fun_c()
#> [1] 5
Additionally, I cannot seem to figure out how to call function(). Some of the things I have tried:
`function`(formals(fun_a), body(fun_b))
#> Error: invalid formal argument list for "function"
`function`(as.pairlist(formals(fun_a)), body(fun_b))
#> Error: invalid formal argument list for "function"
do.call(`function`, c(formals(fun_a), body(fun_b)))
#> Error in do.call("function", c(formals(fun_a), body(fun_b))) :
#> invalid formal argument list for "function"
I'm aware of rlang::new_function() but here I'm looking for base R solutions.

I don't think you can do it in R code. The function implementing function is meant to be called from the parser, where
function(x = 3) { x }
is translated into something like
`function`( pairlist(as.name(x) = 3), quote({ x }))
but the line above is not legal in R: there's no way to say that the tag on an element of a list should be a name instead of a character value. You could write your function in C, or stick with your verbose solution of creating a template and replacing parts of it one at a time.

Related

What is the difference between object and .Object in OOP in R?

I'm studying S4 classes and methods and I got confused to know when to use .Object and object (using as an argument to functions on classes). I don't understand if is there any difference between them.
For example, Would be correct:
setGeneric("getTimes",function(object){standardGeneric ("getTimes")})
setMethod("getTimes","Trajectories",
function(object){
return(object#times)
}
)
or:
setGeneric("getTimes",function(.Object){standardGeneric ("getTimes")})
setMethod("getTimes","Trajectories",
function(.Object){
return(.Object#times)
}
)
First, you should avoid the curly braces around {standardGeneric("getTimes")}.
The short answer for your question: there is no difference between the 2 code in your example. You were defining getTimes as a brand new generic function of your own. You can specify its arguments name whatever you like (object, x, xobject, .Object). Then, when you write the methods for the generic function, your methods' arguments name must match with the generic function's arguments name. For example:
setGeneric("getTimes", function(object) standardGeneric("getTimes"))
setMethod("getTimes", "Trajectories", function(object) object#times)
If not follow, there will be error (technically, a warning because R automatically/"silently" correct it. However, in my opinion, R should stop and throw an error in this case):
setGeneric("getTimes", function(object) standardGeneric("getTimes"))
setMethod("getTimes", "Trajectories", function(x) x#times)
# mismatch between `x` argument name in method and `object` argument name in generic
In the case you want to define methods for existing generic, you should use function method.skeleton.
Example 1:
setGeneric("getTimes", function(xobject) standardGeneric("getTimes")) # generic function is defined
getTimes # type function name without parentheses to get a summary of the generic
method.skeleton("getTimes", "Trajectories", stdout())
# copy this method skeleton to your script/source file and modify to your need
Example 2, show is a predefined generic with object as argument (see ?show) or you can type show without parentheses to check. Therefore, setMethod("show", "Trajectories", function(.Object) .Object) will be error. You can proceed using this approach, however, I think method.skeleton is a pretty useful alternative:
> method.skeleton("show", "Trajectories", stdout())
setMethod("show",
signature(object = "Trajectories"),
function (object)
{
stop("need a definition for the method here")
}
)
Example 3, initialize is a generic function and its argument .Object may be defined (type initialize without parentheses to check). From my understanding, the reason .Object is chosen as argument name in this case to invoke the feeling of a prototype object (you can read more at ?initialize). Similarly to Example 2, use the method.skeleton helper function:
> method.skeleton("initialize", "Trajectories", stdout())
setMethod("initialize",
signature(.Object = "Trajectories"),
function (.Object, ...)
{
stop("need a definition for the method here")
}
)
Note: there is a special case for replacement/assignment function (<-), that is its last argument must be named value. Read more. For example:
setClass("Trajectories", slots = c(times = "numeric"))
setGeneric("getTimes", function(x) standardGeneric("getTimes"))
setMethod("getTimes","Trajectories", function(x) x#times)
setGeneric("getTimes<-", function(xobject, value) standardGeneric("getTimes<-"))
setMethod("getTimes<-", c("Trajectories", "ANY"), function(xobject, value) {
xobject#times <- value
xobject
})
# test drive
m <- new("Trajectories", times = 32)
getTimes(m)
getTimes(m) <- 42
getTimes(m)
R will not output any error or warning if you use other name (new_value in below) when defining the generic and accompanying methods. However, when you use it, R will error:
setGeneric("getTimes<-", function(xobject, new_value) standardGeneric("getTimes<-"))
setMethod("getTimes<-", c("Trajectories", "ANY"), function(xobject, new_value) {
xobject#times <- new_value
xobject
})
# test drive
m <- new("Trajectories", times = 32)
getTimes(m)
getTimes(m) <- 42 # error because the right side of <- is always considered as `value` argument

Replacement function as R6 class member function

I have been playing around with R6 ab bit and tried to implement a replacement function (similar in spirit to base::`diag<-`()). I wasn't hugely surprised to learn that the following does not work
library(R6)
r6_class <- R6Class("r6_class",
public = list(
initialize = function(x) private$data <- x,
elem = function(i) private$data[i],
`elem<-` = function(i, val) private$data[i] <- val
),
private = list(
data = NULL
)
)
test <- r6_class$new(1:5)
test$elem(2)
#> [1] 2
test$elem(2) <- 3
#> Error in test$elem(2) <- 3 :
#> target of assignment expands to non-language object
What does this correspond to in prefix notation? All of the following work as expected, so I guess it's none of these
test$`elem<-`(2, 3)
`$`(test, "elem<-")(2, 3)
I'm less interested in possible workarounds, but more in understanding why the above is invalid.
You are allowed to have nested complex assignments, e.g.
names(x)[3] <- "c"
but
test$elem(2) <- 3
is not of that form. It would be legal syntax as
elem(test,2) <- 3
which would expand to
*tmp* <- test
test <- `elem<-`(*tmp*, 2, 3)
but in the original form it would have to expand to
*tmp* <- 2
2 <- `test$elem<-`(*tmp*, 3)
(I've used test$elem<- in backticks to suggest it's the assignment version of the function returned by test$elem. That's not really right, there is no such thing.) The main problem is that the object being modified is 2, so you get the error message you saw: you're not allowed to modify 2.
If you want to do this in R6, I think you could do it something like this. Define a global function
`elem<-` <- function(x, arg, value) x$`elem<-`(arg, value)
and change the definition of your class elem<- method to
`elem<-` = function(i, val) { private$data[i] <- val; self }
Not all that convenient to need two definitions for every assignment method, but it appears to work.

Non-standard evaluation in a user-defined function with lapply or with in R

I wrote a wrapper around ftable because I need to compute flat tables with frequency and percentage for many variables. As ftable method for class "formula" uses non-standard evaluation, the wrapper relies on do.call and match.call to allow the use of the subset argument of ftable (more details in my previous question).
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]))
# etc
}
However, I cannot use this wrapper with lapply nor with:
# example 1: error with "lapply"
lapply(X = warpbreaks[c("breaks",
"wool",
"tension")],
FUN = mytable,
row.vars = 1)
Error in (function (x, ...) : object 'X' not found
# example 2: error with "with"
with(data = warpbreaks[warpbreaks$tension == "L", ],
expr = mytable(wool))
Error in (function (x, ...) : object 'wool' not found
These errors seem to be due to match.call not being evaluated in the right environment.
As this question is closely linked to my previous one, here is a sum up of my problems:
The wrapper with do.call and match.call cannot be used with lapply or with.
The wrapper without do.call and match.call cannot use the subset argument of ftable.
And a sum up of my questions:
How can I write a wrapper which allows both to use the subset argument of ftable and to be used with lapply and with? I have ideas to avoid the use of lapply and with, but I am looking to understand and correct these errors to improve my knowledge of R.
Is the error with lapply related to the following note from ?lapply?
For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g., bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[i]], ...),
with i replaced by the current (integer or double) index. This is not
normally a problem, but it can be if FUN uses sys.call or match.call
or if it is a primitive function that makes use of the call. This
means that it is often safer to call primitive functions with a
wrapper, so that e.g. lapply(ll, function(x) is.numeric(x)) is
required to ensure that method dispatch for is.numeric occurs
correctly.
The problem with using match.call with lapply is that match.call returns the literal call that passed into it, without any interpretation. To see what's going on, let's make a simpler function which shows exactly how your function is interpreting the arguments passed into it:
match_call_fun <- function(...) {
call = as.list(match.call()[-1])
print(call)
}
When we call it directly, match.call correctly gets the arguments and puts them in a list that we can use with do.call:
match_call_fun(iris['Species'], 9)
[[1]]
iris["Species"]
[[2]]
[1] 9
But watch what happens when we use lapply (I've only included the output of the internal print statement):
lapply('Species', function(x) match_call_fun(iris[x], 9))
[[1]]
iris[x]
[[2]]
[1] 9
Since match.call gets the literal arguments passed to it, it receives iris[x], not the properly interpreted iris['Species'] that we want. When we pass those arguments into ftable with do.call, it looks for an object x in the current environment, and then returns an error when it can't find it. We need to interpret
As you've seen, adding envir = parent.frame() fixes the problem. This is because, adding that argument tells do.call to evaluate iris[x] in the parent frame, which is the anonymous function in lapply where x has it's proper meaning. To see this in action, let's make another simple function that uses do.call to print ls from 3 different environmental levels:
z <- function(...) {
print(do.call(ls, list()))
print(do.call(ls, list(), envir = parent.frame()))
print(do.call(ls, list(), envir = parent.frame(2)))
}
When we call z() from the global environment, we see the empty environment inside the function, then the Global Environment:
z()
character(0) # Interior function environment
[1] "match_call_fun" "y" "z" # GlobalEnv
[1] "match_call_fun" "y" "z" # GlobalEnv
But when we call from within lapply, we see that one level of parent.frame up is the anonymous function in lapply:
lapply(1, z)
character(0) # Interior function environment
[1] "FUN" "i" "X" # lapply
[1] "match_call_fun" "y" "z" # GlobalEnv
So, by adding envir = parent.frame(), do.call knows to evaluate iris[x] in the lapply environment where it knows that x is actually 'Species', and it evaluates correctly.
mytable_envir <- function(...) {
tab <- do.call(what = ftable,
args = as.list(match.call()[-1]),
envir = parent.frame())
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
round(x = margin,
digits = 1)
}
# This works!
lapply(X = c("breaks","wool","tension"),
FUN = function(x) mytable_envir(warpbreaks[x],row.vars = 1))
As for why adding envir = parent.frame() makes a difference since that appears to be the default option. I'm not 100% sure, but my guess is that when the default argument is used, parent.frame is evaluated inside the do.call function, returning the environment in which do.call is run. What we're doing, however, is calling parent.frame outside do.call, which means it returns one level higher than the default version.
Here's a test function that takes parent.frame() as a default value:
fun <- function(y=parent.frame()) {
print(y)
print(parent.frame())
print(parent.frame(2))
print(parent.frame(3))
}
Now look at what happens when we call it from within lapply both with and without passing in parent.frame() as an argument:
lapply(1, function(y) fun())
<environment: 0x12c5bc1b0> # y argument
<environment: 0x12c5bc1b0> # parent.frame called inside
<environment: 0x12c5bc760> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
lapply(1, function(y) fun(y = parent.frame()))
<environment: 0x104931358> # y argument
<environment: 0x104930da8> # parent.frame called inside
<environment: 0x104931358> # 1 level up = lapply
<environment: R_GlobalEnv> # 2 levels up = globalEnv
In the first example, the value of y is the same as what you get when you call parent.frame() inside the function. In the second example, the value of y is the same as the environment one level up (inside lapply). So, while they look the same, they're actually doing different things: in the first example, parent.frame is being evaluated inside the function when it sees that there is no y= argument, in the second, parent.frame is evaluated in the lapply anonymous function first, before calling fun, and then is passed into it.
As you only want to pass all the arguments passed to ftable u do not need the do.call().
mytable <- function(...) {
tab <- ftable(...)
prop <- prop.table(x = tab,
margin = 2) * 100
bind <- cbind(as.matrix(x = tab),
as.matrix(x = prop))
margin <- addmargins(A = bind,
margin = 1)
return(round(x = margin,
digits = 1))
}
The following lapply creates a table for every Variable separatly i don't know if that is what you want.
lapply(X = c("breaks",
"wool",
"tension"),
FUN = function(x) mytable(warpbreaks[x],
row.vars = 1))
If you want all 3 variables in 1 table
warpbreaks$newVar <- LETTERS[3:4]
lapply(X = cbind("c(\"breaks\", \"wool\", \"tension\")",
"c(\"newVar\", \"tension\",\"wool\")"),
FUN = function(X)
eval(parse(text=paste("mytable(warpbreaks[,",X,"],
row.vars = 1)")))
)
Thanks to this issue, the wrapper became:
# function 1
mytable <- function(...) {
do.call(what = ftable,
args = as.list(x = match.call()[-1]),
envir = parent.frame())
# etc
}
Or:
# function 2
mytable <- function(...) {
mc <- match.call()
mc[[1]] <- quote(expr = ftable)
eval.parent(expr = mc)
# etc
}
I can now use the subset argument of ftable, and use the wrapper in lapply:
lapply(X = warpbreaks[c("wool",
"tension")],
FUN = function(x) mytable(formula = x ~ breaks,
data = warpbreaks,
subset = breaks < 15))
However I do not understand why I have to supply envir = parent.frame() to do.call as it is a default argument.
More importantly, these methods do not resolve another issue: I can not use the subset argument of ftable with mapply.

R: how to find what S3 method will be called on an object?

I know about methods(), which returns all methods for a given class. Suppose I have x and I want to know what method will be called when I call foo(x). Is there a oneliner or package that will do this?
The shortest I can think of is:
sapply(class(x), function(y) try(getS3method('foo', y), silent = TRUE))
and then to check the class of the results... but is there not a builtin for this?
Update
The full one liner would be:
fm <- function (x, method) {
cls <- c(class(x), 'default')
results <- lapply(cls, function(y) try(getS3method(method, y), silent = TRUE))
Find(function (x) class(x) != 'try-error', results)
}
This will work with most things but be aware that it might fail with some complex objects. For example, according to ?S3Methods, calling foo on matrix(1:4, 2, 2) would try foo.matrix, then foo.numeric, then foo.default; whereas this code will just look for foo.matrix and foo.default.
findMethod defined below is not a one-liner but its body has only 4 lines of code (and if we required that the generic be passed as a character string it could be reduced to 3 lines of code). It will return a character string representing the name of the method that would be dispatched by the input generic given that generic and its arguments. (Replace the last line of the body of findMethod with get(X(...)) if you want to return the method itself instead.) Internally it creates a generic X and an X method corresponding to each method of the input generic such that each X method returns the name of the method of the input generic that would be run. The X generic and its methods are all created within the findMethod function so they disappear when findMethod exits. To get the result we just run X with the input argument(s) as the final line of the findMethod function body.
findMethod <- function(generic, ...) {
ch <- deparse(substitute(generic))
f <- X <- function(x, ...) UseMethod("X")
for(m in methods(ch)) assign(sub(ch, "X", m, fixed = TRUE), "body<-"(f, value = m))
X(...)
}
Now test it. (Note that the one-liner in the question fails with an error in several of these tests but findMethod gives the expected result.)
findMethod(as.ts, iris)
## [1] "as.ts.default"
findMethod(print, iris)
## [1] "print.data.frame"
findMethod(print, Sys.time())
## [1] "print.POSIXct"
findMethod(print, 22)
## [1] "print.default"
# in this example it looks at 2nd component of class vector as no print.ordered exists
class(ordered(3))
## [1] "ordered" "factor"
findMethod(print, ordered(3))
## [1] "print.factor"
findMethod(`[`, BOD, 1:2, "Time")
## [1] "[.data.frame"
I use this:
s3_method <- function(generic, class, env = parent.frame()) {
fn <- get(generic, envir = env)
ns <- asNamespace(topenv(fn))
tbl <- ns$.__S3MethodsTable__.
for (c in class) {
name <- paste0(generic, ".", c)
if (exists(name, envir = tbl, inherits = FALSE)) {
return(get(name, envir = tbl))
}
if (exists(name, envir = globalenv(), inherits = FALSE)) {
return(get(name, envir = globalenv()))
}
}
NULL
}
For simplicity this doesn't return methods defined by assignment in the calling environment. The global environment is checked for convenience during development. These are the same rules used in r-lib packages.

What are Replacement Functions in R?

I searched for a reference to learn about replacement functions in R, but I haven't found any yet. I'm trying to understand the concept of the replacement functions in R. I have the code below but I don't understand it:
"cutoff<-" <- function(x, value){
x[x > value] <- Inf
x
}
and then we call cutoff with:
cutoff(x) <- 65
Could anyone explain what a replacement function is in R?
When you call
cutoff(x) <- 65
you are in effect calling
x <- "cutoff<-"(x = x, value = 65)
The name of the function has to be quoted as it is a syntactically valid but non-standard name and the parser would interpret <- as the operator not as part of the function name if it weren't quoted.
"cutoff<-"() is just like any other function (albeit with a weird name); it makes a change to its input argument on the basis of value (in this case it is setting any value in x greater than 65 to Inf (infinite)).
The magic is really being done when you call the function like this
cutoff(x) <- 65
because R is parsing that and pulling out the various bits to make the real call shown above.
More generically we have
FUN(obj) <- value
R finds function "FUN<-"() and sets up the call by passing obj and value into "FUN<-"() and arranges for the result of "FUN<-"() to be assigned back to obj, hence it calls:
obj <- "FUN<-"(obj, value)
A useful reference for this information is the R Language Definition Section 3.4.4: Subset assignment ; the discussion is a bit oblique, but seems to be the most official reference there is (replacement functions are mentioned in passing in the R FAQ (differences between R and S-PLUS), and in the R language reference (various technical issues), but I haven't found any further discussion in official documentation).
Gavin provides an excellent discussion of the interpretation of the replacement function. I wanted to provide a reference since you also asked for that: R Language Definition Section 3.4.4: Subset assignment.
As a complement to the accepted answer I would like to note that replacement functions can be defined also for non standard functions, namely operators (see ?Syntax) and control flow constructs. (see ?Control).
Note also that it is perfectly acceptable to design a generic and associated methods for replacement functions.
operators
When defining a new class it is common to define S3 methods for $<-, [[<- and [<-, some examples are data.table:::`$<-.data.table`, data.table:::`[<-.data.table`, or tibble:::`$.tbl_df`.
However for any other operator we can write a replacement function, some examples :
`!<-` <- function(x, value) !value
x <- NULL # x needs to exist before replacement functions are used!
!x <- TRUE
x
#> [1] FALSE
`==<-` <- function(e1, e2, value) replace(e1, e1 == e2, value)
x <- 1:3
x == 2 <- 200
x
#> [1] 1 200 3
`(<-` <- function(x, value) sapply(x, value, USE.NAMES = FALSE)
x <- c("foo", "bar")
(x) <- toupper
x
#> [1] "FOO" "BAR"
`%chrtr%<-` <- function(e1, e2, value) {
chartr(e2, value, e1)
}
x <- "woot"
x %chrtr% "o" <- "a"
x
#> [1] "waat"
we can even define <-<-, but the parser will prevent its usage if we call x <- y <- z, so we need to use the left to right assignment symbol
`<-<-` <- function(e1, e2, value){
paste(e2, e1, value)
}
x <- "b"
"a" -> x <- "c"
x
#> [1] "a b c"
Fun fact, <<- can have a double role
x <- 1:3
x < 2 <- NA # this fails but `<<-` was called!
#> Error in x < 2 <- NA: incorrect number of arguments to "<<-"
# ok let's define it then!
`<<-` <- function(x, y, value){
if (missing(value)) {
eval.parent(substitute(.Primitive("<<-")(x, y)))
} else {
replace(x, x < y, value)
}
}
x < 2 <- NA
x
#> [1] NA 2 3
x <<- "still works"
x
#> [1] "still works"
control flow constructs
These are in practice seldom encountered (in fact I'm responsible for the only practical use I know, in defining for<- for my package pbfor), but R is flexible enough, or crazy enough, to allow us to define them. However to actually use them, due to the way control flow constructs are parsed, we need to use the left to right assignment ->.
`repeat<-` <- function(x, value) replicate(value, x)
x <- "foo"
3 -> repeat x
x
#> [1] "foo" "foo" "foo"
function<-
function<- can be defined in principle but to the extent of my knowledge we can't do anything with it.
`function<-` <- function(x,value){NULL}
3 -> function(arg) {}
#> Error in function(arg) {: target of assignment expands to non-language object
Remember, in R everything operation is a function call (therefore also the assignment operations) and everything that exists is an object.
Replacement functions act as if they modify their arguments in place such as in
colnames(d) <- c("Input", "Output")
They have the identifier <- at the end of their name and return a modified copy of the argument object (non-primitive replacement functions) or the same object (primitive replacement functions)
At the R prompt, the following will not work:
> `second` <- function(x, value) {
+ x[2] <- value
+ x
+ }
> x <- 1:10
> x
[1] 1 2 3 4 5 6 7 8 9 10
> second(x) <- 9
Error in second(x) <- 9: couldn't find function "second<-"
As you can see, R is searching the environment not for second but for second<-.
So lets do the same thing but using such a function identifier instead:
> `second<-` <- function(x, value) {
+ x[2] <- value
+ x
+ }
Now, the assignment at the second position of the vector works:
> second(x) <- 9
> x
[1] 1 9 3 4 5 6 7 8 9 10
I also wrote a simple script to list all replacement functions in R base package, find it here.

Resources