Writing a decorator for R functions - r

A colleague recently was looking at call graphs and wanted to see what called what. We sorted that with foodweb from mvbutils, but I was wondering about how best to create a decorator (in python speak) in R. So I did this:
instrument=function(z){
force(z)
n=deparse(substitute(z)) # get the name
f=function(...){
cat("calling ", n,"\n")
x=z(...)
cat("done\n")
return(x)
}
return(f)
}
This lets me do:
> foo=function(x,y){x+y}
> foo(1,2)
[1] 3
and now I can make the function log itself by wrapping it:
> foo=instrument(foo)
> foo(1,2)
calling foo
done
[1] 3
has this been done before, in a package say, and have I missed any gotchas that will break my way of doing this?

The trace function in R does that. See ?trace.

my github package tag attempts to tackle this issue.
Your example could be solved as follows :
# remotes::install_github("moodymudskipper/tag")
library(tag)
deco <- tag(args = list(.first = NULL, .last = NULL), pattern = {
t_args <- T_ARGS() # fetch arguments fed to tag
eval.parent(t_args[[".first"]]) # run .first arg
on.exit(eval.parent(t_args[[".last"]])) # run .last arg on exit
CALL() # run main call
})
foo <- function(x, y) {Sys.sleep(1); x + y} # sleep 1 sec to highlight expected behavior
deco(quote(message("calling foo")), quote(message("done")))$foo(1, 2)
#> calling foo
#> done
#> [1] 3
foo2 <- deco(quote(message("calling foo")), quote(message("done")))$foo
foo2(1, 2)
#> calling foo
#> done
#> [1] 3
deco2 <- deco(quote(message("calling foo")), quote(message("done")))
deco2$foo(1, 2)
#> calling foo
#> done
#> [1] 3
Created on 2020-01-30 by the reprex package (v0.3.0)
tags are function operator factories (or adverb factories), here deco is a tag, and deco(quote(message("calling foo")), quote(message("done"))) is an adverb, with a method for $. It means you could run deco(quote(message("calling foo")), quote(message("done")))(foo)(1,2), but the dollar notation makes it friendlier.
The tag definition features defaults arguments (default value is mandatory, dots aren't supported), and a pattern which is a bit like your new body, using special functions T_ARGS(), F_ARGS(), F_ARGS(), F_FORMALS() and CALL() to access the tag or function's arguments or formals and the call itself (see ?tag::CALL).
Some more magic is implemented so the tag's argument can be given to the tagged function itself, so the following can be done too :
deco$foo(1, 2, quote(message("calling foo")), quote(message("done")))
#> calling foo
#> done
#> [1] 3
foo2 <- deco$foo
foo2(1, 2, quote(message("calling foo")), quote(message("done")))
#> calling foo
#> done
#> [1] 3
In these cases you can enjoy the autocomplete in RStudio :
More info : https://github.com/moodymudskipper/tag
The package tags contains a collection of such "decorators"

Related

Why does match.call not work when the original call is wrapped in a function?

I want to pass down function arguments to recursively call a function within itself (usually with a break condition, of course).
I learnt that match.call should work to capture all arguments, and it works - until I wrap the original call in another function.
inner <- function(my_arg) {
message(my_arg)
do.call("inner", as.list(match.call()[-1]))
}
# this yields an error ... (unexpected)
outer <- function() {
mydata <- data.frame(1)
inner(mydata)
}
outer()
# ... while this yields an infinite loop (expected)
mydata <- data.frame(1)
inner(mydata)
This outputs:
1
Error in is.data.frame(my_arg) : object 'mydata' not found
Why is that? Is this intended? How can I fix this?
This happens because of scoping. Hopefully this modification of your two functions will give a clear picture of what's going on (with no infinite loops!), and how to fix it.
inner <- function(my_arg)
{
mc <- match.call()
cat("Call to inner:\n")
print(mc)
cat("\nSymbol to be evaluated within \"inner\":\n")
print(as.list(mc)$my_arg)
cat("\nSymbol evaluated in scope of \"inner\":\n")
tryCatch(print(eval(as.list(mc)$my_arg)),
error = function(e) cat("**Error** - symbol not found\n"))
cat("\nSymbol evaluated in parent frame of \"inner\":\n")
tryCatch(print(eval(as.list(mc)$my_arg, envir = parent.frame())),
error = function(e) cat("**Error** - symbol not found\n"))
}
outer <- function()
{
my_data <- "outer test string"
inner(my_data)
}
Which we can test as follows:
inner("inner test string")
#> Call to inner:
#> inner(my_arg = "inner test string")
#>
#> Symbol to be evaluated within "inner":
#> [1] "inner test string"
#>
#> Symbol evaluated in scope of "inner":
#> [1] "inner test string"
#>
#> Symbol evaluated in parent frame of "inner":
#> [1] "inner test string"
outer()
#> Call to inner:
#> inner(my_arg = my_data)
#>
#> Symbol to be evaluated within "inner":
#> my_data
#>
#> Symbol evaluated in scope of "inner":
#> **Error** - symbol not found
#>
#> Symbol evaluated in parent frame of "inner":
#> [1] "outer test string"
It's really difficult to explain the error because it results from the interaction of do.call, match.call and recursion. The problem results from when the promises of the nested calls inner(my_arg = mydata) are forced. When message is called, R searches the function scope and, in case the object is not found, the enclosing environments. This appears to fail when a promise in the nested calls hasn't been forced (due to your do.call("inner", as.list(match.call()[-1])) construct).
> traceback()
5: message(my_arg) at #2
4: inner(my_arg = mydata)
3: do.call("inner", as.list(match.call()[-1])) at #4
2: inner(mydata) at #4
1: outer()
I suggest you study the language definition, e.g. Section 4.3.3.
Also, why do you need match.call here? Just use inner(my_arg) instead of that do.call with match.call construct. That immediately forces the promise and everything works fine.

Getting the name of object passed to `print` when calling object directly (not expressing the `print` function)

I'm trying to define the print method for my new object and use the name of the object passed to print using deparse(substitute(y)). This works perfectly using the print function explicitly:
obj <- structure(list(x = 1),
class = "new_obj")
print.new_obj <- function(y){
cat("New object name:\n")
print(deparse(substitute(y)))
}
print(obj)
# New object name:
# [1] "obj"
But when the object is called by name on its own the resulting print function doesn't detect the name:
obj
# New object name:
# [1] "x"
Is there a standard way to change the behaviour of the implicit call to print when passing an object name on its own?
EDIT: have changed the function argument to y to represent object being passed, to demonstrate that "x" is returned no matter what in the second call.
It is easier to explain what's going on than it is to fix it. If we start by looking at the generic print we can see it simply dispatches the class-appropriate print method via UseMethod("print"):
print
#> function (x, ...)
#> UseMethod("print")
So when you call print(obj), you are calling the generic function print(obj) first, which then calls print.new_obj(obj). We can confirm this by adding print(sys.calls()) to your print method:
print.new_obj <- function(y){
print(sys.calls())
cat("New object name:\n")
cat(deparse(substitute(y)))
}
print(obj)
#> [[1]]
#> print(obj)
#>
#> [[2]]
#> print.new_obj(obj)
#>
#> New object name:
#> obj
So far, so good, and I suspect you already knew all this.
What happens now, when you just type obj into the console?
obj
#> [[1]]
#> (function (x, ...)
#> UseMethod("print"))(x)
#>
#> [[2]]
#> print.new_obj(x)
#>
#> New object name:
#> x
Now we can see where the x comes from. It is taken from a behind-the-scenes call to the generic print which is actually called as an unnamed function. Hence the name of the variable is not actually included in the call stack. There are other questions on SO where it says this makes the problem insoluble. This isn't true; it just means you will need to look outside of the call stack for your object:
print.new_obj <- function(y){
obj_name <- deparse(substitute(x, parent.frame()))
if (obj_name != "x")
{
obj_name <- names(which(sapply(ls(envir = parent.frame(2)), function(v)
identical(y, get(v, envir = parent.frame(2))))))[1]
cat("New object name:\n", obj_name)
}
else cat("New object name:\n", deparse(substitute(y)))
}
print(obj)
#> New object name:
#> obj
obj
#> New object name:
#> obj
Of course, you wouldn't want to use this in production code, for all sorts of reasons. It is not particularly useful or logical for a data structure to know what name it has been assigned in a particular environment, and would not be an idiomatic way to write a package for other users.
Still, nice to know it is possible.

Manipulating enclosing environment of a function

I'm trying to get a better understanding of closures, in particular details on a function's scope and how to work with its enclosing environment(s)
Based on the Description section of the help page on rlang::fn_env(), I had the understanding, that a function always has access to all variables in its scope and that its enclosing environment belongs to that scope.
But then, why isn't it possible to manipulate the contents of the closure environment "after the fact", i.e. after the function has been created?
By means of R's lexical scoping, shouldn't bar() be able to find x when I put into its enclosing environment?
foo <- function(fun) {
env_closure <- rlang::fn_env(fun)
env_closure$x <- 5
fun()
}
bar <- function(x) x
foo(bar)
#> Error in fun(): argument "x" is missing, with no default
Ah, I think I got it down now.
It has to do with the structure of a function's formal arguments:
If an argument is defined without a default value, R will complain when you call the function without specifiying that even though it might technically be able to look it up in its scope.
One way to kick off lexical scoping even though you don't want to define a default value would be to set the defaults "on the fly" at run time via rlang::fn_fmls().
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fun()
}
# No argument at all -> lexical scoping takes over
baz <- function() x
foo(baz)
#> [1] 5
# Set defaults to desired values on the fly at run time of `foo()`
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fmls <- rlang::fn_fmls(fun)
fmls$x <- substitute(get("x", envir = env_enclosing, inherits = FALSE))
rlang::fn_fmls(fun) <- fmls
fun()
}
bar <- function(x) x
foo(bar)
#> [1] 5
I can't really follow your example as I am unfamiliar with the rlang library but I think a good example of a closure in R would be:
bucket <- function() {
n <- 1
foo <- function(x) {
assign("n", n+1, envir = parent.env(environment()))
n
}
foo
}
bar <- bucket()
Because bar() is define in the function environment of bucket then its parent environment is bucket and therefore you can carry some data there. Each time you run it you modify the bucket environment:
bar()
[1] 2
bar()
[1] 3
bar()
[1] 4

Warnings instead of errors from assert_that()?

I'm using R's assertthat package and am wanting to (temporarily) output a warning instead of an error on assertion failure. What's the easiest way to do that with the assertthat package?
I realize that wanting warnings instead of errors kind of goes against what assertions are supposed to be used for. In the long term, we indeed want to be outputting errors on assertion failure. In the short term, we still want the code to function even with bad input, since the output with bad inputs is still "good enough" for now.
A simple example: suppose I have a function that takes x as input and outputs x+5. I want to output a warning if x!=3. Since we will be using assert_that ultimately, it would be nice if we can use assertthat package for the warning.
In the long term, we'll use this:
> x <- 3
> fn <- function(x) {assert_that(x==3); return(x+5)}
> fn(3)
[1] 8
> fn(4)
Error: x not equal to 3
In the short term, here's the best I have so far:
> fn <- function(x) {if(!see_if(x==3)) warning(validate_that(x==3)); return(x+5)}
> fn(3)
[1] 8
> fn(4)
[1] 9
Warning message:
In fn(4) : x not equal to 3
I'm looking for a more concise solution, if possible (best case would be passing an "output_warning" parameter to assert_that, but I don't think that exists).
I created a user defined function which accepts a string corresponding to an expression against which you would like to run validate_that() (ultimately assert_that()). The function prints a warning if the assertion fails and remains silent otherwise. See below for usage. You could easily extend this custom function to accept more than one expression if necessary. Note that I also use sys.calls() to obtain the name of the function which called this helper function. This is an important piece of information so you can correlate your warnings with the code that actually generated them.
assert_that_soft <- function(exp) {
if (!exp) {
print (paste("Error in function:",
parse(sys.calls()[[sys.nframe()-1]])) ) # name of caller
}
}
Usage:
> fn <- function(x) { assert_that_soft(x==3); return(x+5) }
> fn(3)
[1] 8
> fn(8)
[1] "Error in function: fn(8)"
[1] 13
Another option is to wrap assert_that in tryCatch.
fn <- function(x) tryCatch(assert_that(x == 3), error = function(e) warning(e), finally = return(x+5))
fn(3)
# [1] 8
fn(8)
# [1] 13
# Warning message:
# x not equal to 3
I think the easiest way to overwrite the function would be to copy most of the assert_that function as is, and call the new function by the same name so you don't need to change all the code when you go into error mode.
assert_that <- function(..., env=parent.frame()) {
res <- see_if(..., env=env)
if (res)
return(TRUE)
warning(attr(res, "msg"))
TRUE
}
fn <- function(x) { assert_that(x==3); return(x+5) }
fn(3)
# [1] 8
fn(8)
# [1] 13
# Warning message:
# In assert_that(x == 3) : x not equal to 3
I am proposing an extension of the assertthat package to allow for simple warnings, see
https://github.com/hadley/assertthat/issues/69
any feedback is welcome!

Using "[[ ]]" notation for reference class methods

While experimenting with the new reference classes in R I noticed some odd behaviour if you use the "[[ ]]" notation for methods (X[["doSomething"]] instead of X$doSomething). This notation works for fields, but I initially thought it wouldn't work for methods until I found that if you execute "class(X$doSomething)" you can then use "[[ ]]" afterwards. The simple example below illustrates the point.
setRefClass("Number",
fields = list(
value = "numeric"
),
methods = list(
addOne = function() {
value <<- value + 1
}
)
)
X <- new("Number", value = 1)
X[['value']] # 1
X[["addOne"]]() # Error: attempt to apply non-function
class(X[["addOne"]]) # NULL
class(X$addOne) # "refMethodDef"
# Now the following works!
X[["addOne"]]() # sets X$value = 2
class(X[["addOne"]]) # "refMethodDef"
The reason I encountered this is because I want to group my objects together in a list and create an "applyMethod" function which applies a specified method on each of the objects within. Therefore, I need to specify the method as a string. Does anyone have any ideas how I can achieve this?
Here's a class
.A <-
setRefClass("A",
fields=list(x="numeric"),
methods=list(foo=function() x))
If I had an instance a and wanted to construct a call to the 'foo' method using '$' I could
eval(substitute(a$FUN(), list(FUN="foo")))
So I'll create a class Alist that is meant to have a list of elements of class A (this could be enforced programmatically), and that has a .delegate method that'll apply an arbitrary method to all elements of the list. I'll then add a method that delegates foo.
.delegate <- function(FUN, ...)
{
lapply(elts, function(elt, ...) {
eval(substitute(elt$FUN(...), list(FUN=FUN, ...)))
})
}
.Alist <-
setRefClass("Alist",
fields=list(elts="list"),
methods=list(
initialize = function(...) callSuper(elts=list(...)),
.delegate = .delegate,
foo=function() .delegate("foo")))
And then use it
> aList <- .Alist$new(.A$new(x=1), .A$new(x=2))
> aList$foo()
[[1]]
[1] 1
[[2]]
[1] 2
basically R5 ref class does not cache the method until it is necessary. This is probably a kind of delayed evaluation.
And the caching takes place when you access the method via $.
So, AFAIK, there is no way to access the method via [[string]]
But you can find a workaround using .dollarForEnvRefClass like this:
> X <- new("Number", value = 1)
> ls(X#.xData)
[1] "value" # no methods named "addOne" before caching
> X[["addOne"]]
NULL
> methods:::.dollarForEnvRefClass(X, "addOne") # cache it
Class method definition for method addOne()
function ()
{
value <<- value + 1
}
<environment: 0x116a4aa00>
> ls(X#.xData)
[1] "addOne" "value" # you can find it
> X$value # value is 1
[1] 1
> X[["addOne"]]() # call the method
> X$value # the method works
[1] 2
if you are interested in more detail, see the implementation:
http://svn.r-project.org/R/trunk/src/library/methods/R/refClass.R
Maybe there is more straightforward way.
Report as bug on r-devel so John Chambers can fix it.

Resources