I'm creating an S3 method for a generic defined in another package. An earlier method for the generic produces some console output that's not returned as part of the function return value, it's only printed to the console. I'd like to capture that output for use in my own method.
I tried using capture.output() on NextMethod(), but that just results in a bizarre error:
foo <- function(x, ...) UseMethod("foo")
foo.bar <- function(x, ...) cat(x, "\n")
foo.baz <- function(x, ...) capture.output(NextMethod())
foo(structure(1, class = "bar"))
#> 1
foo(structure(1, class = c("baz", "bar")))
#> Error: 'function' is not a function, but of type 8
Is this expected behaviour, a known limitation, or a bug? I couldn't find anything matching this error with a quick search.
How can I capture the output of the next S3 method in another S3 method?
This is... "expected behavior." I say that because I believe it's technically true, but there's probably no way for a user to expect it necessarily. If you don't care why it happens, but just want to see how to work around it, skip down to the heading "The Fix", because the following explanation of the error is a little involved.
What does 'function' is not a function, but of type 8 mean?
type 8 refers to a type 8 SEXP. From Section one of the R Internals Manual:
What R users think of as variables or objects are symbols which are
bound to a value. The value can be thought of as either a SEXP (a
pointer), or the structure it points to, a SEXPREC...
Currently SEXPTYPEs 0:10 and 13:25 are in use....
no SEXPTYPE Description
...
3 CLOSXP closures
...
8 BUILTINSXP builtin functions
NextMethod() expects a CLOSXP, not a BUILTINSXP. We can see this if we look at the source code (around line 717) of do_nextmethod(), the C function underlying NextMethod()
SEXP attribute_hidden do_nextmethod(SEXP call, SEXP op, SEXP args, SEXP env)
{
// Some code omitted
if (TYPEOF(s) != CLOSXP){ /* R_LookupMethod looked for a function */
if (s == R_UnboundValue)
error(_("no calling generic was found: was a method called directly?"));
else
errorcall(R_NilValue,
_("'function' is not a function, but of type %d"),
TYPEOF(s));
}
So why did that happen here? This is where it gets tricky. I believe it's because by passing NextMethod() through capture.output(), it gets called using eval(), which is a built-in (see builtins()).
So how can we deal with this? Read on...
The Fix
We can simulate capture output with clever use of sink(), cat(), and tempfile():
foo.baz <- function(x, ...) {
# Create a temporary file to store the output
tmp <- tempfile("tmp.txt")
# start sink()
sink(tmp)
# call NextMethod() just for the purpose of capturing output
NextMethod()
# stop sink'ing
sink()
# store the output in an R object
y <- readLines(tmp)
# here we'll cat() the output to make sure it worked
cat("The output was:", y, "\n")
# destroy the temporary file
unlink(tmp)
# and call NextMethod for its actual execution
NextMethod()
}
foo(structure(1, class = c("baz", "bar")))
# 1
I'm not sure if what you saw is documented or not: the documentation ?NextMethod makes clear that it isn't a regular function, but I didn't follow all the details to see if your usage would be allowed.
One way to do what you want would be
foo.baz <- function(x, ...) {class(x) <- class(x)[-1]; capture.output(foo(x, ...))}
This assumes that the method was called directly from a call to the generic; it won't work if there's a third level, and foo.baz was itself invoked by NextMethod().
Related
I'd like to wrap around the checkmate library's qassert to check multiple variable's specification at a time. Importantly, I'd like assertion errors to still report the variable name that's out of spec.
So I create checkargs to loop through input arguments. But to get the variable passed on to qassert, I use the same code for each loop -- that ambigious code string gets used for for the error message instead of the problematic variable name.
qassert() (via vname()) is getting what to display in the assertion error like deparse(eval.parent(substitute(substitute(x))). Is there any way to box up get(var) such that that R will see e.g. 'x' on deparse instead?
At least one work around is eval(parse()). But something like checkargs(x="n', system('echo malicious',intern=T),'") has me hoping for an alternative.
checkargs <- function(...) {
args<-list(...)
for(var in names(args))
checkmate::qassert(get(var,envir=parent.frame()),args[[var]])
# scary string interpolation alternative
#eval(parse(text=paste0("qassert(",var,",'",args[[var]], "')")),parent.frame())
}
test_checkargs <- function(x, y) {checkargs(x='b',y='n'); print(y)}
# checkargs is working!
test_checkargs(T, 1)
# [1] 1
# but the error message isn't helpful.
test_checkargs(1, 1)
# Error in checkargs(x = "b", y = "n") :
# Assertion on 'get(var, envir = parent.frame())' failed. Must be of class 'logical', not 'double'.
#
# want:
# Assertion on 'x' failed. ...
substitute() with as.name seems to do the trick. This still uses eval but without string interpolation.
eval(substitute(
qassert(x,spec),
list(x=as.name(var),
spec=args[[var]])),
envir=parent.frame())
Below I define an R function that includes <- as part of its name. I realize, without understanding, that this function uses the value of the right hand side of <-.
Whilst R says that object myfun is not found whilst myfun<- is a function, I cannot understand how I am still able to wrap myx in myfun. In doing so, am I actually calling the myfun<- function? Any insights and guidance from the stackoverflow experts would be appreciated!
'myfun<-' <- function (x,value) x <- paste0(value, " says hello")
myx <- "my_x"
myy <- "my_y"
myfun(myx) <- myy
myx
[1] "my_y says hello"
myfun
Error: object 'myfun' not found
`myfun<-`
function (x,value) x <- paste0(value, " says hello")
As #akrun says, a myfun and amyfun<- would be two different functions, the first one for accessing a value from the symbol table, and the other for assigning a value to a symbol. Looking at the "R Language Definition" in the Function Calls" section we see that:
A special type of function calls can appear on the left hand side of the assignment operator as in
class(x) <- "foo"
What this construction really does is to call the function class<- with the original object and the right hand side. This function performs the modification of the object and returns the result which is then stored back into the original variable.
Your definition of the function also has a aspect that exposes a further misconception. The assignment operation inside the body of that function is entirely unnecessary. The value of the <- function is in fact the RHS's value, so unless you had return()-ed the x, the assignment to x would not be needed. Let's leave the assignment out of the function body and try to make the assignment without first creating an entry in the symbol table for myx:
'myfun<-' <- function (x,value) paste0(value, " says hello")
myy <- "my_y"
myfun(myx) <- myy
#Error in myfun(myx) <- myy : object 'myx' not found
The usual way of creating an entry in the symbol table for an item is to use one of the empty class constructors:
myx <- character(0)
myfun(myx) <- myy
myx
#[1] "my_y says hello" # Success
Note that the assignment operator in the function name directs that the symbol myx be used as the "target", so there is no need for it to have to be a value of "my_x" or anything resembling it's character value. You could have used numeric(0) and you would have only been confusing yourself when you went back later to read your code, but the interpreter would have simply coerced the class of the target symbol to character. R is very weakly typed.
Normally I wonder where mysterious errors come from but now my question is where a mysterious lack of error comes from.
Let
numbers <- c(1, 2, 3)
frame <- as.data.frame(numbers)
If I type
subset(numbers, )
(so I want to take some subset but forget to specify the subset-argument of the subset function) then R reminds me (as it should):
Error in subset.default(numbers, ) :
argument "subset" is missing, with no default
However when I type
subset(frame,)
(so the same thing with a data.frame instead of a vector), it doesn't give an error but instead just returns the (full) dataframe.
What is going on here? Why don't I get my well deserved error message?
tl;dr: The subset function calls different functions (has different methods) depending on the type of object it is fed. In the example above, subset(numbers, ) uses subset.default while subset(frame, ) uses subset.data.frame.
R has a couple of object-oriented systems built-in. The simplest and most common is called S3. This OO programming style implements what Wickham calls a "generic-function OO." Under this style of OO, an object called a generic function looks at the class of an object and then applies the proper method to the object. If no direct method exists, then there is always a default method available.
To get a better idea of how S3 works and the other OO systems work, you might check out the relevant portion of the Advanced R site. The procedure of finding the proper method for an object is referred to as method dispatch. You can read more about this in the help file ?UseMethod.
As noted in the Details section of ?subset, the subset function "is a generic function." This means that subset examines the class of the object in the first argument and then uses method dispatch to apply the appropriate method to the object.
The methods of a generic function are encoded as
< generic function name >.< class name >
and can be found using methods(<generic function name>). For subset, we get
methods(subset)
[1] subset.data.frame subset.default subset.matrix
see '?methods' for accessing help and source code
which indicates that if the object has a data.frame class, then subset calls the subset.data.frame the method (function). It is defined as below:
subset.data.frame
function (x, subset, select, drop = FALSE, ...)
{
r <- if (missing(subset))
rep_len(TRUE, nrow(x))
else {
e <- substitute(subset)
r <- eval(e, x, parent.frame())
if (!is.logical(r))
stop("'subset' must be logical")
r & !is.na(r)
}
vars <- if (missing(select))
TRUE
else {
nl <- as.list(seq_along(x))
names(nl) <- names(x)
eval(substitute(select), nl, parent.frame())
}
x[r, vars, drop = drop]
}
Note that if the subset argument is missing, the first lines
r <- if (missing(subset))
rep_len(TRUE, nrow(x))
produce a vector of TRUES of the same length as the data.frame, and the last line
x[r, vars, drop = drop]
feeds this vector into the row argument which means that if you did not include a subset argument, then the subset function will return all of the rows of the data.frame.
As we can see from the output of the methods call, subset does not have methods for atomic vectors. This means, as your error
Error in subset.default(numbers, )
that when you apply subset to a vector, R calls the subset.default method which is defined as
subset.default
function (x, subset, ...)
{
if (!is.logical(subset))
stop("'subset' must be logical")
x[subset & !is.na(subset)]
}
The subset.default function throws an error with stop when the subset argument is missing.
I'm trying to figure out how NextMethod() works. The most detailed explanation I have found of the S3 class system is in Chambers & Hastie (edts.)'s Statistical Models in S (1993, Chapman & Hall), however I find the part concerning NextMethod invocation a little obscure. Following are the relevant paragraphs I'm trying to make sense of (pp. 268-269).
Turning now to methods invoked as a result of a call to
NextMethod(), these behave as if they had been called from the
previous method with a special call. The arguments in the call to the
inherited method are the same in number, order, and actual argument
names as those in the call to the current method (and, therefore, in
the call to the generic). The expressions for the arguments, however,
are the names of the corresponding formal arguments of the current
method. Suppose, for example, that the expression print(ratings) has
invoked the method print.ordered(). When this method invokes
NextMethod(), this is equivalent to a call to print.factor() of
the form print.factor(x), where x is here the x in the frame of
print.ordered(). If several arguments match the formal argument
"...", those arguments are represented in the call to the inherited
method y special names "..1", "..2", etc. The evaluator recognizes
these names and treats them appropriately (see page 476 for an
example).
This rather subtle definition exists to ensure that the semantics of
function calls in S carry over as cleanly as possible to the use of
methods (compare Becker, Chambers and Wilks's The New S Language,
page 354). In particular:
Arguments are passed down from the current method to the inherited method with their current values at the time NextMethod() is called.
Lazy evaluation continues in effect; unevaluated arguments stay unevaluated.
Missing arguments remain missing in the inherited method.
Arguments passed through the "..." formal argument arrive with the correct argument name.
Objects in the frame that do not correspond to actual arguments in the call will not be passed to the inherited method."
The inheritance process is essentially transparent so far as the
arguments go.
Two points that I find confusing are:
What is "the current method" and what is "the previous method"?
What is the difference between "The arguments in the call to the inherited method", "The expressions for the arguments" and "the names of the corresponding formal arguments of the current method"?
Generally speaking, if anyone could please restate the description given in the above paragraphs in a lucider fashion, I'd appreciate it.
Hard to go through all this post, but I think that this small example can help to demystify the NextMethod dispatching.
I create an object with 2 classes attributes (inheritance) 'first' and 'second'.
x <- 1
attr(x,'class') <- c('first','second')
Then I create a generic method Cat to print my object
Cate <- function(x,...)UseMethod('Cate')
I define Cate method for each class.
Cate.first <- function(x,...){
print(match.call())
print(paste('first:',x))
print('---------------------')
NextMethod() ## This will call Cate.second
}
Cate.second <- function(x,y){
print(match.call())
print(paste('second:',x,y))
}
Now you can can check Cate call using this example:
Cate(x,1:3)
Cate.first(x = x, 1:3)
[1] "first: 1"
[1] "---------------------"
Cate.second(x = x, y = 1:3)
[1] "second: 1 1" "second: 1 2" "second: 1 3"
For Cate.second the previous method is Cate.first
Arguments x and y are passed down from the current method to the inherited
method with their current values at the time NextMethod() is called.
Argument y passed through the "..." formal argument arrive with the correct argument name Cate.second(x = x, y = 1:3)
Consider this example where generic function f is called and it invokes f.ordered and then, using NextMethod, f.ordered invokes f.factor:
f <- function(x) UseMethod("f") # generic
f.ordered <- function(x) { x <- x[-1]; NextMethod() }
f.factor <- function(x) x # inherited method
x <- ordered(c("a", "b", "c"))
class(x)
## [1] "ordered" "factor"
f(x)
## [1] b c
## Levels: a < b < c
Now consider the original text:
Turning now to methods invoked as a result of a call to NextMethod(),
these behave as if they had been called from the previous method with
a special call.
Here f calls f.ordered which calls f.factor so the method "invoked as a
result of a call to NextMethod" is f.factor and the previous method is
f.ordered.
The arguments in the call to the inherited method are the same in
number, order, and actual argument names as those in the call to the
current method (and, therefore, in the call to the generic). The
expressions for the arguments, however, are the names of the
corresponding formal arguments of the current method. Suppose, for
example, that the expression print(ratings) has invoked the method
print.ordered(). When this method invokes NextMethod(), this is
equivalent to a call to print.factor() of the form print.factor(x),
where x is here the x in the frame of print.ordered()
Now we switch perspectives and we are sitting in f.ordered so now f.ordered
is the current method and f.factor is the inherited method.
At the point that f.ordered invokes NextMethod() a special call is constructed
to call f.factor whose arguments are the same as those passed to f.ordered and
to the generic f
except that they refer to the versions of the arguments in f.ordered (which
makes a difference here as f.ordered changes the argument before invoking
f.factor.
Background
I’m in the process of creating a shortcut for lambdas, since the repeated use of function (…) … clutters my code considerably. As a remedy, I’m trying out alternative syntaxes inspired by other languages such as Haskell, as far as this is possible in R. Simplified, my code looks like this:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = parent.frame())
}
This allows the following code:
f(x = x * 2)(5) # => 10
f(x, y = x + y)(1, 2) # => 3
etc.
Of course the real purpose is to use this with higher-order functions1:
Map(f(x = x * 2), 1 : 10)
The problem
Unfortunately, I sometimes have to nest higher-order functions and then it stops working:
f(x = Map(f(y = x + y), 1:2))(10)
yields “Error in eval(expr, envir, enclos): object x not found”. The conceptually equivalent code using function instead of f works. Furthermore, other nesting scenarios also work:
f(x = f(y = x + y)(2))(3) # => 5
I’m suspecting that the culprit is the parent environment of the nested f inside the map: it’s the top-level environment rather than the outer f’s. But I have no idea how to fix this, and it also leaves me puzzled that the second scenario above works. Related questions (such as this one) suggest workarounds which are not applicable in my case.
Clearly I have a gap in my understanding of environments in R. Is what I want possible at all?
1 Of course this example could simply be written as (1 : 10) * 2. The real application is with more complex objects / operations.
The answer is to attach parent.frame() to the output function's environment:
f <- function (...) {
args <- match.call(expand.dots = FALSE)$...
last <- length(args)
params <- c(args[-last], names(args)[[last]])
e <- parent.frame()
function (...)
eval(args[[length(args)]],
envir = setNames(list(...), params),
enclos = e)
}
Hopefully someone can explain well why this works and not yours. Feel free to edit.
Great question.
Why your code fails
Your code fails because eval()'s supplied enclos= argument does not point far enough up the call stack to reach the environment in which you are wanting it to next search for unresolved symbols.
Here is a partial diagram of the call stack from the bottom of which your call to parent.frame() occurs. (To make sense of this, it's important to keep in mind that the function call from which parent.frame() is here being called is not f(), but a call the anonymous function returned by f() (let's call it fval)).
## Note: E.F. = "Evaluation Frame"
## fval = anonymous function returned as value of nested call to f()
f( <------------------------- ## E.F. you want, ptd to by parent.frame(n=3)
Map(
mapply( <-------------------- ## E.F. pointed to by parent.frame(n=1)
fval( |
parent.frame(n=1 |
In this particular case, redefining the function returned by f() to call parent.frame(n=3) rather than parent.frame(n=1) produces working code, but that's not a good general solution. For instance, if you wanted to call f(x = mapply(f(y = x + y), 1:2))(10), the call stack would then be one step shorter, and you'd instead need parent.frame(n=2).
Why flodel's code works
flodel's code provides a more robust solution by calling parent.frame() during evaluation of the inner call to f in the nested chain f(Map(f(), ...)) (rather than during the subsequent evaluation of the anonymous function fval returned by f()).
To understand why his parent.frame(n=1) points to the appropriate environment, it's important to recall that in R, supplied arguments are evaluated in the the evaluation frame of the calling function. In the OP's example of nested code, the inner f() is evaluated during the processing of Map()'s supplied arguments, so it's evaluation environment is that of the function calling Map(). Here, the function calling Map() is the outer call to f(), and its evaluation frame is exactly where you want eval() to next be looking for symbols:
f( <--------------------- ## Evaluation frame of the nested call to f()
Map(f( |
parent.frame(n=1 |