I want to implement an inset method for my class myClass for the internal generic [<- (~ help(Extract)).
This method should run a bunch of tests, before passing on the actual insetting off to [<- via NextMethod().
I understand that:
any method has to include at least the arguments of the generic (mine does, I think)
the NextMethod() call does not usually need any arguments (though supplying them manually doesn't seem to help either).
Here's my reprex:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i, j, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod()
}
x[1] <- 3 # this errors out with *expected* error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
What seems to be happening is that NextMethod() also passes on foo to the internal generic [<-, which mistakes foo for another index, and, consequently errors out (because, in this case, x has no second dimension to index on).
I also tried supplying the arguments explicitly no NextMethod(), but this also fails (see reprex below the break).
How can I avoid choking up NextMethod() with additional arguments to my method?
(Bonus: Does anyone know good resources for building methods for internal generics? #Hadleys adv-r is a bit short on the matter).
Reprex with explicit arguments:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i = NULL, j = NULL, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod(generic = "`[<-`", object = x, i = i, j = j, value = value, ...)
}
x[1] <- 3 # this errors out with expected error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
I don't see an easy way around this except to strip the class (which makes a copy of x)
`[<-.myClass` <- function(x, i, value, ..., foo = TRUE) {
if (foo) {
cat("hi!")
x
} else {
class_x <- class(x)
x <- unclass(x)
x[i] <- value
class(x) <- class_x
x
}
}
x <- structure(1:2, class = "myClass")
x[1] <- 3
#> hi!
x[1, foo = FALSE] <- 3
x
#> [1] 3 2
#> attr(,"class")
#> [1] "myClass"
This is not a general approach - it's only needed for [, [<-, etc because they don't use the regular rules for argument matching:
Note that these operations do not match their index arguments in the standard way: argument names are ignored and positional matching only is used. So m[j = 2, i = 1] is equivalent to m[2, 1] and not to m[1, 2].
(from the "Argument matching" section in ?`[`)
That means your x[1, foo = FALSE] is equivalent to x[1, FALSE] and then you get an error message because x is not a matrix.
Approaches that don't work:
Supplying additional arguments to NextMethod(): this can only increase the number of arguments, not decrease it
Unbinding foo with rm(foo): this leads to an error about undefined foo.
Replacing foo with a missing symbol: this leads to an error that foo is not supplied with no default argument.
Here's how I understand it, but I don't know so much about that subject so I hope I don't say too many wrong things.
From ?NextMethod
NextMethod invokes the next method (determined by the class vector,
either of the object supplied to the generic, or of the first argument
to the function containing NextMethod if a method was invoked
directly).
Your class vector is :
x <- c(1,2)
class(x) <- "myClass" # note: you might want class(x) <- c("myClass", class(x))
class(x) # [1] "myClass"
So you have no "next method" here, and [<-.default, doesn't exist.
What would happen if we define it ?
`[<-.default` <- function(x, i, j, value, ...) {print("default"); value}
x[1, foo = FALSE] <- 3
# [1] "default"
x
# [1] 3
If there was a default method with a ... argument it would work fine as the foo argument would go there, but it's not the case so I believe NextMethod just cannot be called as is.
You could do the following to hack around the fact that whatever is called doesn't like to be fed a foo argument:
`[<-.myClass` <- function(x, i, j, value, foo = FALSE, ...) {
if (foo) {
stop("'foo' must be false!")
}
`[<-.myClass` <- function(x, i, j, value, ...) NextMethod()
args <- as.list(match.call())[-1]
args <- args[names(args) %in% c("","x","i","j","value")]
do.call("[<-",args)
}
x[1, foo = FALSE] <- 3
x
# [1] 3 2
# attr(,"class")
# [1] "myClass"
Another example, with a more complex class :
library(data.table)
x <- as.data.table(iris[1:2,1:2])
class(x) <- c("myClass",class(x))
x[1, 2, foo = FALSE] <- 9999
# Sepal.Length Sepal.Width
# 1: 5.1 9999
# 2: 4.9 3
class(x)
# [1] "myClass" "data.table" "data.frame"
This would fail if the next method had other arguments than x, i, j and value, in that case better to be explicit about our additional arguments and run args <- args[! names(args) %in% c("foo","bar")]. Then it might work (as long as arguments are given explicitly as match.call doesn't catch default arguments). I couldn't test this though as I don't know such method for [<-.
Related
Given a regular R function f, I'd like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.
For example:
f <- function(x, y) {
z <- x + y
df <- data.frame(z=z)
df
}
# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
e <- new.env()
altered <- function(...) {
parent <- parent.frame()
e <- something...(func, environment(), parent, etc., etc.)
result <- func(...)
saveRDS(as.list(e), 'foo.rds')
result
}
environment(func) <- e
altered
}
f_debug <- capturing(f)
I'm not sure whether my knowledge gap to do this is large or small, anyone have a solution?
Solution 1: Steal the function's code
Here's a solution which doesn't return a new function which captures intermediate calculations, but rather calls the given function's code internally. There's some limitations, such as it probably only works with named arguments. Instead of storing the intermediate calculations as an RDS, it attaches them as an attribute.
capturing <- function(fun, ...) {
fun <- match.fun(fun)
code <- body(fun)
parent <- environment(fun)
env <- new.env(parent = parent)
for (val in names(list(...))) {
env[[val]] <- list(...)[[val]]
}
result <- eval(code, envir = env, enclos = parent.frame())
attr(result, "intermediate") <- env
result
}
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
intermediates <- function(x) {
attr(x, "intermediate", exact = TRUE)
}
value <- capturing(my_add, x = 1, y = 7)
ls(envir = intermediates(value))
#> [1] "u" "w" "x" "y" "z"
intermediates(value)$x
#> [1] 1
# Created on 2022-02-08 by the reprex package (v2.0.1)
Solution 2: Modify the function's code
One weakness of this solution is that if the chosen function features a call to on.exit(add=FALSE), some additional work needs to be done to modify the function so the internal environment is captured. However, it does work when the function accepts ... arguments.
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
insert_capture <- function(code) {
# `<<-` assigns into the global environment if no variable of the given name is found
# while traveling up to the global environment. If you need this assignment to go elsewhere,
# I'd recommend passing in `assign()`. Of course, you could also modify the `on.exit()`
# to use saveRDS.
parse(text=append(deparse(code),
"on.exit(._last_capture <<- environment(), add = TRUE)",
after = 1L))
}
capturing2 <- function(fun) {
fun <- match.fun(fun)
code <- insert_capture(body(fun))
body(fun) <- code
fun
}
my_add2 <- capturing2(my_add)
my_add2(1, 7)
#> [1] 8
ls(envir = ._last_capture)
#> [1] "u" "w" "x" "y" "z"
._last_capture$u
#> [1] -6
Created on 2022-02-08 by the reprex package (v2.0.1)
What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.
capturing <- function(func, ...) {
cc <- as.call(c(quote(utils::dump.frames), list(...)))
cc <- call("on.exit", cc, add = TRUE)
body(func) <- call("{", cc, body(func))
func
}
capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:
dumpto, a character string giving the name to be used for the "dump.frames" object
to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well
A quick example, which you should try yourself:
tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)
f <- function(x, y) {
z <- x + y
z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
d <- g(a, b)
d + 1
}
h12 <- h(1, 2)
load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
##
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
##
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"
ls(zzz[[1L]])
## [1] "a" "b"
ls(zzz[[2L]])
## [1] "z" "x" "y"
utils::debugger(zzz)
## Message: Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
##
## Enter an environment number, or 0 to exit
## Selection: 2
## Browsing in the environment with call:
## #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q
setwd(cwd)
unlink(tmp, recursive = TRUE)
See ?browser if you are unfamiliar with R's environment browser.
My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.
Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
I have a generic function foo that I want to call three different ways depending on the arguments given to it.
foo <- function(...) UseMethod("foo")
#default
foo.default <- function(x, y, ...) {
#does some magic
print("this is the default method")
}
#formula
foo.formula <- function(formula, data = list(), ...) {
print("this is the formula method")
}
#data.frame
foo.data.frame <- function(data, x, y, ...) {
print("this is the data.frame method")
}
In the following I'm going to show how I am expecting the method dispatch to work but the outputs are presented under each call...
mydata <- data.frame(x=c(1,2,3,4),y=c(5,6,7,8))
#ways to call default function
foo(x = mydata$x, y = mydata$y)
#[1] "this is the default method"
#ways to call formula
foo(formula = mydata$x~mydata$y)
#[1] "this is the formula method"
foo(formula = x~y, data = mydata)
#[1] "this is the formula method"
foo(data = mydata, formula = x~y) #ERROR
#[1] "this is the data.frame method"
#ways to call data.frame method
foo(data = mydata, x = x, y = y)
#[1] "this is the data.frame method"
foo(x = x, y = y, data = mydata) #ERROR
#Error in foo(x = x, y = y, data = mydata) : object 'x' not found
from what I can tell, the method used depends on the class of the first argument. Essentially, I would like for the method dispatch to depend on the arguments passed to the generic function foo and not the first argument.
I would like the dispatch to have the following priority:
If the formula argument is present the formula method is used (data argument should be optional here)
Then, if no formula argument is found, if data argument is present use data.frame method (which requires x and y arguments)
else foo expects the x and y arguments or it will fail.
Note
I would like to avoid defining the generic function foo as follows
foo <- function(formula, data,...) UseMethod("foo")
while this would fix all my issues (I believe all except the last case), this will cause a devtools::check() warning because the some of S3 functions will not have the same arguments as the generic function and will no longer be consistent (specifically foo.default and foo.data.frame). And I wouldn't like to include the missing arguments because those methods do not have use for those arguments.
As Thomas has pointed out, this is not the standard behavior for S3 classes. If you really want to stick to S3, however, you could write your functions so as to "mimick" UseMethod, even though it won't be pretty and is probably not what you want to do. Nevertheless, here an idea that is based on capturing all arguments first, and then checking for the presence of your "preferred" argument type:
Get some objects first:
a <- 1; class(a) <- "Americano"
b <- 2; class(b) <- "Espresso"
Let the function in question capture all arguments with dots, and then check for the presence of an argument type in order of your preference:
drink <- function(...){
dots <- list(...)
if(any(sapply(dots, function(cup) class(cup)=="Americano"))){
drink.Americano(...)
} else { # you can add more checks here to get a hierarchy
# try to find appropriate method first if one exists,
# using the first element of the arguments as usual
tryCatch(get(paste0("drink.", class(dots[[1]])))(),
# if no appropriate method is found, try the default method:
error = function(e) drink.default(...))
}
}
drink.Americano <- function(...) print("Hmm, gimme more!")
drink.Espresso <- function(...) print("Tripple, please!")
drink.default <- function(...) print("Any caffeine in there?")
drink(a) # "Americano", dispatch hard-coded.
# [1] "Hmm, gimme more!"
drink(b) # "Espresso", not hard-coded, but correct dispatch anyway
# [1] "Tripple, please!"
drink("sthelse") # Dispatches to default method
# [1] "Any caffeine in there?"
drink(a,b,"c")
# [1] "Hmm, gimme more!"
drink(b,"c", a)
# [1] "Hmm, gimme more!"
I know about methods(), which returns all methods for a given class. Suppose I have x and I want to know what method will be called when I call foo(x). Is there a oneliner or package that will do this?
The shortest I can think of is:
sapply(class(x), function(y) try(getS3method('foo', y), silent = TRUE))
and then to check the class of the results... but is there not a builtin for this?
Update
The full one liner would be:
fm <- function (x, method) {
cls <- c(class(x), 'default')
results <- lapply(cls, function(y) try(getS3method(method, y), silent = TRUE))
Find(function (x) class(x) != 'try-error', results)
}
This will work with most things but be aware that it might fail with some complex objects. For example, according to ?S3Methods, calling foo on matrix(1:4, 2, 2) would try foo.matrix, then foo.numeric, then foo.default; whereas this code will just look for foo.matrix and foo.default.
findMethod defined below is not a one-liner but its body has only 4 lines of code (and if we required that the generic be passed as a character string it could be reduced to 3 lines of code). It will return a character string representing the name of the method that would be dispatched by the input generic given that generic and its arguments. (Replace the last line of the body of findMethod with get(X(...)) if you want to return the method itself instead.) Internally it creates a generic X and an X method corresponding to each method of the input generic such that each X method returns the name of the method of the input generic that would be run. The X generic and its methods are all created within the findMethod function so they disappear when findMethod exits. To get the result we just run X with the input argument(s) as the final line of the findMethod function body.
findMethod <- function(generic, ...) {
ch <- deparse(substitute(generic))
f <- X <- function(x, ...) UseMethod("X")
for(m in methods(ch)) assign(sub(ch, "X", m, fixed = TRUE), "body<-"(f, value = m))
X(...)
}
Now test it. (Note that the one-liner in the question fails with an error in several of these tests but findMethod gives the expected result.)
findMethod(as.ts, iris)
## [1] "as.ts.default"
findMethod(print, iris)
## [1] "print.data.frame"
findMethod(print, Sys.time())
## [1] "print.POSIXct"
findMethod(print, 22)
## [1] "print.default"
# in this example it looks at 2nd component of class vector as no print.ordered exists
class(ordered(3))
## [1] "ordered" "factor"
findMethod(print, ordered(3))
## [1] "print.factor"
findMethod(`[`, BOD, 1:2, "Time")
## [1] "[.data.frame"
I use this:
s3_method <- function(generic, class, env = parent.frame()) {
fn <- get(generic, envir = env)
ns <- asNamespace(topenv(fn))
tbl <- ns$.__S3MethodsTable__.
for (c in class) {
name <- paste0(generic, ".", c)
if (exists(name, envir = tbl, inherits = FALSE)) {
return(get(name, envir = tbl))
}
if (exists(name, envir = globalenv(), inherits = FALSE)) {
return(get(name, envir = globalenv()))
}
}
NULL
}
For simplicity this doesn't return methods defined by assignment in the calling environment. The global environment is checked for convenience during development. These are the same rules used in r-lib packages.
I try to write a function where I can throw in a abitrary number of objects and get a list of the datatypes of that objects. This is a personal task to learn S3 Generics.
What I have done so far is:
myTypes <- function(x, ...) {
dots <- list(...)
return (as.list(apply(dots, 1, myType)))
}
myType <- function(x){
UseMethod("myType")
}
myType.numeric <- function(x){
if(is.numeric(x)) "Type: numberic"
}
myType.data.frame <- function(x){
if(is.data.frame(x)) "Type: dataframe"
}
The error occurs e.g. when I call
x <- 1
y <- 3
myTypes(x,y)
I always get the error: "Error in apply(dots, 1, myType) : dim(X) must have a positive length" and I am not sure what is wrong. Could anyone help me here? Since I am totally new to R I maybe doing something basically wrong.
The first argument of apply must be a matrix-like object (i.e., a matrix, array or data.frame). Otherwise you get this error:
apply(1, 1, mean)
#Error in apply(1, 1, mean) : dim(X) must have a positive length
You are passing a list to apply, which can't work because you tell apply to apply the function along the first dimension and a list doesn't have dimensions.
You probably want to use lapply and not apply:
myTypes <- function( ...) {
dots <- list(...)
lapply(dots, myType)
}
x <- 1
y <- 3
myTypes(x,y)
#[[1]]
#[1] "Type: numberic"
#
#[[2]]
#[1] "Type: numberic"
Of course, it seems more useful, to simply return the class:
myTypes <- function(...) {
dots <- list(...)
lapply(dots, class)
}
myTypes(x,y)
#[[1]]
#[1] "numeric"
#
#[[2]]
#[1] "numeric"
Btw., if you use S3 method dispatch, you don't have to test the class inside the method because a method is only dispatched if the object has the corresponding class.