I am pretty new to R, I have coded with Python and here OOP is quite different to python. I am trying to understand it, so in S3 you can create methods/functions that are not directly attached to a single class, just the same as the objects as they can be in multiple classes (which is quite flexible I guess). However what I do not understand is when I am creating a class such as:
> my_mean <- function (x, ...) {
UseMethod("my_mean", x)}
> my_mean
function (x, ...) {
UseMethod("my_mean", x)}
> my_mean.default <- function(obj){cat("this is a generic function")}
> my_mean.default
function(obj){cat("this is a generic function")}
But then when I have to run for example summary:
summary.default
function (object, ..., digits, quantile.type = 7)
{
if (is.factor(object))
return(summary.factor(object, ...))
else if (is.matrix(object)) {
if (missing(digits))
return(summary.matrix(object, quantile.type = quantile.type,
...))
else return(summary.matrix(object, digits = digits, quantile.type = quantile.type,
...))
}
value <- if (is.logical(object))
c(Mode = "logical", {
tb <- table(object, exclude = NULL, useNA = "ifany")
if (!is.null(n <- dimnames(tb)[[1L]]) && any(iN <- is.na(n))) dimnames(tb)[[1L]][iN] <- "NA's"
tb
})
else if (is.numeric(object)) {
nas <- is.na(object)
object <- object[!nas]
qq <- stats::quantile(object, names = FALSE, type = quantile.type)
qq <- c(qq[1L:3L], mean(object), qq[4L:5L])
if (!missing(digits))
qq <- signif(qq, digits)
names(qq) <- c("Min.", "1st Qu.", "Median",
"Mean", "3rd Qu.", "Max.")
if (any(nas))
c(qq, `NA's` = sum(nas))
else qq
}
else if (is.recursive(object) && !is.language(object) &&
(n <- length(object))) {
sumry <- array("", c(n, 3L), list(names(object),
c("Length", "Class", "Mode")))
ll <- numeric(n)
for (i in 1L:n) {
ii <- object[[i]]
ll[i] <- length(ii)
cls <- oldClass(ii)
sumry[i, 2L] <- if (length(cls))
cls[1L]
else "-none-"
sumry[i, 3L] <- mode(ii)
}
sumry[, 1L] <- format(as.integer(ll))
sumry
}
else c(Length = length(object), Class = class(object), Mode = mode(object))
class(value) <- c("summaryDefault", "table")
value
}
<bytecode: 0x000001926eaaf8f8>
<environment: namespace:base>
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x000001926e9ec2c0>
<environment: namespace:base>
I cannot see the difference in why when you call summary in the console it does not give you the function, it gives you a reference to that object. There's any explanation? Furthermore, is it generic in some way similar to init?
S3 classes work nothing like any OOP you may be familiar with from other languages. They are a losely connected set of mechanisms that only work when you stick to certain rules.
x <- 1:11
mean(x)
#> [1] 6
This implcitely calls the function mean.default because x is a simple atomic vector.
Now we create a method for our own class evil
mean.evil <- function( x ) {
return(666) # always retuns 666 that is why it is evil
}
And we convert the vector x to a class evil:
class(x) <- "evil" # you can actually do it just like that
Now, calling mean determines that xis of class evil and calls the according function.
mean(x) # calls mean.evil
#> [1] 666
mean.default(x) # coerces R to use the default method which is still possible
#> [1] 6
The reason is that mean uses UseMethod() which checks the class and tries to find a function that has name with the pattern mean.[myclass]. And that is all that happens.
mean
#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x0000000015812e18>
#> <environment: namespace:base>
In other languages everything is held together by the syntax. S3 mechanisms on the other hand can be used to "approximate" OOP but they can be easily misused. They are simply and effective and appropriate for many use cases in R. If you are interested in more advanced OOP in R I recommend R6 classes.
Created on 2020-06-30 by the reprex package (v0.3.0)
Related
Given a regular R function f, I'd like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.
For example:
f <- function(x, y) {
z <- x + y
df <- data.frame(z=z)
df
}
# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
e <- new.env()
altered <- function(...) {
parent <- parent.frame()
e <- something...(func, environment(), parent, etc., etc.)
result <- func(...)
saveRDS(as.list(e), 'foo.rds')
result
}
environment(func) <- e
altered
}
f_debug <- capturing(f)
I'm not sure whether my knowledge gap to do this is large or small, anyone have a solution?
Solution 1: Steal the function's code
Here's a solution which doesn't return a new function which captures intermediate calculations, but rather calls the given function's code internally. There's some limitations, such as it probably only works with named arguments. Instead of storing the intermediate calculations as an RDS, it attaches them as an attribute.
capturing <- function(fun, ...) {
fun <- match.fun(fun)
code <- body(fun)
parent <- environment(fun)
env <- new.env(parent = parent)
for (val in names(list(...))) {
env[[val]] <- list(...)[[val]]
}
result <- eval(code, envir = env, enclos = parent.frame())
attr(result, "intermediate") <- env
result
}
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
intermediates <- function(x) {
attr(x, "intermediate", exact = TRUE)
}
value <- capturing(my_add, x = 1, y = 7)
ls(envir = intermediates(value))
#> [1] "u" "w" "x" "y" "z"
intermediates(value)$x
#> [1] 1
# Created on 2022-02-08 by the reprex package (v2.0.1)
Solution 2: Modify the function's code
One weakness of this solution is that if the chosen function features a call to on.exit(add=FALSE), some additional work needs to be done to modify the function so the internal environment is captured. However, it does work when the function accepts ... arguments.
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
insert_capture <- function(code) {
# `<<-` assigns into the global environment if no variable of the given name is found
# while traveling up to the global environment. If you need this assignment to go elsewhere,
# I'd recommend passing in `assign()`. Of course, you could also modify the `on.exit()`
# to use saveRDS.
parse(text=append(deparse(code),
"on.exit(._last_capture <<- environment(), add = TRUE)",
after = 1L))
}
capturing2 <- function(fun) {
fun <- match.fun(fun)
code <- insert_capture(body(fun))
body(fun) <- code
fun
}
my_add2 <- capturing2(my_add)
my_add2(1, 7)
#> [1] 8
ls(envir = ._last_capture)
#> [1] "u" "w" "x" "y" "z"
._last_capture$u
#> [1] -6
Created on 2022-02-08 by the reprex package (v2.0.1)
What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.
capturing <- function(func, ...) {
cc <- as.call(c(quote(utils::dump.frames), list(...)))
cc <- call("on.exit", cc, add = TRUE)
body(func) <- call("{", cc, body(func))
func
}
capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:
dumpto, a character string giving the name to be used for the "dump.frames" object
to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well
A quick example, which you should try yourself:
tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)
f <- function(x, y) {
z <- x + y
z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
d <- g(a, b)
d + 1
}
h12 <- h(1, 2)
load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
##
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
##
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"
ls(zzz[[1L]])
## [1] "a" "b"
ls(zzz[[2L]])
## [1] "z" "x" "y"
utils::debugger(zzz)
## Message: Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
##
## Enter an environment number, or 0 to exit
## Selection: 2
## Browsing in the environment with call:
## #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q
setwd(cwd)
unlink(tmp, recursive = TRUE)
See ?browser if you are unfamiliar with R's environment browser.
My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.
Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.
Given a call to a function bar::foo(), I would like to be able to programmatically switch the package bar so that the same syntax calls hello::foo().
An example:
Let's say I have three packages, parentPkg, childPkg1 and childPkg2.
In parentPkg I have a call to function childPkg1::foo()
foo() is also a function in childPkg2
I would like to be able, in parentPkg to use the :: operator to call foo() but to programatically switch the package name.
Something like:
dummy_pkg_name = ifelse(scenario=="child1", "childPkg1", "childPkg2")
dummy_pkg_name::foo()
Is it possible? How do I achieve it?
Some context
parentPkg is a function that interacts with a web application, takes some request and data and returns results from different statistical models depending on the scenarios.
Each scenario is quite complex and not everything can be generalised in parentPkg. For this reason, childPkg1 and childPkg2 (actually there are also 3 and 4) are sort of sub-packages that deals with the data cleaning and various alternatives for each scenario but return the same class of value.
The idea is that parentPkg would switch the package to the pertinent child depending on the scenario and call all of the necessary functions without having to write the same sequence for each child but just with a slightly different :: call.
Since :: can be seen as a function, it looks like
`::`(dummy_pkg_name, foo)()
is what you want. Alternatively,
getFromNamespace("foo", ns = dummy_pkg_name)()
For instance,
`::`(stats, t.test)
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x102fd4b00>
# <environment: namespace:stats>
getFromNamespace("t.test", ns = "stats")
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x102fd4b00>
# <environment: namespace:stats>
To adhere to KISS, simply re-assign to new named functions in global environment. Be sure to leave out () since you are not requesting to run the function.
parent_foo <- parentPkg::foo
child1_foo <- childPkg1::foo
child2_foo <- childPkg2::foo
child3_foo <- childPkg3::foo
Then, conditionally apply them as needed:
if (scenario=="child1") {
obj <- child1_foo(...)
}
else if (scenario=="child2") {
obj <- child2_foo(...)
}
...
You could also create a call() that could then be evaluated.
call("::", quote(bar), quote(foo()))
# bar::foo()
Put into use:
c <- call("::", quote(stats), quote(t.test))
eval(c)
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x4340988>
# <environment: namespace:stats>
Wrapped up in a function using setdiff as our default function:
f <- function(pkg, fn = setdiff) {
pkg <- substitute(pkg)
fn <- substitute(fn)
eval(call("::", pkg, fn))
}
f(base)
# function (x, y)
# {
# x <- as.vector(x)
# y <- as.vector(y)
# unique(if (length(x) || length(y))
# x[match(x, y, 0L) == 0L]
# else x)
# }
# <bytecode: 0x30f1ea8>
# <environment: namespace:base>
f(dplyr)
# function (x, y, ...)
# UseMethod("setdiff")
# <environment: namespace:dplyr>
I want to implement an inset method for my class myClass for the internal generic [<- (~ help(Extract)).
This method should run a bunch of tests, before passing on the actual insetting off to [<- via NextMethod().
I understand that:
any method has to include at least the arguments of the generic (mine does, I think)
the NextMethod() call does not usually need any arguments (though supplying them manually doesn't seem to help either).
Here's my reprex:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i, j, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod()
}
x[1] <- 3 # this errors out with *expected* error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
What seems to be happening is that NextMethod() also passes on foo to the internal generic [<-, which mistakes foo for another index, and, consequently errors out (because, in this case, x has no second dimension to index on).
I also tried supplying the arguments explicitly no NextMethod(), but this also fails (see reprex below the break).
How can I avoid choking up NextMethod() with additional arguments to my method?
(Bonus: Does anyone know good resources for building methods for internal generics? #Hadleys adv-r is a bit short on the matter).
Reprex with explicit arguments:
x <- c(1,2)
class(x) <- c("myClass", "numeric")
`[<-.myClass` <- function(x, i = NULL, j = NULL, value, foo = TRUE, ...) {
if (foo) {
stop("'foo' must be false!")
}
NextMethod(generic = "`[<-`", object = x, i = i, j = j, value = value, ...)
}
x[1] <- 3 # this errors out with expected error message, so dispatch works
x[1, foo = FALSE] <- 3 # this fails with "incorrect number of subscripts
I don't see an easy way around this except to strip the class (which makes a copy of x)
`[<-.myClass` <- function(x, i, value, ..., foo = TRUE) {
if (foo) {
cat("hi!")
x
} else {
class_x <- class(x)
x <- unclass(x)
x[i] <- value
class(x) <- class_x
x
}
}
x <- structure(1:2, class = "myClass")
x[1] <- 3
#> hi!
x[1, foo = FALSE] <- 3
x
#> [1] 3 2
#> attr(,"class")
#> [1] "myClass"
This is not a general approach - it's only needed for [, [<-, etc because they don't use the regular rules for argument matching:
Note that these operations do not match their index arguments in the standard way: argument names are ignored and positional matching only is used. So m[j = 2, i = 1] is equivalent to m[2, 1] and not to m[1, 2].
(from the "Argument matching" section in ?`[`)
That means your x[1, foo = FALSE] is equivalent to x[1, FALSE] and then you get an error message because x is not a matrix.
Approaches that don't work:
Supplying additional arguments to NextMethod(): this can only increase the number of arguments, not decrease it
Unbinding foo with rm(foo): this leads to an error about undefined foo.
Replacing foo with a missing symbol: this leads to an error that foo is not supplied with no default argument.
Here's how I understand it, but I don't know so much about that subject so I hope I don't say too many wrong things.
From ?NextMethod
NextMethod invokes the next method (determined by the class vector,
either of the object supplied to the generic, or of the first argument
to the function containing NextMethod if a method was invoked
directly).
Your class vector is :
x <- c(1,2)
class(x) <- "myClass" # note: you might want class(x) <- c("myClass", class(x))
class(x) # [1] "myClass"
So you have no "next method" here, and [<-.default, doesn't exist.
What would happen if we define it ?
`[<-.default` <- function(x, i, j, value, ...) {print("default"); value}
x[1, foo = FALSE] <- 3
# [1] "default"
x
# [1] 3
If there was a default method with a ... argument it would work fine as the foo argument would go there, but it's not the case so I believe NextMethod just cannot be called as is.
You could do the following to hack around the fact that whatever is called doesn't like to be fed a foo argument:
`[<-.myClass` <- function(x, i, j, value, foo = FALSE, ...) {
if (foo) {
stop("'foo' must be false!")
}
`[<-.myClass` <- function(x, i, j, value, ...) NextMethod()
args <- as.list(match.call())[-1]
args <- args[names(args) %in% c("","x","i","j","value")]
do.call("[<-",args)
}
x[1, foo = FALSE] <- 3
x
# [1] 3 2
# attr(,"class")
# [1] "myClass"
Another example, with a more complex class :
library(data.table)
x <- as.data.table(iris[1:2,1:2])
class(x) <- c("myClass",class(x))
x[1, 2, foo = FALSE] <- 9999
# Sepal.Length Sepal.Width
# 1: 5.1 9999
# 2: 4.9 3
class(x)
# [1] "myClass" "data.table" "data.frame"
This would fail if the next method had other arguments than x, i, j and value, in that case better to be explicit about our additional arguments and run args <- args[! names(args) %in% c("foo","bar")]. Then it might work (as long as arguments are given explicitly as match.call doesn't catch default arguments). I couldn't test this though as I don't know such method for [<-.
I want to avoid using parse() in a function definition that contains a polynomial().
My polynomial is this:
library(polynom)
polynomial(c(1, 2))
# 1 + 2*x
I want to create a function which uses this polynomial expression as in:
my.function <- function(x) magic(polynomial(c(1, 2)))
where for magic(), I have tried various combinations of expression(), formula(), eval(), as.character(), etc... but nothing seems to work.
My only working solution is using eval(parse()):
eval(parse(text = paste0('poly_function <- function(x) ', polynomial(c(1, 2)))))
poly_function(x = 10)
# 21
Is there a better way to do want I want? Can I avoid the eval(parse())?
Like you, I though that the polynomial function was returning an R expression, but we were both wrong. Reading the help Index for package:polynom would have helped us both:
str(pol)
#Class 'polynomial' num [1:2] 1 2
help(pac=polynom)
So user20650 is correct and:
> poly_function <- as.function(pol)
> poly_function(10)
[1] 21
So this was how the authors (Venables, Hornick, Maechler) do it:
> getAnywhere(as.function.polynomial)
A single object matching ‘as.function.polynomial’ was found
It was found in the following places
registered S3 method for as.function from namespace polynom
namespace:polynom
with value
function (x, ...)
{
a <- rev(coef(x))
w <- as.name("w")
v <- as.name("x")
ex <- call("{", call("<-", w, 0))
for (i in seq_along(a)) {
ex[[i + 2]] <- call("<-", w, call("+", a[1], call("*",
v, w)))
a <- a[-1]
}
ex[[length(ex) + 1]] <- w
f <- function(x) NULL
body(f) <- ex
f
}
<environment: namespace:polynom>
Since you mention in your comments that getAnywhere was new then it also might be the case that you could gain by reviewing the "run up" to using it. If you type a function name at the console prompt, you get the code, in this case:
> as.function
function (x, ...)
UseMethod("as.function")
<bytecode: 0x7f978bff5fc8>
<environment: namespace:base>
Which is rather unhelpful until you follow it up with:
> methods(as.function)
[1] as.function.default as.function.polynomial*
see '?methods' for accessing help and source code
The asterisk at the end of the polynomial version tells you that the code is not "exported", i.e. available at the console just by typing. So you need to pry it out of a loaded namespace with getAnywhere.
It seems like you could easily write your own function too
poly_function = function(x, p){
sum(sapply(1:length(p), function(i) p[i]*x^(i-1)))
}
# As 42- mentioned in comment to this answer,
# it appears that p can be either a vector or a polynomial
pol = polynomial(c(1, 2))
poly_function(x = 10, p = pol)
#[1] 21
#OR
poly_function(x = 10, p = c(1,2))
#[1] 21
I know about methods(), which returns all methods for a given class. Suppose I have x and I want to know what method will be called when I call foo(x). Is there a oneliner or package that will do this?
The shortest I can think of is:
sapply(class(x), function(y) try(getS3method('foo', y), silent = TRUE))
and then to check the class of the results... but is there not a builtin for this?
Update
The full one liner would be:
fm <- function (x, method) {
cls <- c(class(x), 'default')
results <- lapply(cls, function(y) try(getS3method(method, y), silent = TRUE))
Find(function (x) class(x) != 'try-error', results)
}
This will work with most things but be aware that it might fail with some complex objects. For example, according to ?S3Methods, calling foo on matrix(1:4, 2, 2) would try foo.matrix, then foo.numeric, then foo.default; whereas this code will just look for foo.matrix and foo.default.
findMethod defined below is not a one-liner but its body has only 4 lines of code (and if we required that the generic be passed as a character string it could be reduced to 3 lines of code). It will return a character string representing the name of the method that would be dispatched by the input generic given that generic and its arguments. (Replace the last line of the body of findMethod with get(X(...)) if you want to return the method itself instead.) Internally it creates a generic X and an X method corresponding to each method of the input generic such that each X method returns the name of the method of the input generic that would be run. The X generic and its methods are all created within the findMethod function so they disappear when findMethod exits. To get the result we just run X with the input argument(s) as the final line of the findMethod function body.
findMethod <- function(generic, ...) {
ch <- deparse(substitute(generic))
f <- X <- function(x, ...) UseMethod("X")
for(m in methods(ch)) assign(sub(ch, "X", m, fixed = TRUE), "body<-"(f, value = m))
X(...)
}
Now test it. (Note that the one-liner in the question fails with an error in several of these tests but findMethod gives the expected result.)
findMethod(as.ts, iris)
## [1] "as.ts.default"
findMethod(print, iris)
## [1] "print.data.frame"
findMethod(print, Sys.time())
## [1] "print.POSIXct"
findMethod(print, 22)
## [1] "print.default"
# in this example it looks at 2nd component of class vector as no print.ordered exists
class(ordered(3))
## [1] "ordered" "factor"
findMethod(print, ordered(3))
## [1] "print.factor"
findMethod(`[`, BOD, 1:2, "Time")
## [1] "[.data.frame"
I use this:
s3_method <- function(generic, class, env = parent.frame()) {
fn <- get(generic, envir = env)
ns <- asNamespace(topenv(fn))
tbl <- ns$.__S3MethodsTable__.
for (c in class) {
name <- paste0(generic, ".", c)
if (exists(name, envir = tbl, inherits = FALSE)) {
return(get(name, envir = tbl))
}
if (exists(name, envir = globalenv(), inherits = FALSE)) {
return(get(name, envir = globalenv()))
}
}
NULL
}
For simplicity this doesn't return methods defined by assignment in the calling environment. The global environment is checked for convenience during development. These are the same rules used in r-lib packages.