Generic "classes" inside R - r

I have written a stack "class" with the following functions: add, push, pop, size, isEmpty, clear (and some more).
I'd like to use this "class" as a generic in R, so I may create multiple instances of stacks within my script. How do I go about doing this?
(I have class in quotes because my stack functions are written in a different script (not necessarily the definition of a class per se)
Thanks in advance
list <- ""
cursor = 0
#Initializes stack to empty
stack <- function(){
list <- c()
cursor = -1
assign("list",list,.GlobalEnv)
assign("cursor",cursor,.GlobalEnv)
}
#Where item is a item to be added to generic list
push <- function(item){
if(size(list) == 0){
add(item, -1)
}else{
add(item, 0)
}
assign("list",list,.GlobalEnv)
}

This is a simpler version of the stack implementation #GSee references that avoids using any of the formal object-orientation systems available in R. Simplification proceeds from the fact that all functions in R are closures and functions created during a function call are bound to the environment created for that call.
new_stack <- function() {
stack <- vector()
push <- function(x) stack <<- c(stack, x)
pop <- function() {
tmp<-tail(stack, 1)
stack<<-stack[-length(stack)]
return(tmp)
}
structure(list(pop=pop, push=push), class='stack')
}
x <- new_stack()
x$push(1:3)
x$pop()
# [1] 3
x$pop()
# [1] 2
Here's an S4 implementation, for comparison.
setClass('Stack',
representation(list='list', cursor='numeric'), # type defs
prototype(list=list(), cursor=NA_real_)) # default values
setGeneric('push', function(obj, ...) standardGeneric('push'))
setMethod('push', signature(obj='Stack'),
function(obj, x) {
obj#list <- c(x, obj#list)
obj
})
setGeneric('pop', function(obj, ...) standardGeneric('pop'))
setMethod('pop', signature(obj='Stack'),
function(obj) {
obj#cursor <- obj#list[[1]]
obj#list <- obj#list[-1]
obj
}
)
x <- new('Stack')
# cursor is empty to start
x#cursor
#[1] NA
# add items
x <- push(x, 1)
x <- push(x, 2)
# pop them (move next item to cursor, remove from list)
x <- pop(x)
x#cursor
# [1] 2
x <- pop(x)
x#cursor
# [1] 1

Since you are specifically talking about a stack "class" with push and pop methods, here's an implementation by Jeff Ryan taken from Introducing Closures which you can read for an explanation of what's going on here.
new_stack <- function() {
stack <- new.env()
stack$.Data <- vector()
stack$push <- function(x) .Data <<- c(.Data,x)
stack$pop <- function() {
tmp <- .Data[length(.Data)]
.Data <<- .Data[-length(.Data)]
return(tmp)
}
environment(stack$push) <- as.environment(stack)
environment(stack$pop) <- as.environment(stack)
class(stack) <- "stack"
stack
}
> x <- new_stack()
> x$push(1:3)
> x$pop()
[1] 3
> x$pop()
[1] 2
Then, if you create S3 generics...
push <- function(x, value, ...) UseMethod("push")
pop <- function(x, ...) UseMethod("pop")
push.stack <- function(x, value, ...) x$push(value)
pop.stack <- function(x) x$pop()
> push(x, 5)
> pop(x)
[1] 5
> pop(x)
[1] 1

Related

R: instrument function to capture all assignments

Given a regular R function f, I'd like to be able to create a new function f_debug that acts just like f, but lets me keep track of all the assignments to function-local variables that happened inside it.
For example:
f <- function(x, y) {
z <- x + y
df <- data.frame(z=z)
df
}
# This function doesn't work as intended - would like it to (in the case of `f` above)
# write out a list containing `z` and `df` to an RDS file
capturing <- function(func) {
e <- new.env()
altered <- function(...) {
parent <- parent.frame()
e <- something...(func, environment(), parent, etc., etc.)
result <- func(...)
saveRDS(as.list(e), 'foo.rds')
result
}
environment(func) <- e
altered
}
f_debug <- capturing(f)
I'm not sure whether my knowledge gap to do this is large or small, anyone have a solution?
Solution 1: Steal the function's code
Here's a solution which doesn't return a new function which captures intermediate calculations, but rather calls the given function's code internally. There's some limitations, such as it probably only works with named arguments. Instead of storing the intermediate calculations as an RDS, it attaches them as an attribute.
capturing <- function(fun, ...) {
fun <- match.fun(fun)
code <- body(fun)
parent <- environment(fun)
env <- new.env(parent = parent)
for (val in names(list(...))) {
env[[val]] <- list(...)[[val]]
}
result <- eval(code, envir = env, enclos = parent.frame())
attr(result, "intermediate") <- env
result
}
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
intermediates <- function(x) {
attr(x, "intermediate", exact = TRUE)
}
value <- capturing(my_add, x = 1, y = 7)
ls(envir = intermediates(value))
#> [1] "u" "w" "x" "y" "z"
intermediates(value)$x
#> [1] 1
# Created on 2022-02-08 by the reprex package (v2.0.1)
Solution 2: Modify the function's code
One weakness of this solution is that if the chosen function features a call to on.exit(add=FALSE), some additional work needs to be done to modify the function so the internal environment is captured. However, it does work when the function accepts ... arguments.
my_add <- function(x, y) {
z <- x+y
u <- x-y
w <- x*y
x + y
}
insert_capture <- function(code) {
# `<<-` assigns into the global environment if no variable of the given name is found
# while traveling up to the global environment. If you need this assignment to go elsewhere,
# I'd recommend passing in `assign()`. Of course, you could also modify the `on.exit()`
# to use saveRDS.
parse(text=append(deparse(code),
"on.exit(._last_capture <<- environment(), add = TRUE)",
after = 1L))
}
capturing2 <- function(fun) {
fun <- match.fun(fun)
code <- insert_capture(body(fun))
body(fun) <- code
fun
}
my_add2 <- capturing2(my_add)
my_add2(1, 7)
#> [1] 8
ls(envir = ._last_capture)
#> [1] "u" "w" "x" "y" "z"
._last_capture$u
#> [1] -6
Created on 2022-02-08 by the reprex package (v2.0.1)
What you are describing is already implemented in base R with utils::dump.frames, in an even more sophisticated way. It saves the frame (environment) associated with each call in the call stack to an object of class "dump.frames", which you can explore retroactively with utils::debugger as if you had actually run your code under a debugger.
capturing <- function(func, ...) {
cc <- as.call(c(quote(utils::dump.frames), list(...)))
cc <- call("on.exit", cc, add = TRUE)
body(func) <- call("{", cc, body(func))
func
}
capturing injects the call on.exit(utils::dump.frames(...), add = TRUE) into the body of func and returns the modified function.
Here, ... is a list of arguments to dump.frames:
dumpto, a character string giving the name to be used for the "dump.frames" object
to.file, a logical flag indicating whether the "dump.frames" object should be assigned in the global environment or save-ed to paste0(dumpto, ".rda") in the current working directory
include.GlobalEnv, a logical flag indicating whether the global environment should be saved as well
A quick example, which you should try yourself:
tmp <- tempfile()
dir.create(tmp)
cwd <- setwd(tmp)
f <- function(x, y) {
z <- x + y
z + 1
}
g <- capturing(f, dumpto = "zzz", to.file = TRUE)
h <- function(a, b) {
d <- g(a, b)
d + 1
}
h12 <- h(1, 2)
load("zzz.rda")
zzz
## $`h(1, 2)`
## <environment: 0x14c16cb58>
##
## $`#2: g(a, b)`
## <environment: 0x14c16ca40>
##
## attr(,"error.message")
## [1] ""
## attr(,"class")
## [1] "dump.frames"
ls(zzz[[1L]])
## [1] "a" "b"
ls(zzz[[2L]])
## [1] "z" "x" "y"
utils::debugger(zzz)
## Message: Available environments had calls:
## 1: h(1, 2)
## 2: #2: g(a, b)
##
## Enter an environment number, or 0 to exit
## Selection: 2
## Browsing in the environment with call:
## #2: g(a, b)
## Called from: debugger.look(ind)
## Browse[1]> ls()
## [1] "x" "y" "z"
## Browse[1]> x == 1 && y == 2 && z == x + y
## [1] TRUE
## Browse[1]> Q
setwd(cwd)
unlink(tmp, recursive = TRUE)
See ?browser if you are unfamiliar with R's environment browser.
My capturing function has the limitation that on.exit calls in the body of func must also use add = TRUE. If you have written func yourself, then it is not much of a limitation at all, and passing add = TRUE is a good habit anyway.
Ultimately, there is no completely safe way to inject code into functions, but, in an interactive setting, I would say that this level of "unsafety" is fine.

How can I change the behavior of the $ operator in environments?

I want to override the behavior of the dollar operator, so that if I have
x <- new.env()
x$foo <- 3
will e.g. call something. I tried to look for possible functions such as $, but my knowledge of the internals is not good enough.
I tried this:
`$` <- function(a, b) {
res <- .Primitive("$")(a, b);
print(res);
if(is.null(res)) { print("null!") };
return(res)
}
It kind of seem to work, but:
> x$foobar
NULL
[1] "null!"
NULL
> x$foobar <- 3
> x$foobar
NULL
[1] "null!"
NULL
>
So it seems to stay null despite the override.
Normal behavior of R's environments:
myenv <- new.env(parent = emptyenv())
myenv$foo <- 3
class(myenv)
# [1] "environment"
myenv$foo
# [1] 3
myenv$foobar
# NULL
Let's define a super-class (I'll name it environment2, feel free to be creative here) and override $ for that class:
class(myenv) <- c("environment2", "environment")
`$.environment2` <- function(x, name) {
stopifnot(name %in% names(x))
NextMethod()
}
myenv$foo
# [1] 3
myenv$foobar
# Error in `$.environment2`(myenv, foobar) : name %in% names(x) is not TRUE
You can easily clean up that error if you'd like, either using an if statement with stop, or (in R-4 or newer) naming the conditions in stopifnot.
`$.environment2` <- function(x, name) {
if (!name %in% names(x)) stop("something meaningful", call. = FALSE)
NextMethod()
}
`$.environment2` <- function(x, name) {
stopifnot(
"something meaningful" = name %in% names(x)
)
NextMethod()
}
### both render
myenv$foobar
# Error in `$.environment2`(myenv, foobar) : something meaningful
They are relatively equivalent, but with if/stop, you can reduce the error context:
`$.environment2` <- function(x, name) {
if (!name %in% names(x)) stop("something meaningful", call. = FALSE)
NextMethod()
}
myenv$foobar
# Error: something meaningful

Elegant way to define a function inside another function

I want to construct
f <- function(...) {
g <- function(x) x ^ 2
list(...)
}
so that I can invoke using f(g(4)) and have list(...) result in list(16).
In general I will define several temporary functions inside f that the user can invoke when calling f(...).
I have experimented with assign and newenvironment but have just gotten more confused. Help with an elegant solution is appreciated.
The reason for wanting this is that I want a function in the Hmisc package, drawPlot to be able to let the users specify generically named functions as input for building up a series of graphical elements, and I don't want to reserve these generic-type names. E.g.:
d <- drawPlot(Curve(), Points()) # interactively make a curve and
# a set of points
I'm guessing you in fact need something more elaborate than this, but the following does what you've asked for in your supplied example:
f <- function(...) {
g <- function(x) x ^ 2
list(eval(substitute(...)))
}
f(g(4))
# [[1]]
# [1] 16
Or, if users may supply one or more function calls, something like this:
f <- function(...) {
g <- function(x) x ^ 2
h <- function(x) 100*x
cc <- as.list(substitute(list(...))[-1])
res <- list()
for(i in seq_along(cc)) {
res[[i]] <- eval(cc[[i]])
}
res
}
f(g(4), h(5))
# [[1]]
# [1] 16
#
# [[2]]
# [1] 500
Very similar to this answer but I think maybe more extensible and closer to your original idea:
match.fun_wrapper <- function(...) {
# `match.fun` searches in the parent environment of the environment that
# calls `match.fun`, so this wrapper is a hack to be able to search in
# the current environment rather than the parent of the current environemnt
match.fun(...)
}
f <- function(fun, ...) {
g <- function(x) x ^ 2
fun <- match.fun_wrapper(substitute(fun))
fun(...)
}
If you wanted to do away with match.fun, you could also do away with the wrapper hack:
f <- function(fun, ...) {
g <- function(x) x ^ 2
fun(...)
}
It looks to me like what you're trying to do is something like this:
f <- function(fun, ...) {
g <- function(x) x ^ 2
h <- function(x) x ^ 3
i <- function(x) x ^ 4
switch(fun,
'g' = g(...),
'h' = h(...),
'i' = i(...))
}
> f('g', 3)
[1] 9
> f('h', 3)
[1] 27
> f('i', 3)
[1] 81
It's not obvious why you would want to, unless you're just trying to encapsulate functions with similar names inside different namespaces and using this as a hacky workaround for the fact R doesn't offer fully-featured classes. If that's the case, you can also just use actual namespaces, i.e. put your functions inside a package so they're called by package::g(arg) instead of f('g', arg).

Proper way to have two functions access a single function's environment?

Based on the answer provided in1088639, I set up a pair of functions which both access the same sub-function's environment. This example works, but I wanted to see if I'd missed some cleaner way to "connect" both top-level functions to the internal environment.
( Back story: I wanted to write a pair of complementary functions which shared a variable, e.g. "count" in this example, and meet CRAN package requirements which do not allow functions to modify the global environment. )
static.f <- function() {
count <- 0
f <- function(x) {
count <<- count + 1
return( list(mean=mean(x), count=count) )
}
return( f )
}
# make sure not to delete this command, even tho' it's not
# creating a function.
f1 <- static.f()
statfoo <- function(x){
tmp<-f1(x)
tmp<- list(tmp,plus=2)
return(tmp)
}
statbar <- function(x){
tmp<-f1(x)
tmp<- list(tmp,minus=3)
return(tmp)
}
Sample output:
> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 1
$plus
[1] 2
Rgames> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 2
$plus
[1] 2
> statbar(4)
[[1]]
[[1]]$mean
[1] 4
[[1]]$count
[1] 3
$minus
[1] 3
> statfoo(5)
[[1]]
[[1]]$mean
[1] 5
[[1]]$count
[1] 4
$plus
[1] 2
A cleaner method would be to use an object oriented approach. There is already an answer using reference classes.
A typical object oriented approach with classes would create a class and then create a singleton object, i.e. a single object of that class. Of course it is a bit wasteful to create a class only to create one object from it so here we provide a proto example. (Creating a function to enclose count and the function doing the real work has a similar problem -- you create an enclosing function only to run it once.) The proto model allows one to create an object directly bypassing the need to create a class only to use it once. Here foobar is the proto object with property count and methods stats, statfoo and statbar. Note that we factored out stats to avoid duplicating its code in each of statfoo and statbar. (continued further down)
library(proto)
foobar <- proto(count = 0,
stats = function(., x) {
.$count <- .$count + 1
list(mean = mean(x), count = .$count)
},
statfoo = function(., x) c(.$stats(x), plus = 2),
statbar = function(., x) c(.$stats(x), plus = 3)
)
foobar$statfoo(1:3)
foobar$statbar(2:4)
giving:
> foobar$statfoo(1:3)
$mean
[1] 2
$count
[1] 1
$plus
[1] 2
> foobar$statbar(2:4)
$mean
[1] 3
$count
[1] 2
$plus
[1] 3
A second design would be to have statfoo and statbar as independent functions and only keep count and stats in foobar (continued further down)
library(proto)
foobar <- proto(count = 0,
stats = function(., x) {
.$count <- .$count + 1
list(mean = mean(x), count = .$count)
}
)
statfoo <- function(x) c(foobar$stats(x), plus = 2)
statbar <- function(x) c(foobar$stats(x), plus = 3)
statfoo(1:3)
statbar(2:4)
giving similar output to the prior example.
Third approach Of course the second variation could easily be implemented by using local and a function getting us close to where you started. This does not use any packages but does not create a function only to throw it away:
foobar <- local({
count <- 0
function(x) {
count <<- count + 1
list(mean = mean(x), count = count)
}
})
statfoo <- function(x) c(foobar(x), plus = 2)
statbar <- function(x) c(foobar(x), plus = 3)
statfoo(1:3)
statbar(2:4)
Another simple option is tocreate an environment and assign it to both functions. Here I use simpler functions for illustrative purposes, but this can be easily extended:
f1 <- function() {count <<- count + 1; return(paste("hello", count))}
f2 <- function() {count <<- count + 1; return(paste("goodbye", count))}
environment(f1) <- environment(f2) <- list2env(list(count=0))
Then:
> f1()
[1] "hello 1"
> f2()
[1] "goodbye 2"
> f1()
[1] "hello 3"
Both functions have the same environment.
You can use reference class like this:
foobar <- setRefClass(
'foobar',
fields = list(count='numeric'),
methods = list(
initialize=function() {
.self$initFields(count = 0L)
},
statfoo = function(x) {
count <<- count + 1L
list(list(mean=mean(x), count=count), plus=2)
},
statbar = function(x){
count <<- count + 1L
list(list(mean=mean(x), count=count), minus=3)
}
)
)()
foobar$statfoo(5)
foobar$statbar(3)
It makes it relatively clear that neither statfoo nor statbar is a pure function.
You could get rid of the factory functions, and more explicitly use environments. A solution like this would work as well
.env<-(function() {
count <- 0
f <- function(x) {
count <<- count + 1
return( list(mean=mean(x), count=count))
}
return(environment())
})()
statfoo <- function(x){
list(.env$f(x),plus=2)
}
statbar <- function(x){
list(.env$f(x),minus=3)
}
The .env variable is created by immediately executing an anonymous function to get its environment. We then extract the function from the environment itself to modify its values.

A better way to push and pop to/from lists in R?

If foo <- list(), I find myself writing foo[[length(foo)+1]] <- bar a lot when really I just want to write push(foo, bar).
Similarly (though much less frequently) bar <- foo[[length(foo)]] would be much nicer as bar <- pop(foo).
Is there a better way to do this in base R?
Failing that, has anyone written a package that makes these basic list operations less syntactically torturous?
It's the repetition of the variable name that kills me. Consider:
anInformative.selfDocumenting.listName[[length(anInformative.selfDocumenting.listName)+1]] <- bar
edit:
foo <- append(foo, bar) doesn't work for me
foo <- list()
for (i in 1:10) {
x <- data.frame(a=i, b=rnorm(1,0,1))
foo[[length(foo)+1]] <- x
}
str(foo)
Gives a list of 10 objects as expected.
foo <- list()
for (i in 1:10) {
x <- data.frame(a=i, b=rnorm(1,0,1))
foo <- append(foo, x)
}
str(foo)
Gives a list of 20 objects.
foo[[length(foo)+1]] <- bar
could be rewritten as
foo <- append(foo, bar)
and
bar <- foo[[length(foo)]]
could be rewritten as
bar <- tail(foo,1)
Note that with functional languages such as R functions generally aren't supposed to changes values of parameters passed to them; typically you return a new object which you can assign elsewhere. The function should have no "side effects." Functions like push/pop in other languages usually modify one of the passed parameters.
A few years late to the party, but here's the approach that requires the least amount of typing. Note that pop removes the last element from the object.
push <- function(obj, input) {
variable <- deparse(substitute(obj))
out <- get(variable, envir = parent.frame(n=2))
out[[length(out)+1]] <- input
assign(variable, out, envir = parent.frame(n=2))
}
pop <- function(obj) {
variable <- deparse(substitute(obj))
obj <- get(variable, envir = parent.frame(n=2))
if (length(obj) > 0) {
out <- obj[[length(obj)]]
assign(variable, obj[-length(obj)], envir = parent.frame(n=2))
}else{
out <- NULL
assign(variable, out, envir = parent.frame(n=2))
}
return(out)
}
Example:
> foo <- list()
> push(foo, "a")
> push(foo, "b")
> foo
[[1]]
[1] "a"
[[2]]
[1] "b"
> pop(foo)
[1] "b"
> foo
[[1]]
[1] "a"
pushR = function(foo, bar){
foo[[length(foo)+1]] <- bar
foo
}
popR=function(foo){
foo[[length(foo)]]
}
foo <- list()
for (i in 1:10) {
x <- data.frame(a=i, b=rnorm(1,0,1))
foo <- pushR(foo, x)
}
popR(foo)
foo <- list()
for (i in 1:10) {
x <- data.frame(a=i, b=rnorm(1,0,1))
foo[[length(foo)+1]] <- x
}
Can be written as:
foo <- list()
for (i in 1:10) {
x <- data.frame(a=i, b=rnorm(1,0,1))
foo[[i]] <- x
}
if you are worried about repeating the variable names. I agree through that lapply works more efficiently, both in performance and length of your overall code.
This is simple and clean with closures
makeStack <- function() {
valueList <- list()
list(
push = function(value) {
valueList <<- append(valueList, value)
},
pop = function() {
value <- utils::tail(valueList, 1)[[1]]
valueList <<- utils::head(valueList, -1)
value
}
)
}
> stack <- makeStack()
> stack$push(3)
> stack$push(5)
> stack$pop()
[1] 5
> stack$pop()
[1] 3

Resources