R object's name carrying through multiple functions - r

In line with my reading of Hadley's advice on building S3 objects I am using a helper function, a constructor function, and a validator function. A simple reproducible example:
test_object <- function(x, y, z) {
new_test_object(x, y, z)
}
new_test_object <- function(x, y, z) {
structure(list(x = x,
y = y,
z = z,
x_name = deparse(substitute(x))),
class = "test_object")
}
validate_test_object <- function(test_object) {
# Anything goes
test_object
}
I would like the resulting object to include a value with the original name that the item passed in had ($x_name in the above example). The deparse(substitute(...)) trick works if I call the constructor directly:
alpha = "a"
test_constructor <- new_test_object(x = alpha, y = "b", z = "c")
test_constructor$x_name
# [1] "alpha"
But not if I use the helper function:
test_helper <- test_object(x = alpha, y = "b", z = "c")
test_helper$x_name
# [1] "x"
I would like test_helper$x_name to also return [1] "alpha".
Short of doing the deparse(substitute(...)) step at the helper stage, is there any way of the constructor function (new_test_object()) accessing the 'original' name of the object x if it has come via the helper? Or to ensure that its name passes through with it as the helper function passes it to the constructor?

What's really the purpose here? If you are just using one function as a wrapper to another, then there are better ways of preserving arguments. For example
test_object <- function(x, y, z) {
call <- match.call()
call[[1]] <- quote(new_test_object)
eval(call)
}
But in general relying on deparse() to get information from names of variables isn't a very reliable method. It would be better to have such pieces of information be proper parameters that you can set if you like. This makes your functions much more flexible.
test_object <- function(x, y, z, xname=deparse(substitute(x))) {
new_test_object(x, y, z, xname=xname)
}
new_test_object <- function(x, y, z, xname=deparse(substitute(x))) {
structure(list(x = x,
y = y,
z = z,
x_name = xname),
class = "test_object")
}

Here is a not beautifull fix: you add ... argument to pass the name when you are calling it from another function
test_object <- function(x, y, z) {
x_name = deparse(substitute(x))
new_test_object(x, y, z, x_name = x_name)
}
new_test_object <- function(x, y, z, ...) {
args <- list(...)
if(is.null(args[["x_name"]])){
structure(list(x = x,
y = y,
z = z,
x_name = deparse(substitute(x))),
class = "test_object")
}
else{
structure(list(x = x,
y = y,
z = z,
x_name = args[["x_name"]]),
class = "test_object")
}
}
And here is the result:
test_helper <- test_object(x = alpha, y = "b", z = "c")
test_helper$x_name
# [1] "alpha"

Related

r how to keep print method for custom class

i have defined a method for printing a vector with the class test:
print.test <- function(x, ...) {
x <- formatC(
as.numeric(x),
format = "f",
big.mark = ".",
decimal.mark = ",",
digits = 1
)
x[x == "NA"] <- "-"
x[x == "NaN"] <- "-"
print.default(x)
}
which works fine for the following
a <- c(1000.11, 2000.22, 3000.33)
class(a) <- c("test", class(a))
print(a)
[1] "1.000,11" "2.000,22" "3.000,33"
this also works:
round(a)
[1] "1.000,0" "2.000,0" "3.000,0"
this does not:
median(a)
[1] 2000.22
class(median(a))
[1] "numeric"
now my question is: do i need to write a custom method for this class to use median e.g. and if so what would it look like or is there another way (as i simply would like this class to print the data in a certain format)?
The problem is that median.default returns an object of class numeric therefore autoprinting of the returned object does not call your custom print method.
The following will do so.
median.test <- function(x, na.rm = FALSE, ...){
y <- NextMethod(x, na.rm = na.rm, ...)
class(y) <- c("test", class(y))
y
}
median(a)
#[1] "2.000,2"
As for the handling of NA values, I will first define another method for a base R function. It is not strictly needed but save some code lines if objects of class test are used frequently.
c.test <- function(x, ...){
y <- NextMethod(x, ...)
class(y) <- c("test", class(y))
y
}
b <- c(a, NA)
class(b)
#[1] "test" "numeric"
median(b)
#[1] "-"
median(b, na.rm = TRUE)
#[1] "2.000,2"
EDIT.
The following defines a generic function wMedian, a default method and a method for objects of class "currency", as requested by the OP in a comment.
Note that there must be a method print.currency, which I don't redefine since it's exactly the same as print.test above. As for the other methods, I have made them simpler with the help of a new function, as.currency.
median.currency <- function(x, na.rm = FALSE, ...){
y <- NextMethod(x, na.rm = na.rm, ...)
as.currency(y)
}
c.currency <- function(x, ...){
y <- NextMethod(x, ...)
as.currency(y)
}
as.currency <- function(x){
class(x) <- c("currency", class(x))
x
}
wMedian <- function(x, ...) UseMethod("wMedian")
wMedian.default <- function(x, ...){
matrixStats::weightedMedian(x, ...)
}
wMedian.currency <- function(x, w = NULL, idxs = NULL, na.rm = FALSE, interpolate = is.null(ties), ties = NULL, ...) {
y <- NextMethod(x, w = w, idxs = idxs, na.rm = na.rm, interpolate = interpolate, ties = ties, ... )
as.currency(y)
}
set.seed(1)
x <- rnorm(10)
wMedian(x, w = (1:10)/10)
#[1] 0.4084684
wMedian(as.currency(x), w = (1:10)/10)
#[1] "0,4"

Create a one-row dataframe where every column is the same as an existing variable

I often write code like this:
answer.df = data.frame(x = numeric(0), y = numeric(0), z = numeric(0))
for (i in 1:100) {
x = do_stuff(i)
y = do_more_stuff(i)
z = yet_more_stuff(i)
# Is there a better way of doing this:
temp.df = data.frame(x = x, y = y, z = z)
answer.df = rbind(answer.df, temp.df)
}
My question is, in the line temp.df = data.frame(x = x, y = y, z = z), is there a neater way of doing this? Imagine it with ten or more variables to understand my problem.
Try this:
do.call("rbind", lapply(1:100, function(i) list(x = xfun(i), y = yfun(i))))
Also try rbindlist from data.table which may have some performance advantages:
library(data.table)
rbindlist(lapply(1:100, function(i) list(x = xfun(i), y = yfun(i))))

Create R Function with flexibility to reference different datasets

I am trying to create a simple function in R that can reference multiple datasets and multiple variable names. Using the following code, I get an error, which I believe is due to referencing:
set.seed(123)
dat1 <- data.frame(x = sample(10), y = sample(10), z = sample(10))
dat2 <- data.frame(x = sample(10), y = sample(10), z = sample(10))
table(dat1$x, dat1$y)
table(dat2$x, dat2$y)
fun <- function(dat, sig, range){print(table(dat$sig, dat$range))}
fun(dat = dat1, sig = x, range = y)
fun(dat = dat2, sig = x, range = y)
Any idea how to adjust this code so that it can return the table appropriately?
The [[ ]] operator on data frame is similar to $ but allows you to introduce an object and look for it's value. Then outside of the function you assign "x" value to sig. if you don't put quotes there R will look for x object
fun <- function(dat, sig, range){print(table(dat[[sig]], dat[[range]]))}
fun(dat = dat1, sig = "x", range = "y")
fun(dat = dat2, sig = "x", range = "y")

Intercepting & using the value of an optional variable captured in the dots (...)

I need to intercept the value of an optional xlim in a function so that I can change the units of it before plotting. The following function confirms that xlim was passed, but I can't access the value.
foo <- function(x, y, ...) {
if ("xlim" %in% names(list(...))) {
print(xlim) # not found/can't use value!
}
# modify xlim and pass to plotting functions
return()
}
But foo(x = 1:5, y = 1:5, xlim = c(2,4)) gives:
Error in print(xlim) : object 'xlim' not found
What trick do I need use the value? Seems like it should just work, but I see from looking around on SO that the dots can be vexing. I've played a bit with exists, deparse etc but I don't really 'get' the proper use of those functions.
EDIT: so here is the final snippet which was the leanest way to access the value:
dots <- list(...)
if (any(names(dots) == "xlim")) {
xlim <- dots$xlim
print(xlim)
}
This is because xlim is actually a list element, and is not (yet) an actual object in the function's environment. You could do
foo <- function(x, y, ...) {
m <- match.call(expand.dots = FALSE)$...
if(any(names(m) == "xlim")) m[["xlim"]]
else stop("no xlim value")
}
foo(x = 1:5, y = 1:5, xlim = c(2,4))
# c(2, 4)
foo(x = 1:5, y = 1:5, ylim = c(2,4))
# Error in foo(x = 1:5, y = 1:5, ylim = c(2, 4)) : no xlim value
You can see what match.call is doing if we examine the function as
f <- function(x, y, ...) {
match.call(expand.dots = FALSE)$...
}
It is a list of all the entered dot arguments with their respective expressions, so there are many different ways to get the values, the above is just one way.
f(x = 1:5, y = 1:5, xlim = c(2,4))
# $xlim
# c(2, 4)
Alternatively, you could do
g <- function(x, y, ...) {
dots <- list(...)
any(names(dots) == "xlim")
}
g(x = 1:5, y = 1:5, xlim = c(2,4))
# [1] TRUE
Also keep in mind that match.call keeps the argument as an unevaluated call, while list(...) evaluates the argument. This might be important for you passing the argument to other functions.

Generalizing `...` (three dots) argument dispatch: S4 methods for argument set including `...`

Actual question
Is it possible to define methods for a set of signature arguments that includes ... (as opposed to exclusively for ...)? It's not possible "out-of-the-box", but would it theoretically be possible at all (involving some tweaks) or is this something that simply cannot be done due to the way the S4 mechanism is designed?
I'm looking for something along the lines of
setGeneric(
name = "foo",
signature = c("x", "..."),
def = function(x, ...) standardGeneric("foo")
)
setMethod(
f = "foo",
signature = signature(x = "character", "..." = "ThreedotsRelevantForMe"),
definition = function(x, ...) bar(x = x)
)
Martin Morgan thankfully pointed me to dotsMethods and it says this:
Currently, “...” cannot be mixed with other formal arguments: either the signature of the generic function is “...” only, or it does not contain “...”. (This restriction may be lifted in a future version.)
Background
Consider the following attempt to generalize the dispatching mechanism based on ... from a simple case (only one more function is supposed to use arguments passed via ...; e.g. the use of ... in plot() for passing arguments to par()) to scenarios involving the following aspects (taken from here):
when you would like to pass along arguments to more than one, hence r, recipients,
when those recipients can be located on c different layers of the calling stack
and when they might even use the same argument names but associate different meanings to these arguments in their very own scope/closure/frame/environment
Also note that, while it may indeed be good practice to do so, top-level functions/interfaces should not necessarily need to be concerned with definining (lots of) explicit arguments of subsequently called functions/interfaces in order to pass arguments correctly. IMO, this choice should be left to the developer as sometimes one or the other alternative makes more sense.
It would be cool if I could substitute the dispatch that is currently handled via withThreedots() (which AFAICT would need to involve an actual split-up of ...) with the S4 dispatcher somehow, thus ideally simply being able to call foo(x = x, ...) instead of withThreedots("foo", x = x, ...) in foobar():
Definitions
withThreedots <- function(fun, ...) {
threedots <- list(...)
idx <- which(names(threedots) %in% sprintf("args_%s", fun))
eval(substitute(
do.call(FUN, c(THREE_THIS, THREE_REST)),
list(
FUN = as.name(fun),
THREE_THIS = if (length(idx)) threedots[[idx]],
THREE_REST = if (length(idx)) threedots[-idx] else threedots
)
))
}
foobar <- function(x, ...) {
withThreedots("foo", x = x, ...)
}
foo <- function(x = x, y = "some text", ...) {
message("foo/y")
print(y)
withThreedots("bar", x = x, ...)
}
bar <- function(x = x, y = 1, ...) {
message("bar/y")
print(y)
withThreedots("downTheLine", x = x, ...)
}
downTheLine <- function(x = x, y = list(), ...) {
message("downTheLine/y")
print(y)
}
Apply
foobar(x = 10)
foobar(x = 10, args_foo = list(y = "hello world!"))
foobar(x = 10, args_bar = list(y = 10))
foobar(x = 10, args_downTheLine = list(y = list(a = TRUE)))
foobar(x = 10,
args_foo = list(y = "hello world!"),
args_bar = list(y = 10),
args_downTheLine = list(y = list(a = TRUE))
)
# foo/y
# [1] "hello world!"
# bar/y
# [1] 10
# downTheLine/y
# $a
# [1] TRUE
Conceptional approach (MOSTLY PSEUDO CODE)
I guess I'm looking for something along the lines of this:
Definitions
setGeneric(
name = "foobar",
signature = c("x"),
def = function(x, ...) standardGeneric("foobar")
)
setMethod(
f = "foobar",
signature = signature(x = "ANY"),
definition = function(x, ...) pkg.foo::foo(x = x, ...)
)
Assumption: foo() is defined in package/namespace pkg.foo
setGeneric(
name = "foo",
signature = c("x", "y", "..."),
def = function(x, y = "some text", ...) standardGeneric("foo")
)
setMethod(
f = "foo",
signature = signature(x = "ANY", y = "character", "..." = "Threedots.pkg.foo.foo"),
definition = function(x, y, ...) {
message("foo/y")
print(y)
pkg.bar::bar(x = x, ...)
}
)
Assumption: bar() is defined in package/namespace pkg.bar:
setGeneric(
name = "bar",
signature = c("x", "y", "..."),
def = function(x, y = 1, ...) standardGeneric("bar")
)
setMethod(
f = "bar",
signature = signature(x = "ANY", y = "numeric", "..." = "Threedots.pkg.bar.bar"),
definition = function(x, y, ...) {
message("bar/y")
print(y)
pkg.a::downTheLine(x = x, ...)
)
setGeneric(
name = "downTheLine",
signature = c("x", "y", "..."),
def = function(x, y = list(), ...) standardGeneric("downTheLine")
)
Assumption: downTheLine() is defined in package/namespace pkg.a:
setMethod(
f = "downTheLine",
signature = signature(x = "ANY", y = "list", "..." = "Threedots.pkg.a.downTheLine"),
definition = function(x, y, ...) {
message("downTheLine/y")
print(y)
return(TRUE)
)
Illustration what the dispatcher would need to do
The crucial part is that it would have to be able to distinguish between those elements in ... that are relevant for the respective current fun being called (based on a full S4 dispatch on regular and threedots signature arguments) and those elements that should be passed along to functions that fun might be calling (i.e., an updated state of ...; similar to what's happening inside withThreedots() above):
s4Dispatcher <- function(fun, ...) {
threedots <- splitThreedots(list(...))
## --> automatically split `...`:
## 1) into those arguments that are part of the signature list of `fun`
## 2) remaining part: everything that is not part of
## the signature list and that should thus be passed further along as an
## updated version of the original `...`
args_this <- threedots$this
## --> actual argument set relevant for the actual call to `fun`
threedots <- threedots$threedots
## --> updated `...` to be passed along to other functions
mthd <- selectMethod(fun, signature = inferSignature(args_this))
## --> `inferSignature()` would need to be able to infer the correct
## signature vector to be passed to `selectMethod()` from `args_this`
## Actual call //
do.call(mthd, c(args_this, threedots))
}
Here's an illustration of how a generator for a "typed three dots argument container" could look like.
Note that in order for such a mechanism to work across packages, it would probably make sense to also offer a possibility to state the namespace of a certain function (arg ns and field .ns):
require("R6")
Threedots <- function(..., fun, ns = NULL) {
name <- if (!is.null(ns)) sprintf("Threedots.%s.%s", ns, fun) else
sprintf("Threedots.%s", fun)
eval(substitute({
INSTANCE <- R6Class(CLASS,
portable = TRUE,
public = list(
.args = "list", ## Argument list
.fun = "character", ## Function name
.ns = "character", ## Namespace of function
initialize = function(..., fun, ns = NULL) {
self$.fun <- fun
self$.ns <- ns
self$.args <- structure(list(), names = character())
value <- list(...)
if (length(value)) {
self$.args <- value
}
}
)
)
INSTANCE$new(..., fun = fun, ns = ns)
},
list(CLASS = name, INSTANCE = as.name(name))
))
}
Example
x <- Threedots(y = "hello world!", fun = "foo", ns = "pkg.foo")
x
# <Threedots.pkg.foo.foo>
# Public:
# .args: list
# .fun: foo
# .ns: pkg.foo
# initialize: function
class(x)
# [1] "Threedots.pkg.foo.foo" "R6"
x$.args
# $y
# [1] "hello world!"
The actual calls would then look like this:
foobar(x = 10)
foobar(x = 10, Threedots(y = "hello world!", fun = "foo", ns = "pkg.foo"))
foobar(x = 10, Threedots(y = 10, fun = "bar", ns = "pkg.bar"))
foobar(x = 10, Threedots(y = list(a = TRUE), fun = "downTheLine", ns = "pkg.a")))
foobar(x = 10,
Threedots(y = "hello world!", fun = "foo", ns = "pkg.foo"),
Threedots(y = 10, fun = "bar", ns = "pkg.bar),
Threedots(y = list(a = 10), fun = "downTheLine", ns = "pkg.a")
)
See ?setGeneric and search for '...', and then ?dotsMethods. It's possible to define a generic that dispatches on ... (only, not mixed with other arguments for dispatch).
.A = setClass("A", contains="numeric")
.B = setClass("B", contains="A")
setGeneric("foo", function(...) standardGeneric("foo"))
setMethod("foo", "A", function(...) "foo,A-method")
setGeneric("bar", function(..., verbose=TRUE) standardGeneric("bar"),
signature="...")
setMethod("bar", "A", function(..., verbose=TRUE) if (verbose) "bar,A-method")
leading to
> foo(.A(), .B())
[1] "foo,A-method"
> bar(.A(), .B())
[1] "bar,A-method"
> bar(.A(), .B(), verbose=FALSE)
>
I don't know if this fits the rest of your scenario.

Resources