how can I pass argument (names) to a function factory? - r

I need to build a lot of functions with lots of different arguments, though they otherwise share a lot of code and structure.
To avoid duplication, I thought I'd be clever and build myself a function factory (aka closure).
I can't figure out how to pass the function arguments inside the function factory.
My use case is a bunch of S3 constructor functions, all of which share the same validation mechanism.
So I'll use that as an example to explain my problem.
Say, I have a ClassA and ClassB, each of which require their own arguments in the respective constructor functions:
ClassA <- function(A_arg1, A_arg2) {
# some class-SPECIFIC construction magic happens, say
out <- list(A_arg1, A_arg2)
# some GENERAL construction magic happens
class(out) <- "ClassA"
return(out)
}
ClassB <- function(B_arg1, B_arg2) {
# some class-SPECIFIC construction magic happens, say
out <- B_arg1 + B_arg2
# some GENERAL construction magic happens
class(out) <- "ClassB"
return(out)
}
Obviously, I'd love to avoid the duplication in the general part of the constructor functions, so a function factory that could be used like so would be nice:
ClassA <- produce_class_constructor(classname = "ClassA", fun = function(A_arg1, A_arg2) {return(list(A_arg1, A_arg2))})
This should, ideally, yield the exact same function as the above manually constructed ClassA function, with the general part factored out.
Here's my attempt at building that function factory:
produce_class_constructor <- function(classname, fun) {
class_specific_arguments <- formals(fun = fun) # this works just fine on the console
construct_class <- function(class_specific_arguments) {
# here runs the class-specific stuff
out <- fun(class_specific_arguments)
# here runs the general stuff
class(out) <- classname
}
}
This however, does not work, because the resulting constructor function only has a class_specific_arguments-argument, not the, well, actual A_arg1, and A_arg2.
Is there way to do this?
Am I doing this wrong?
(It's really important to me that the resulting class constructor functions have properly named arguments, so a ... approach won't work).

Here's my attempt:
produce_class_constructor <- function(classname, fun) {
out_fun <- function() {
out_obj <- do.call(fun, as.list(environment()))
class(out_obj) <- classname
out_obj
}
formals(out_fun) <- formals(fun)
out_fun
}
ClassA <- produce_class_constructor(classname = "ClassA",
fun = function(A_arg1, A_arg2) {list(A_arg1, A_arg2)})
ClassA(1, 2)
#[[1]]
#[1] 1
#
#[[2]]
#[1] 2
#
#attr(,"class")
#[1] "ClassA"
ClassB <- produce_class_constructor(classname = "ClassB",
fun = function(B_arg1, B_arg2) {B_arg1 + B_arg2})
ClassB(B_arg2 = 2, 1)
#[1] 3
#attr(,"class")
#[1] "ClassB"
Idea with as.list(environment()) taken from this question. Note that you should be extra careful along that path, as ?formals says, "this is
advanced, dangerous coding".

Related

mockery::mock and mockery::stub do not work properly with quasiquotation?

I've written an import function that gets a single file from an aws s3-bucket.
That function itself is a wrapper around aws.s3::s3read_using() which takes a reading function as its first argument.
Why do I wrap around aws.s3::s3read_using() ? Because I need to do some special error-handling and want the wrapping function to do some Recall() up to a limit... but that's a different story.
Now that i've successfully build and tested my wrapping function i want to do another wrapping arround that:
I want to iterate n times over my wrapper to bind the downloaded files together. I now have the difficulty to hand the 'reading_function' to the FUN argument of aws.s3::s3read_using().
I could do that by simply using ... - BUT!
I want to make clear to the USER of my wrapping wrapper, that he needs to specify that argument.
So I've decided to use rlangs rlang::enexpr() to capture the argument and to hand it over to my first wrapper via !! - which in return captures that argument again with rlang::enexpr() and hands it over - finally - to aws.s3::s3read_using() via rlang::expr(aws.s3::s3read_using(FUN = !!reading_fn, object = s3_object))
That works perfectly fine and smooth. My Problem is with testing that function construct using testthat and mockery
Here is some broadly simplyfied code:
my_workhorse_function <- function(fn_to_work_with, value_to_work_on) {
fn <- rlang::enexpr(fn_to_work_with)
# Some other magic happens here - error handling, condition-checking, etc...
out <- eval(rlang::expr((!!fn)(value_to_work_on)))
}
my_iterating_function <- function(fn_to_iter_with, iterate_over) {
fn <- rlang::enexpr(fn_to_iter_with)
out <- list()
for(i in seq_along(iterate_over)) {
out[[i]] <- my_workhorse_function(!!fn, iterate_over[i])
}
return(out)
}
# Works just fine
my_iterating_function(sqrt, c(9:16))
Now, to the test:
# Throws an ERROR: 'Error in `!fn`: invalid argument type'
test_that("my_iterating_function iterates length(iterate_over) times over my_workhorse_function", {
mock_1 <- mockery::mock(1, cycle = TRUE)
stub(my_iterating_function, "my_workhorse_function", mock_1)
expect_equal(my_iterating_function(sqrt, c(9:16)), list(1,1,1,1,1,1,1,1))
expect_called(mock_1, 8)
})
I've used a workarround, but that just doesn't feel right, even though, it works:
# Test passed
test_that("my_iterating_function iterates length(iterate_over) times over my_workhorse_function", {
mock_1 <- mockery::mock(1, cycle = TRUE)
stub(my_iterating_function, "my_workhorse_function",
function(fn_to_work_with, value_to_work_on) {
fn <- rlang::enexpr(fn_to_work_with)
out <- mock_1(fn, value_to_work_on)
out})
expect_equal(my_iterating_function(sqrt, c(9:16)), list(1,1,1,1,1,1,1,1))
expect_called(mock_1, 8)
})
I'm using version of R: 4.1.1
I'm using versions of testthat(3.1.1), mockery(0.4.2), rlang(0.4.12)
I think you're complicating things here, although maybe I'm not fully understanding your end goal. You can directly pass functions through arguments without any issue. Your example code above can be easily simplified to (keeping the loop just to match your test_that() call):
library(testthat)
library(mockery)
my_workhorse_function <- function(fn_to_work_with, value_to_work_on) {
fn_to_work_with(value_to_work_on)
}
my_iterating_function <- function(fn_to_iter_with, iterate_over) {
out <- list()
for(i in seq_along(iterate_over)) {
out[[i]] <- my_workhorse_function(fn_to_iter_with, iterate_over[i])
}
return(out)
}
# Works just fine
my_iterating_function(sqrt, c(9:16))
#> [[1]]
#> [1] 3
#>
#> ...
test_that("my_iterating_function iterates length(iterate_over) times over my_workhorse_function", {
mock_1 <- mockery::mock(1, cycle = TRUE)
stub(my_iterating_function, "my_workhorse_function", mock_1)
expect_equal(my_iterating_function(sqrt, c(9:16)), list(1,1,1,1,1,1,1,1))
expect_called(mock_1, 8)
})
#> Test passed 🥇
You can just pass FUN directly through all of your nested functions. The functions you're wrapping with enexpr() were never going to be evaluated in the first place until you explicitly call them. You usually use enexpr when users are supplying expressions, not just functions.

Using R, variadically checking if `exists` [duplicate]

I stacked with trying to pass variable through few functions, and on the final function I want to get the name of the original variable. But it seems like substitute function in R looked only in "local" environment, or just for one level up. Well, let me explain it by code:
fun1 <- function (some_variable) {deparse(substitute(some_variable)}
fun2 <- function (var_pass) { fun1 (var_pass) }
my_var <- c(1,2) # I want to get 'my_var' in the end
fun2 (my_var) # > "var_pass"
Well, it seems like we printing the name of variable that only pass to the fun1. Documentation of the substitute tells us, that we can use env argument, to specify where we can look. But by passing .Global or .BaseNamespaceEnv as an argument to substitute I got even more strange results - "some_variable"
I believe that answer is in this function with using env argument, so, could you please explain me how it works and how can I get what I need. Thanks in advance!
I suggest you consider passing optional name value to these functions. I say this because it seems like you really want to use the name as a label for something in the end result; so it's not really the variable itself that matters so much as its name. You could do
fun1 <- function (some_variable, name=deparse(substitute(some_variable))) {
name
}
fun2 <- function (var_pass, name=deparse(substitute(var_pass))) {
fun1 (var_pass, name)
}
my_var <- c(1,2)
fun2(my_var)
# [1] "my_var"
fun1(my_var)
# [1] "my_var"
This way if you end up having some odd variable name and what to give a better name to a result, you at least have the option. And by default it should do what you want without having to require the name parameter.
One hack, probably not the best way:
fun2 <- function (var_pass) { fun1 (deparse(substitute(var_pass))) }
fun1 <- function (some_variable) {(some_variable))}
fun2(my_var)
# "my_var"
And you could run get on that. But as Paul H, suggests, there are better ways to track variables.
Another approach I'd like to suggest is to use rlang::enexpr.
The main advantage is that we don't need to carry the original variable name in a parameter. The downside is that we have to deal with expressions which are slightly trickier to use.
> fun1 <- function (some_variable) {
message("Entering fun1")
rlang::enexpr(some_variable)
}
> fun2 <- function (var_pass) {
message("Entering fun2")
eval(parse(text=paste0("fun1(", rlang::enexpr(var_pass), ")")))
}
> my_var <- c(1, 2)
> fun1(my_var)
#Entering fun1
my_var
> fun2(my_var)
#Entering fun2
#Entering fun1
my_var
The trick here is that we have to evaluate the argument name in fun2 and build the call to fun1 as a character. If we were to simply call fun1 with enexpr(var_pass), we would loose the notion of fun2's variable name, because enexpr(var_pass) would never be evaluated in fun2:
> bad_fun2 <- function (var_pass) {
message("Entering bad fun2")
fun1(rlang::enexpr(var_pass))
}
> bad_fun2(my_var)
#Entering bad fun2
#Entering fun1
rlang::enexpr(var_pass)
On top of that, note that neither fun1 nor fun2 return variable names as character vectors. The returned object is of class name (and can of course be coerced to character).
The bright side is that you can use eval directly on it.
> ret <- fun2(my_var)
#Entering fun2
#Entering fun1
> as.character(ret)
[1] "my_var"
> class(ret)
[1] "name"
> eval(ret)
[1] 1 2

S3 dispatching of `rbind` and `cbind`

I am trying to write an rbind method for a particular class. Here's a simple example where it doesn't work (at least for me):
rbind.character <- function(...) {
do.call("paste", list(...))
}
After entering this function, I seemingly can confirm that it is a valid method that R knows about:
> methods("rbind")
[1] rbind.character rbind.data.frame rbind.rootogram* rbind.zoo*
see '?methods' for accessing help and source code
However, it is not recognized if I try to use it:
> rbind("abc", "xyz")
[,1]
[1,] "abc"
[2,] "xyz"
> #### compared with ####
> rbind.character("abc", "xyz")
[1] "abc xyz"
The help page says that dispatch is performed internally as follows:
For each argument we get the list of possible class memberships from
the class attribute.
We inspect each class in turn to see if there is an applicable
method.
If we find an applicable method we make sure that it is identical to
any method determined for prior arguments. If it is identical, we
proceed, otherwise we immediately drop through to the default code.
With rbind("abc", "xyz"), I believe all these criteria are satisfied. What gives, and how can I fix it?
attributes("abc")
#NULL
A character vector doesn't have a class attribute. I don't think a method can be dispatched by rbind for the implicit classes.
A workaround would be to define your own class:
b <- "abc"
class(b) <- "mycharacter"
rbind.mycharacter <- function(...) {
do.call("paste", list(...))
}
rbind(b, b)
# [1] "abc abc"
The reason why it does not work with character was nicely explained by Roland in his comment.
rbind is not a standard S3 function, so you cannot "intercept" it for character.
Luckily, you can override the default implementation. Try:
rbind.character <- function(...) {
print("hello from rbind.character")
}
rbind <- function(...) {
args <- list(...)
if (all(vapply(args, is.character, logical(1)))) {
rbind.character(...)
} else {
base::rbind(...)
}
}
Basically, we check if the arguments are all characters. If so, we call our character function. If not, we call the default implementation.

How to see if an object has a particular method?

I would like to know if a given object has a particular method.
For example, suppose I want to know whether my mystery object has a specific print method. From reading ?methods, I try something like this:
has.print <- function (mysteryObject) {
'print' %in% attr(methods(class=class(mysteryObject)), 'info')$generic
}
m <- lm(Sepal.Length ~ Species, iris)
class(m) # 'lm'
has.print(m)
This is fine if mysteryObject has just one class. If it has multiple, there are problems in methods. I can get around this by using class(mysteryObject)[1], so that (for example)
library(data.table)
class(test) # data.table, data.frame
test <- data.table(iris)
has.print(test) # TRUE since there's a print.data.table
However, if I have something with multiple classes but the first does not have a print method, this returns false. Example:
mlm <- lm(cbind(Petal.Length, Petal.Width) ~ Species, iris)
class(mlm) # 'mlm', 'lm'. Note there is no print.mlm but there's a print.lm
has.print(mlm) # FALSE
This returns FALSE as there is no print.mlm. However, there is a print.lm, that is used instead, so I would like this to return TRUE.
Speaking as someone who knows very little about S3, S4, etc, is there a "proper" way to see if an object has a 'print' method on any of its classes? Ideally this works for both S3 and S4 objects, though I do not know what this means.
I can vectorise my methods(class=...) over class(mysteryObject), but I bet there's a more appropriate way to do it...
Apply methods on every class, unlist and search for "print":
has.print <- function(object) {
"print" %in%
unlist(
lapply(
class(object),
function(x) attr(methods(class = x), "info")$generic)
)
}
It is possible to start from the other side (searching for a class in all
generic print functions):
has.print <- function(object) {
any( sprintf("print.%s", class(object)) %in%
rownames(attr(methods(generic.function = "print"), "info")))
}
To find the method:
which.print <- function(object) {
print_methods <- rownames(attr(methods(generic.function = "print"), "info"))
print_methods[print_methods %in% sprintf("print.%s", class(object))]
}
# > which.print(mlm)
# [1] "print.lm"
S4
S4 classes are "printed" with show method. If no specialized method exists the showDefault is called. Function showMethods will show if there is any specialized show:
For example:
library(Matrix)
showMethods(f = "show", class = "denseMatrix")
#> Function: show (package methods)
#> object="denseMatrix"
showDefault is also calling print for non S4 members.

Get the attribute of a packaged function from within itself

Suppose we have this functions in a R package.
prova <- function() {
print(attr(prova, 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
'myattr<-' <- function(x, value) {
attr(x, 'myattr') <- value
x
}
myattr <- function(x) attr(x, 'myattr')
So, I install the package and then I test it. This is the result:
prova()
# NULL
# NULL
myattr(prova) <- 'ciao' # setting 'ciao' for 'myattr' attribute
prova()
# NULL
# NULL # Why NULL here ?
myattr(prova)
# [1] "ciao"
attr(prova, 'myattr')
# [1] "ciao"
The question is: how to get the attribute of the function from within itself?
Inside the function itself I cannot get its attribute, as demonstrated by the example.
I suppose that the solution will be of the serie "computing on the language" (match.call()[[1L]], substitute, environments and friends). Am I wrong?
I think that the important point here is that this function is in a package (so, it has its environment and namespace) and I need its attribute inside itself, in the package, not outside.
you can use get with the envir argument.
prova <- function() {
print(attr(get("prova", envir=envir.prova), 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
eg:
envir.prova <- environment()
prova()
# NULL
# NULL
myattr(prova) <- 'ciao'
prova()
# [1] "ciao"
# [1] "ciao"
Where envir.prova is a variable whose value you set to the environment in which prova is defined.
Alternatively you can use get(.. envir=parent.frame()), but that is less reliable as then you have to track the calls too, and ensure against another object with the same name between the target environment and the calling environment.
Update regarding question in the comments:
regarding using parent.frame() versus using an explicit environment name: parent.frame, as the name suggests, goes "up one level." Often, that is exactly where you want to go, so that works fine. And yet, even when your goal is get an object in an environment further up, R searches up the call stack until it finds the object with the matching name. So very often, parent.frame() is just fine.
HOWEVER if there are multiple calls between where you are invoking parent.frame() and where the object is located AND in one of the intermediary environments there exists another object with the same name, then R will stop at that intermediary environment and return its object, which is not the object you were looking for.
Therefore, parent.frame() has an argument n (which defaults to 1), so that you can tell R to begin it's search at n levels back.
This is the "keeping track" that I refer to, where the developer has to be mindful of the number of calls in between. The straightforward way to go about this is to have an n argument in every function that is calling the function in question, and have that value default to 1. Then for the envir argument, you use: get/assign/eval/etc (.. , envir=parent.frame(n=n) )
Then if you call Func2 from Func1, (both Func1 and Func2 have an n argument), and Func2 is calling prova, you use:
Func1 <- function(x, y, ..., n=1) {
... some stuff ...
Func2( <some, parameters, etc,> n=n+1)
}
Func2 <- function(a, b, c, ..., n=1) {
.... some stuff....
eval(quote(prova()), envir=parent.frame(n=n) )
}
As you can see, it is not complicated but it is * tedious* and sometimes what seems like a bug creeps in, which is simply forgetting to carry the n over.
Therefore, I prefer to use a fixed variable with the environment name.
The solution that I found is:
myattr <- function(x) attr(x, 'myattr')
'myattr<-' <- function(x, value) {
# check that x is a function (e.g. the prova function)
# checks on value (e.g. also value is a function with a given precise signature)
attr(x, 'myattr') <- value
x
}
prova <- function(..., env = parent.frame()) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], env)
# print(eval(as.call(c(myattr, this)), env)) # alternative
print(myattr(this))
# print(attr(this, 'myattr')
invisible(TRUE)
}
I want to thank #RicardoSaporta for the help and the clarification about keeping tracks of the calls.
This solution doesn't work when e.g. myattr(prova) <- function() TRUE is nested in func1 while prova is called in func2 (that it's called by func1). Unless you do not properly update its parameter env ...
For completeness, following the suggestion of #RicardoSaporta, I slightly modified the prova function:
prova <- function(..., pos = 1L) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], parent.frame(n = pos)
print(myattr(this))
# ...
}
This way, it works also when nested, if the the correct pos parameter is passed in.
With this modification it is easier to go to fish out the environment in which you set the attribute on the function prova.
myfun1 <- function() {
myattr(prova) <- function() print(FALSE)
myfun2(n = 2)
}
myfun2 <- function(n) {
prova(pos = n)
}
myfun1()
# function() print(FALSE)
# <environment: 0x22e8208>

Resources