Given a call to a function bar::foo(), I would like to be able to programmatically switch the package bar so that the same syntax calls hello::foo().
An example:
Let's say I have three packages, parentPkg, childPkg1 and childPkg2.
In parentPkg I have a call to function childPkg1::foo()
foo() is also a function in childPkg2
I would like to be able, in parentPkg to use the :: operator to call foo() but to programatically switch the package name.
Something like:
dummy_pkg_name = ifelse(scenario=="child1", "childPkg1", "childPkg2")
dummy_pkg_name::foo()
Is it possible? How do I achieve it?
Some context
parentPkg is a function that interacts with a web application, takes some request and data and returns results from different statistical models depending on the scenarios.
Each scenario is quite complex and not everything can be generalised in parentPkg. For this reason, childPkg1 and childPkg2 (actually there are also 3 and 4) are sort of sub-packages that deals with the data cleaning and various alternatives for each scenario but return the same class of value.
The idea is that parentPkg would switch the package to the pertinent child depending on the scenario and call all of the necessary functions without having to write the same sequence for each child but just with a slightly different :: call.
Since :: can be seen as a function, it looks like
`::`(dummy_pkg_name, foo)()
is what you want. Alternatively,
getFromNamespace("foo", ns = dummy_pkg_name)()
For instance,
`::`(stats, t.test)
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x102fd4b00>
# <environment: namespace:stats>
getFromNamespace("t.test", ns = "stats")
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x102fd4b00>
# <environment: namespace:stats>
To adhere to KISS, simply re-assign to new named functions in global environment. Be sure to leave out () since you are not requesting to run the function.
parent_foo <- parentPkg::foo
child1_foo <- childPkg1::foo
child2_foo <- childPkg2::foo
child3_foo <- childPkg3::foo
Then, conditionally apply them as needed:
if (scenario=="child1") {
obj <- child1_foo(...)
}
else if (scenario=="child2") {
obj <- child2_foo(...)
}
...
You could also create a call() that could then be evaluated.
call("::", quote(bar), quote(foo()))
# bar::foo()
Put into use:
c <- call("::", quote(stats), quote(t.test))
eval(c)
# function (x, ...)
# UseMethod("t.test")
# <bytecode: 0x4340988>
# <environment: namespace:stats>
Wrapped up in a function using setdiff as our default function:
f <- function(pkg, fn = setdiff) {
pkg <- substitute(pkg)
fn <- substitute(fn)
eval(call("::", pkg, fn))
}
f(base)
# function (x, y)
# {
# x <- as.vector(x)
# y <- as.vector(y)
# unique(if (length(x) || length(y))
# x[match(x, y, 0L) == 0L]
# else x)
# }
# <bytecode: 0x30f1ea8>
# <environment: namespace:base>
f(dplyr)
# function (x, y, ...)
# UseMethod("setdiff")
# <environment: namespace:dplyr>
Related
I am trying to understand the behaviour of user-defined functions like the below (based on the first answer to this question), which returns the arguments supplied to it as a named list:
function(a, b, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
Essentially, functions like the above produce unexpected behaviour when one of the argument names is also the name of a primitive function whose only parameter is ...
Below are some reproducible examples.
Example 1 - function behaves as expected, missing argument does not cause error
#define function as above
fun1 <- function(a, b, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun1(a = 1)
#returns the below. note that $b has the missing argument and this does not cause an error
#$a
#[1] 1
#$b
Example 2 - function returns error if 'c' is one of the explicit parameters and missing
#define function as above but with new explicit argument, called 'c'
#note that c() is a primitive function whose only parameter is ...
fun2 <- function(a, b, c, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun2(a = 1)
#returns error:
#Error in c(as.list(environment()), list(...)) :
# argument "c" is missing, with no default
Example 3 - replace 'c' with 'switch', a primitive function with parameters other than ...
#define function same way as fun2, but change 'c' parameter to 'switch'
#note that switch() is a primitive function that has parameters other than ...
fun3 <- function(a, b, switch, ...) {
argg <- c(as.list(environment()), list(...))
print(argg)
}
#run function
fun3(a = 1)
#returns the below. note that $b and $switch have the missing argument and this does not cause an error
#$a
#[1] 1
#$b
#$switch
I have tried numerous variations of the above that seem pointless to print here given that the basic pattern should be clear and thus easily reproducible without specific passages of code; suffice to say that as far as I have been able to tell, it appears that the function returns an error if one of its arguments a.) has the same name as a primitive function whose only parameter is ... and b.) is also missing. No other changes that I tested (such as removing the ... from the user-defined function's parameters; altering the order in which the arguments are specified when calling the function or when defining the function; changing the names and quantity of other arguments specified when calling the function or defining the function, etc.) had an impact on whether the behaviour was as expected.
Another point to note is that I don't see an error if I define a function with the same parameters as fun2, and with the c argument still missing, if I am not trying to access the function's arguments inside it. For example:
#define function with same parameters but different content to fun2
fun4 <- function(a, b, c, ...) {
return(a+b)
}
#run function
fun4(a = 1, b = 2)
#returns
#[1] 3
Please could somebody explain why I see this pattern of behaviour and the reason for the key role apparently played by primitive functions that only have ... as a parameter.
Please do not submit answers or comments suggesting 'workarounds' or querying the practical significance of the issue at hand. I am not asking my question in order to address a specific practical problem and there is no reason I can think of why I would ever be forced to use the name of a primitive function as a parameter; rather, I want to understand why the errors occur when they do in order to gain a clearer understanding of how functions in general, and the processes used to access their parameters in particular, work in R.
It's not the ... that's causing the problem. When you call c(), R looks for the function definition in the environment. Outside of a function it will normally find this as base::c. But within your function it first looks for the definition in the argument c in the function call, which it then can't find. This way of calling shows that it can work by telling R specifically where to find the definition of c:
fun4 <- function(a, b, c, ...) {
argg <- base::c(as.list(environment()), list(...))
print(argg)
}
#run function
fun4(a = 1)
#> $a
#> [1] 1
#>
#> $b
#>
#>
#> $c
Environments - from Advanced R
To demonstrate where things are being called you can use this tip from Advanced R by Hadley Wickham to see where R is finding each object. In the function where c isn't an argument, it finds it in base, otherwise it "finds" it in the function environment (where a and b are also defined):
library(rlang)
where <- function(name, env = caller_env()) {
if (identical(env, empty_env())) {
stop("Can't find ", name, call. = FALSE)
} else if (env_has(env, name)) {
env
} else {
where(name, env_parent(env))
}
}
fun5 <- function(a, b, ...) {
print(where("a"))
print(where("b"))
print(where("c"))
}
#run function
fun5(a = 1)
#> <environment: 0x000000001de35890>
#> <environment: 0x000000001de35890>
#> <environment: base>
fun6 <- function(a, b, c, ...) {
print(where("a"))
print(where("b"))
print(where("c"))
}
#run function
fun6(a = 1)
#> <environment: 0x000000001e1381f0>
#> <environment: 0x000000001e1381f0>
#> <environment: 0x000000001e1381f0>
Created on 2021-12-15 by the reprex package (v2.0.1)
I am pretty new to R, I have coded with Python and here OOP is quite different to python. I am trying to understand it, so in S3 you can create methods/functions that are not directly attached to a single class, just the same as the objects as they can be in multiple classes (which is quite flexible I guess). However what I do not understand is when I am creating a class such as:
> my_mean <- function (x, ...) {
UseMethod("my_mean", x)}
> my_mean
function (x, ...) {
UseMethod("my_mean", x)}
> my_mean.default <- function(obj){cat("this is a generic function")}
> my_mean.default
function(obj){cat("this is a generic function")}
But then when I have to run for example summary:
summary.default
function (object, ..., digits, quantile.type = 7)
{
if (is.factor(object))
return(summary.factor(object, ...))
else if (is.matrix(object)) {
if (missing(digits))
return(summary.matrix(object, quantile.type = quantile.type,
...))
else return(summary.matrix(object, digits = digits, quantile.type = quantile.type,
...))
}
value <- if (is.logical(object))
c(Mode = "logical", {
tb <- table(object, exclude = NULL, useNA = "ifany")
if (!is.null(n <- dimnames(tb)[[1L]]) && any(iN <- is.na(n))) dimnames(tb)[[1L]][iN] <- "NA's"
tb
})
else if (is.numeric(object)) {
nas <- is.na(object)
object <- object[!nas]
qq <- stats::quantile(object, names = FALSE, type = quantile.type)
qq <- c(qq[1L:3L], mean(object), qq[4L:5L])
if (!missing(digits))
qq <- signif(qq, digits)
names(qq) <- c("Min.", "1st Qu.", "Median",
"Mean", "3rd Qu.", "Max.")
if (any(nas))
c(qq, `NA's` = sum(nas))
else qq
}
else if (is.recursive(object) && !is.language(object) &&
(n <- length(object))) {
sumry <- array("", c(n, 3L), list(names(object),
c("Length", "Class", "Mode")))
ll <- numeric(n)
for (i in 1L:n) {
ii <- object[[i]]
ll[i] <- length(ii)
cls <- oldClass(ii)
sumry[i, 2L] <- if (length(cls))
cls[1L]
else "-none-"
sumry[i, 3L] <- mode(ii)
}
sumry[, 1L] <- format(as.integer(ll))
sumry
}
else c(Length = length(object), Class = class(object), Mode = mode(object))
class(value) <- c("summaryDefault", "table")
value
}
<bytecode: 0x000001926eaaf8f8>
<environment: namespace:base>
> summary
function (object, ...)
UseMethod("summary")
<bytecode: 0x000001926e9ec2c0>
<environment: namespace:base>
I cannot see the difference in why when you call summary in the console it does not give you the function, it gives you a reference to that object. There's any explanation? Furthermore, is it generic in some way similar to init?
S3 classes work nothing like any OOP you may be familiar with from other languages. They are a losely connected set of mechanisms that only work when you stick to certain rules.
x <- 1:11
mean(x)
#> [1] 6
This implcitely calls the function mean.default because x is a simple atomic vector.
Now we create a method for our own class evil
mean.evil <- function( x ) {
return(666) # always retuns 666 that is why it is evil
}
And we convert the vector x to a class evil:
class(x) <- "evil" # you can actually do it just like that
Now, calling mean determines that xis of class evil and calls the according function.
mean(x) # calls mean.evil
#> [1] 666
mean.default(x) # coerces R to use the default method which is still possible
#> [1] 6
The reason is that mean uses UseMethod() which checks the class and tries to find a function that has name with the pattern mean.[myclass]. And that is all that happens.
mean
#> function (x, ...)
#> UseMethod("mean")
#> <bytecode: 0x0000000015812e18>
#> <environment: namespace:base>
In other languages everything is held together by the syntax. S3 mechanisms on the other hand can be used to "approximate" OOP but they can be easily misused. They are simply and effective and appropriate for many use cases in R. If you are interested in more advanced OOP in R I recommend R6 classes.
Created on 2020-06-30 by the reprex package (v0.3.0)
I want to be able to find the environment from which the ... (dots) arguments of a call originate.
Scenario
For example, consider a function
foo <- function(x, ...) {
# do something
}
We want a function env_dots(), which we invoke from within foo(), that finds the originating environment of the ... in a call to foo(), even when the call to foo() is deeply nested. That is, if we define
foo <- function(x, ...) {
# find the originating environment of '...'
env <- env_dots()
# do something
}
and nest a call to foo, like so,
baz <- function(...) {
a <- "You found the dots"
bar(1, 2)
}
bar <- function(...)
foo(...)
then calling baz() should return the environment in which the ... in the (nested) call to foo(...) originates: this is the environment where the call bar(1, 2) is made, since the 2 (but not the 1) gets passed to the dots of foo. In particular, we should get
baz()$a
#> [1] "You found the dots"
Naive implementation of env_dots()
Update — env_dots(), as defined here, will not work in general, because the final ... may be populated by arguments that are called at multiple levels of the call stack.
Here's one possibility for env_dots():
# mc: match.call() of function from which env_dots() is called
env_dots <- function(mc) {
# Return NULL if initial call invokes no dots
if (!rlang::has_name(mc, "...")) return(NULL)
# Otherwise, climb the call stack until the dots origin is found
stack <- rlang::call_stack()[-1]
l <- length(stack)
i <- 1
while (i <= l && has_dots(stack[[i]]$expr)) i <- i + 1
# return NULL if no dots invoked
if (i <= l) stack[[i + 1]]$env else NULL
}
# Does a call have dots?
has_dots <- function(x) {
if (is.null(x))
return(FALSE)
args <- rlang::lang_tail(x)
any(vapply(args, identical, logical(1), y = quote(...)))
}
This seems to work: with
foo <- function(x, ...)
env_dots(match.call(expand.dots = FALSE))
we get
baz()$a
#> [1] "You found the dots"
bar(1, 2) # 2 gets passed down to the dots of foo()
#> <environment: R_GlobalEnv>
bar(1) # foo() captures no dots
#> NULL
Questions
The above implementation of env_dots() is not very efficient.
Is there are more skillful way to implement env_dots() in rlang and/or base R?
How can I move the match.call() invocation to within env_dots()?
match.call(sys.function(-1), call = sys.call(-1), expand.dots = FALSE) will indeed work.
Remark — One can't infer the origin environment of the dots from rlang::quos(...), because some quosures won't be endowed with the calling environment (e.g., when an expression is a literal object).
I'm sorry to dig up an old question, but I'm not sure the desired behavior is well-defined. ... is not a single expression; it's a list of expressions. In case of rlang quosures, each of those expressions has their own environment. So what should the environment of the list be?
Furthermore, the ... list itself can be modified. Consider the following example, where g takes its ..., prepends it with an (unevaluated) expression x+3 and passes it onto f.
f <- function(...) {rlang::enquos( ... )}
g <- function(...) {
a <- rlang::quo( x + 3 )
l <- rlang::list2( a, ... )
f(!!!l)
}
b <- rlang::quo( 5 * y )
g( b, 10 )
# [[1]]
# <quosure>
# expr: ^x + 3
# env: 0x7ffd1eca16f0
# [[2]]
# <quosure>
# expr: ^5 * y
# env: global
# [[3]]
# <quosure>
# expr: ^10
# env: empty
Notice that each of the three quosures that make it over to f has their own environment. (As you noted in your question, literals like 10 have an empty environment. This is because the value is the same independent of which environment it's evaluated in.)
Given this scenario, what should the hypothetical env_dots() return when called inside f()?
I am trying to update a call for a new function I developed with a new class. The developing is pretty similar to linmod found in Leish's article "Creating R packages".
Inside the function, the call is stored with match.call().
When I try to update the call, as follows:
library(MASS)
fit <- linmod(Hwt~Bwt*Sex, data=cats)
update(fit, subset = -1)
I got the following error message:
Error in eval(expr, envir, enclos) :
could not find function "linmod.formula"
The problem seems to be that match.call() saves the full S3 method name (linmod.formula), instead of just the generic function name (linmod), which would work perfectly.
Anyone could help me how to solve this problem?
The easiest way I know for fixing this is exporting the method. For this, you need to add #export linmod.formula. Of course, it is generally not recommended to export methods.
Another option is creating a method for update. The following is a copy of update.default with one additional line:
#' #export
update.linmod <- function (object, formula., ..., evaluate = TRUE)
{
if (is.null(call <- getCall(object)))
stop("need an object with call component")
extras <- match.call(expand.dots = FALSE)$...
#call generic instead of method:
call[[1]] <- quote(linmod)
if (!missing(formula.))
call$formula <- update.formula(formula(object), formula.)
if (length(extras)) {
existing <- !is.na(match(names(extras), names(call)))
for (a in names(extras)[existing]) call[[a]] <- extras[[a]]
if (any(!existing)) {
call <- c(as.list(call), extras[!existing])
call <- as.call(call)
}
}
if (evaluate)
eval(call, parent.frame())
else call
}
I dislike both options and would avoid having methods for the linmod function. Your default method seems useless to me. Note how, e.g., lm is not an S3 generic.
PS: update doesn't have a subset parameter.
Since this hasn't been mentioned here yet, and it is the approach explicitly recommended in ?update: write a method for getCall. From ?update:
“Extracting the call” in update() and similar functions uses getCall() which itself is a (S3) generic function with a default method that simply gets x$call. Because of this, update() will often work (via its default method) on new model classes, either automatically, or by providing a simple getCall() method for that class.
So, in your package, if you have:
#' #export
f <- function(x) {
UseMethod("f")
}
#' #export
f.bar <- function(x) {
structure(list(x = x, call = match.call()), class = "fbar")
}
#' #export
#' #importFrom stats getCall
getCall.fbar <- function(x) {
x$call[[1L]] <- quote(f) # replacing `f.bar`
x$call
}
Then, in your script, you could do:
x1 <- structure(1, class = "bar")
x2 <- structure(2, class = "bar")
fx1 <- f(x = x1)
fx2 <- update(fx1, x = x2)
fx1
# $x
# [1] 1
# attr(,"class")
# [1] "bar"
#
# $call
# f.bar(x = x1)
#
# attr(,"class")
# [1] "fbar"
fx2
# $x
# [1] 2
# attr(,"class")
# [1] "bar"
#
# $call
# f.bar(x = x2)
#
# attr(,"class")
# [1] "fbar"
Suppose we have this functions in a R package.
prova <- function() {
print(attr(prova, 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
'myattr<-' <- function(x, value) {
attr(x, 'myattr') <- value
x
}
myattr <- function(x) attr(x, 'myattr')
So, I install the package and then I test it. This is the result:
prova()
# NULL
# NULL
myattr(prova) <- 'ciao' # setting 'ciao' for 'myattr' attribute
prova()
# NULL
# NULL # Why NULL here ?
myattr(prova)
# [1] "ciao"
attr(prova, 'myattr')
# [1] "ciao"
The question is: how to get the attribute of the function from within itself?
Inside the function itself I cannot get its attribute, as demonstrated by the example.
I suppose that the solution will be of the serie "computing on the language" (match.call()[[1L]], substitute, environments and friends). Am I wrong?
I think that the important point here is that this function is in a package (so, it has its environment and namespace) and I need its attribute inside itself, in the package, not outside.
you can use get with the envir argument.
prova <- function() {
print(attr(get("prova", envir=envir.prova), 'myattr'))
print(myattr(prova))
invisible(TRUE)
}
eg:
envir.prova <- environment()
prova()
# NULL
# NULL
myattr(prova) <- 'ciao'
prova()
# [1] "ciao"
# [1] "ciao"
Where envir.prova is a variable whose value you set to the environment in which prova is defined.
Alternatively you can use get(.. envir=parent.frame()), but that is less reliable as then you have to track the calls too, and ensure against another object with the same name between the target environment and the calling environment.
Update regarding question in the comments:
regarding using parent.frame() versus using an explicit environment name: parent.frame, as the name suggests, goes "up one level." Often, that is exactly where you want to go, so that works fine. And yet, even when your goal is get an object in an environment further up, R searches up the call stack until it finds the object with the matching name. So very often, parent.frame() is just fine.
HOWEVER if there are multiple calls between where you are invoking parent.frame() and where the object is located AND in one of the intermediary environments there exists another object with the same name, then R will stop at that intermediary environment and return its object, which is not the object you were looking for.
Therefore, parent.frame() has an argument n (which defaults to 1), so that you can tell R to begin it's search at n levels back.
This is the "keeping track" that I refer to, where the developer has to be mindful of the number of calls in between. The straightforward way to go about this is to have an n argument in every function that is calling the function in question, and have that value default to 1. Then for the envir argument, you use: get/assign/eval/etc (.. , envir=parent.frame(n=n) )
Then if you call Func2 from Func1, (both Func1 and Func2 have an n argument), and Func2 is calling prova, you use:
Func1 <- function(x, y, ..., n=1) {
... some stuff ...
Func2( <some, parameters, etc,> n=n+1)
}
Func2 <- function(a, b, c, ..., n=1) {
.... some stuff....
eval(quote(prova()), envir=parent.frame(n=n) )
}
As you can see, it is not complicated but it is * tedious* and sometimes what seems like a bug creeps in, which is simply forgetting to carry the n over.
Therefore, I prefer to use a fixed variable with the environment name.
The solution that I found is:
myattr <- function(x) attr(x, 'myattr')
'myattr<-' <- function(x, value) {
# check that x is a function (e.g. the prova function)
# checks on value (e.g. also value is a function with a given precise signature)
attr(x, 'myattr') <- value
x
}
prova <- function(..., env = parent.frame()) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], env)
# print(eval(as.call(c(myattr, this)), env)) # alternative
print(myattr(this))
# print(attr(this, 'myattr')
invisible(TRUE)
}
I want to thank #RicardoSaporta for the help and the clarification about keeping tracks of the calls.
This solution doesn't work when e.g. myattr(prova) <- function() TRUE is nested in func1 while prova is called in func2 (that it's called by func1). Unless you do not properly update its parameter env ...
For completeness, following the suggestion of #RicardoSaporta, I slightly modified the prova function:
prova <- function(..., pos = 1L) {
# get the current function object (in its environment)
this <- eval(match.call()[[1L]], parent.frame(n = pos)
print(myattr(this))
# ...
}
This way, it works also when nested, if the the correct pos parameter is passed in.
With this modification it is easier to go to fish out the environment in which you set the attribute on the function prova.
myfun1 <- function() {
myattr(prova) <- function() print(FALSE)
myfun2(n = 2)
}
myfun2 <- function(n) {
prova(pos = n)
}
myfun1()
# function() print(FALSE)
# <environment: 0x22e8208>