"demask" function by finding its definition from previously loaded package - r

In my package, I define %+% operator as a shortcut for strings concatenation. As it may be defined by previously loaded packages, I want to execute my custom code only when both arguments are suitable (e.g. character), otherwise try to call the code from previously loaded packages. Here is my solution for that:
# helper function to find environment of the package
getEnvByName <- function(inpEnv=.GlobalEnv, lookFor){
e <- inpEnv;
while (environmentName(e) != 'R_EmptyEnv' & environmentName(e)!=lookFor) e <- parent.env(e);
if (environmentName(e) != lookFor) return(NULL);
return(e);
}
"%+%" <- function(arg1, arg2){
if (is.character(arg1) & is.character(arg2)) {
paste0(arg1, arg2);
} else {
e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
if (exists('%+%', envir = e)) get('%+%',envir = e)(arg1,arg2);
}
}
My questions are:
1) is it a good way to treat such situations?
2) why it is not the common practice to do similar things in other packages? For example, in the ggplot2 package, %+% operator is defined as following:
"%+%" <- function (e1, e2)
{
e2name <- deparse(substitute(e2))
if (is.theme(e1)) add_theme(e1, e2, e2name)
else if (is.ggplot(e1)) add_ggplot(e1, e2, e2name)
}
as you see, their code breaks previously defined %+% for any arguments while they could just override it only for theme or ggplot arguments and keep all other cases. I could suggest the authors to implement this kind of check but I assume there's some reason they don't do it...
UPD. just a little modification of my code: instead of defining everything in one function, I split it with UseMethod() - I'm wondering if it makes any difference:
`%+%` <- function(...) UseMethod("%+%")
`%+%.character` <- paste0
`%+%.default` <- function (arg1, arg2){
e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
get('%+%',envir = e)(arg1,arg2);
}

First of all I don't think it is a good practice to reimplement functions that already exist in widely used package (I refer to previously mentioned %s+% from stringi).
As for about you question I think the best way is this:
'%+%' <- function(arg1, arg2){
if (is.character(arg1) & is.character(arg2)) {
paste0(arg1, arg2)
} else {
old.func <- get('%+%',
envir = parent.env(.GlobalEnv),
inherits = TRUE)
old.func(arg1, arg2)
}
}
With option inherits = TRUE (which is default by the way) get performs the same search in environments as is implemented in your answer;
The method with UseMethod will work differently because in that case %+% will check only the first argument for the type "character", not both arguments;
As for ggplot2s %+% I think it was intended to return NULL with not suitable arguments' type. It might possibly be a flaw in the code.

Related

Modifying calls in function arguments

How can a function inspect and modify the arguments of a call that it received as argument?
Application: A user feeds a call to function a as an argument to function b, but they forget to specify one of the required arguments of a. How can function b detect the problem and fix it?
In this minimal example, function a requires two arguments:
a <- function(arg1, arg2) {
return(arg1 + arg2)
}
Function b accepts a call and an argument. The commented lines indicate what I need to do:
b <- function(CALL, arg3) {
# 1. check if `arg2` is missing from CALL
# 2. if `arg2` is missing, plug `arg3` in its place
# 3. return evaluated call
CALL
}
Expected behavior:
b(CALL = a(arg1 = 1, arg2 = 2), arg3 = 3)
> 3
b(CALL = a(arg1 = 1), arg3 = 3)
> 4
The second call currently fails because the user forgot to specify the required arg2 argument. How can function b fix this mistake automatically?
Can I exploit lazy evaluation to modify the call to a before it is evaluated? I looked into rlang::modify_call but couldn't figure it out.
Here's a method that would work
b <- function(CALL, arg3) {
scall <- substitute(CALL)
stopifnot(is.call(scall)) #check that it's a call
lcall <- as.list(scall)
if (!"arg2" %in% names(lcall)) {
lcall <- c(lcall, list(arg2 = arg3))
}
eval.parent(as.call(lcall))
}
We use substitute() to grab the unevaluated version the CALL parameter. We convert it to a list so we can modify it. Then we append to the list another list with the parameter name/value we want. Finally, we turn the list back into a call and then evaluate that call in the environment of the caller rather than in the function body itself.
If you wanted to use rlang::modify_call and other rlang functions you could use
b <- function(CALL, arg3) {
scall <- rlang::enquo(CALL)
stopifnot(rlang::quo_is_call(scall))
if (!"arg2" %in% names(rlang::quo_get_expr(scall))) {
scall <- rlang::call_modify(scall, arg2=arg3)
}
rlang::eval_tidy(scall, env = rlang::caller_env())
}
I don't see why fancy language manipulation is needed. The problem is what to do when a, which requires 2 arguments, is supplied only 1. Wrapping it with b, which has a default value for the 2nd argument, solves this.
b <- function(arg1, arg2=42)
{
a(arg1, arg2)
}
b(1)
# [1] 43
b(1, 2)
# [1] 3

Why does this happen when a user-defined R function does not return a value?

In the function shown below, there is no return. However, after executing it, I can confirm that the value entered d normally.
There is no return. Any suggestions in this regard will be appreciated.
Code
#installed plotly, dplyr
accumulate_by <- function(dat, var) {
var <- lazyeval::f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
dplyr::bind_rows(dats)
}
d <- txhousing %>%
filter(year > 2005, city %in% c("Abilene", "Bay Area")) %>%
accumulate_by(~date)
In the function, the last assignment is creating 'dats' which is returned with bind_rows(dats) We don't need an explicit return statement. Suppose, if there are two objects to be returned, we can place it in a list
In some languages like python, for memory efficiency, generators are used which will yield instead of creating the whole output in memory i.e. Consider two functions in python
def get_square(n):
result = []
for x in range(n):
result.append(x**2)
return result
When we run it
get_square(4)
#[0, 1, 4, 9]
The same function can be written as a generator. Instead of returning anything,
def get_square(n):
for x in range(n):
yield(x**2)
Running the function
get_square(4)
#<generator object get_square at 0x0000015240C2F9E8>
By casting with list, we get the same output
list(get_square(4))
#[0, 1, 4, 9]
There is always a return :) You just don't have to be explicit about it.
All R expressions return something. Including control structures and user-defined functions. (Control-structures are just functions, by the way, so you can just remember that everything is a value or a function call, and everything evaluates to a value).
For functions, the return value is the last expression evaluated in the execution of the function. So, for
f <- function(x) 2 + x
when you call f(3) you will invoke the function + with two parameters, 2 and x. These evaluate to 2 and 3, respectively, so `+`(2, 3) evaluates to 5, and that is the result of f(3).
When you call the return function -- and remember, this is a function -- you just leave the control-flow of a function early. So,
f <- function(x) {
if (x < 0) return(0)
x + 2
}
works as follows: When you call f, it will call the if function to figure out what to do in the first statement. The if function will evaluate x < 0 (which means calling the function < with parameters x and 0). If x < 0 is true, if will evaluate return(0). If it is false, it will evaluate its else part (which, because if has a special syntax when it comes to functions, isn't shown, but is NULL). If x < 0 is not true, f will evaluate x + 2 and return that. If x < 0 is true, however, the if function will evaluate return(0). This is a call to the function return, with parameter 0, and that call will terminate the execution of f and make the result 0.
Be careful with return. It is a function so
f <- function(x) {
if (x < 0) return;
x + 2
}
is perfectly valid R code, but it will not return when x < 0. The if call will just evaluate to the function return but not call it.
The return function is also a little special in that it can return from the parent call of control structures. Strictly speaking, return isn't evaluated in the frame of f in the examples above, but from inside the if calls. It just handles this special so it can return from f.
With non-standard evaluation this isn't always the case.
With this function
f <- function(df) {
with(df, if (any(x < 0)) return("foo") else return("bar"))
"baz"
}
you might think that
f(data.frame(x = rnorm(10)))
should return either "foo" or "bar". After all, we return in either case in the if statement. However, the if statement is evaluated inside with and it doesn't work that way. The function will return baz.
For non-local returns like that, you need to use callCC, and then it gets more technical (as if this wasn't technical enough).
If you can, try to avoid return completely and rely on functions returning the last expression they evaluate.
Update
Just to follow up on the comment below about loops. When you call a loop, you will most likely call one of the built-in primitive functions. And, yes, they return NULL. But you can write your own, and they will follow the rule that they return the last expression they evaluate. You can, for example, implement for in terms of while like this:
`for` <- function(itr_var, seq, body) {
itr_var <- as.character(substitute(itr_var))
body <- substitute(body)
e <- parent.frame()
j <- 1
while (j < length(seq)) {
assign(x = itr_var, value = seq[[j]], envir = e)
eval(body, envir = e)
j <- j + 1
}
"foo"
}
This function, will definitely return "foo", so this
for(i in 1:5) { print(i) }
evalutes to "foo". If you want it to return NULL, you have to be explicit about it (or just let the return value be the result of the while loop -- if that is the primitive while it returns NULL).
The point I want to make is that functions return the last expression they evaluate has to do with how the functions are defined, not how you call them. The loops use non-standard evaluation, so the last expression in the loop body you provide them might be the last value they evaluate and might not. For the primitive loops, it is not.
Except for their special syntax, there is nothing magical about loops. They follow the rules all functions follow. With non-standard evaluation it can get a bit tricky to work out from a function call what the last expression they will evaluate might be, because the function body looks like it is what the function evaluates. It is, to a degree, if the function is sensible, but the loop body is not the function body. It is a parameter. If it wasn't for the special syntax, and you had to provide loop bodies as normal parameters, there might be less confusion.

Make `==` a generic funciton in R

I would like to make == a generic function.
When I run: setGeneric("=="), the definition does not appear to change:
> `==`
function (e1, e2) .Primitive("==")
> setGeneric("==")
[1] "=="
> `==`
function (e1, e2) .Primitive("==")
And when I call setgeneric("`==`"), I get the following error:
> setGeneric("`==`")
Error in setGeneric("`==`") :
must supply a function skeleton for ‘`==`’, explicitly or via an existing function
I can define the == function with:
`==` <- function(x,y) 42;
And then I can use setGeneric on it. But then I'd have to put the body of the original == there, which seems clunky.
So is there any way to make == be generic in S4?
Thanks to MrFlick's response:
It turns out that == is already generic (implement in C). So you don't need to call setGeneric.
Rather, you can just use setMethod.
setMethod("==",
c(e1="Class1",e2="Class2"),
funciton(e1,e2) { .... })

Capture a function from parameter in NESTED function (closure function)

Consider a code snippet as follow:
f = function(y) function() y()
f(version)()
Error in f(version)() : could not find function "y"
P.s. It seems that the closure mechanism is quite different from C# Lambda. (?)
Q: How can I capture a function in the closure?
--EDIT--
Scenario: Actually, I would like to write a function factory, and I don't want to add parameter to the nested function.
Like this:
theme_factory = function(theme_fun)
{
function(device)
{
if (!is.onMac()) # Not Mac
{
(device == "RStudioGD") %?% theme_fun(): theme_fun(base_family="Heiti")
}
else
{
theme_fun(base_family="STHeiti")
}
}
}
And I defined two customized theme function for ggplot
theme_bw_rmd = theme_factory(theme_bw)
theme_grey_rmd = theme_factory(theme_grey)
Then I use them like:
function(device)
ggplot(data) + geom_point() something like that + theme_bw_rmd(device)
Thanks.
So the problem is with passing parameter? What about something like this:
alwaysaddone <- function(f) function(...) f(...)+1
biggersum <- alwaysaddone(sum)
sum(1:3)
# 6
biggersum(1:3)
# 7
You can use ... to "pass-through" any parameters you like.
Use eval(func, envir = list(... captured parameters)) or substitute(func, envir) to eval the captured function in a specific environment.

In R can I find the environment associated with a lazy argument?

Sorry this is a little complicated.
I want to capture an argument expression, but also know which environment it should be evaluated in. Something like this:
make.promise = function(x = print(b), b = 7) {
expr = substitute(x)
env = parent.frame()
function() {
eval(expr, env)
}
}
p1 = (
function() {
a = 2
make.promise(print(a))
}
)()
p2 = make.promise()
The problem is, if no argument is supplied for x, its environment becomes the local environment of make.promise(), and I don't know how to detect that. Is there a function other than substitute I could use that also captures the environment?
The simplest implementation of make.promise would be:
make.promise <- function(x) {
function() x
}
But I don't think that's what you're looking for. I'm not aware of any way to find the environment associated - you might try email the r-devel mailing list.

Resources