Imagine you have a simple function that specifies which statistical tests to run for each variable. Its syntax, simplified for the purposes of this question is as follows:
test <- function(...) {
x <- list(...)
return(x)
}
which takes argument pairs such as Gender = 'Tukey', and intends to pass its result to other functions down the line. The output of test() is as follows:
test(Gender = 'Tukey')
# $Gender
# [1] "Tukey"
What is desired is the ability to replace the literal Gender by a dynamically assigned variable varname (e.g., for looping purposes). Currently what happens is:
varname <- 'Gender'
test(varname = 'Tukey')
# $varname
# [1] "Tukey"
but what is desired is this:
varname <- 'Gender'
test(varname = 'Tukey')
# $Gender
# [1] "Tukey"
I tried tinkering with functions such as eval() and parse(), but to no avail. In practice, I resolved the issue by simply renaming the resulting list, but it is an ugly solution and I am sure there is an elegant R way to achieve it. Thank in advance for the educational value of your answer.
NB: This question occurred to me while trying to program a custom function which uses mcp() from the effects package in its internals. The said mcp() function is the real world counterpart of test().
EDIT1: Perhaps it needs to be clarified that (for educational purposes) changing test() is not an option. The question is about how to pass the tricky argument to test(). If you take a look at NB, it becomes clear why: the real world counterpart of test(), namely mcp(), comes with a package. And while it is possible to create a modified copy of it, I am really curious whether there exists a simple solution in somehow 'converting' the dynamically assigned variable to a literal in the context of dot-arguments.
This works:
test <- function(...) {
x = list(...)
names(x) <- sapply(names(x),
function(p) eval(as.symbol(p)))
return(x)
}
apple = "orange"
test(apple = 5)
We can use
test <- function(...) {
x <- list(...)
if(exists(names(x))) names(x) <- get(names(x))
x
}
test(Gender = 'Tukey')
#$Gender
#[1] "Tukey"
test(varname = 'Tukey')
#$Gender
#[1] "Tukey"
What about this:
varname <- "Gender"
args <- list()
args[[varname]] <- "Tukey"
do.call(test, args)
Related
I stacked with trying to pass variable through few functions, and on the final function I want to get the name of the original variable. But it seems like substitute function in R looked only in "local" environment, or just for one level up. Well, let me explain it by code:
fun1 <- function (some_variable) {deparse(substitute(some_variable)}
fun2 <- function (var_pass) { fun1 (var_pass) }
my_var <- c(1,2) # I want to get 'my_var' in the end
fun2 (my_var) # > "var_pass"
Well, it seems like we printing the name of variable that only pass to the fun1. Documentation of the substitute tells us, that we can use env argument, to specify where we can look. But by passing .Global or .BaseNamespaceEnv as an argument to substitute I got even more strange results - "some_variable"
I believe that answer is in this function with using env argument, so, could you please explain me how it works and how can I get what I need. Thanks in advance!
I suggest you consider passing optional name value to these functions. I say this because it seems like you really want to use the name as a label for something in the end result; so it's not really the variable itself that matters so much as its name. You could do
fun1 <- function (some_variable, name=deparse(substitute(some_variable))) {
name
}
fun2 <- function (var_pass, name=deparse(substitute(var_pass))) {
fun1 (var_pass, name)
}
my_var <- c(1,2)
fun2(my_var)
# [1] "my_var"
fun1(my_var)
# [1] "my_var"
This way if you end up having some odd variable name and what to give a better name to a result, you at least have the option. And by default it should do what you want without having to require the name parameter.
One hack, probably not the best way:
fun2 <- function (var_pass) { fun1 (deparse(substitute(var_pass))) }
fun1 <- function (some_variable) {(some_variable))}
fun2(my_var)
# "my_var"
And you could run get on that. But as Paul H, suggests, there are better ways to track variables.
Another approach I'd like to suggest is to use rlang::enexpr.
The main advantage is that we don't need to carry the original variable name in a parameter. The downside is that we have to deal with expressions which are slightly trickier to use.
> fun1 <- function (some_variable) {
message("Entering fun1")
rlang::enexpr(some_variable)
}
> fun2 <- function (var_pass) {
message("Entering fun2")
eval(parse(text=paste0("fun1(", rlang::enexpr(var_pass), ")")))
}
> my_var <- c(1, 2)
> fun1(my_var)
#Entering fun1
my_var
> fun2(my_var)
#Entering fun2
#Entering fun1
my_var
The trick here is that we have to evaluate the argument name in fun2 and build the call to fun1 as a character. If we were to simply call fun1 with enexpr(var_pass), we would loose the notion of fun2's variable name, because enexpr(var_pass) would never be evaluated in fun2:
> bad_fun2 <- function (var_pass) {
message("Entering bad fun2")
fun1(rlang::enexpr(var_pass))
}
> bad_fun2(my_var)
#Entering bad fun2
#Entering fun1
rlang::enexpr(var_pass)
On top of that, note that neither fun1 nor fun2 return variable names as character vectors. The returned object is of class name (and can of course be coerced to character).
The bright side is that you can use eval directly on it.
> ret <- fun2(my_var)
#Entering fun2
#Entering fun1
> as.character(ret)
[1] "my_var"
> class(ret)
[1] "name"
> eval(ret)
[1] 1 2
I'm trying to evaluate an expression containing an address of an object at a parent.frame scope, and am getting weird results:
test2 <- function(d) {
address.current <- address(d) # "0x5595b73aedf8"
address.at.caller <- eval(parse(text="address(df)")) # "0x5595b73aedf8"
address.at.caller2 <- do.call(address, args=list("df"), envir=parent.frame()) # problem: "0x5595b6d89de8"
}
test1 <- function(df) {
test2(df)
}
df <- data.frame(a=1:2)
test1(df)
Moreover, if you stop at a breakpoint inside test2 and re-evaluate the expression for address.at.caller2 you'd get non-repeating results:
Browse[2]> do.call(address, args=list("df"), envir=parent.frame())
[1] "0x5595b8c37d78"
Browse[2]> do.call(address, args=list("df"), envir=parent.frame())
[1] "0x5595b8cc74a8"
Browse[2]> do.call(address, args=list("df"), envir=parent.frame())
[1] "0x5595b8cd1348"
This seems to indicate that the result is an address of some temporary object. (Evaluate repeatedly address(2) for a different example).
Is something wrong with the expression do.call(address, args=list("df"), envir=parent.frame())?
Is there a different explanation for this behaviour?
Its not really clear what you are trying to do by using do.call. When you use it like you did, you gave it a variable ( a string) and you asked it for the address. the thing is that R automatically creates copies when you enter variables into functions. So when you gave args = list("df) what R did was create a copy of the string "df" within the do.call frame, and then it gave you the local address before closing the call. You should pass the variable you want to evaluate into the function, or alternatively have it sit on the global scope.
Interesting question.
You don't have to pass input variable really, or operate on the global scope. You can use a more robust alternative to do.call, the eval(as.call(.)).
test2 <- function(d) {
address.current <- address(d)
print(address.current)
address.at.caller <- eval(parse(text="address(df)"))
print(address.at.caller)
address.at.caller2 <- do.call(address, args=list("df"), envir=parent.frame())
print(address.at.caller2)
address.at.caller3 = eval.parent(as.call(list(quote(address), as.name("df"))))
print(address.at.caller3)
}
test1 <- function(df) {
test2(df)
}
df <- data.frame(a=1:2)
test1(df)
[1] "0x560d46e33cc0"
[1] "0x560d46e33cc0"
[1] "0x560d46e4a5f8"
[1] "0x560d46e33cc0"
For example, suppose I would like to be able to define a function that returned the name of the assignment variable concatenated with the first argument:
a <- add_str("b")
a
# "ab"
The function in the example above would look something like this:
add_str <- function(x) {
arg0 <- as.list(match.call())[[1]]
return(paste0(arg0, x))
}
but where the arg0 line of the function is replaced by a line that will get the name of the variable being assigned ("a") rather than the name of the function.
I've tried messing around with match.call and sys.call, but I can't get it to work. The idea here is that the assignment operator is being called on the variable and the function result, so that should be the parent call of the function call.
I think that it's not strictly possible, as other solutions explained, and the reasonable alternative is probably Yosi's answer.
However we can have fun with some ideas, starting simple and getting crazier gradually.
1 - define an infix operator that looks similar
`%<-add_str%` <- function(e1, e2) {
e2_ <- e2
e1_ <- as.character(substitute(e1))
eval.parent(substitute(e1 <- paste0(e1_,e2_)))
}
a %<-add_str% "b"
a
# "ab"
2 - Redefine := so that it makes available the name of the lhs to the rhs through a ..lhs() function
I think it's my favourite option :
`:=` <- function(lhs,rhs){
lhs_name <- as.character(substitute(lhs))
assign(lhs_name,eval(substitute(rhs)), envir = parent.frame())
lhs
}
..lhs <- function(){
eval.parent(quote(lhs_name),2)
}
add_str <- function(x){
res <- paste0(..lhs(),x)
res
}
a := add_str("b")
a
# [1] "ab"
There might be a way to redefine <- based on this, but I couldn't figure it out due to recursion issues.
3 - Use memory address dark magic to hunt lhs (if it exists)
This comes straight from: Get name of x when defining `(<-` operator
We'll need to change a bit the syntax and define the function fetch_name for this purpose, which is able to get the name of the rhs from a *<- function, where as.character(substitute(lhs)) would return "*tmp*".
fetch_name <- function(x,env = parent.frame(2)) {
all_addresses <- sapply(ls(env), pryr:::address2, env)
all_addresses <- all_addresses[names(all_addresses) != "*tmp*"]
all_addresses_short <- gsub("(^|<)[0x]*(.*?)(>|$)","\\2",all_addresses)
x_address <- tracemem(x)
untracemem(x)
x_address_short <- tolower(gsub("(^|<)[0x]*(.*?)(>|$)","\\2",x_address))
ind <- match(x_address_short, all_addresses_short)
x_name <- names(all_addresses)[ind]
x_name
}
`add_str<-` <- function(x,value){
x_name <- fetch_name(x)
paste0(x_name,value)
}
a <- NA
add_str(a) <- "b"
a
4- a variant of the latter, using .Last.value :
add_str <- function(value){
x_name <- fetch_name(.Last.value)
assign(x_name,paste0(x_name,value),envir = parent.frame())
paste0(x_name,value)
}
a <- NA;add_str("b")
a
# [1] "ab"
Operations don't need to be on the same line, but they need to follow each other.
5 - Again a variant, using a print method hack
Extremely dirty and convoluted, to please the tortured spirits and troll the others.
This is the only one that really gives the expected output, but it works only in interactive mode.
The trick is that instead of doing all the work in the first operation I also use the second (printing). So in the first step I return an object whose value is "b", but I also assigned a class "weird" to it and a printing method, the printing method then modifies the object's value, resets its class, and destroys itself.
add_str <- function(x){
class(x) <- "weird"
assign("print.weird", function(x) {
env <- parent.frame(2)
x_name <- fetch_name(x, env)
assign(x_name,paste0(x_name,unclass(x)),envir = env)
rm(print.weird,envir = env)
print(paste0(x_name,x))
},envir = parent.frame())
x
}
a <- add_str("b")
a
# [1] "ab"
(a <- add_str("b") will have the same effect as both lines above. print(a <- add_str("b")) would also have the same effect but would work in non interactive code, as well.
This is generally not possible because the operator <- is actually parsed to a call of the <- function:
rapply(as.list(quote(a <- add_str("b"))),
function(x) if (!is.symbol(x)) as.list(x) else x,
how = "list")
#[[1]]
#`<-`
#
#[[2]]
#a
#
#[[3]]
#[[3]][[1]]
#add_str
#
#[[3]][[2]]
#[1] "b"
Now, you can access earlier calls on the call stack by passing negative numbers to sys.call, e.g.,
foo <- function() {
inner <- sys.call()
outer <- sys.call(-1)
list(inner, outer)
}
print(foo())
#[[1]]
#foo()
#[[2]]
#print(foo())
However, help("sys.call") says this (emphasis mine):
Strictly, sys.parent and parent.frame refer to the context of the
parent interpreted function. So internal functions (which may or may
not set contexts and so may or may not appear on the call stack) may
not be counted, and S3 methods can also do surprising things.
<- is such an "internal function":
`<-`
#.Primitive("<-")
`<-`(x, foo())
x
#[[1]]
#foo()
#
#[[2]]
#NULL
As Roland pointed, the <- is outside of the scope of your function and could only be located looking at the stack of function calls, but this fail. So a possible solution could be to redefine the '<-' else than as a primitive or, better, to define something that does the same job and additional things too.
I don't know if the ideas behind following code can fit your needs, but you can define a "verbose assignation" :
`:=` <- function (var, value)
{
call = as.list(match.call())
message(sprintf("Assigning %s to %s.\n",deparse(call$value),deparse(call$var)))
eval(substitute(var <<- value))
return(invisible(value))
}
x := 1:10
# Assigning 1:10 to x.
x
# [1] 1 2 3 4 5 6 7 8 9 10
And it works in some other situation where the '<-' is not really an assignation :
y <- data.frame(c=1:3)
colnames(y) := "b"
# Assigning "b" to colnames(y).
y
# b
#1 1
#2 2
#3 3
z <- 1:4
dim(z) := c(2,2)
#Assigning c(2, 2) to dim(z).
z
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
>
I don't think the function has access to the variable it is being assigned to. It is outside of the function scope and you do not pass any pointer to it or specify it in any way. If you were to specify it as a parameter, you could do something like this:
add_str <- function(x, y) {
arg0 <-deparse(substitute(x))
return(paste0(arg0, y))
}
a <- 5
add_str(a, 'b')
#"ab"
I am looking for advice on the best way to code passing a list of functions as argument.
What I want to do:
I would like to pass as an argument a list a functions to apply them to a specific input. And give output name based on those.
A "reproducible" example
input = 1:5
Functions to pass are mean, min
Expected call:
foo(input, something_i_ask_help_for)
Expected output:
list(mean = 3, min = 1)
If it's not perfectly clear, please see my two solutions to have an illustration.
Solution 1: Passing functions as arguments
foo <- function(input, funs){
# Initialize output
output = list()
# Compute output
for (fun_name in names(funs)){
# For each function I calculate it and store it in output
output[fun_name] = funs[[fun_name]](input)
}
return(output)
}
foo(1:5, list(mean=mean, min=min))
What I don't like with this method is that we can't call it by doing: foo(1:5, list(mean, min)).
Solution 2: passing functions names as argument and using get
foo2 <- function(input, funs){
# Initialize output
output = list()
# Compute output
for (fun in funs){
# For each function I calculate it and store it in output
output[fun] = get(fun)(input)
}
return(output)
}
foo2(1:5, c("mean", "min"))
What i don't like with this method is that we are not really passing the function-object as argument.
My question:
Both ways works, but I not quite sure which one to choose.
Could you help me by:
Telling me which one is the best?
What are the advantages and defaults of each methods?
Is there a another (better) method
If you need any more information, don't hesitate to ask.
Thanks!
Simplifying solutions in question
The first of the solutions in the question requires that the list be named and the second requires that the functions have names which are passed as character strings. Those two user interfaces could be implemented using the following simplifications. Note that we add an envir argument to foo2 to ensure function name lookup occurs as expected. Of those the first seems cleaner but if the functions were to be used interactively and less typing were desired then the second does do away with having to specify the names.
foo1 <- function(input, funs) Map(function(f) f(input), funs)
foo1(1:5, list(min = min, max = max)) # test
foo2 <- function(input, nms, envir = parent.frame()) {
Map(function(nm) do.call(nm, list(input), envir = envir), setNames(nms, nms))
}
foo2(1:5, list("min", "max")) # test
Alternately we could build foo2 on foo1:
foo2a <- function(input, funs, envir = parent.frame()) {
foo1(input, mget(unlist(funs), envir = envir, inherit = TRUE))
}
foo2a(1:5, list("min", "max")) # test
or base the user interface on passing a formula containing the function names since formulas already incorporate the notion of environment:
foo2b <- function(input, fo) foo2(input, all.vars(fo), envir = environment(fo))
foo2b(1:5, ~ min + max) # test
Optional names while passing function itself
However, the question indicates that it is preferred that
the functions themselves be passed
names are optional
To incorporate those features the following allows the list to have names or not or a mixture. If a list element does not have a name then the expression defining the function (usually its name) is used.
We can derive the names from the list's names or when a name is missing we can use the function name itself or if the function is anonymous and so given as its definition then the name can be the expression defining the function.
The key is to use match.call and pick it apart. We ensure that funs is a list in case it is specified as a character vector. match.fun will interpret functions and character strings naming functions and look them up in the parent frame so we use a for loop instead of Map or lapply in order that we not generate a new function scope.
foo3 <- function(input, funs) {
cl <- match.call()[[3]][-1]
nms <- names(cl)
if (is.null(nms)) nms <- as.character(cl)
else nms[nms == ""] <- as.character(cl)[nms == ""]
funs <- as.list(funs)
for(i in seq_along(funs)) funs[[i]] <- match.fun(funs[[i]])(input)
setNames(funs, nms)
}
foo3(1:5, list(mean = mean, min = "min", sd, function(x) x^2))
giving:
$mean
[1] 3
$min
[1] 1
$sd
[1] 1.581139
$`function(x) x^2`
[1] 1 4 9 16 25
One thing that you are missing is replacing the for loops with lapply. Also for functional programming it is often good practice to separate functions to do one thing. I personally like the version from solution 1 where you pass the functions in directly because it avoids another call in R and therefore is more efficient. In solution 2, it is best to use match.fun instead of get. match.fun is stricter than get in searching for functions.
x <- 1:5
foo <- function(input, funs) {
lapply(funs, function(fun) fun(input))
}
foo(x, c(mean=mean, min=min))
The above code simplifies your solution 1. To add to this function, you could add some error handling such as is.numeric for x and is.function for funs.
In improving an rbind method, I'd like to extract the names of the objects passed to it so that I might generate unique IDs from those.
I've tried all.names(match.call()) but that just gives me:
[1] "rbind" "deparse.level" "..1" "..2"
Generic example:
rbind.test <- function(...) {
dots <- list(...)
all.names(match.call())
}
t1 <- t2 <- ""
class(t1) <- class(t2) <- "test"
> rbind(t1,t2)
[1] "rbind" "deparse.level" "..1" "..2"
Whereas I'd like to be able to retrieve c("t1","t2").
I'm aware that in general one cannot retrieve the names of objects passed to functions, but it seems like with ... it might be possible, as substitute(...) returns t1 in the above example.
I picked this one up from Bill Dunlap on the R Help List Serve:
rbind.test <- function(...) {
sapply(substitute(...()), as.character)
}
I think this gives you what you want.
Using the guidance here How to use R's ellipsis feature when writing your own function?
eg substitute(list(...))
and combining with with as.character
rbind.test <- function(...) {
.x <- as.list(substitute(list(...)))[-1]
as.character(.x)
}
you can also use
rbind.test <- function(...){as.character(match.call(expand.dots = F)$...)}