Goal
I am trying to create a function in R to replicate the functionality of a homonymous MATLAB function which returns the number of arguments that were passed to a function.
Example
Consider the function below:
addme <- function(a, b) {
if (nargin() == 2) {
c <- a + b
} else if (nargin() == 1) {
c <- a + a
} else {
c <- 0
}
return(c)
}
Once the user runs addme(), I want nargin() to basically look at how many parameters were passed―2 (a and b), only 1 (a) or none―and calculate c accordingly.
What I have tried
After spending a lot of time messing around with environments, this is the closest I ever got to a working solution:
nargin <- function() {
length(as.list(match.call(envir = parent.env(environment()))))
}
The problem with this function is that it always returns 0, and the reason why is that I think it's looking at its own environment instead of its parent's (in spite of my attempt of throwing in a parent.env there).
I know I can use missing() and args() inside addme() to achieve the same functionality, but I'll be needing this quite a few other times throughout my project, so wrapping it in a function is definitely something I should try to do.
Question
How can I get nargin() to return the number of arguments that were passed to its parent function?
You could use
nargin <- function() {
if(sys.nframe()<2) stop("must be called from inside a function")
length(as.list(sys.call(-1)))-1
}
Basically you just use sys.call(-1) to go up the call stack to the calling function and get it's call and then count the number of elements and subtract one for the function name itself.
I would like to write a function where one of the argument is a function written by the user.
Specifically, I have something like:
My_function(n,g){
x<-dnorm(n,0,1)
y<-g(x)
return(y)
}
For example, g(x)=x^2 ... but is chosen by the user. Of course, I could directly put g(dnorm(n,0,1)) as argument but I would like the user to write it in terms of x, i.e. g<-x^2 in the example.
How could I do this since the x object is only defined within the function (and not in the arguments)
I can't define the g function beforehand (otherwise, I reckon it's easy). It has to be defined within "My_function" so that the user defines everything he needs in one line.
Why not just declare g as a function with argument?
g=function(x) x^2
My_function=function(n,g){
x<-dnorm(n,0,1)
y<-g(x)
return(y)
}
My_function(1,g)
In R there's common function calling pattern that looks like this:
child = function(a, b, c) {
a * b - c
}
parent = function(a, b, c) {
result = child(a=a, b=b, c=c)
}
This repetition of names is helpful because it prevents potentially insidious errors if the ordering of the child's arguments were to change, or if an additional variable were to be added into the list:
childReordered = function(c, b, a) { # same args in different order
a * b - c
}
parent = function(a, b, c) {
result = childReordered(a, b, c) # probably an error
}
But this becomes cumbersome when the names of the arguments get long:
child = function(longVariableNameA, longVariableNameB, longVariableNameC) {
longVariableNameA * longVariableNameB - longVariableNameC
}
parent = function(longVariableNameA, longVariableNameB, longVariableNameC) {
child(longVariableNameA=longVariableNameA, longVariableNameB=longVariableNameB, longVariableNameC=longVariableNameB)
}
I'd like to have a way to get the safety of the named parameters without needing to actually type the names again. And I'd like to be able to do this when I can modify only the parent, and not the child. I'd envision something like this:
parent = function(a, b, c) {
result = child(params(a, b, c))
}
Where params() would be a new function that converts the unnamed arguments to named parameters based on the names of the variables. For example:
child(params(c,b,a)) == child(a=a, b=b, c=c)
There are a couple function in 'pryr' that come close to this, but I haven't figured out how to combine them to do quite what I want. named_dots(c,b,a) returns list(c=c, b=b, a=a), and standardise_call() has a similar operation, but I haven't figured out how to be able to convert the results into something that can be passed to an unmodified child().
I'd like to be able to use a mixture of implicit and explicitly named parameters:
child(params(b=3, c=a, a)) == child(b=3, c=a, a=a)
It would also be nice to be able to mix in some unnamed constants (not variables), and have them treated as named arguments when passed to the child.
child(params(7, c, b=x)) == child(a=7, c=c, b=x) # desired
named_dots(7, c, b=x) == list("7"=7, c=c, b=x) # undesired
But in a non-R way, I'd prefer to raise errors rather than trying to muddle through with what are likely programmer mistakes:
child(params(c, 7, b=x)) # ERROR: unnamed parameters must be first
Are there tools that already exist to do this? Simple ways to piece together existing functions to do what I want? Better ways to accomplish the same goal of getting safety in the presence of changing parameter lists without unwieldy repetition? Improvements to my suggested syntax to make it even safer?
Pre-bounty clarification: Both the parent() and child() functions should be considered unchangeable. I'm not interested in wrapping either with a different interface. Rather, I'm looking here for a way to write the proposed params() function in a general manner that can rewrite the list of arguments on the fly so that both parent() and child() can be used directly with a safe but non-verbose syntax.
Post-bounty clarification: While inverting the parent-child relationship and using do.call() is a useful technique, it's not the one I'm looking for here. Instead, I'm looking for a way to accept a '...' argument, modify it to have named parameters, and then return it in a form that the enclosing function will accept. It's possible that as others suggest this is truly impossible. Personally, I currently think it is possible with a C level extension, and my hope is that this extension already exists. Perhaps the vadr package does what I want? https://github.com/crowding/vadr#dot-dot-dot-lists-and-missing-values
Partial credit: I feel silly just letting the bounty expire. If there are no full solutions, I'll award it to anyone who gives a proof of concept of at least one of the necessary steps. For example, modifying a '...' argument within a function and then passing it to another function without using do.call(). Or returning an unmodified '...' argument in a way that the parent can use it. Or anything that best points the way toward a direct solution, or even some useful links: http://r.789695.n4.nabble.com/internal-manipulation-of-td4682090.html But I'm reluctant to award it to an answer that starts with the (otherwise entirely reasonable) premise that "you don't want to do that", or "that's impossible so here's an alternative".
Bounty awarded: There are several really useful and practical answers, but I chose to award the bounty to #crowding. While he (probably correctly) asserts that what I want is impossible, I think his answer comes closest to the 'idealistic' approach I'm aiming for. I also think that his vadr package might be a good starting point for a solution, whether it matches my (potentially unrealistic) design goals or not. The 'accepted answer' is still up for grabs if in case someone figures out a way to do the impossible. Thanks for the other answers and suggestions, and hopefully they will help someone put together the pieces for a more robust R syntax.
I think attempting to overwrite the built argument matching functionality of R is somewhat dangerous, so here is a solution that uses do.call.
It was unclear how much of parent is changeable
# This ensures that only named arguments to formals get passed through
parent = function(a, b, c) {
do.call("child", mget(names(formals(child))))
}
A second option, based on the "magic" of write.csv
# this second option replaces the call to parent with child and passes the
# named arguments that have been matched within the call to parent
#
parent2 <- function(a,b,c){
Call <- match.call()
Call[[1]] <- quote(child)
eval(Call)
}
You can't change the parameters to a function from inside the function call. The next best way would be to write a simple wrapper around the call. Perhaps something like this can help
with_params <- function(f, ...) {
dots <- substitute(...())
dots <- setNames(dots, sapply(dots, deparse))
do.call(f, as.list(dots), envir=parent.frame())
}
And we can test with something like
parent1 <- function(a, b, c) {
child(a, b, c)
}
parent2 <- function(a, b, c) {
with_params(child, a, b, c)
}
child <- function(a, b, c) {
a * b - c
}
parent1(5,6,7)
# [1] 23
parent2(5,6,7)
# [1] 23
child <- function(c, a, b) {
a * b - c
}
parent1(5,6,7)
# [1] 37
parent2(5,6,7)
# [1] 23
Note that parent2 is robust to change in the parameter order of the child while parent is.
It's not easy to get the exact syntax you're proposing. R is lazily evaluated, so syntax appearing in an argument to a function is only looked at after the function call is started. By the time the interpreter encounters params() it will have already started the call to child() and bound all of child's arguments and executed some of child's code. Can't rewrite child's arguments after the horse has left the barn, so to speak.
(Most non-lazy languages wouldn't let you do this either, for various reasons)
So the syntax will need to be something that that has the the references to 'child' and 'params' in its arguments. vadr has a %()% operator that can work. It applies a dotlist given on the right to a function given on the left. So you would write:
child %()% params(a, b, c)
where params catches its arguments in a dotlist and manipulates them:
params <- function(...) {
d <- dots(...)
ex <- expressions(d)
nm <- names(d) %||% rep("", length(d))
for (i in 1:length(d)) {
if (is.name(ex[[i]]) && nm[[i]] == "") {
nm[[i]] = as.character(ex[[i]])
}
}
names(d) <- nm
d
}
This was meant to be a comment but it doesn't fit the limits. I commend you for your programming ambition and purity, but I think the goal is unattainable. Let's assume params exists and apply it to the function list. By the definition of params, identical(list(a = a, b = b, c = c) , list(params(a, b, c)). From this it follows that identical(a, params(a, b, c)) by taking the first element of the first and second argument of identical. From which it follows that params does not depend on its second and later arguments, a contradiction. Q.E.D. But I think your idea is a lovely example of DRY in R and I am perfectly happy with do.call(f, params(a,b,c)), which has an additional do.call, but no repetition. With your permission I would like to incorporate it in my package bettR which collects various ideas to improve the R language. A related idea which I was toying with is creating a function that allows another function to get missing args from the calling frame. That is, instead of calling f(a = a, b = b), one could call f() and inside f there would be something like args = eval(formals(f), parent.frame()) but encapsulated into a macro-like construct args = get.args.by.name or some such. This is distinct from your idea in that it requires f to be programmed deliberately to have this feature.
Here's an answer that at first would appear to work. Diagnosing why it does not may lead to enlightenment as to why it cannot (hint: see first paragraph of #crowding's answer).
params<-function(...) {
dots<-list(...)
names(dots)<-eval(substitute(alist(...)))
child.env<-sys.frame(-1)
child.fun<-sys.function(sys.parent()+1)
args<-names(formals(child.fun))
for(arg in args) {
assign(arg,dots[[arg]],envir=child.env)
}
dots[[args[[1]]]]
}
child1<-function(a,b,c) a*b-c
parent1<-function(a,b,c) child1(params(a,b,c))
parent1(1,2,3)
#> -1
child2<-function(a,c,b) a*b-c #swap b and c in formals
parent2<-function(a,b,c) child2(params(a,b,c)) #mirrors parent1
parent2(1,2,3)
#> -1
Both produce 1*2-3 == -1 even though the order of b=2 and c=3 have been swapped in the formal argument list of child1 versus child2.
This is basically a clarification to Mnel's answer. If it happens to answer your question, please do not accept it or award the bounty; it should go to Mnel. First, we define call_as_parent, which you use to call a function inside another function as the outside function:
call_as_parent <- function(fun) {
par.call <- sys.call(sys.parent())
new.call <- match.call(
eval(par.call[[1L]], parent.frame(2L)),
call=par.call
)
new.call[[1L]] <- fun
eval(new.call, parent.frame())
}
Then we define parent and child:
child <- function(c, b, a) a - b - c
parent <- function(a, b, c) call_as_parent(child)
And finally, some examples
child(5, 10, 20)
# [1] 5
parent(5, 10, 20)
# [1] -25
Notice how clearly in the second example the 20 is getting matched to c, as it should.
I'm writing some functions in R and I'm having some issues. Summarizing, inside the function I'm writing, I call another function that I've developed. The 2nd function shares some arguments with the first, how to specify to this 2nd function that has to take the same values for its arguments that the ones in the first function?
first.fx=function(arg1,arg2,arg3,...){
.
.
.
second.fx=function(arg2,arg3,arg4,...){
}
}
The second.fx shares with the first arg2 & arg3. How to inherit these to values to second.fx?
Simply assign the values (which come from the call to first.fx as default parameters in the definition of second.fx:
second.fx <- function(arg2=arg2,arg3=arg3,arg4,...){
You don't need to declare the arguments explicitly in the definition of second.fx. By the magic of lexical scoping, these variables will be found in second.fx's enclosing environment, which is that of first.fx.
first.fx <- function(arg1, arg2, arg3, ...)
{
second.fx <- function(arg4)
{
# values of arg2/3 will be found from first.fx's environment
}
}
I'm trying to figure out how to get R's callCC function for short-circuiting evalutation of a function to work with functions like lapply and Reduce.
Motivation
This would make Reduce and and lapply have asymptotic efficiency > O(n), by allowing you to
exit a computation early.
For example, if I'm searching for a value in a list I could map a 'finder' function across the list, and the second it is found lapply stops running and that value is returned (much like breaking a loop, or using a return statement to break out early).
The problem is I am having trouble writing the functions that lapply and Reduce should take using a style that callCC requires.
Example
Say I'm trying to write a function to find the value '100' in a list: something equivalent to
imperativeVersion <- function (xs) {
for (val in xs) if (val == 100) return (val)
}
The function to pass to lapply would look like:
find100 <- function (val) { if (val == 100) SHORT_CIRCUIT(val) }
functionalVersion <- function (xs) lapply(xs, find100)
This (obviously) crashes, since the short circuiting function hasn't been defined yet.
callCC( function (SHORT_CIRCUIT) lapply(1:1000, find100) )
The problem is that this also crashes, because the short circuiting function wasn't around when find100 was defined. I would like for something similar to this to work.
the following works because SHORT_CIRCUIT IS defined at the time that the function passed to lapply is created.
callCC(
function (SHORT_CIRCUIT) {
lapply(1:1000, function (val) {
if (val == 100) SHORT_CIRCUIT(val)
})
)
How can I make SHORT_CIRCUIT be defined in the function passed to lapply without defining it inline like above?
I'm aware this example can be achieved using loops, reduce or any other number of ways. I am looking for a solution to the problem of using callCC with lapply and Reduce in specific.
If I was vague or any clarification is needed please leave a comment below. I hope someone can help with this :)
Edit One:
The approach should be 'production-quality'; no deparsing functions or similar black magic.
I found a soluton to this problem:
find100 <- function (val) {
if (val == 100) SHORT_CIRCUIT(val)
}
short_map <- function (fn, coll) {
callCC(function (SHORT_CIRCUIT) {
clone_env <- new.env(parent = environment(fn))
clone_env$SHORT_CIRCUIT <- SHORT_CIRCUIT
environment(fn) <- clone_env
lapply(coll, fn)
})
}
short_map(find100, c(1,2,100,3))
The trick to making higher-order functions work with callCC is to assign the short-circuiting function into the input functions environment before carrying on with the rest of the program. I made a clone of the environment to avoid unintended side-effects.
You can achieve this using metaprogramming in R.
#alexis_laz's approach was in fact already metaprogramming.
However, he used strings which are a dirty hack and error prone. So you did well to reject it.
The correct way to approach #alexis_laz's approach would be by wrangling on code level. In base R this is done using substitute(). There are however better packages e.g. rlang by Hadley Wickham. But I give you a base R solution (less dependency).
lapply_ <- function(lst, FUN) {
eval.parent(
substitute(
callCC(function(return_) {
lapply(lst_, FUN_)
}),
list(lst_ = lst, FUN_=substitute(FUN))))
}
Your SHORT_CIRCUIT function is actually a more general, control flow return function (or a break function which takes an argument to return it). Thus, I call it return_.
We want to have a lapply_ function, in which we can in the FUN= part use a return_ to break out of the usual lapply().
As you showed, this is the aim:
callCC(
function (return_) {
lapply(1:1000, function (x) if (x == 100) return_(x))
}
)
Just with the problem, that we want to be able to generalize this expression.
We want
callCC(
function(return_) lapply(lst, FUN_)
)
Where we can use inside the function definition we give for FUN_ the return_.
We can let, however, the function defintion see return_ only if we insert the function definition code into this expression.
This exactly #alexis_laz tried using string and eval.
Or you did this by manipulating environment variables.
We can safely achieve the insertion of literal code using substitute(expr, replacer_list) where expr is the code to be manipulated and replacer_list is the lookup table for the replacement of code.
By substitute(FUN) we take the literal code given for FUN= for lapply_ without evaluating it. This expression returns literal quoted code (better than the string in #alexis_laz's approach).
The big substitute command says: "Take the expression callCC(function(return_) lapply(lst_, FUN_)) and replace lst_ in this expression by the list given for coll and FUN_ by the literal quoted expression given for FUN.
This replaced expression is then evaluated in the parent environment (eval.parent()) meaning: the resulting expression replaces the lapply_() call and is executed exactly where it was placed.
Such use of eval.parent() (or eval( ... , envir=parent.frame())) is fool proof. (otherwise, tidyverse packages wouldn't be production level ...).
So in this way, you can generalize callCC() calls.
lapply_(1:1000, FUN=function(x) if (x==100) return_(x))
## [1] 100
I don't know if it can be of use, but:
find100 <- "function (val) { if (val == 100) SHORT_CIRCUIT(val) }"
callCC( function (SHORT_CIRCUIT) lapply(1:1000, eval(parse(text = find100))) )
#[1] 100