In R, is it possible to define a function using `function`? [duplicate] - r

I am trying to call the function `function` to define a function in R code.
As we all know™️, `function`is a .Primitive that’s used internally by R to define functions when the user uses the conventional syntax, i.e.
mean1 = function (x, ...) base::mean(x, ...)
But there’s nothing preventing me from calling that primitive directly. Or so I thought. I can call other primitives directly (and even redefine them; for instance, in a moment of madness I overrode R’s builtin `for`). So this is in principle possible.
Yet I cannot get it to work for `function`. Here’s what I tried:
# Works
mean2 = as.function(c(formals(mean), quote(mean(x, ...))))
# Works
mean3 = eval(call('function', formals(mean), quote(mean(x, ...))))
# Error: invalid formal argument list for "function"
mean4 = `function`(formals(mean), quote(mean(x, ...)))
The fact that mean3 in particular works indicates to me that mean4 should work. But it doesn’t. Why?
I checked the definition of the `function` primitive in the R source. do_function is defined in eval.c. And I see that it calls CheckFormals, which ensures that each argument is a symbol, and this fails. But why does it check this, and what does that mean?
And most importantly: Is there a way of calling the `function` primitive directly?
Just to clarify: There are trivial workarounds (this question lists two, and there’s at least a third). But I’d like to understand how this (does not) works.

This is because function is a special primitive:
typeof(`function`)
#> [1] "special"
The arguments are not evaluated, so you have actually passed quote(formals(mean)) instead of the value of formals(mean). I don't think there's a way of calling function directly without evaluation tricks, except with an empty formals list which is just NULL.

For completeness’ sake, Lionel’s answer hints at a way of calling `function` after all. Unfortunately it’s rather restricted, since we cannot pass any argument definition except for NULL:
mean5 = `function`(NULL, mean(x, ...))
formals(mean5) = formals(mean)
(Note the lack of quoting around the body!)
This is of course utterly unpractical (and formals<- internally calls as.function anyway.)

After digging a little bit through the source code, here are a few observations:
The actual function creation is done by mkCLOSXP(). This is what gets called by function() {}, by as.function.default() and by .Primitive("function") (a.k.a. `function`)
as.function.default() gets routed to do_asfunction(), which also calls CheckFormals(). However, it directly constructs these formals a few lines above that.
As you pointed out, the other place where CheckFormals() gets called is inside do_function(). However, I don't think do_function() gets called by anything other than .Primitive("function"), so this is the only situation where CheckFormals() is called on the user's input.
CheckFormals() does actually correctly validate a pairlist object.
You can check the last point yourself by running parts of the CheckFormals() function using inline::cfunction
inline::cfunction( c(x="ANY"),
'Rprintf("is list?: %d\\nTag1 OK?: %d\\nTag2 OK?: %d\\nTag3 NULL?: %d\\n",
isList(x), TYPEOF(TAG(x)) == SYMSXP, TYPEOF(TAG(CDR(x))) == SYMSXP,
CDR(CDR(x)) == R_NilValue); return R_NilValue;' )( formals(mean) )
# is list?: 1
# Tag1 OK?: 1
# Tag2 OK?: 1
# Tag3 NULL?: 1
So, somewhere between you passing formals(means) to .Primitive("function") and it getting forwarded to CheckFormals() by do_function(), the argument loses its validity. (I don't know the R source well enough to tell you how that happens.) However, since do_function() is only called by .Primitive("function"), you don't encounter this situation with any other examples.

Related

How an R function can be used before being defined

I came across this spinet of code where the function rval_top_ingredients() was used to render a D3wordcloud before it was defined. I think that would throw an error in case of Python as the script is executed from top to bottom. Why did it work in R then? Thankyou.
output$wc_ingredients <- d3wordcloud::renderD3wordcloud({
ingredients_df <- rval_top_ingredients()
d3wordcloud(ingredients_df$ingredient, ingredients_df$nb_recipes, tooltip = TRUE)
})
rval_top_ingredients <- reactive({
recipes_enriched %>%
filter(cuisine == input$cuisine) %>%
arrange(desc(tf_idf)) %>%
head(input$nb_ingredients) %>%
mutate(ingredient = forcats::fct_reorder(ingredient, tf_idf))
})
R doesn’t differ from Python here: you can’t use a function before it’s defined. But, despite appearances to the contrary, this also isn’t happening here.
d3wordcloud::renderD3wordcloud is a special function call which doesn’t evaluate its arguments immediately. In fact, the argument is stored internally as an unevaluated expression and is only evaluated later after a certain trigger. By that time, rval_top_ingredients has been defined.
This is a pervasive pattern in Shiny, but you can harness this behaviour yourself. Consider the following:
f = function (expr) {}
f(g())
g = function () { stop('oh no!') }
This code works, since f never uses its argument, and since R uses lazy evaluation for function arguments: unlike most other languages, a function argument only gets evaluated once it is used. Arguments that are never used are never evaluated.
So, despite the fact that f(g()) appears to use g before it’s defined, the actual call to f never evaluates its arguments so there’s no issue. The only constraint is that the argument needs to be syntactically valid.
Here’s a slightly more meaningful example which does something useful (it creates a function that creates a log message before evaluating an expression:
make_verbose = function (expr) {
function () {
message(sprintf('Evaluating %s', deparse(substitute(expr))))
expr
}
}
verbose_g = make_verbose(g())
g = function () {
message('g was called!')
}
verbose_g()
Python doesn’t quite support this, since Python doesn’t have lazy and non-standard evaluation. But a similar situation still exists in Python:
def f():
g()
def g():
print('g()')
f()
Here, g() is seemingly used before it was defined; but this is only true if we’re reading the code textually from top top bottom without paying attention to scope. In reality, g() is only ever called after it was defined. The same is true in the R code you’ve posted.

rlang: Error: Can't convert a function to a string

I created a function to convert a function name to string. Version 1 func_to_string1 works well, but version 2 func_to_string2 doesn't work.
func_to_string1 <- function(fun){
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string2 <- function(fun){
is.function(fun)
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string1 works:
> func_to_string1(sum)
[1] "sum"
func_to_string2 doesn't work.
> func_to_string2(sum)
Error: Can't convert a primitive function to a string
Call `rlang::last_error()` to see a backtrace
My guess is that by calling the fun before converting it to a string, it gets evaluated inside function and hence throw the error message. But why does this happen since I didn't do any assignments?
My questions are why does it happen and is there a better way to convert function name to string?
Any help is appreciated, thanks!
This isn't a complete answer, but I don't think it fits in a comment.
R has a mechanism called pass-by-promise,
whereby a function's formal arguments are lazy objects (promises) that only get evaluated when they are used.
Even if you didn't perform any assignment,
the call to is.function uses the argument,
so the promise is "replaced" by the result of evaluating it.
Nevertheless, in my opinion, this seems like an inconsistency in rlang*,
especially given cory's answer,
which implies that R can still find the promise object even after a given parameter has been used;
the mechanism to do so might not be part of R's public API though.
*EDIT: see coments.
Regardless, you could treat enexpr/enquo/ensym like base::missing,
in the sense that you should only use them with parameters you haven't used at all in the function's body.
Maybe use this instead?
func_to_string2 <- function(fun){
is.function(fun)
deparse(substitute(fun))
#print(rlang::as_string(rlang::enexpr(fun)))
}
> func_to_string2(sum)
[1] "sum"
This question brings up an interesting point on lazy evaluations.
R arguments are lazily evaluated, meaning the arguments are not evaluated until its required.
This is best understood in the Advanced R book which has the following example,
f <- function(x) {
10
}
f(stop("This is an error!"))
the result is 10, which is surprising because x is never called and hence never evaluated. We can force x to be evaluated by using force()
f <- function(x) {
force(x)
10
}
f(stop("This is an error!"))
This behaves as expected. In fact we dont even need force() (Although it is good to be explicit).
f <- function(x) {
x
10
}
f(stop("This is an error!"))
This what is happening with your call here. The function sum which is a symbol initially is being evaluated with no arguments when is.function() is being called. In fact, even this will fail.
func_to_string2 <- function(fun){
fun
print(rlang::as_string(rlang::ensym(fun)))
}
Overall, I think its best to use enexpr() at the very beginning of the function.
Source:
http://adv-r.had.co.nz/Functions.html

How do I call the `function` function?

I am trying to call the function `function` to define a function in R code.
As we all know™️, `function`is a .Primitive that’s used internally by R to define functions when the user uses the conventional syntax, i.e.
mean1 = function (x, ...) base::mean(x, ...)
But there’s nothing preventing me from calling that primitive directly. Or so I thought. I can call other primitives directly (and even redefine them; for instance, in a moment of madness I overrode R’s builtin `for`). So this is in principle possible.
Yet I cannot get it to work for `function`. Here’s what I tried:
# Works
mean2 = as.function(c(formals(mean), quote(mean(x, ...))))
# Works
mean3 = eval(call('function', formals(mean), quote(mean(x, ...))))
# Error: invalid formal argument list for "function"
mean4 = `function`(formals(mean), quote(mean(x, ...)))
The fact that mean3 in particular works indicates to me that mean4 should work. But it doesn’t. Why?
I checked the definition of the `function` primitive in the R source. do_function is defined in eval.c. And I see that it calls CheckFormals, which ensures that each argument is a symbol, and this fails. But why does it check this, and what does that mean?
And most importantly: Is there a way of calling the `function` primitive directly?
Just to clarify: There are trivial workarounds (this question lists two, and there’s at least a third). But I’d like to understand how this (does not) works.
This is because function is a special primitive:
typeof(`function`)
#> [1] "special"
The arguments are not evaluated, so you have actually passed quote(formals(mean)) instead of the value of formals(mean). I don't think there's a way of calling function directly without evaluation tricks, except with an empty formals list which is just NULL.
For completeness’ sake, Lionel’s answer hints at a way of calling `function` after all. Unfortunately it’s rather restricted, since we cannot pass any argument definition except for NULL:
mean5 = `function`(NULL, mean(x, ...))
formals(mean5) = formals(mean)
(Note the lack of quoting around the body!)
This is of course utterly unpractical (and formals<- internally calls as.function anyway.)
After digging a little bit through the source code, here are a few observations:
The actual function creation is done by mkCLOSXP(). This is what gets called by function() {}, by as.function.default() and by .Primitive("function") (a.k.a. `function`)
as.function.default() gets routed to do_asfunction(), which also calls CheckFormals(). However, it directly constructs these formals a few lines above that.
As you pointed out, the other place where CheckFormals() gets called is inside do_function(). However, I don't think do_function() gets called by anything other than .Primitive("function"), so this is the only situation where CheckFormals() is called on the user's input.
CheckFormals() does actually correctly validate a pairlist object.
You can check the last point yourself by running parts of the CheckFormals() function using inline::cfunction
inline::cfunction( c(x="ANY"),
'Rprintf("is list?: %d\\nTag1 OK?: %d\\nTag2 OK?: %d\\nTag3 NULL?: %d\\n",
isList(x), TYPEOF(TAG(x)) == SYMSXP, TYPEOF(TAG(CDR(x))) == SYMSXP,
CDR(CDR(x)) == R_NilValue); return R_NilValue;' )( formals(mean) )
# is list?: 1
# Tag1 OK?: 1
# Tag2 OK?: 1
# Tag3 NULL?: 1
So, somewhere between you passing formals(means) to .Primitive("function") and it getting forwarded to CheckFormals() by do_function(), the argument loses its validity. (I don't know the R source well enough to tell you how that happens.) However, since do_function() is only called by .Primitive("function"), you don't encounter this situation with any other examples.

julia introspection - get name of variable passed to function

In Julia, is there any way to get the name of a passed to a function?
x = 10
function myfunc(a)
# do something here
end
assert(myfunc(x) == "x")
Do I need to use macros or is there a native method that provides introspection?
You can grab the variable name with a macro:
julia> macro mymacro(arg)
string(arg)
end
julia> #mymacro(x)
"x"
julia> #assert(#mymacro(x) == "x")
but as others have said, I'm not sure why you'd need that.
Macros operate on the AST (code tree) during compile time, and the x is passed into the macro as the Symbol :x. You can turn a Symbol into a string and vice versa. Macros replace code with code, so the #mymacro(x) is simply pulled out and replaced with string(:x).
Ok, contradicting myself: technically this is possible in a very hacky way, under one (fairly limiting) condition: the function name must have only one method signature. The idea is very similar the answers to such questions for Python. Before the demo, I must emphasize that these are internal compiler details and are subject to change. Briefly:
julia> function foo(x)
bt = backtrace()
fobj = eval(current_module(), symbol(Profile.lookup(bt[3]).func))
Base.arg_decl_parts(fobj.env.defs)[2][1][1]
end
foo (generic function with 1 method)
julia> foo(1)
"x"
Let me re-emphasize that this is a bad idea, and should not be used for anything! (well, except for backtrace display). This is basically "stupid compiler tricks", but I'm showing it because it can be kind of educational to play with these objects, and the explanation does lead to a more useful answer to the clarifying comment by #ejang.
Explanation:
bt = backtrace() generates a ... backtrace ... from the current position. bt is an array of pointers, where each pointer is the address of a frame in the current call stack.
Profile.lookup(bt[3]) returns a LineInfo object with the function name (and several other details about each frame). Note that bt[1] and bt[2] are in the backtrace-generation function itself, so we need to go further up the stack to get the caller.
Profile.lookup(...).func returns the function name (the symbol :foo)
eval(current_module(), Profile.lookup(...)) returns the function object associated with the name :foo in the current_module(). If we modify the definition of function foo to return fobj, then note the equivalence to the foo object in the REPL:
julia> function foo(x)
bt = backtrace()
fobj = eval(current_module(), symbol(Profile.lookup(bt[3]).func))
end
foo (generic function with 1 method)
julia> foo(1) == foo
true
fobj.env.defs returns the first Method entry from the MethodTable for foo/fobj
Base.decl_arg_parts is a helper function (defined in methodshow.jl) that extracts argument information from a given Method.
the rest of the indexing drills down to the name of the argument.
Regarding the restriction that the function have only one method signature, the reason is that multiple signatures will all be listed (see defs.next) in the MethodTable. As far as I know there is no currently exposed interface to get the specific method associated with a given frame address. (as an exercise for the advanced reader: one way to do this would be to modify the address lookup functionality in jl_getFunctionInfo to also return the mangled function name, which could then be re-associated with the specific method invocation; however, I don't think we currently store a reverse mapping from mangled name -> Method).
Note also that (1) backtraces are slow (2) there is no notion of "function-local" eval in Julia, so even if one has the variable name, I believe it would be impossible to actually access the variable (and the compiler may completely elide local variables, unused or otherwise, put them in a register, etc.)
As for the IDE-style introspection use mentioned in the comments: foo.env.defs as shown above is one place to start for "object introspection". From the debugging side, Gallium.jl can inspect DWARF local variable info in a given frame. Finally, JuliaParser.jl is a pure-Julia implementation of the Julia parser that is actively used in several IDEs to introspect code blocks at a high level.
Another method is to use the function's vinfo. Here is an example:
function test(argx::Int64)
vinfo = code_lowered(test,(Int64,))
string(vinfo[1].args[1][1])
end
test (generic function with 1 method)
julia> test(10)
"argx"
The above depends on knowing the signature of the function, but this is a non-issue if it is coded within the function itself (otherwise some macro magic could be needed).

Can an R function access its own name?

Can you write a function that prints out its own name?
(without hard-coding it in, obviously)
You sure can.
fun <- function(x, y, z) deparse(match.call()[[1]])
fun(1,2,3)
# [1] "fun"
You can, but just in case it's because you want to call the function recursively see ?Recall which is robust to name changes and avoids the need to otherwise process to get the name.
Recall package:base R Documentation
Recursive Calling
Description:
‘Recall’ is used as a placeholder for the name of the function in
which it is called. It allows the definition of recursive
functions which still work after being renamed, see example below.
As you've seen in the other great answers here, the answer seems to be "yes"...
However, the correct answer is actually "yes, but not always". What you can get is actually the name (or expression!) that was used to call the function.
First, using sys.call is probably the most direct way of finding the name, but then you need to coerce it into a string. deparse is more robust for that.
myfunc <- function(x, y=42) deparse(sys.call()[[1]])
myfunc (3) # "myfunc"
...but you can call a function in many ways:
lapply(1:2, myfunc) # "FUN"
Map(myfunc, 1:2) # (the whole function definition!)
x<-myfunc; x(3) # "x"
get("myfunc")(3) # "get(\"myfunc\")"
The basic issue is that a function doesn't have a name - it's just that you typically assign the function to a variable name. Not that you have to - you can have anonymous functions - or assign many variable names to the same function (the x case above).

Resources