How an R function can be used before being defined - r

I came across this spinet of code where the function rval_top_ingredients() was used to render a D3wordcloud before it was defined. I think that would throw an error in case of Python as the script is executed from top to bottom. Why did it work in R then? Thankyou.
output$wc_ingredients <- d3wordcloud::renderD3wordcloud({
ingredients_df <- rval_top_ingredients()
d3wordcloud(ingredients_df$ingredient, ingredients_df$nb_recipes, tooltip = TRUE)
})
rval_top_ingredients <- reactive({
recipes_enriched %>%
filter(cuisine == input$cuisine) %>%
arrange(desc(tf_idf)) %>%
head(input$nb_ingredients) %>%
mutate(ingredient = forcats::fct_reorder(ingredient, tf_idf))
})

R doesn’t differ from Python here: you can’t use a function before it’s defined. But, despite appearances to the contrary, this also isn’t happening here.
d3wordcloud::renderD3wordcloud is a special function call which doesn’t evaluate its arguments immediately. In fact, the argument is stored internally as an unevaluated expression and is only evaluated later after a certain trigger. By that time, rval_top_ingredients has been defined.
This is a pervasive pattern in Shiny, but you can harness this behaviour yourself. Consider the following:
f = function (expr) {}
f(g())
g = function () { stop('oh no!') }
This code works, since f never uses its argument, and since R uses lazy evaluation for function arguments: unlike most other languages, a function argument only gets evaluated once it is used. Arguments that are never used are never evaluated.
So, despite the fact that f(g()) appears to use g before it’s defined, the actual call to f never evaluates its arguments so there’s no issue. The only constraint is that the argument needs to be syntactically valid.
Here’s a slightly more meaningful example which does something useful (it creates a function that creates a log message before evaluating an expression:
make_verbose = function (expr) {
function () {
message(sprintf('Evaluating %s', deparse(substitute(expr))))
expr
}
}
verbose_g = make_verbose(g())
g = function () {
message('g was called!')
}
verbose_g()
Python doesn’t quite support this, since Python doesn’t have lazy and non-standard evaluation. But a similar situation still exists in Python:
def f():
g()
def g():
print('g()')
f()
Here, g() is seemingly used before it was defined; but this is only true if we’re reading the code textually from top top bottom without paying attention to scope. In reality, g() is only ever called after it was defined. The same is true in the R code you’ve posted.

Related

rlang: Error: Can't convert a function to a string

I created a function to convert a function name to string. Version 1 func_to_string1 works well, but version 2 func_to_string2 doesn't work.
func_to_string1 <- function(fun){
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string2 <- function(fun){
is.function(fun)
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string1 works:
> func_to_string1(sum)
[1] "sum"
func_to_string2 doesn't work.
> func_to_string2(sum)
Error: Can't convert a primitive function to a string
Call `rlang::last_error()` to see a backtrace
My guess is that by calling the fun before converting it to a string, it gets evaluated inside function and hence throw the error message. But why does this happen since I didn't do any assignments?
My questions are why does it happen and is there a better way to convert function name to string?
Any help is appreciated, thanks!
This isn't a complete answer, but I don't think it fits in a comment.
R has a mechanism called pass-by-promise,
whereby a function's formal arguments are lazy objects (promises) that only get evaluated when they are used.
Even if you didn't perform any assignment,
the call to is.function uses the argument,
so the promise is "replaced" by the result of evaluating it.
Nevertheless, in my opinion, this seems like an inconsistency in rlang*,
especially given cory's answer,
which implies that R can still find the promise object even after a given parameter has been used;
the mechanism to do so might not be part of R's public API though.
*EDIT: see coments.
Regardless, you could treat enexpr/enquo/ensym like base::missing,
in the sense that you should only use them with parameters you haven't used at all in the function's body.
Maybe use this instead?
func_to_string2 <- function(fun){
is.function(fun)
deparse(substitute(fun))
#print(rlang::as_string(rlang::enexpr(fun)))
}
> func_to_string2(sum)
[1] "sum"
This question brings up an interesting point on lazy evaluations.
R arguments are lazily evaluated, meaning the arguments are not evaluated until its required.
This is best understood in the Advanced R book which has the following example,
f <- function(x) {
10
}
f(stop("This is an error!"))
the result is 10, which is surprising because x is never called and hence never evaluated. We can force x to be evaluated by using force()
f <- function(x) {
force(x)
10
}
f(stop("This is an error!"))
This behaves as expected. In fact we dont even need force() (Although it is good to be explicit).
f <- function(x) {
x
10
}
f(stop("This is an error!"))
This what is happening with your call here. The function sum which is a symbol initially is being evaluated with no arguments when is.function() is being called. In fact, even this will fail.
func_to_string2 <- function(fun){
fun
print(rlang::as_string(rlang::ensym(fun)))
}
Overall, I think its best to use enexpr() at the very beginning of the function.
Source:
http://adv-r.had.co.nz/Functions.html

How do I call the `function` function?

I am trying to call the function `function` to define a function in R code.
As we all know™️, `function`is a .Primitive that’s used internally by R to define functions when the user uses the conventional syntax, i.e.
mean1 = function (x, ...) base::mean(x, ...)
But there’s nothing preventing me from calling that primitive directly. Or so I thought. I can call other primitives directly (and even redefine them; for instance, in a moment of madness I overrode R’s builtin `for`). So this is in principle possible.
Yet I cannot get it to work for `function`. Here’s what I tried:
# Works
mean2 = as.function(c(formals(mean), quote(mean(x, ...))))
# Works
mean3 = eval(call('function', formals(mean), quote(mean(x, ...))))
# Error: invalid formal argument list for "function"
mean4 = `function`(formals(mean), quote(mean(x, ...)))
The fact that mean3 in particular works indicates to me that mean4 should work. But it doesn’t. Why?
I checked the definition of the `function` primitive in the R source. do_function is defined in eval.c. And I see that it calls CheckFormals, which ensures that each argument is a symbol, and this fails. But why does it check this, and what does that mean?
And most importantly: Is there a way of calling the `function` primitive directly?
Just to clarify: There are trivial workarounds (this question lists two, and there’s at least a third). But I’d like to understand how this (does not) works.
This is because function is a special primitive:
typeof(`function`)
#> [1] "special"
The arguments are not evaluated, so you have actually passed quote(formals(mean)) instead of the value of formals(mean). I don't think there's a way of calling function directly without evaluation tricks, except with an empty formals list which is just NULL.
For completeness’ sake, Lionel’s answer hints at a way of calling `function` after all. Unfortunately it’s rather restricted, since we cannot pass any argument definition except for NULL:
mean5 = `function`(NULL, mean(x, ...))
formals(mean5) = formals(mean)
(Note the lack of quoting around the body!)
This is of course utterly unpractical (and formals<- internally calls as.function anyway.)
After digging a little bit through the source code, here are a few observations:
The actual function creation is done by mkCLOSXP(). This is what gets called by function() {}, by as.function.default() and by .Primitive("function") (a.k.a. `function`)
as.function.default() gets routed to do_asfunction(), which also calls CheckFormals(). However, it directly constructs these formals a few lines above that.
As you pointed out, the other place where CheckFormals() gets called is inside do_function(). However, I don't think do_function() gets called by anything other than .Primitive("function"), so this is the only situation where CheckFormals() is called on the user's input.
CheckFormals() does actually correctly validate a pairlist object.
You can check the last point yourself by running parts of the CheckFormals() function using inline::cfunction
inline::cfunction( c(x="ANY"),
'Rprintf("is list?: %d\\nTag1 OK?: %d\\nTag2 OK?: %d\\nTag3 NULL?: %d\\n",
isList(x), TYPEOF(TAG(x)) == SYMSXP, TYPEOF(TAG(CDR(x))) == SYMSXP,
CDR(CDR(x)) == R_NilValue); return R_NilValue;' )( formals(mean) )
# is list?: 1
# Tag1 OK?: 1
# Tag2 OK?: 1
# Tag3 NULL?: 1
So, somewhere between you passing formals(means) to .Primitive("function") and it getting forwarded to CheckFormals() by do_function(), the argument loses its validity. (I don't know the R source well enough to tell you how that happens.) However, since do_function() is only called by .Primitive("function"), you don't encounter this situation with any other examples.

In R, is it possible to define a function using `function`? [duplicate]

I am trying to call the function `function` to define a function in R code.
As we all know™️, `function`is a .Primitive that’s used internally by R to define functions when the user uses the conventional syntax, i.e.
mean1 = function (x, ...) base::mean(x, ...)
But there’s nothing preventing me from calling that primitive directly. Or so I thought. I can call other primitives directly (and even redefine them; for instance, in a moment of madness I overrode R’s builtin `for`). So this is in principle possible.
Yet I cannot get it to work for `function`. Here’s what I tried:
# Works
mean2 = as.function(c(formals(mean), quote(mean(x, ...))))
# Works
mean3 = eval(call('function', formals(mean), quote(mean(x, ...))))
# Error: invalid formal argument list for "function"
mean4 = `function`(formals(mean), quote(mean(x, ...)))
The fact that mean3 in particular works indicates to me that mean4 should work. But it doesn’t. Why?
I checked the definition of the `function` primitive in the R source. do_function is defined in eval.c. And I see that it calls CheckFormals, which ensures that each argument is a symbol, and this fails. But why does it check this, and what does that mean?
And most importantly: Is there a way of calling the `function` primitive directly?
Just to clarify: There are trivial workarounds (this question lists two, and there’s at least a third). But I’d like to understand how this (does not) works.
This is because function is a special primitive:
typeof(`function`)
#> [1] "special"
The arguments are not evaluated, so you have actually passed quote(formals(mean)) instead of the value of formals(mean). I don't think there's a way of calling function directly without evaluation tricks, except with an empty formals list which is just NULL.
For completeness’ sake, Lionel’s answer hints at a way of calling `function` after all. Unfortunately it’s rather restricted, since we cannot pass any argument definition except for NULL:
mean5 = `function`(NULL, mean(x, ...))
formals(mean5) = formals(mean)
(Note the lack of quoting around the body!)
This is of course utterly unpractical (and formals<- internally calls as.function anyway.)
After digging a little bit through the source code, here are a few observations:
The actual function creation is done by mkCLOSXP(). This is what gets called by function() {}, by as.function.default() and by .Primitive("function") (a.k.a. `function`)
as.function.default() gets routed to do_asfunction(), which also calls CheckFormals(). However, it directly constructs these formals a few lines above that.
As you pointed out, the other place where CheckFormals() gets called is inside do_function(). However, I don't think do_function() gets called by anything other than .Primitive("function"), so this is the only situation where CheckFormals() is called on the user's input.
CheckFormals() does actually correctly validate a pairlist object.
You can check the last point yourself by running parts of the CheckFormals() function using inline::cfunction
inline::cfunction( c(x="ANY"),
'Rprintf("is list?: %d\\nTag1 OK?: %d\\nTag2 OK?: %d\\nTag3 NULL?: %d\\n",
isList(x), TYPEOF(TAG(x)) == SYMSXP, TYPEOF(TAG(CDR(x))) == SYMSXP,
CDR(CDR(x)) == R_NilValue); return R_NilValue;' )( formals(mean) )
# is list?: 1
# Tag1 OK?: 1
# Tag2 OK?: 1
# Tag3 NULL?: 1
So, somewhere between you passing formals(means) to .Primitive("function") and it getting forwarded to CheckFormals() by do_function(), the argument loses its validity. (I don't know the R source well enough to tell you how that happens.) However, since do_function() is only called by .Primitive("function"), you don't encounter this situation with any other examples.

How to define a function inside a function depending on variable values

I'm writing a function that I would find easier to write and read if it could define another function differently depending on input or runtime values of variables (and then use that function). The following illustrates the idea (even if defining a function inside a function is of no advantage in this simple example):
julia> function f(option::Bool)
if option
g() = println("option true")
g()
else
g() = println("option false")
g()
end
end;
WARNING: Method definition g() in module Main at REPL[1]:3 overwritten at REPL[1]:6.
julia> f(true)
option false
julia> f(false)
ERROR: UndefVarError: g not defined
in f(::Bool) at .\REPL[1]:7
Using the full function ... end syntax for g does not help either.
The question is: am I doing something wrong to get that warning and that unintended behavior, or Julia does not allow this for a reason? And if it can be done, how?
N.B. For my present need, I can just define two different functions, g1 and g2, and it seems to work; but what if there were many cases of g for just one task concept? I thought that a function, being a first-class object, could be manipulated freely: assigned to a variable, defined in a way or another depending on conditions, overwritten, etc.
P.S. I know I can compose a String and then parse-eval it, but that's an ugly solution.
You want to use anonymous functions. This is a known issue (this other issue also shows your problem).
function f(option::Bool)
if option
g = () -> println("option true")
else
g = () -> println("option false")
end
g
end
In v0.5 there's no performance difference between anonymous and generic functions, so there's no reason to not use anonymous functions. Note that there's also a sytnax for extended anonymous functions:
f = function (x)
x
end
and you can add dispatches via call overloading:
(T::typeof(f))(x,y) = x+y
so there's no reason to not use an anonymous function here.

Is stricter error reporting available in R?

In PHP we can do error_reporting(E_ALL) or error_reporting(E_ALL|E_STRICT) to have warnings about suspicious code. In g++ you can supply -Wall (and other flags) to get more checking of your code. Is there some similar in R?
As a specific example, I was refactoring a block of code into some functions. In one of those functions I had this line:
if(nm %in% fields$non_numeric)...
Much later I realized that I had overlooked adding fields to the parameter list, but R did not complain about an undefined variable.
(Posting as an answer rather than a comment)
How about ?codetools::checkUsage (codetools is a built-in package) ... ?
This is not really an answer, I just can't resist showing how you could declare globals explicitly. #Ben Bolker should post his comment as the Answer.
To avoiding seeing globals, you can take a function "up" one environment -- it'll be able to see all the standard functions and such (mean, etc), but not anything you put in the global environment:
explicit.globals = function(f) {
name = deparse(substitute(f))
env = parent.frame()
enclos = parent.env(.GlobalEnv)
environment(f) = enclos
env[[name]] = f
}
Then getting a global is just retrieving it from .GlobalEnv:
global = function(n) {
name = deparse(substitute(n))
env = parent.frame()
env[[name]] = get(name, .GlobalEnv)
}
assign('global', global, env=baseenv())
And it would be used like
a = 2
b = 3
f = function() {
global(a)
a
b
}
explicit.globals(f)
And called like
> f()
Error in f() : object 'b' not found
I personally wouldn't go for this but if you're used to PHP it might make sense.
Summing up, there is really no correct answer: as Owen and gsk3 point out, R functions will use globals if a variable is not in the local scope. This may be desirable in some situations, so how could the "error" be pointed out?
checkUsage() does nothing that R's built-in error-checking does not (in this case). checkUsageEnv(.GlobalEnv) is a useful way to check a file of helper functions (and might be great as a pre-hook for svn or git; or as part of an automated build process).
I feel the best solution when refactoring is: at the very start to move all global code to a function (e.g. call it main()) and then the only global code would be to call that function. Do this first, then start extracting functions, etc.

Resources