Verbatim command arguments: deparse(substitute(foo)) in a wrapper - r

Here's a little puzzler for those fluent in the nitty-gritty of how the R evaluator handles function calls. Suppose I wanted to write a function that takes an R statement, same as what I'd write at the command line, and echoes both it, and the evaluated result. Example:
> p.eval(sum(1:3))
sum(1:3) --> 6
That's easy; here's the definition of p.eval():
p.eval <- function(v,prefix="--> ") {
cmd <- deparse(substitute(v)); cat(cmd,prefix,v,"\n")
}
But suppose I now want to write a wrapper around p.eval, to be invoked the same way; perhaps as a somewhat demented binary operator with a dummy second argument:
%PE% <- function(x,...) p.eval(x)
I'd like to invoke it like so: sum(1:3) %PE% 0 should be equivalent to the old p.eval(sum(1:3)). This doesn't work, of course, because the deparse(substitute()) of p.eval() now gives x.
Question to the enlightened: is there a way to make this work as I desire?.. For this particular usage, I'm quite fine with defining %PE% by copying/pasting the one-liner definition of p.eval, so this question is mostly academic in nature. Maybe I'll learn something about the nitty-gritty of the R evaluator :)
P.S.: Why might one find the above functions useful?.. Suppose I develop some analysis code and invoke it non-interactively through org-babel (which is most definitely worth playing with if you are an Org-mode and/or an Emacs user). By default, org-babel slurps up the output as things are evaluated in the interpreter. Thus, if I want to get anything but raw numbers, I have to explicitly construct strings to be printed through cat or paste, but who wants to do that when they are flying through the analysis?.. The hack above allows you to simply append %PE%0 after a line that you want printed, and this echoes the command to the org output.

Try this:
> "%PE%" <- function(x, ...) do.call(p.eval, list(substitute(x)))
> sum(1:3) %PE% 0
sum(1:3) --> 6

Also could just have p.eval return "v" and then:
p.eval <- function(v,prefix="--> ") {
cmd <- deparse(substitute(v)); cat(cmd,prefix,v,"\n") ; return(v) }
"%PE%" <- function(x, y=NULL) x
sum(1:3) %PE% Inf
#[1] 6
sum(1:3) %PE% # won't accept single argument
r # give it anything
#[1] 6

Related

rlang: Error: Can't convert a function to a string

I created a function to convert a function name to string. Version 1 func_to_string1 works well, but version 2 func_to_string2 doesn't work.
func_to_string1 <- function(fun){
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string2 <- function(fun){
is.function(fun)
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string1 works:
> func_to_string1(sum)
[1] "sum"
func_to_string2 doesn't work.
> func_to_string2(sum)
Error: Can't convert a primitive function to a string
Call `rlang::last_error()` to see a backtrace
My guess is that by calling the fun before converting it to a string, it gets evaluated inside function and hence throw the error message. But why does this happen since I didn't do any assignments?
My questions are why does it happen and is there a better way to convert function name to string?
Any help is appreciated, thanks!
This isn't a complete answer, but I don't think it fits in a comment.
R has a mechanism called pass-by-promise,
whereby a function's formal arguments are lazy objects (promises) that only get evaluated when they are used.
Even if you didn't perform any assignment,
the call to is.function uses the argument,
so the promise is "replaced" by the result of evaluating it.
Nevertheless, in my opinion, this seems like an inconsistency in rlang*,
especially given cory's answer,
which implies that R can still find the promise object even after a given parameter has been used;
the mechanism to do so might not be part of R's public API though.
*EDIT: see coments.
Regardless, you could treat enexpr/enquo/ensym like base::missing,
in the sense that you should only use them with parameters you haven't used at all in the function's body.
Maybe use this instead?
func_to_string2 <- function(fun){
is.function(fun)
deparse(substitute(fun))
#print(rlang::as_string(rlang::enexpr(fun)))
}
> func_to_string2(sum)
[1] "sum"
This question brings up an interesting point on lazy evaluations.
R arguments are lazily evaluated, meaning the arguments are not evaluated until its required.
This is best understood in the Advanced R book which has the following example,
f <- function(x) {
10
}
f(stop("This is an error!"))
the result is 10, which is surprising because x is never called and hence never evaluated. We can force x to be evaluated by using force()
f <- function(x) {
force(x)
10
}
f(stop("This is an error!"))
This behaves as expected. In fact we dont even need force() (Although it is good to be explicit).
f <- function(x) {
x
10
}
f(stop("This is an error!"))
This what is happening with your call here. The function sum which is a symbol initially is being evaluated with no arguments when is.function() is being called. In fact, even this will fail.
func_to_string2 <- function(fun){
fun
print(rlang::as_string(rlang::ensym(fun)))
}
Overall, I think its best to use enexpr() at the very beginning of the function.
Source:
http://adv-r.had.co.nz/Functions.html

Confused by ...()?

In another question, sapply(substitute(...()), as.character) was used inside a function to obtain the names passed to the function. The as.character part sounds fine, but what on earth does ...() do?
It's not valid code outside of substitute:
> test <- function(...) ...()
> test(T,F)
Error in test(T, F) : could not find function "..."
Some more test cases:
> test <- function(...) substitute(...())
> test(T,F)
[[1]]
T
[[2]]
F
> test <- function(...) substitute(...)
> test(T,F)
T
Here's a sketch of why ...() works the way it does. I'll fill in with more details and references later, but this touches on the key points.
Before performing substitution on any of its components, substitute() first parses an R statement.
...() parses to a call object, whereas ... parses to a name object.
... is a special object, intended only to be used in function calls. As a consequence, the C code that implements substitution takes special measures to handle ... when it is found in a call object. Similar precautions are not taken when ... occurs as a symbol. (The relevant code is in the functions do_substitute, substitute, and substituteList (especially the latter two) in R_SRCDIR/src/main/coerce.c.)
So, the role of the () in ...() is to cause the statement to be parsed as a call (aka language) object, so that substitution will return the fully expanded value of the dots. It may seem surprising that ... gets substituted for even when it's on the outside of the (), but: (a) calls are stored internally as list-like objects and (b) the relevant C code seems to make no distinction between the first element of that list and the subsequent ones.
Just a side note: for examining behavior of substitute or the classes of various objects, I find it useful to set up a little sandbox, like this:
f <- function(...) browser()
f(a = 4, 77, B = "char")
## Then play around within the browser
class(quote(...)) ## quote() parses without substituting
class(quote(...()))
substitute({...})
substitute(...(..., X, ...))
substitute(2 <- (makes * list(no - sense))(...))

Why the "=" R operator should not be used in functions?

The manual states:
The operator ‘<-’ can be used anywhere,
whereas the operator ‘=’ is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.
The question here mention the difference when used in the function call. But in the function definition, it seems to work normally:
a = function ()
{
b = 2
x <- 3
y <<- 4
}
a()
# (b and x are undefined here)
So why the manual mentions that the operator ‘=’ is only allowed at the top level??
There is nothing about it in the language definition (there is no = operator listed, what a shame!)
The text you quote says at the top level OR in a braced list of subexpressions. You are using it in a braced list of subexpressions. Which is allowed.
You have to go to great lengths to find an expression which is neither toplevel nor within braces. Here is one. You sometimes want to wrap an assignment inside a try block: try( x <- f() ) is fine, but try( x = f(x) ) is not -- you need to either change the assignment operator or add braces.
Expressions not at the top level include usage in control structures like if. For example, the following programming error is illegal.
> if(x = 0) 1 else x
Error: syntax error
As mentioned here: https://stackoverflow.com/a/4831793/210673
Also see http://developer.r-project.org/equalAssign.html
Other than some examples such as system.time as others have shown where <- and = have different results, the main difference is more philisophical. Larry Wall, the creater of Perl, said something along the lines of "similar things should look similar, different things should look different", I have found it interesting in different languages to see what things are considered "similar" and which are considered "different". Now for R assignment let's compare 2 commands:
myfun( a <- 1:10 )
myfun( a = 1:10 )
Some would argue that in both cases we are assigning 1:10 to a so what we are doing is similar.
The other argument is that in the first call we are assigning to a variable a that is in the same environment from which myfun is being called and in the second call we are assigning to a variable a that is in the environment created when the function is called and is local to the function and those two a variables are different.
So which to use depends on whether you consider the assignments "similar" or "different".
Personally, I prefer <-, but I don't think it is worth fighting a holy war over.

Can an R function access its own name?

Can you write a function that prints out its own name?
(without hard-coding it in, obviously)
You sure can.
fun <- function(x, y, z) deparse(match.call()[[1]])
fun(1,2,3)
# [1] "fun"
You can, but just in case it's because you want to call the function recursively see ?Recall which is robust to name changes and avoids the need to otherwise process to get the name.
Recall package:base R Documentation
Recursive Calling
Description:
‘Recall’ is used as a placeholder for the name of the function in
which it is called. It allows the definition of recursive
functions which still work after being renamed, see example below.
As you've seen in the other great answers here, the answer seems to be "yes"...
However, the correct answer is actually "yes, but not always". What you can get is actually the name (or expression!) that was used to call the function.
First, using sys.call is probably the most direct way of finding the name, but then you need to coerce it into a string. deparse is more robust for that.
myfunc <- function(x, y=42) deparse(sys.call()[[1]])
myfunc (3) # "myfunc"
...but you can call a function in many ways:
lapply(1:2, myfunc) # "FUN"
Map(myfunc, 1:2) # (the whole function definition!)
x<-myfunc; x(3) # "x"
get("myfunc")(3) # "get(\"myfunc\")"
The basic issue is that a function doesn't have a name - it's just that you typically assign the function to a variable name. Not that you have to - you can have anonymous functions - or assign many variable names to the same function (the x case above).

Using readline() within a for loop

I have a function, as follows:
f.factor <- function(x) {
print(length(unique(x)))
z <- 1
for (i in 1:length(unique(x))) {
z[i] <- readline(":")
}
x <- factor(x, labels=c(z))
return(x)
}
Essentially, it allows me to copy/paste/type or just simply write into my script the factors for a particular variable without having to type c("..","...") a million times.
I've run into a problem when I try to use this function in a loop, perhaps the loop structure will not allow lines to be read within the loop?
for(i in 1:ncol(df.)) {
df[,paste("q4.",i,sep="")] <- f.factor(df[,paste("q4.",i,sep="")])
Never Heard of
Heard of but Not at all Familiar
Somewhat Familiar
Familiar
Very Familiar
Extremely Familiar
}
In the end, I'm looking for a way to specify the factor label without having to rewrite it over and over.
That was only working before because when you pasted all the code in at the top level it was executed immediately and the readline() call used the following N lines. In a function, or any control structure, it will try to parse it as R code which will fail.
A multiline string can stand in for a passable heredoc:
lvls = strsplit('
Never Heard of
Heard of but Not at all Familiar
Somewhat Familiar
Familiar
Very Familiar
Extremely Familiar
', '\n')[[1]][-1]
Instead of the for loop you can just use scan without a file name (and what='' and possibly sep='\n'.
> tmp <- scan(what='', sep='\n')
1: hello there
2: some more
3:
Read 2 items
> tmp
[1] "hello there" "some more"
>

Resources