Lazy evaluation example with NULL in Advanced R - r

In Hadley Wickham's Advanced R, I'm having some trouble understanding one of the lazy evaluation examples.
Copied from the book, chapter on functions:
Laziness is useful in if statements — the second statement below will be >evaluated only if the first is true. If it wasn’t, the statement would return >an error because NULL > 0 is a logical vector of length 0 and not a valid input >to if.
x <- NULL
if (!is.null(x) && x > 0) {
}
As I understand the example, if R was not using lazy evaluation, the !is.null() and > 0 functions would be evaluated simultaneously, throwing an error since NULL is not a permissible argument to the ">" function. Is this correct? Is it thus generally advisable to include !is.null() in R statements when we expect that a variable could be NULL?

This is just the way the && operator works. It's called short-circuiting and is separate from lazy evaluation.
Lazy evaluation refers to the way function arguments are evaluated. In particular arguments are only evaluated when (if) they are actually used in the function. For example, consider the following function
f <- function(a, b) NULL
that does nothing and returns NULL. The arguments a and b are never evaluated because they are unused. They don't appear in the body of f, so you can call f with any expressions you want (as long as it's syntactically correct) as arguments because the expressions won't be evaluated. E.g.
> f(1, 2)
NULL
> f(43$foo, unboundvariableblablabla)
NULL
Without lazy evaluation the arguments are evaluated first and then passed to the function, so the call above would fail because if you try to evaluate 43$foo you'll get an error
> 43$foo
Error in 43$foo : $ operator is invalid for atomic vectors

Related

What is the Most Rly Way for Lazy Conditional Evaluation

The case. I have a part of code like this: if (exists("mybooleanvar") & mybooleanvar) {statement1} else {statement2}. I expect that if the conditions are lazily (short-circuit) evaluated astatement1 will be run if mybooleanvar is not assigned and statement2 will be called if mybooleanvar does not exist or equals FALSE.
But in practice I am getting a runtime error showing that the value of mybooleanvar is acessed and compared to TRUE if exists("mybooleanvar") == FALSE. So the complete boolean evaluation takes place.
Of course the issue can be solved by enclosed if statements with outer ones evaluating exists() and inner ones - booleans. But I wonder what is the most Rly way to properly avoid evaluation of n'th members of conditional statement if the result becomes known despite the values of further statements.
For example statement1 & statement2 will be FALSE if statement1 == FALSE. statement1 | statement2 is TRUE if statement1 == TRUE and statement2 needs not to be checked (or at least this check can be switched off by something like compiler directive {$B-) in Delphi).
Here I would use && instead of &. They differ in two ways (cf. ?"&&"):
The shorter form performs elementwise comparisons ...
and:
The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.
Example:
foo <- function()
if (exists("x") && (x)) cat("is TRUE\n") else cat("not existing or FALSE\n")
x <- TRUE
foo()
x <- FALSE
foo()
rm(x)
foo()
More can be found in this post.

rlang: Error: Can't convert a function to a string

I created a function to convert a function name to string. Version 1 func_to_string1 works well, but version 2 func_to_string2 doesn't work.
func_to_string1 <- function(fun){
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string2 <- function(fun){
is.function(fun)
print(rlang::as_string(rlang::enexpr(fun)))
}
func_to_string1 works:
> func_to_string1(sum)
[1] "sum"
func_to_string2 doesn't work.
> func_to_string2(sum)
Error: Can't convert a primitive function to a string
Call `rlang::last_error()` to see a backtrace
My guess is that by calling the fun before converting it to a string, it gets evaluated inside function and hence throw the error message. But why does this happen since I didn't do any assignments?
My questions are why does it happen and is there a better way to convert function name to string?
Any help is appreciated, thanks!
This isn't a complete answer, but I don't think it fits in a comment.
R has a mechanism called pass-by-promise,
whereby a function's formal arguments are lazy objects (promises) that only get evaluated when they are used.
Even if you didn't perform any assignment,
the call to is.function uses the argument,
so the promise is "replaced" by the result of evaluating it.
Nevertheless, in my opinion, this seems like an inconsistency in rlang*,
especially given cory's answer,
which implies that R can still find the promise object even after a given parameter has been used;
the mechanism to do so might not be part of R's public API though.
*EDIT: see coments.
Regardless, you could treat enexpr/enquo/ensym like base::missing,
in the sense that you should only use them with parameters you haven't used at all in the function's body.
Maybe use this instead?
func_to_string2 <- function(fun){
is.function(fun)
deparse(substitute(fun))
#print(rlang::as_string(rlang::enexpr(fun)))
}
> func_to_string2(sum)
[1] "sum"
This question brings up an interesting point on lazy evaluations.
R arguments are lazily evaluated, meaning the arguments are not evaluated until its required.
This is best understood in the Advanced R book which has the following example,
f <- function(x) {
10
}
f(stop("This is an error!"))
the result is 10, which is surprising because x is never called and hence never evaluated. We can force x to be evaluated by using force()
f <- function(x) {
force(x)
10
}
f(stop("This is an error!"))
This behaves as expected. In fact we dont even need force() (Although it is good to be explicit).
f <- function(x) {
x
10
}
f(stop("This is an error!"))
This what is happening with your call here. The function sum which is a symbol initially is being evaluated with no arguments when is.function() is being called. In fact, even this will fail.
func_to_string2 <- function(fun){
fun
print(rlang::as_string(rlang::ensym(fun)))
}
Overall, I think its best to use enexpr() at the very beginning of the function.
Source:
http://adv-r.had.co.nz/Functions.html

Determining whether a function has standard evaluation

Is there any way to programmatically tell if a given function in r has standard evaluation, and if not, which component of function evaluation –
parsing,
matching,
scoping,
promise formation,
promise fulfillment,
return,
etc. – is non-standard? I understand that closures are likely to be standard, and primitives are likely to be non-standard, but there are exceptions both ways. I’m asking about determining whether the function semantics are standard with respect to each of these things, not whether the function mechanics are standard.
I assume these things ought to be derivable from a close and careful reading of the help page, and failing that the code, and failing that any referenced source code. But it would save me a great deal of grief if I had a mechanical way of quickly identifying non-standard features in the evaluation of a given function.
If there is not a way to programmatically identify all the ways in which a function is nonstandard, are there ways to test for any aspect of standardness?
Quick way to check for non-standard evaluation (NSE) is to verify if certain keywords are used, e.g. substitute, eval, deparse etc.
Please see the code below. It looks into the body of function and count how many times NSE-related keywords are used.
is_nse <- function(x) {
nse_criteria <- c("substitute", "deparse", "eval", "parent.frame", "quote")
code <- as.character(body(x))
print(x)
cat("-------------------------------------------------------------------\n")
nse_count <- sapply(nse_criteria, function(x) sum(grepl(x, code)))
if(sum(nse_count) > 0)
warning("Possible non-standard evaluation")
nse_count
}
is_nse(as.Date.default)
Output:
function (x, ...)
{
if (inherits(x, "Date"))
x
else if (is.null(x))
.Date(numeric())
else if (is.logical(x) && all(is.na(x)))
.Date(as.numeric(x))
else stop(gettextf("do not know how to convert '%s' to class %s",
deparse1(substitute(x)), dQuote("Date")), domain = NA)
}
<bytecode: 0x0000021be90e8f18>
<environment: namespace:base>
-------------------------------------------------------------------
substitute deparse eval parent.frame quote
1 1 0 0 0
Warning message:
In is_nse(as.Date.default) : Possible non-standard evaluation

R errors indirect calling of argument

This is one of examples when '[' error occurs :
> libs=.packages(TRUE)
> library(help=libs[1])
Błąd w poleceniu 'find.package(pkgName, lib.loc, verbose = verbose)':
nie ma pakietu o nazwie ‘[’
R behaves differently when I use argument directly library(help="base") versus indirect use : x="base"; library(help=x), why R thinks I ask about x packages, what mechanism is used ? I think solution is somewhere here : http://adv-r.had.co.nz/
Looking at the source of library, you will find the following code
if (!character.only)
help <- as.character(substitute(help))
The help to substitute says that
substitute returns the parse tree for the (unevaluated) expression expr, substituting any variables bound in env
where env means the current evaluation environment. But libs is bound in .GlobalEnv and not in the environment of the function library.
A simple example of what you are doing
x="a"
test_fun=function(x) {
as.character(substitute(x))
}
test_fun(x)
#"x"
However, if x was defined within the function body
#delete previous definition of x, if necessary
#rm(list="x")
test_fun=function(x) {
x="a"
as.character(substitute(x))
}
test_fun(x)
#"a"
One interesting aspect can be read further down the help of substitute
Substitution takes place by examining each component of the parse tree as follows: If it is not a bound symbol in env, it is unchanged. If it is a promise object, i.e., a formal argument to a function or explicitly created using delayedAssign(), the expression slot of the promise replaces the symbol. If it is an ordinary variable, its value is substituted, unless env is .GlobalEnv in which case the symbol is left unchanged.
The last part "unless env is .GlobalEnv in which case the symbol is left unchanged" is interesting because
x="a"
as.character(substitute(x))
"x"
This is because env is .GlobalEnv in this case. Then, substitution does not take place. I am sure there is a good reason for this - still it is somewhat surprising for me.

Why the "=" R operator should not be used in functions?

The manual states:
The operator ‘<-’ can be used anywhere,
whereas the operator ‘=’ is only allowed at the top level (e.g.,
in the complete expression typed at the command prompt) or as one
of the subexpressions in a braced list of expressions.
The question here mention the difference when used in the function call. But in the function definition, it seems to work normally:
a = function ()
{
b = 2
x <- 3
y <<- 4
}
a()
# (b and x are undefined here)
So why the manual mentions that the operator ‘=’ is only allowed at the top level??
There is nothing about it in the language definition (there is no = operator listed, what a shame!)
The text you quote says at the top level OR in a braced list of subexpressions. You are using it in a braced list of subexpressions. Which is allowed.
You have to go to great lengths to find an expression which is neither toplevel nor within braces. Here is one. You sometimes want to wrap an assignment inside a try block: try( x <- f() ) is fine, but try( x = f(x) ) is not -- you need to either change the assignment operator or add braces.
Expressions not at the top level include usage in control structures like if. For example, the following programming error is illegal.
> if(x = 0) 1 else x
Error: syntax error
As mentioned here: https://stackoverflow.com/a/4831793/210673
Also see http://developer.r-project.org/equalAssign.html
Other than some examples such as system.time as others have shown where <- and = have different results, the main difference is more philisophical. Larry Wall, the creater of Perl, said something along the lines of "similar things should look similar, different things should look different", I have found it interesting in different languages to see what things are considered "similar" and which are considered "different". Now for R assignment let's compare 2 commands:
myfun( a <- 1:10 )
myfun( a = 1:10 )
Some would argue that in both cases we are assigning 1:10 to a so what we are doing is similar.
The other argument is that in the first call we are assigning to a variable a that is in the same environment from which myfun is being called and in the second call we are assigning to a variable a that is in the environment created when the function is called and is local to the function and those two a variables are different.
So which to use depends on whether you consider the assignments "similar" or "different".
Personally, I prefer <-, but I don't think it is worth fighting a holy war over.

Resources