How does return() in a with() block work?
Here is a test function
test_func <- function(df, y) {
with(df,
if(x > 10){
message('Inside step 1')
return(1)
}
)
message("After step 1")
if(y > 10){
message('In side step 2')
return(2)
}
message("After step 2")
}
The function keeps going after return(1).
df <- data.frame(x = 11)
y <- 11
test_func(df, y) ## result is 2
Output
Inside step 1
After step 1
In side step 2
[1] 2
return(1) doesn't return to the test_func() rather than get out the with() block
df <- data.frame(x = 11)
y <- 5
test_func(df, y) ## no result
Output
Inside step 1
After step 1
After step 2
I think the main point here is that return() is designed to exit the current scope to the parent scope with a particular value. In the case of running
return("hello")
# Error: no function to return from, jumping to top level
You get an error because we are calling this from the global environment and there is no parent scope you are jumping back to. Note that thanks to R's lazy evaluation, parameters passed to a function are usually evaluated in the environment where they were passed from. So in this example
f <- function(x) x
f(return("hello"))
# Error in f(return("hello")) :
# no function to return from, jumping to top level
So because we are actually passing in the return() call to the function from the global environment, that's where the return will be evaluated and will return the same error. But note that this is not what with does, instead it captures the commands you pass and re-runs them in a new environment. It's closer to something like this
f <- function(x) eval(substitute(x))
f(return("hello"))
# [1] "hello"
This eval() creates a new level of scope we can escape from, and because we aren't evaluating the parameter passed to the function directly, we are just running those commands in a different environment, the return is no longer evaluated in the global environment so there's no error. So the function we created is basically the same as calling
with(NULL, return("hello"))
# [1] "hello"
which is different from something like
print(return("hello"))
# no function to return from, jumping to top level
where the parameter is directly evaluated. So the different meanings of return() really are side effects of non-standard evaluation in this case.
When return() is used inside a with(), you are not returning from the function that you called with() from, but you are returning from the scope that with() created for you to run your command.
How to fix this particular problem is already addressed by the comments left by #IceCreamToucan. You just need to move the return outside of the with().
Related
I've just read about delayedAssign(), but the way you have to do it is by passing the name of the delayed variable as the first parameter. Is there a way to do it via direct assignment?
e.g.:
x <- delayed_variable("Hello World")
rather than
delayedAssign("x","Hello World")
I want to create a variable that will throw an error if accessed (use-case is obviously more complex), so for example:
f <- function(x){
y <- delayed_variable(stop("don't use y"))
x
}
f(10)
> 10
f <- function(x){
y <- delayed_variable(stop("don't use y"))
y
}
f(10)
> Error in f(10) : don't use y
No, you can't do it that way. Your example would be fine with the current setup, though:
f <- function(x){
delayedAssign("y", stop("don't use y"))
y
}
f(10)
which gives exactly the error you want. The reason for this limitation is that delayed_variable(stop("don't use y")) would create a value which would trigger the error when evaluated, and assigning it to y would evaluate it.
Another version of the same thing would be
f <- function(x, y = stop("don't use y")) {
...
}
Internally it's very similar to the delayedAssign version.
I reached a solution using makeActiveBinding() which works provided it is being called from within a function (so it doesn't work if called directly and will throw an error if it is). The main purpose of my use-case is a smaller part of this, but I generalised the code a bit for others to use.
Importantly for my use-case, this function can allow other functions to use delayed assignment within functions and can also pass R CMD Check with no Notes.
Here is the function and it gives the desired outputs from my question.
delayed_variable <- function(call){
#Get the current call
prev.call <- sys.call()
attribs <- attributes(prev.call)
# If srcref isn't there, then we're not coming from a function
if(is.null(attribs) || !"srcref" %in% names(attribs)){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Extract the call including the assignment operator
this_call <- parse(text=as.character(attribs$srcref))[[1]]
# Check if this is an assignment `<-` or `=`
if(!(identical(this_call[[1]],quote(`<-`)) ||
identical(this_call[[1]],quote(`=`)))){
stop("delayed_variable() can only be used as an assignment within a function.")
}
# Get the variable being assigned to as a symbol and a string
var_sym <- this_call[[2]]
var_str <- deparse(var_sym)
#Get the parent frame that we will be assigining into
p_frame <- parent.frame()
var_env <- new.env(parent = p_frame)
#Create a random string to be an identifier
var_rand <- paste0(sample(c(letters,LETTERS),50,replace=TRUE),collapse="")
#Put the variables into the environment
var_env[["p_frame"]] <- p_frame
var_env[["var_str"]] <- var_str
var_env[["var_rand"]] <- var_rand
# Create the function that will be bound to the variable.
# Since this is an Active Binding (AB), we have three situations
# i) It is run without input, and thus the AB is
# being called on it's own (missing(input)),
# and thus it should evaluate and return the output of `call`
# ii) It is being run as the lhs of an assignment
# as part of the initial assignment phase, in which case
# we do nothing (i.e. input is the output of this function)
# iii) It is being run as the lhs of a regular assignment,
# in which case, we want to overwrite the AB
fun <- function(input){
if(missing(input)){
# No assignment: variable is being called on its own
# So, we activate the delayed assignment call:
res <- eval(call,p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
res
} else if(!inherits(input,"assign_delay") &&
input != var_rand){
# Attempting to assign to the variable
# and it is not the initial definition
# So we overwrite the active binding
res <- eval(substitute(input),p_frame)
rm(list=var_str,envir=p_frame)
assign(var_str,res,p_frame)
invisible(res)
}
# Else: We are assigning and the assignee is the output
# of this function, in which case, we do nothing!
}
#Fix the call in the above eval to be the exact call
# rather than a variable (useful for debugging)
# This is in the line res <- eval(call,p_frame)
body(fun)[[c(2,3,2,3,2)]] <- substitute(call)
#Put the function inside the environment with all
# all of the variables above
environment(fun) <- var_env
# Check if the variable already exists in the calling
# environment and if so, remove it
if(exists(var_str,envir=p_frame)){
rm(list=var_str,envir=p_frame)
}
# Create the AB
makeActiveBinding(var_sym,fun,p_frame)
# Return a specific object to check for
structure(var_rand,call="assign_delay")
}
I was helping a friend of mine with some of his code. I didn't know how to explain the strange behavior, but I could tell him that his functions weren't explicitly returning anything. Here is a minimum reproducible example:
derp <- function(arg){
arg <- arg+3
}
data <- derp(500)
data
#[1] 503
derp(500)
#nothing outputs
class(derp(500))
#[1] "numeric"
Is there a name for this that I can google? Why is this happening? Why isn't arg being destroyed after the call to derp() finishes?
You need to understand the difference between a function returning a value, and printing that value. By default, a function returns the value of the last expression evaluated, which in this case is the assignment
arg <- arg + 3
(Note that in R, an assignment is an expression that returns a value, in this case the value assigned.) This is why data <- derp(500) results in data containing 503.
However, the returned value is not printed to the screen by default, unless you isolate the function's final expression on its own line. This is one of those quirks in R. So if you want to see the value:
derp <- function(arg)
{
arg <- arg + 3
arg
}
or just
derp <- function(arg)
arg + 3
I often use return(NULL) for in the beginning of a function enclosed in some consistency checks, e.g. whether an output file would be overwritten. e.g.
some_function <- function(output_filename, overwrite = FALSE) {
if (file.exists(output_filename)) {
if (!overwrite) {
return(NULL)
}
}
}
In this case the calling some_function("file.path", overwrite = FALSE) would return NULL. You can prevent that by modifying the return to:
...
return(invisible(NULL))
...
the arg variable is being destroyed. A function in R will return the value of the last statement executed in the function unless a return statement is explicitly called.
In your case a copy of arg is the return value of your function. Example:
alwaysReturnSomething = function()
{
x = runif(1)
if(x<0.5) 20 else 10
}
> for(x in 1:10) cat(alwaysReturnSomething())
20202020102010101020
or:
alwaysReturnSomething <- function(){}
> z=alwaysReturnSomething()
> z
NULL
This is curious behavior.
Basically derp(), it returns if you assign the output of derp(), and derp() does not return if you do not assign the result. This is because the assignment function (<-) returns using the invisible() function. see Make a function return silently for how that works.
You can see the same behavior with derp2:
derp2 <- function(arg) {
invisible(arg + 3)
}
derp2(3)
# nothing
b <- derp2(3)
b
# 6
If you want a function to change the value of a variable in Global environment you can use <<- operator, but you still need to write the exact name of a variable in Global environment
derp <- function(arg){
arg <- arg+3
b<<-3
}
Try it and call b
I'm trying to get a better understanding of closures, in particular details on a function's scope and how to work with its enclosing environment(s)
Based on the Description section of the help page on rlang::fn_env(), I had the understanding, that a function always has access to all variables in its scope and that its enclosing environment belongs to that scope.
But then, why isn't it possible to manipulate the contents of the closure environment "after the fact", i.e. after the function has been created?
By means of R's lexical scoping, shouldn't bar() be able to find x when I put into its enclosing environment?
foo <- function(fun) {
env_closure <- rlang::fn_env(fun)
env_closure$x <- 5
fun()
}
bar <- function(x) x
foo(bar)
#> Error in fun(): argument "x" is missing, with no default
Ah, I think I got it down now.
It has to do with the structure of a function's formal arguments:
If an argument is defined without a default value, R will complain when you call the function without specifiying that even though it might technically be able to look it up in its scope.
One way to kick off lexical scoping even though you don't want to define a default value would be to set the defaults "on the fly" at run time via rlang::fn_fmls().
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fun()
}
# No argument at all -> lexical scoping takes over
baz <- function() x
foo(baz)
#> [1] 5
# Set defaults to desired values on the fly at run time of `foo()`
foo <- function(fun) {
env_enclosing <- rlang::fn_env(fun)
env_enclosing$x <- 5
fmls <- rlang::fn_fmls(fun)
fmls$x <- substitute(get("x", envir = env_enclosing, inherits = FALSE))
rlang::fn_fmls(fun) <- fmls
fun()
}
bar <- function(x) x
foo(bar)
#> [1] 5
I can't really follow your example as I am unfamiliar with the rlang library but I think a good example of a closure in R would be:
bucket <- function() {
n <- 1
foo <- function(x) {
assign("n", n+1, envir = parent.env(environment()))
n
}
foo
}
bar <- bucket()
Because bar() is define in the function environment of bucket then its parent environment is bucket and therefore you can carry some data there. Each time you run it you modify the bucket environment:
bar()
[1] 2
bar()
[1] 3
bar()
[1] 4
You can make a function with the following code in R, omitting the brackets after the return command, but the return statement does not behave as expected and seems to do nothing:
> func <- function(x) { return; print(x) }
> func(1)
[1] 1
Including the brackets behaves as expected:
> func <- function(x) { return(); print(x)}
> func(1)
NULL
Why? Does a return statement without an argument serve a purpose, and, if not, why doesn't it cause an exception?
I can perhaps offer some insight. In addition to it being legal in R to have a function by itself with no parameters, it is also legal to have a variable on a line with no assignments, function calls, etc. Consider the following code snippet:
x <- c(1,2,3)
x
print(x)
print
Here is the output from that:
[1] 1 2 3
[1] 1 2 3
function (x, ...)
UseMethod("print")
<bytecode: 0xbb87b8>
<environment: namespace:base>
In other words, from the console the default behavior for a variable or function by itself is to print information about that variable or function. So there clearly is defined behavior in this case, and it seems to be that the function does not get called. This makes less sense perhaps when this is happening inside another function, though it definitely seems that R has behavior defined for this.
function(x) { return(); print(x) } calls return() as a function. function(x) { return; print(x) } references return as an ordinary object. Here is the difference.
return # Just show the function body.
## .Primitive("return")
return() # Actually call the function.
## Error: no function to return from, jumping to top level
The function testfun1, defined below, does what I want it to do. (For the reasoning of all this, see the background info below the code example.) The question I wanted to ask you is why what I tried in testfun2 doesn't work. To me, both appear to be doing the exact same thing. As shown by the print in testfun2, the evaluation of the helper function inside testfun2 takes place in the correct environment, but the variables from the main function environment get magically passed to the helper function in testfun1, but not in testfun2. Does anyone of you know why?
helpfun <- function(){
x <- x^2 + y^2
}
testfun1 <- function(x,y){
xy <- x*y
environment(helpfun) <- sys.frame(sys.nframe())
x <- eval(as.call(c(as.symbol("helpfun"))))
return(list(x=x,xy=xy))
}
testfun1(x = 2,y = 1:3)
## works as intended
eval.here <- function(fun){
environment(fun) <- parent.frame()
print(environment(fun))
eval(as.call(c(as.symbol(fun))))
}
testfun2 <- function(x,y){
print(sys.frame(sys.nframe()))
xy <- x*y
x <- eval.here("helpfun")
return(list(x=x,xy=xy))
}
testfun2(x = 2,y = 1:3)
## helpfun can't find variable 'x' despite having the same environment as in testfun1...
Background info: I have a large R code in which I want to call helperfunctions inside my main function. They alter variables of the main function environment. The purpose of all this is mainly to unclutter my code. (Main function code is currently over 2000 lines, with many calls to various helperfunctions which themselves are 40-150 lines long...)
Note that the number of arguments to my helper functions is very high, so that the traditional explicit passing of function arguments ( "helpfun(arg1 = arg1, arg2 = arg2, ... , arg50 = arg50)") would be cumbersome and doesnt yield the uncluttering of the code that I am aiming for. Therefore, I need to pass the variables from the parent frame to the helper functions anonymously.
Use this instead:
eval.here <- function(fun){
fun <- get(fun)
environment(fun) <- parent.frame()
print(environment(fun))
fun()
}
Result:
> testfun2(x = 2,y = 1:3)
<environment: 0x0000000013da47a8>
<environment: 0x0000000013da47a8>
$x
[1] 5 8 13
$xy
[1] 2 4 6