R: Exit from the calling function - r

In R, is there a way to exit from the calling function and return a value? Something like return(), but from the parent function?
parent <- function(){
child()
# stuff afterward should not be executed
}
child <- function(){
returnFromParent("a message returned by parent()")
}
It seems stop() is doing something like that. What I want to do is to write a small replacement for stop() that returns the message that stop() writes to stderr.
Update after G5W's suggestion: I have a large number of checks, each resulting in a stop() if the test fails, but subsequent conditions cannot be evaluated if earlier checks fail, so the function must exit after a failing one. To do this 'properly', I would have to build up a huge if else construct, which I wanted to avoid.

Got it. I guess I was looking for something like this:
parent <- function(){
parent_killing_child()
print("do not run this")
}
parent_killing_child <- function(){
do.call(return, list("my message"), envir = sys.frame(-1))
}
parent()
Thanks for all the advices.

Disclaimer: This sounds a XY problem, printing the stop message to stdout has few to no value, if interactive it should not be a problem, if in a script just use the usual redirection 2 > &1 to write stderr messages to stdout, or maybe use sink as in answer in this question.
Now, if I understood properly what you're after I'll do something like the following to avoid too much code refactoring.
First define a function to handle errors:
my_stop <- function() {
e <- geterrmessage()
print(e)
}
Now configure the system to send errors to your function (error handler) and suppress error messages:
options(error = my_stop)
options(show.error.messages=FALSE)
Now let's test it:
f1 <- function() {
f2()
print("This should not be seen")
}
f2 <- function() {
stop("This is a child error message")
}
Output:
> f1()
[1] "Error in f2() : This is a child error message\n"

For the parent function, make a list of tests. Then loop over the tests, and return your message at the first failed test. Subsequent tests will not be executed after the first failure.
Sample code:
test1 <- function(){criteria <- T; return(ifelse(criteria,T,F))}
test2 <- function(){criteria <- F; return(ifelse(criteria,T,F))}
test3 <- function(){criteria <- T; return(ifelse(criteria,T,F))}
parent <- function() {
tests <- c('test1', 'test2', 'test3')
for (i in 1:length(tests)) {
passed <- do.call(tests[i],args = list())
#print(passed)
if (!passed){
return(paste("Testing failed on test ", i, ".", sep=''))
}
}
return('Congrats! All tests passed!')
}
parent()

Update
Kudos to #chris for their clever application of do.call() in their successful solution.
In five years since then, the R team has released the rlang package within the tidyverse, which provides the apt function rlang::return_from() in tandem with rlang::return_to().
While base::return() can only return from the current local frame,
these two functions will return from any frame on the current
evaluation stack, between the global and the currently active context.
They provide a way of performing arbitrary non-local jumps out of the
function currently under evaluation.
Solution
Thus, you can simply do
child <- function() {
rlang::return_from(
# Return from the parent context (1 frame back).
frame = rlang::caller_env(n = 1),
# Return the message text.
value = "some text returned by parent()"
)
}
where the parent is identified via rlang::caller_env().
Results
When called from a parent() function
parent <- function() {
child()
# stuff afterward should not be executed
return("text that should NOT be returned by parent()")
}
the child() function will force parental behavior like this:
parent()
#> [1] "some text returned by parent()"
Bonus
See my solution here for throwing an error from a parent (or from any arbitrary "ancestor").

Related

R tryCatch() - referencing return of expr() in finally? [duplicate]

This question already has answers here:
How to write trycatch in R
(5 answers)
Closed 2 years ago.
I am trying to write a function to handle execution of batch jobs,
logging errors and stats of the job results.
Is there a way to reference returning value of expr block, from finally block?
my_do <- function(FUN, ...){
result <- tryCatch({
FUN(...)
},
error = function(e) {
message("error.")
},
finaly = {
# how can I reference the returning value of FUN(...) in finally block?
# so for example, I can write code like this:
message(paste("Result dimensions:", dim(expr_result)))
},
)
return(result)
}
If the tryCatch return value is being saved into a variable, such as
x <- tryCatch({ 1; }, finally = { message("value is ", x); })
# Error in message("value is ", x) : object 'x' not found
then the answer is no, since the x object does not exist when tryCatch executes finally=.
However, the code block operates within the parent environment, so you can do this instead:
tryCatch({ x <- 1; }, finally = { message("value is ", x); })
# value is 1
x
# [1] 1
This relies on the return value being set without error. If there's an error somewhere in the execution, then ... obviously there will be no value to retrieve.
I suggest that this is not the best way for using finally.
There are following best practices for use finally (http://adv-r.had.co.nz/Exceptions-Debugging.html):
It specifies a block of code (not a function) to run regardless of whether the initial expression succeeds or fails. This can be useful for clean up (e.g., deleting files, closing connections). This is functionally equivalent to using on.exit() but it can wrap smaller chunks of code than an entire function.

Why expression after return's parenthesis is checked for lexical correctness, but is not evaluated?

Consider the following code:
a = function() {
return (23)
}
b = function() {
return (23) * 23
}
c = function() {
return (23) * someUndefinedVariable
}
All of the above runs successfully (if called) and return 23.
I assumed that R ignores everything that goes after the closing parenthesis of return, but it does not really, because this code fails during code loading:
d = function() {
return (23) something
}
My assumption is that in the latter example some lexer or parser fails. But in the former, expression is parsed as (return(23))*some (because return is treated like a function), but evaluation stops at return and therefore R does not try to find some.
Does that sounds ok? Is that the reason? Is such behavior intended? Can I enable some warnings so that interpreter tells me about such 'unreachable code'?
The failure of this code:
d = function() {
return (23) something
}
... has nothing to do with the prior code and everything to do with the inability to parse: return (23) something. Unlike the earlier misguided attempt to redefine c which had a valid/parseable function body, the d-body is incapable of being put into a functional form. The parser doesn't really stop at return(23) but rather after it tokenizes something and "realizes" that it is not a semicolon or an infix function name. So the R interpreter now has two expressions and no valid connector/separator between them.
The referenced objects inside R function bodies at the time of definition do not get evaluated or even checked for existence in the parameter list or outside the function. (R is not a compiler.)
R parses the statement before it is evaluated:
parse(text = "funky <- function(x) {
return(x) * dog
}")
returns:
expression(funky <- function(x) {
return(x) * dog
})
However,
parse(text = "funky <- function(x) {
return(x) dog
}")
returns:
Error in parse(text = "funky <- function(x) {\n return(x) dog\n}") :
<text>:2:19: unexpected symbol
1: funky <- function(x) {
2: return(x) dog
^
In the above example, even though the variable dog doesn't exist (and comes after return), R is still able to parse it as it correct code.
return is not just "treated like a function", it is a function. And anytime it's called, the code path will exit from whatever function you're in at that moment.
So that means that by the time R would have gotten to multiplying the result of return by 23, it's all over, that evaluation stops, and there are no errors or warnings to report (just like there are no warnings or errors when you return inside some if condition).
Whereas your last function simply cannot be parsed (which more or less means that the expression is put into a function tree), and so that (function) object can't be created.

Capture Arbitrary Conditions with `withCallingHandlers`

The Problem
I'm trying to write a function that will evaluate code and store the results, including any possible conditions signaled in the code. I've got this working perfectly fine, except for the situation when my function (let's call it evalcapt) is run within an error handling expression.
The problem is that withCallingHandlers will keep looking for matching condition handlers and if someone has defined such a handler outside of my function, my function loses control of execution. Here is simplified example of the problem:
evalcapt <- function(expr) {
conds <- list()
withCallingHandlers(
val <- eval(expr),
condition=function(e) {
message("Caught condition of class ", deparse(class(e)))
conds <<- c(conds, list(e))
} )
list(val=val, conditions=conds)
}
myCondition <- simpleCondition("this is a custom condition")
class(myCondition) <- c("custom", class(myCondition))
expr <- expression(signalCondition(myCondition), 25)
tryCatch(evalcapt(expr))
Works as expected
Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25
$conditions
$conditions[[1]]
<custom: this is a custom condition>
but:
tryCatch(
evalcapt(expr),
custom=function(e) stop("Hijacked `evalcapt`!")
)
Doesn't work:
Caught condition of class c("custom", "simpleCondition", "condition")
Error in value[[3L]](cond) : Hijacked `evalcapt`!
A Solution I don't Know How To Implement
What I really need is a way of defining a restart right after the condition is signaled in the code which frankly is the way withCallingHandlers appears to work normally (when my handler is the last available handler), but I don't see the restart established when I browse in my handling function and use computeRestarts.
Things That Seem Like Solutions That Won't Work
Use tryCatch
tryCatch does not have the same problem as withCallingHandlers because it does not continue looking for handlers after it finds the first one. The big problem with is it also does not continue to evaluate the code after the condition. If you look at the example that worked above, but sub in tryCatch for withCallingHandlers, the value (25) does not get returned because execution is brought back to the tryCatch frame after the condition is handled.
So basically, I'm looking for a hybrid between tryCatch and withCallingHandlers, one that returns control to the condition signaler, but also stops looking for more handlers after the first one is found.
Break Up The Expression Into Sub-expression, then Use tryCatch
Okay, but how do you break up (and more complex functions with signaled conditions all over the place):
fun <- function(myCondition) {
signalCondition(myCondition)
25
}
expr <- expression(fun())
Misc
I looked for the source code associated with the .Internal(.signalCondition()) call to see if I can figure out if there is a behind the scenes restart being set, but I'm out of my depth there. It seems like:
void R_ReturnOrRestart(SEXP val, SEXP env, Rboolean restart)
{
int mask;
RCNTXT *c;
mask = CTXT_BROWSER | CTXT_FUNCTION;
for (c = R_GlobalContext; c; c = c->nextcontext) {
if (c->callflag & mask && c->cloenv == env)
findcontext(mask, env, val);
else if (restart && IS_RESTART_BIT_SET(c->callflag))
findcontext(CTXT_RESTART, c->cloenv, R_RestartToken);
else if (c->callflag == CTXT_TOPLEVEL)
error(_("No function to return from, jumping to top level"));
}
}
from src/main/errors.c is doing some of that restart invocation, and this is called by do_signalCondition, but I don't have a clue how I would go about messing with this.
I think what you're looking for is to use withRestarts when your special condition is signaled, like from warning:
withRestarts({
.Internal(.signalCondition(cond, message, call))
.Internal(.dfltWarn(message, call))
}, muffleWarning = function() NULL)
so
evalcapt <- function(expr) {
conds <- list()
withCallingHandlers(
val <- eval(expr),
custom=function(e) {
message("Caught condition of class ", deparse(class(e)))
conds <<- c(conds, list(e))
invokeRestart("muffleCustom")
} )
list(val=val, conditions=conds)
}
expr <- expression(withRestarts({
signalCondition(myCondition)
}, muffleCustom=function() NULL), 25)
leads to
> tryCatch(evalcapt(expr))
Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25
$conditions
$conditions[[1]]
<custom: this is a custom condition>
> tryCatch(
+ evalcapt(expr),
+ custom=function(e) stop("Hijacked `evalcapt`!")
+ )
Caught condition of class c("custom", "simpleCondition", "condition")
$val
[1] 25
$conditions
$conditions[[1]]
<custom: this is a custom condition>
As far as I can tell there isn't and can't be a simple solution to this problem (I'm happy to be proven wrong). The source of the problem can be seen if we look at how tryCatch and withCallingHandlers register the handlers:
.Internal(.addCondHands(name, list(handler), parentenv, environment(), FALSE)) # tryCatch
.Internal(.addCondHands(classes, handlers, parentenv, NULL, TRUE)) # withCallingHandlers
The key point is the last argument, FALSE in tryCatch, TRUE in withCallingHandlers. This argument leads to the gp bit getting set by do_addCondHands > mkHandlerEntry in src/main/errors.c.
That same bit is then consulted by do_signalCondition (still in src/main/errors.c) when a condition is signaled:
// simplified code excerpt from `do_signalCondition
PROTECT(oldstack = R_HandlerStack);
while ((list = findConditionHandler(cond)) != R_NilValue) {
SEXP entry = CAR(list);
R_HandlerStack = CDR(list);
if (IS_CALLING_ENTRY(entry)) { // <<------------- Consult GP bit
... // Evaluate handler
} else gotoExitingHandler(cond, ecall, entry); // Evaluate handler and exit
}
R_HandlerStack = oldstack;
return R_NilValue;
Basically, if the GP bit is set, then we evaluate the handler, and keep iterating through the handler stack. If it isn't set, then we run gotExitingHandler which runs the handler but then returns control to the handling control structure rather than resuming the code where the condition was signaled.
Since the GP bit can only tell you to do one of two things, there is no straightforward way to modify the behavior of this call (i.e. you either iterate through all the handlers if using withCallingHandlers, or you stop at the first matching one registered by tryCatch).
I toyed with the idea of traceing signalConditions to add a restart there, but that seems too hackish.
With a bit of C you can evaluate an expression within a ToplevelExec() to isolate it from all handlers registered on the stack.
We will expose it at R level in the next rlang version.
I may be a bit late, but I've been digging into the condition-system as well, and I think I've found some other solutions.
But first: some reasons why this is necessarily a hard problem, not something that can easily be solved generally.
The question is which function is signalling a condition, and whether this function can continue execution if it throws a condition. Errors are implemented as "just a condition" as well, but most functions don't expect to be continued after they've thrown a stop().
And some functions may pass on a condition, expecting not be bothered by it again.
Normally, this means that control can only be returned after a stop if a function has explicitly said it can accept that: with a restart provided.
There may also be other serious conditions that can be signalled, and if a function expects such a condition to always be caught, and you force it to return execution, things break badly.
What should happen when you would have written it as follows and execution would resume?
myfun <- function(deleteFiles=NULL) {
if (!all(haveRights(deleteFiles))) stop("Access denied")
file.remove(deleteFiles)
}
withCallingHandlers(val <- eval(myfun(myProtectedFiles)),
error=function(e) message("I'm just going to ignore everything..."))
If no other handlers are called (which alert the user that stop has been called), the files would be removed, even though this function has a (small) safeguard against that.
In the case of an error this is clear, but there could be also cases for other conditions, so I think that's the main reason R doesn't really support it if you stop the passing on of conditions, unless it means halting.
Nonetheless, I think I've found 2 ways of hacking your problem.
The first is simply executing expr step by step, which is quite close to Martin Morgans solution, but moves the withRestarts into your function:
evalcapt <- function(expr) {
conds <- list()
for (i in seq_along(expr)) {
withCallingHandlers(
val <- withRestarts(
eval(expr[[i]]),
muffleCustom = function()
NULL
),
custom = function(e) {
message("Caught condition of class ", deparse(class(e)))
conds <<- c(conds, list(e))
invokeRestart(findRestart("muffleCustom"))
})
}
list(val = val, conditions = conds)
}
The main disadvantage is that this doesn't dig into functions, expr is executed for each instruction at the level it is called.
So if you call evalcapt(myfun()), the for-loop sees this as one instruction. And this one instruction throws a condition --> so does not return --> so you can't see any output that would have been there would you not have been catching anything.
OTOH, evalcapt(expression(signalCondition(myCondition), 25)) does work as requested, as this is an expression with 2 elements, each of which is called.
If you want to go hardcore, I think you could try evaluating myfun() step-by-step, but there is always the question how deep you want to go. If myfun() calls myotherfun(), which calls myotherotherfun(), do you want to return control to the point where myfun failed, or myotherfun, or myotherotherfun?
Basically, it's just a guess about what level you want to halt execution, and where you want to resume.
So a second solution: hijack any call to signalCondition. This means you'll probably end up at a quite deep level, although not the very deepest (no primitives, or code that calls .signalCondition).
I think this works best if you're really sure that your custom condition is only thrown by code that is written by you: it means that execution resumes directly after signalCondition.
Which gives me this function:
evalcapt <- function(expr) {
if(exists('conds', parent.frame(), inherits=FALSE)) {
conds_backup <- get('conds', parent.frame(), inherits=FALSE)
on.exit(assign('conds', conds_backup, parent.frame(), inherits=FALSE), add=TRUE)
} else {
on.exit(rm('conds', pos=parent.frame(), inherits=FALSE), add=TRUE)
}
assign('conds', list(), parent.frame(), inherits=FALSE)
origsignalCondition <- signalCondition
if(exists('signalCondition', parent.frame(), inherits=FALSE)) {
signal_backup <- get('signalCondition', parent.frame(), inherits=FALSE)
on.exit(assign('signalCondition', signal_backup, parent.frame(), inherits=FALSE), add=TRUE)
} else {
on.exit(rm('signalCondition', pos=parent.frame(), inherits=FALSE), add=TRUE)
}
assign('signalCondition', function(e) {
if(is(e, 'custom')) {
message("Caught condition of class ", deparse(class(e)))
conds <<- c(conds, list(e))
} else {
origsignalCondition(e)
}
}, parent.frame())
val <- eval(expr, parent.frame())
list(val=val, conditions=conds)
}
It looks way messier, but that's mostly because there are more issues with which environment to use. The differences are that here, I use the calling environment as context, and to hijack signalCondition() that needs to be there too. And afterwards we need to clean up.
But the main use is overwriting signalCondition: if we see a custom error we log it, and return control. If it's another condition, we pass on control.
Here there may be some smaller disadvantages:
You may end up in a deeper function, where the bug is the way myfun calls myotherfun, but you end up in myotherfun (or deeper).
It only catches occurrences where signalCondition is called. If you call e.g. warning(myCondition), nothing is caught.
If a function in another package/another environment calls signalCondition, then it uses its own searchpath, meaning our signalCondition might be bypassed, and base::signalCondition is used instead.
When debugging, it's a lot uglier. Variables are assigned in environments where you don't expect them (and then disappear when you exit a function), the scope for different functions may be unclear, parent.frame() might give others results then you'd expect, etc.
And as said before: all functions must be able to handle re-entrance after throwing a condition.

break/exit script

I have a program that does some data analysis and is a few hundred lines long.
Very early on in the program, I want to do some quality control and if there is not enough data, I want the program to terminate and return to the R console. Otherwise, I want the rest of the code to execute.
I've tried break,browser, and quit and none of them stop the execution of the rest of the program (and quit stops the execution as well as completely quitting R, which is not something I want to happen). My last resort is creating an if-else statement as below:
if(n < 500){}
else{*insert rest of program here*}
but that seems like bad coding practice. Am I missing something?
You could use the stopifnot() function if you want the program to produce an error:
foo <- function(x) {
stopifnot(x > 500)
# rest of program
}
Perhaps you just want to stop executing a long script at some point. ie. like you want to hard code an exit() in C or Python.
print("this is the last message")
stop()
print("you should not see this")
Edited. Thanks to #Droplet, who found a way to make this work without the .Internal(): Here is a way to implement an exit() command in R.
exit <- function() { invokeRestart("abort") }
print("this is the last message")
exit()
print("you should not see this")
Only lightly tested, but when I run this, I see this is the last message and then the script aborts without any error message.
Below is the uglier version from my original answer.
exit <- function() {
.Internal(.invokeRestart(list(NULL, NULL), NULL))
}
Reverse your if-else construction:
if(n >= 500) {
# do stuff
}
# no need for else
Edit: Seems the OP is running a long script, in that case one only needs to wrap the part of the script after the quality control with
if (n >= 500) {
.... long running code here
}
If breaking out of a function, you'll probably just want return(), either explicitly or implicitly.
For example, an explicit double return
foo <- function(x) {
if(x < 10) {
return(NA)
} else {
xx <- seq_len(x)
xx <- cumsum(xx)
}
xx ## return(xx) is implied here
}
> foo(5)
[1] 0
> foo(10)
[1] 1 3 6 10 15 21 28 36 45 55
By return() being implied, I mean that the last line is as if you'd done return(xx), but it is slightly more efficient to leave off the call to return().
Some consider using multiple returns bad style; in long functions, keeping track of where the function exits can become difficult or error prone. Hence an alternative is to have a single return point, but change the return object using the if () else () clause. Such a modification to foo() would be
foo <- function(x) {
## out is NA or cumsum(xx) depending on x
out <- if(x < 10) {
NA
} else {
xx <- seq_len(x)
cumsum(xx)
}
out ## return(out) is implied here
}
> foo(5)
[1] NA
> foo(10)
[1] 1 3 6 10 15 21 28 36 45 55
This is an old question but there is no a clean solution yet. This probably is not answering this specific question, but those looking for answers on 'how to gracefully exit from an R script' will probably land here. It seems that R developers forgot to implement an exit() function. Anyway, the trick I've found is:
continue <- TRUE
tryCatch({
# You do something here that needs to exit gracefully without error.
...
# We now say bye-bye
stop("exit")
}, error = function(e) {
if (e$message != "exit") {
# Your error message goes here. E.g.
stop(e)
}
continue <<-FALSE
})
if (continue) {
# Your code continues here
...
}
cat("done.\n")
Basically, you use a flag to indicate the continuation or not of a specified block of code. Then you use the stop() function to pass a customized message to the error handler of a tryCatch() function. If the error handler receives your message to exit gracefully, then it just ignores the error and set the continuation flag to FALSE.
Here:
if(n < 500)
{
# quit()
# or
# stop("this is some message")
}
else
{
*insert rest of program here*
}
Both quit() and stop(message) will quit your script. If you are sourcing your script from the R command prompt, then quit() will exit from R as well.
I had a similar issue: Exit the current function, but not wanted to finish the rest of the code.
Finally I solved it by a for() loop that runs only once. Inside the for loop you can set several differenct conditions to leave the current loop (function).
for (i in T) {
print('hello')
if (leave.condition) next
print('good bye')
}
You can use the pskill function in the R "tools" package to interrupt the current process and return to the console. Concretely, I have the following function defined in a startup file that I source at the beginning of each script. You can also copy it directly at the start of your code, however. Then insert halt() at any point in your code to stop script execution on the fly. This function works well on GNU/Linux and judging from the R documentation, it should also work on Windows (but I didn't check).
# halt: interrupts the current R process; a short iddle time prevents R from
# outputting further results before the SIGINT (= Ctrl-C) signal is received
halt <- function(hint = "Process stopped.\n") {
writeLines(hint)
require(tools, quietly = TRUE)
processId <- Sys.getpid()
pskill(processId, SIGINT)
iddleTime <- 1.00
Sys.sleep(iddleTime)
}

Environment chaining in R

In my R development I need to wrap function primitives in proto objects so that a number of arguments can be automatically passed to the functions when the $perform() method of the object is invoked. The function invocation internally happens via do.call(). All is well, except when the function attempts to access variables from the closure within which it is defined. In that case, the function cannot resolve the names.
Here is the smallest example I have found that reproduces the behavior:
library(proto)
make_command <- function(operation) {
proto(
func = operation,
perform = function(., ...) {
func <- with(., func) # unbinds proto method
do.call(func, list(), envir=environment(operation))
}
)
}
test_case <- function() {
result <- 100
make_command(function() result)$perform()
}
# Will generate error:
# Error in function () : object 'result' not found
test_case()
I have a reproducible testthat test that also outputs a lot of diagnostic output. The diagnostic output has me stumped. By looking up the parent environment chain, my diagnostic code, which lives inside the function, finds and prints the very same variable the function fails to find. See this gist..
How can the environment for do.call be set up correctly?
This was the final answer after an offline discussion with the poster:
make_command <- function(operation) {
proto(perform = function(.) operation())
}
I think the issue here is clearer and easier to explore if you:
Replace the anonymous function within make_command() with a named one.
Make that function open a browser() (instead of trying to get result). That way you can look around to see where you are and what's going on.
Try this, which should clarify the cause of your problem:
test_case <- function() {
result <- 100
myFun <- function() browser()
make_command(myFun)$perform()
}
test_case()
## Then from within the browser:
#
parent.env(environment())
# <environment: 0x0d8de854>
# attr(,"class")
# [1] "proto" "environment"
get("result", parent.env(environment()))
# Error in get("result", parent.env(environment())) :
# object 'result' not found
#
parent.frame()
# <environment: 0x0d8ddfc0>
get("result", parent.frame()) ## (This works, so points towards a solution.)
# [1] 100
Here's the problem. Although you think you're evaluating myFun(), whose environment is the evaluation frame of test_case(), your call to do.call(func, ...) is really evaluating func(), whose environment is the proto environment within which it was defined. After looking for and not finding result in its own frame, the call to func() follows the rules of lexical scoping, and next looks in the proto environment. Neither it nor its parent environment contains an object named result, resulting in the error message you received.
If this doesn't immediately make sense, you can keep poking around within the browser. Here are a few further calls you might find helpful:
environment(get("myFun", parent.frame()))
ls(environment(get("myFun", parent.frame())))
environment(get("func", parent.env(environment())))
ls(environment(get("func", parent.env(environment()))))

Resources