Advanced error handling: systematically try a range of handlers - r

Another follow up to this and this.
Actual question
Question 1
Upon running into some condition (say a simpleError), how can I invoke a respective restart handler that systematically tests a range of actual handler functions until one is found that does not result in another condition? If the last available handler has been tried, the default abortion restart handler should be invoked (invokeRestart("abort")). The implementation should allow for a flexible specification of the actual "handler suite" to use.
Question 2
I don't understand how a) the a test function that is specified alongside a restart handler works and b) where it would make sense to use it. Any suggestions? A short example would be great!
The help page of withRestarts says:
The most flexible form of a restart specification is as a list that can include several fields, including handler, description, and test. The test field should contain a function of one argument, a condition, that returns TRUE if the restart applies to the condition and FALSE if it does not; the default function returns TRUE for all conditions.
For those interested in more details
Below you'll find my first approach with respect to question 1, but I'm sure there's something much more cleaner/more straight-forward out there ;-)
foo <- function(x, y) x + y
fooHandled <- function(
x,
y,
warning=function(cond, ...) {
invokeRestart("warninghandler", cond=cond, ...)},
error=function(
cond,
handlers=list(
expr=expression(x+"b"),
expr=expression(x+"c"),
expr=expression(x+100)
),
...) {
invokeRestart("errorhandler", cond=cond, handlers=handlers, ...)
}
) {
expr <- expression(foo(x=x, y=y))
withRestarts(
withCallingHandlers(
expr=eval(expr),
warning=warning,
error=error
),
warninghandler=function(cond, ...) warning(cond),
errorhandler=function(cond, handlers, ...) {
idx <- 1
do.continue <- TRUE
while (do.continue) {
message(paste("handler:", idx))
expr <- handlers[[idx]]
out <- withRestarts(
tryCatch(
expr=eval(expr),
error=function(cond, ...) {
print(cond)
message("trying next handler ...")
return(cond)
}
)
)
idx <- idx + 1
do.continue <- inherits(out, "simpleError")
}
return(out)
}
)
}
> fooHandled(x=1, y=1)
[1] 2
> fooHandled(x=1, y="a")
handler: 1
<simpleError in x + "b": non-numeric argument to binary operator>
trying next handler ...
handler: 2
<simpleError in x + "c": non-numeric argument to binary operator>
trying next handler ...
handler: 3
[1] 101
EDIT
I'd also be interested in hearing how to substitute the tryCatch part with a withCallingHandlers part. Seems like withCallingHandlers() doesn't really return anything that could be picked up to determine the value of do.continue

Related

How to skip the error file and continue to read the next one when batch reading files in R [duplicate]

I've read a few other SO questions about tryCatch and cuzzins, as well as the documentation:
Exception handling in R
catching an error and then branching logic
How can I check whether a function call results in a warning?
Problems with Plots in Loop
but I still don't understand.
I'm running a loop and want to skip to next if any of a few kinds of errors occur:
for (i in 1:39487) {
# EXCEPTION HANDLING
this.could.go.wrong <- tryCatch(
attemptsomething(),
error=function(e) next
)
so.could.this <- tryCatch(
doesthisfail(),
error=function(e) next
)
catch.all.errors <- function() { this.could.go.wrong; so.could.this; }
catch.all.errors;
#REAL WORK
useful(i); fun(i); good(i);
} #end for
(by the way, there is no documentation for next that I can find)
When I run this, R honks:
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
What basic point am I missing here? The tryCatch's are clearly within the for loop, so why doesn't R know that?
The key to using tryCatch is realising that it returns an object. If there was an error inside the tryCatch then this object will inherit from class error. You can test for class inheritance with the function inherit.
x <- tryCatch(stop("Error"), error = function(e) e)
class(x)
"simpleError" "error" "condition"
Edit:
What is the meaning of the argument error = function(e) e? This baffled me, and I don't think it's well explained in the documentation. What happens is that this argument catches any error messages that originate in the expression that you are tryCatching. If an error is caught, it gets returned as the value of tryCatch. In the help documentation this is described as a calling handler. The argument e inside error=function(e) is the error message originating in your code.
I come from the old school of procedural programming where using next was a bad thing. So I would rewrite your code something like this. (Note that I removed the next statement inside the tryCatch.):
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(!inherits(possibleError, "error")){
#REAL WORK
useful(i); fun(i); good(i);
}
} #end for
The function next is documented inside ?for`.
If you want to use that instead of having your main working routine inside an if, your code should look something like this:
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(inherits(possibleError, "error")) next
#REAL WORK
useful(i); fun(i); good(i);
} #end for
I found other answers very confusing. Here is an extremely simple implementation for anyone who wants to simply skip to the next loop iteration in the event of an error
for (i in 1:10) {
skip_to_next <- FALSE
# Note that print(b) fails since b doesn't exist
tryCatch(print(b), error = function(e) { skip_to_next <<- TRUE})
if(skip_to_next) { next }
}
for (i in -3:3) {
#ERROR HANDLING
possibleError <- tryCatch({
print(paste("Start Loop ", i ,sep=""))
if(i==0){
stop()
}
}
,
error=function(e) {
e
print(paste("Oops! --> Error in Loop ",i,sep = ""))
}
)
if(inherits(possibleError, "error")) next
print(paste(" End Loop ",i,sep = ""))
}
The only really detailed explanation I have seen can be found here: http://mazamascience.com/WorkingWithData/?p=912
Here is a code clip from that blog post showing how tryCatch works
#!/usr/bin/env Rscript
# tryCatch.r -- experiments with tryCatch
# Get any arguments
arguments <- commandArgs(trailingOnly=TRUE)
a <- arguments[1]
# Define a division function that can issue warnings and errors
myDivide <- function(d, a) {
if (a == 'warning') {
return_value <- 'myDivide warning result'
warning("myDivide warning message")
} else if (a == 'error') {
return_value <- 'myDivide error result'
stop("myDivide error message")
} else {
return_value = d / as.numeric(a)
}
return(return_value)
}
# Evalute the desired series of expressions inside of tryCatch
result <- tryCatch({
b <- 2
c <- b^2
d <- c+2
if (a == 'suppress-warnings') {
e <- suppressWarnings(myDivide(d,a))
} else {
e <- myDivide(d,a) # 6/a
}
f <- e + 100
}, warning = function(war) {
# warning handler picks up where error was generated
print(paste("MY_WARNING: ",war))
b <- "changing 'b' inside the warning handler has no effect"
e <- myDivide(d,0.1) # =60
f <- e + 100
return(f)
}, error = function(err) {
# warning handler picks up where error was generated
print(paste("MY_ERROR: ",err))
b <- "changing 'b' inside the error handler has no effect"
e <- myDivide(d,0.01) # =600
f <- e + 100
return(f)
}, finally = {
print(paste("a =",a))
print(paste("b =",b))
print(paste("c =",c))
print(paste("d =",d))
# NOTE: Finally is evaluated in the context of of the inital
# NOTE: tryCatch block and 'e' will not exist if a warning
# NOTE: or error occurred.
#print(paste("e =",e))
}) # END tryCatch
print(paste("result =",result))
One thing I was missing, which breaking out of for loop when running a function inside a for loop in R makes clear, is this:
next doesn't work inside a function.
You need to send some signal or flag (e.g., Voldemort = TRUE) from inside your function (in my case tryCatch) to the outside.
(this is like modifying a global, public variable inside a local, private function)
Then outside the function, you check to see if the flag was waved (does Voldemort == TRUE). If so you call break or next outside the function.

Modifying calls in function arguments

How can a function inspect and modify the arguments of a call that it received as argument?
Application: A user feeds a call to function a as an argument to function b, but they forget to specify one of the required arguments of a. How can function b detect the problem and fix it?
In this minimal example, function a requires two arguments:
a <- function(arg1, arg2) {
return(arg1 + arg2)
}
Function b accepts a call and an argument. The commented lines indicate what I need to do:
b <- function(CALL, arg3) {
# 1. check if `arg2` is missing from CALL
# 2. if `arg2` is missing, plug `arg3` in its place
# 3. return evaluated call
CALL
}
Expected behavior:
b(CALL = a(arg1 = 1, arg2 = 2), arg3 = 3)
> 3
b(CALL = a(arg1 = 1), arg3 = 3)
> 4
The second call currently fails because the user forgot to specify the required arg2 argument. How can function b fix this mistake automatically?
Can I exploit lazy evaluation to modify the call to a before it is evaluated? I looked into rlang::modify_call but couldn't figure it out.
Here's a method that would work
b <- function(CALL, arg3) {
scall <- substitute(CALL)
stopifnot(is.call(scall)) #check that it's a call
lcall <- as.list(scall)
if (!"arg2" %in% names(lcall)) {
lcall <- c(lcall, list(arg2 = arg3))
}
eval.parent(as.call(lcall))
}
We use substitute() to grab the unevaluated version the CALL parameter. We convert it to a list so we can modify it. Then we append to the list another list with the parameter name/value we want. Finally, we turn the list back into a call and then evaluate that call in the environment of the caller rather than in the function body itself.
If you wanted to use rlang::modify_call and other rlang functions you could use
b <- function(CALL, arg3) {
scall <- rlang::enquo(CALL)
stopifnot(rlang::quo_is_call(scall))
if (!"arg2" %in% names(rlang::quo_get_expr(scall))) {
scall <- rlang::call_modify(scall, arg2=arg3)
}
rlang::eval_tidy(scall, env = rlang::caller_env())
}
I don't see why fancy language manipulation is needed. The problem is what to do when a, which requires 2 arguments, is supplied only 1. Wrapping it with b, which has a default value for the 2nd argument, solves this.
b <- function(arg1, arg2=42)
{
a(arg1, arg2)
}
b(1)
# [1] 43
b(1, 2)
# [1] 3

When does initialize check for object validity?

From Chambers' (excellent) Extending R (2016):
A validity method will be called automatically from the default method for initialize(). The recommended form of an initialize method ends with a callNextMethod() call, to ensure that subclass slots can be specified in a call to the generator for the class. If this convention is followed, initialization will end with a call to the default method, and the validity method will be called after all initialization has occurred.
I thought I understood, but the behavior I am getting does not seem to follow this convention.
setClass("A", slots = c(s1 = "numeric"))
setValidity("A", function(object) {
if (length(object#s1) > 5) {
return("s1 longer than 5")
}
TRUE
})
setMethod("initialize", "A", function(.Object, s1, ...) {
if (!missing(s1)) .Object#s1 <- s1 + 4
callNextMethod(.Object, ...)
})
A <- new("A", rep(1.0, 6))
A
# An object of class "A"
# Slot "s1":
# [1] 5 5 5 5 5 5
validObject(A)
# Error in validObject(A) : invalid class “A” object: s1 longer than 5
I expected the validity checking to be done by adding callNextMethod() to the end of the initialize method. Adding an explicit validObject(.Object) before callNextMethod() works, but I am clearly not understanding something here.
Obviously, I can also do all the same checks in the validity method, but ideally all of the validity checking would occur within setValidity so future edits live in one place.
Changing the initialize function slightly gives the desired result -- is there a reason to use one approach over the other? Chambers seems to prefer using .Object#<- whereas I have seen the following method elsewhere (Gentlemman & Hadley).
setMethod("initialize", "A", function(.Object, s1, ...) {
if (!missing(s1)) s1 + 4
else s1 <- numeric()
callNextMethod(.Object, s1 = s1, ...)
})
Perhaps the best guide comes from initialize itself — if you inspect the code for the default method
getMethod("initialize",signature(.Object="ANY"))
then you see that it does indeed contain an explicit call to validObject at the end:
...
validObject(.Object)
}
.Object
}
so if you define your own initialize method, the most similar thing you could do would be to call it at the end of your method, right before you call callNextMethod.
In your case, when you call callNextMethod, that is only checking that the slot you have created is a valid numeric object (which it is), rather than checking the validity of the larger object (which requires the s1 slot to be no longer than 5 elements)

Calling print(ls.str()) in function affect behavior of rep

Begin a new R session with an empty environment. Write a series of functions with a parameter that is to be used as the value of the times parameter in a call to rep().
f <- function(n) {
rep("hello", times = n)
}
f(x)
One expect this to fail, and indeed one gets:
# Error in f(x) : object 'x' not found
Modify the function a bit:
f2 <- function(n) {
ls.str()
rep("hello", times = n)
}
f2(x)
As expected, it still fails:
# Error in f2(x) : object 'x' not found
Modify a bit more (to see the environment in the console):
f3 <- function(n) {
print(ls.str())
rep("hello", times = n)
}
f3(x)
I still expect failure, but instead get:
## n : <missing>
## [1] "hello"
It is as if the call to print() makes rep work as though times were set to 1.
This is not an answer, but too long to post as a comment. A minimal reproducible example is:
f3 <- function(n) {
try(get("n", environment(), inherits=FALSE))
rep("hello", times = n)
}
f3(x)
## Error in get("n", environment(), inherits = FALSE) : object 'x' not found
## [1] "hello"
The following is speculative and based on loosely examining the source for do_rep. get starts the promise evaluation, but upon not finding the "missing" symbol appears to leave the promise partially unevaluated. rep, being a primitive, then attempts to operate on n without realizing that it is a partially evaluated promise and basically that leads implicitly to the assumption that 'n == 1'.
Also, this shows that the promise is in a weird state (have to use browser/debug to see it):
f3a <- function(n) {
try(get("n", environment(), inherits=FALSE))
browser()
rep("hello", times = n)
}
f3a(x)
## Error in get("n", environment(), inherits = FALSE) : object 'x' not found
## Called from: f3a(x)
# Browse[1]> (n)
## Error: object 'x' not found
## In addition: Warning message:
## restarting interrupted promise evaluation
## Browse[1]> c
## [1] "hello"
I received earlier today a report that the bug has been fixed in R-devel and R-patched.
The issue was that the test for missingness in the R sources did not consider the case of an interrupted promise evaluation. A fix has been committed by Luke Tierney and can be seen on GitHub.
f4 <- function(n) {
print('test')
print(ls.str())
print('end test')
rep("hello", times = n)
}
f4(x)
## [1] "test"
## n : <missing>
## [1] "end test"
## [1] "hello"
There's something within print.ls_str, from Frank's test on chat the follwing code exhibit the same problem:
f6 <- function(n) {
z = tryCatch(get("n", new.env(), mode = "any"), error = function(e) e)
rep("A", n)
}
Digging a little inside R source I found the following code
# define GET_VALUE(rval) \
/* We need to evaluate if it is a promise */ \
if (TYPEOF(rval) == PROMSXP) { \
PROTECT(rval); \
rval = eval(rval, genv); \
UNPROTECT(1); \
} \
\
if (!ISNULL(rval) && NAMED(rval) == 0) \
SET_NAMED(rval, 1)
GET_VALUE(rval);
break;
case 2: // get0(.)
if (rval == R_UnboundValue)
return CAD4R(args);// i.e. value_if_not_exists
GET_VALUE(rval);
break;
}
return rval;
}
#undef GET_VALUE
I'm quite surprised this compile properly, as far as I remember (my C is quite far behind) #define doesn't allow spaces between the # and define.
After digging for that, I'm wrong, from gcc doc:
Whitespace is also allowed before and after the `#'.
So there's probably something around this part of code, but that's above my head to pinpoint what exactly.

R: Let users use the console while inside a function, record their inputs and react

Is it possible to write a function in R which will hold its execution, giving the users control over the console (while in interactive mode of course), meanwhile recording their inputs, and continuing execution either:
after a certain input has been made
or after a certain output has been made
or a certain duration of time has passed
Example: ask the user a question (without using readline() for the answer)
question <- function() {
message("How much is 2 + 2?")
#let users take control of the console
#continue to next statement only if they input "2+2", or "4" or a minute has passed
#meanwhile record their last input similar to ".Last.Value", e.g.:
startTime <- Sys.time()
timeout <- FALSE
lastInput <- lastInput()
while (eval(parse(text = lastInput)) != 4 & !timeout) {
if (difftime(Sys.time(), startTime, units = "mins") > 1) {
timeout <- TRUE
}
lastInput <- lastInput()
}
if (timeout) {
stop("Sorry, timeout.")
} else {
message("Correct! Let's continue with this function:")
}
}
Where lastInput() is a function which "listens" to user input when it changes.
Obviously the above structure is tentative and won't give me what I want, some way to "listen" or "observe" and only react when the user inputs something to the console.
The final user experience should be:
> question()
How much is 2+2?
> #I'm the user, I can do whatever
> head(mtcars)
> plot(1:10)
> 3
> 2 + 2
[1] 4
Correct! Let's continue with this function:
Am I too optimistic or is there some R magic for this?
Thanks to #parth I have looked at swirl's source code and got acquainted with the addTaskCallback function. From the help file:
addTaskCallback registers an R function that is to be called each time a top-level task is completed.
And so we can make R check the users input ("top-level task") with a specific function, responding accordingly.
But since the swirl code is very "heavy", I think I need to supply a minimal example:
swirllike <- function(...){
removeTaskCallback("swirllike")
e <- new.env(globalenv())
e$prompt <- TRUE
e$startTime <- Sys.time()
cb <- function(expr, val, ok, vis, data=e){
e$expr <- expr
e$val <- val
e$ok <- ok
e$vis <- vis
# The result of f() will determine whether the callback
# remains active
return(f(e, ...))
}
addTaskCallback(cb, name = "swirllike")
message("How much is 2+2?")
}
OK, so the swirllike function evokes the 2+2 question, but it also declares a new environment e with some objects the user needs not know. It then adds the swirllike task callback to the task callback list (or rather vector). This "task callback" holds the cb function which calls the f function - the f function will run with every input.
If you run this, make sure you see the swirllike task callback with:
> getTaskCallbackNames()
[1] "swirllike"
Now the f function is similar to my sketch in the question:
f <- function(e, ...){
if (e$prompt) {
if (difftime(Sys.time(), e$startTime, units = "mins") > 1) {
timeout <- TRUE
stop("Sorry, timeout.")
}
if(!is.null(.Last.value) && .Last.value == 4) {
message("Correct! Let's continue with this function:")
e$prompt <- FALSE
while (!e$prompt) {
#continue asking questions or something, but for this example:
break
}
}
}
return(TRUE)
}
And don't forget to remove the swirllike task callback with:
removeTaskCallback("swirllike")

Resources