R foreach stop iteration at i - r

I am using R package foreach.
When bug exists in foreach block, it's hard to re-occur it and hard to debug.
Take the following script as example.
I want to stop at i=4 to check what's wrong. However, it stops at i=10.
Any solution?
library(foreach)
foreach(i = icount(10)) %do% {
if (i == 4){
e <- simpleError("test error")
stop(e)
}
}

One option to handle this is with a browser() inside a tryCatch as in:
foreach(i = icount(10)) %do% {
tryCatch(
if (i == 4){
e <- simpleError("test error")
stop(e)
},
error = function(e) browser()
)
}
This will produce a browser of the environment at the time of the error, which will allow you to inspect any objects and/or debug your code.
Your console will then look like the following and you can ask what the value of i is. Like this:
Browse[1]> i
[1] 4

Related

How to tryCatch the same function call multiple times (N times) in R

We have a basic tryCatch that writes a dataframe to Google Sheets, and trys again if the first write fails for any reason:
result = tryCatch({
print('TRYING')
googlesheets4::sheet_write(data = our_df, ss = our_spreadsheet, sheet = 'our_sheetname')
}, error = function(e) {
print('ERROR, TRYING AGAIN')
googlesheets4::sheet_write(data = our_df, ss = our_spreadsheet, sheet = 'our_sheetname')
})
It is possible to generalize this code to retry the googlesheets4::sheet_write() function call for N number of tries? Is something built into base R for this or is there a good R library that handles unlimited retries of a function?
You can put it in a for loop like this.
First, I am going to define a function that often fails (as I don't have access to your Google sheet).
russian_roulette <- function(n = 6) {
revolver <- sample(1:n, 1)
if (revolver == 1) {
return("You lived")
} else {
stop("Better luck next time...")
}
}
Then you can try it as many times as you consider reasonable. You can replace my call to russian_roulette() with your call to googlesheets4::sheet_write().
NUM_TRIES <- 10
for (i in 1:NUM_TRIES) {
message(i)
result <- try({
russian_roulette()
})
if (class(result) != "try-error") {
print("Success!")
break
}
}
Output:
1
Error in russian_roulette() : Better luck next time...
2
Error in russian_roulette() : Better luck next time...
3
Error in russian_roulette() : Better luck next time...
4
Error in russian_roulette() : Better luck next time...
5
Error in russian_roulette() : Better luck next time...
6
[1] "Success!"
result
# [1] "You lived"
I don't know why you expect writing to a file to fail - depending on the reason you may want to add a Sys.sleep() call in there for a certain number of seconds after every failure.

How to skip the error file and continue to read the next one when batch reading files in R [duplicate]

I've read a few other SO questions about tryCatch and cuzzins, as well as the documentation:
Exception handling in R
catching an error and then branching logic
How can I check whether a function call results in a warning?
Problems with Plots in Loop
but I still don't understand.
I'm running a loop and want to skip to next if any of a few kinds of errors occur:
for (i in 1:39487) {
# EXCEPTION HANDLING
this.could.go.wrong <- tryCatch(
attemptsomething(),
error=function(e) next
)
so.could.this <- tryCatch(
doesthisfail(),
error=function(e) next
)
catch.all.errors <- function() { this.could.go.wrong; so.could.this; }
catch.all.errors;
#REAL WORK
useful(i); fun(i); good(i);
} #end for
(by the way, there is no documentation for next that I can find)
When I run this, R honks:
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
What basic point am I missing here? The tryCatch's are clearly within the for loop, so why doesn't R know that?
The key to using tryCatch is realising that it returns an object. If there was an error inside the tryCatch then this object will inherit from class error. You can test for class inheritance with the function inherit.
x <- tryCatch(stop("Error"), error = function(e) e)
class(x)
"simpleError" "error" "condition"
Edit:
What is the meaning of the argument error = function(e) e? This baffled me, and I don't think it's well explained in the documentation. What happens is that this argument catches any error messages that originate in the expression that you are tryCatching. If an error is caught, it gets returned as the value of tryCatch. In the help documentation this is described as a calling handler. The argument e inside error=function(e) is the error message originating in your code.
I come from the old school of procedural programming where using next was a bad thing. So I would rewrite your code something like this. (Note that I removed the next statement inside the tryCatch.):
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(!inherits(possibleError, "error")){
#REAL WORK
useful(i); fun(i); good(i);
}
} #end for
The function next is documented inside ?for`.
If you want to use that instead of having your main working routine inside an if, your code should look something like this:
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(inherits(possibleError, "error")) next
#REAL WORK
useful(i); fun(i); good(i);
} #end for
I found other answers very confusing. Here is an extremely simple implementation for anyone who wants to simply skip to the next loop iteration in the event of an error
for (i in 1:10) {
skip_to_next <- FALSE
# Note that print(b) fails since b doesn't exist
tryCatch(print(b), error = function(e) { skip_to_next <<- TRUE})
if(skip_to_next) { next }
}
for (i in -3:3) {
#ERROR HANDLING
possibleError <- tryCatch({
print(paste("Start Loop ", i ,sep=""))
if(i==0){
stop()
}
}
,
error=function(e) {
e
print(paste("Oops! --> Error in Loop ",i,sep = ""))
}
)
if(inherits(possibleError, "error")) next
print(paste(" End Loop ",i,sep = ""))
}
The only really detailed explanation I have seen can be found here: http://mazamascience.com/WorkingWithData/?p=912
Here is a code clip from that blog post showing how tryCatch works
#!/usr/bin/env Rscript
# tryCatch.r -- experiments with tryCatch
# Get any arguments
arguments <- commandArgs(trailingOnly=TRUE)
a <- arguments[1]
# Define a division function that can issue warnings and errors
myDivide <- function(d, a) {
if (a == 'warning') {
return_value <- 'myDivide warning result'
warning("myDivide warning message")
} else if (a == 'error') {
return_value <- 'myDivide error result'
stop("myDivide error message")
} else {
return_value = d / as.numeric(a)
}
return(return_value)
}
# Evalute the desired series of expressions inside of tryCatch
result <- tryCatch({
b <- 2
c <- b^2
d <- c+2
if (a == 'suppress-warnings') {
e <- suppressWarnings(myDivide(d,a))
} else {
e <- myDivide(d,a) # 6/a
}
f <- e + 100
}, warning = function(war) {
# warning handler picks up where error was generated
print(paste("MY_WARNING: ",war))
b <- "changing 'b' inside the warning handler has no effect"
e <- myDivide(d,0.1) # =60
f <- e + 100
return(f)
}, error = function(err) {
# warning handler picks up where error was generated
print(paste("MY_ERROR: ",err))
b <- "changing 'b' inside the error handler has no effect"
e <- myDivide(d,0.01) # =600
f <- e + 100
return(f)
}, finally = {
print(paste("a =",a))
print(paste("b =",b))
print(paste("c =",c))
print(paste("d =",d))
# NOTE: Finally is evaluated in the context of of the inital
# NOTE: tryCatch block and 'e' will not exist if a warning
# NOTE: or error occurred.
#print(paste("e =",e))
}) # END tryCatch
print(paste("result =",result))
One thing I was missing, which breaking out of for loop when running a function inside a for loop in R makes clear, is this:
next doesn't work inside a function.
You need to send some signal or flag (e.g., Voldemort = TRUE) from inside your function (in my case tryCatch) to the outside.
(this is like modifying a global, public variable inside a local, private function)
Then outside the function, you check to see if the flag was waved (does Voldemort == TRUE). If so you call break or next outside the function.

unrelated nested foreach with an outer %dopar% and an inner %do%

I am running tasks locally in parallel using %dopar% from the foreach package using the doSNOW package to create the cluster (running this on a windows machine at the moment). I have done this many times before and it works fine until I place an unrelated foreach loop using a %do% (i.e. non-parallel) inside of it. Then R gives me the error (with traceback) :
Error in { : task 1 failed - "could not find function "%do%"" 3 stop(simpleError(msg, call = expr)) 2 e$fun(obj, substitute(ex), parent.frame(), e$data) 1 foreach(rc = 1:5) %dopar% {
aRandomCounter = -1
if (1 > 0) {
for (batchi in 1:20) { ...
Here is some code that replicates the problem on my machine:
require(foreach)
require(doSNOW)
cl<-makeCluster(5)
registerDoSNOW(cl)
for(stepi in 1:10) # normal outer for
{
foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
{
aRandomCounter = -1
if(1 > 0)
{
for(batchi in 1:20)
{
anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do%
{
return(runif(1)) # I know this is not efficient, it is a placeholder to reproduce the issue
}
aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
}
}
return(aRandomCounter)
}
}
stopCluster(cl)
Replacing the inner foreach with a simple for or (l/s)apply is a solution. But is there a way to make this work with the inner foreach and why the error in the first place ?
Of course, I got it to work as soon as I posted it (sorry.. I will leave it in case someone else has the same issue). It is a scoping issue - I knew you had to load any external packages within the %dopar%, but what I did not realize is that that includes the foreach package itself. Here is the solution:
require(foreach)
require(doSNOW)
cl<-makeCluster(5)
registerDoSNOW(cl)
for(stepi in 1:10) # normal outer for
{
foreach(rc=1:5) %dopar% # the time consuming stuff in parallel (not looking to actually retrieve any data)
{
require(foreach) ### the solution
aRandomCounter = -1
if(1 > 0)
{
for(batchi in 1:20)
{
anObjectIwantToCreate <- foreach( qrc = 1:100, .combine=c ) %do%
{
return(runif(1))
}
aRandomCounter = aRandomCounter + sum(anObjectIwantToCreate > 0.5)
}
}
return(aRandomCounter)
}
}
stopCluster(cl)
I know this is an outdate question, but just to give a hint for those
who do not get nested foreach to work.
If parallelizing outer loop with putting %do% in %dopar%, you would
need to include .packages = c("doSNOW") in the augment of the
outer loop (%dopar%), otherwise you will run into "doSNOW not found" error.
Generally, people just parallelize inner loop (%dopar% in %:%), which
can be slow for a huge amount of data (waiting for combinations of inner loops).

How do you make tryCatch actually catch the error

I have to call a function that throws an error if the arguments didn't satisfy many conditions.
The conditions are so complicated that I cannot try to satisfy them 100% of the time (I would have to re-type all the conditions the function checks internally).
Instead, I should just retry calling with different arguments (as many times as necessary to fill my table).
In other languages I can write a catch block around the call.
However, in R tryCatch seems to work differently: you can give code with finally=, but after executing the finally-code the outer function terminates anyway.
Here is a minimal example:
sometimesError <- function() {
if(runif(1)<0.1) stop("err")
return(1)
}
fct <- function() {
theSum <- 0
while(theSum < 20) {
tryCatch( theSum <- theSum + sometimesError() )
}
return(theSum)
}
fct() # this should always evaluate to 20, never throw error
( I have read "Is there a way to source() and continue after an error?", and some other posts but I dont think they apply here. They achieve that the source'd code continues statement-by-statement regardless of error as if it were executing at the top level. I, on the other side, am happy with the called function terminating and it is the caller-code that should continue )
You can pass a function to the error argument of tryCatch to specify what should happen when there is an error. In this case, you could just return 0 when there is an error
fct <- function() {
theSum <- 0
while(theSum < 20) {
theSum <- theSum + tryCatch(sometimesError(), error=function(e) 0)
}
return(theSum)
}
As #rawr mentioned in the comments, you could also replace tryCatch with try in this case.
fct <- function() {
theSum <- 0
while(theSum < 20) {
try(theSum <- theSum + sometimesError(), silent=TRUE)
}
return(theSum)
}

Is there a way to run an expression on.exit() but only if completes normally, not on error?

I'm aware of the function on.exit in R, which is great. It runs the expression when the calling function exits, either normally or as the result of an error.
What I'd like is for the expression only to be run if the calling function returns normally, but not in the case of an error. I have multiple points where the function could return normally, and multiple points where it could fail. Is there a way to do this?
myfunction = function() {
...
on.exit( if (just exited normally without error) <something> )
...
if (...) then return( point 1 )
...
if (...) then return( point 2 )
...
if (...) then return( point 3 )
...
return ( point 4 )
}
The whole point of on.exit() is exactly to be run regardless of the exit status. Hence it disregards any error signal. This is afaik equivalent to the finally statement of the tryCatch function.
If you want to run code only on normal exit, simply put it at the end of your code. Yes, you'll have to restructure it a bit using else statements and by creating only 1 exit point, but that's considered good coding practice by some.
Using your example, that would be:
myfunction = function() {
...
if (...) then out <- point 1
...
else if (...) then out <- point 2
...
else if (...) then out <- point 3
...
else out <- point 4
WhateverNeedsToRunBeforeReturning
return(out)
}
Or see the answer of Charles for a nice implementation of this idea using local().
If you insist on using on.exit(), you can gamble on the working of the traceback mechanism to do something like this :
test <- function(x){
x + 12
}
myFun <- function(y){
on.exit({
err <- if( exists(".Traceback")){
nt <- length(.Traceback)
.Traceback[[nt]] == sys.calls()[[1]]
} else {FALSE}
if(!err) print("test")
})
test(y)
}
.Traceback contains the last call stack resulting in an error. You have to check whether the top call in that stack is equal to the current call, and in that case your call very likely threw the last error. So based on that condition you can try to hack yourself a solution I'd never use myself.
Just wrap the args of all your return function calls with the code that you want done. So your example becomes:
foo = function(thing){do something; return(thing)}
myfunction = function() {
...
if (...) then return( foo(point 1) )
...
if (...) then return( foo(point 2) )
...
if (...) then return( foo(point 3) )
...
return ( foo(point 4) )
}
Or just make each then clause into two statements. Using on.exit to lever some code into a number of places is going to cause spooky action-at-a-distance problems and make the baby Dijkstra cry (read Dijkstra's "GOTO considered harmful" paper).
Bit more readable version of my comment on #Joris' answer:
f = function() {
ret = local({
myvar = 42
if (runif(1) < 0.5)
return(2)
stop('oh noes')
}, environment())
# code to run on success...
print(sprintf('myvar is %d', myvar))
ret
}
I guess there is not a clean way yet. I usually create an OK variable at the beginning as FALSE and turn it to TRUE at the end. I prefer on.exit over isolating all my code into a tryCatch.
myfun = function() {
OK=F # the flag "OK" will be FALSE until the function ends OK
conn = my.db.connection.function()
dbBegin(conn)
on.exit({
if(OK) dbCommit(conn) else dbRollback(conn)
dbDisconnect(conn)
})
# ... Your code. You can edit database as a transaction.
# if anything fails in R or in the database a rollback will occur
OK=T # only if the code came to the end everything went ok, so we set the flag OK as TRUE
return(NULL)
}

Resources