Executing alternate cmds using R - r

Is it possible in R to do the following:
execute cmd1
if cmd1 generates error, proceed to:
execute cmd2
Thanks

try and/or tryCatch may be of use to you. Consider the super-simple toy example below:
# Our function just returns it's input as long as input is not negative otherwise an error is generated
f <- function(n) {
if( n < 0 )
stop("need positive integer")
return(n)
}
# Our alternative function to run if we get an error from the first function
f2 <- function(n) return( cat( paste( "You have a negative number which is" , n ) ) )
# Now we try to run it with `try`:
if( class( try( f(-1) , silent = TRUE ) ) == "try-error" )
f2(-1)
#You have a negative number which is -1
# And using the sophisticated `tryCatch()`
tryCatch( f(-1) , finally = f2(-1) )
#Error in f(-1) : need positive integer
#You have a negative number which is -1
The return value of try() is the value of the expression if it evaluates without error, otherwise an object of the class "try-error". In the first example we just check to see if an error was generated using comparing the class of the return value of try and execute f2() if an error was generated.
Note there are quite a few ways to handle this and I certainly wouldn't advocate either of these as being the best, but they should be a useful starting point for you to learn more about error handling.

Try this, err is the error message in the try block.
tryCatch(stop(),
error=function(err){
print(1)
print(err)
})

Depending on your use case, even simple short-circuiting of boolean operators might be enough. i.e., if your functions can return TRUE to indicate no-error and FALSE to indicate an error, then you can use the fact that || will only evaluate the RHS operand if the LHS operand evaluates to FALSE.
> doAwesomeStuff <- function() {cat("OMG, I'm awesome! <falls flat on face>\n"); FALSE}
> okDoNormalStuff <- function() {cat("OMG, I'm OK! :-)\n"); TRUE}
> doAwesomeStuff() || okDoNormalStuff()
OMG, I'm awesome! <falls flat on face>
OMG, I'm OK! :-)
[1] TRUE
But if doAwesomeStuff() works,
> doAwesomeStuff <- function() {cat("OMG, I'm awesome!\n"); TRUE}
> doAwesomeStuff() || okDoNormalStuff()
OMG, I'm awesome!
[1] TRUE

Related

How to skip the error file and continue to read the next one when batch reading files in R [duplicate]

I've read a few other SO questions about tryCatch and cuzzins, as well as the documentation:
Exception handling in R
catching an error and then branching logic
How can I check whether a function call results in a warning?
Problems with Plots in Loop
but I still don't understand.
I'm running a loop and want to skip to next if any of a few kinds of errors occur:
for (i in 1:39487) {
# EXCEPTION HANDLING
this.could.go.wrong <- tryCatch(
attemptsomething(),
error=function(e) next
)
so.could.this <- tryCatch(
doesthisfail(),
error=function(e) next
)
catch.all.errors <- function() { this.could.go.wrong; so.could.this; }
catch.all.errors;
#REAL WORK
useful(i); fun(i); good(i);
} #end for
(by the way, there is no documentation for next that I can find)
When I run this, R honks:
Error in value[[3L]](cond) : no loop for break/next, jumping to top level
What basic point am I missing here? The tryCatch's are clearly within the for loop, so why doesn't R know that?
The key to using tryCatch is realising that it returns an object. If there was an error inside the tryCatch then this object will inherit from class error. You can test for class inheritance with the function inherit.
x <- tryCatch(stop("Error"), error = function(e) e)
class(x)
"simpleError" "error" "condition"
Edit:
What is the meaning of the argument error = function(e) e? This baffled me, and I don't think it's well explained in the documentation. What happens is that this argument catches any error messages that originate in the expression that you are tryCatching. If an error is caught, it gets returned as the value of tryCatch. In the help documentation this is described as a calling handler. The argument e inside error=function(e) is the error message originating in your code.
I come from the old school of procedural programming where using next was a bad thing. So I would rewrite your code something like this. (Note that I removed the next statement inside the tryCatch.):
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(!inherits(possibleError, "error")){
#REAL WORK
useful(i); fun(i); good(i);
}
} #end for
The function next is documented inside ?for`.
If you want to use that instead of having your main working routine inside an if, your code should look something like this:
for (i in 1:39487) {
#ERROR HANDLING
possibleError <- tryCatch(
thing(),
error=function(e) e
)
if(inherits(possibleError, "error")) next
#REAL WORK
useful(i); fun(i); good(i);
} #end for
I found other answers very confusing. Here is an extremely simple implementation for anyone who wants to simply skip to the next loop iteration in the event of an error
for (i in 1:10) {
skip_to_next <- FALSE
# Note that print(b) fails since b doesn't exist
tryCatch(print(b), error = function(e) { skip_to_next <<- TRUE})
if(skip_to_next) { next }
}
for (i in -3:3) {
#ERROR HANDLING
possibleError <- tryCatch({
print(paste("Start Loop ", i ,sep=""))
if(i==0){
stop()
}
}
,
error=function(e) {
e
print(paste("Oops! --> Error in Loop ",i,sep = ""))
}
)
if(inherits(possibleError, "error")) next
print(paste(" End Loop ",i,sep = ""))
}
The only really detailed explanation I have seen can be found here: http://mazamascience.com/WorkingWithData/?p=912
Here is a code clip from that blog post showing how tryCatch works
#!/usr/bin/env Rscript
# tryCatch.r -- experiments with tryCatch
# Get any arguments
arguments <- commandArgs(trailingOnly=TRUE)
a <- arguments[1]
# Define a division function that can issue warnings and errors
myDivide <- function(d, a) {
if (a == 'warning') {
return_value <- 'myDivide warning result'
warning("myDivide warning message")
} else if (a == 'error') {
return_value <- 'myDivide error result'
stop("myDivide error message")
} else {
return_value = d / as.numeric(a)
}
return(return_value)
}
# Evalute the desired series of expressions inside of tryCatch
result <- tryCatch({
b <- 2
c <- b^2
d <- c+2
if (a == 'suppress-warnings') {
e <- suppressWarnings(myDivide(d,a))
} else {
e <- myDivide(d,a) # 6/a
}
f <- e + 100
}, warning = function(war) {
# warning handler picks up where error was generated
print(paste("MY_WARNING: ",war))
b <- "changing 'b' inside the warning handler has no effect"
e <- myDivide(d,0.1) # =60
f <- e + 100
return(f)
}, error = function(err) {
# warning handler picks up where error was generated
print(paste("MY_ERROR: ",err))
b <- "changing 'b' inside the error handler has no effect"
e <- myDivide(d,0.01) # =600
f <- e + 100
return(f)
}, finally = {
print(paste("a =",a))
print(paste("b =",b))
print(paste("c =",c))
print(paste("d =",d))
# NOTE: Finally is evaluated in the context of of the inital
# NOTE: tryCatch block and 'e' will not exist if a warning
# NOTE: or error occurred.
#print(paste("e =",e))
}) # END tryCatch
print(paste("result =",result))
One thing I was missing, which breaking out of for loop when running a function inside a for loop in R makes clear, is this:
next doesn't work inside a function.
You need to send some signal or flag (e.g., Voldemort = TRUE) from inside your function (in my case tryCatch) to the outside.
(this is like modifying a global, public variable inside a local, private function)
Then outside the function, you check to see if the flag was waved (does Voldemort == TRUE). If so you call break or next outside the function.

Calling print(ls.str()) in function affect behavior of rep

Begin a new R session with an empty environment. Write a series of functions with a parameter that is to be used as the value of the times parameter in a call to rep().
f <- function(n) {
rep("hello", times = n)
}
f(x)
One expect this to fail, and indeed one gets:
# Error in f(x) : object 'x' not found
Modify the function a bit:
f2 <- function(n) {
ls.str()
rep("hello", times = n)
}
f2(x)
As expected, it still fails:
# Error in f2(x) : object 'x' not found
Modify a bit more (to see the environment in the console):
f3 <- function(n) {
print(ls.str())
rep("hello", times = n)
}
f3(x)
I still expect failure, but instead get:
## n : <missing>
## [1] "hello"
It is as if the call to print() makes rep work as though times were set to 1.
This is not an answer, but too long to post as a comment. A minimal reproducible example is:
f3 <- function(n) {
try(get("n", environment(), inherits=FALSE))
rep("hello", times = n)
}
f3(x)
## Error in get("n", environment(), inherits = FALSE) : object 'x' not found
## [1] "hello"
The following is speculative and based on loosely examining the source for do_rep. get starts the promise evaluation, but upon not finding the "missing" symbol appears to leave the promise partially unevaluated. rep, being a primitive, then attempts to operate on n without realizing that it is a partially evaluated promise and basically that leads implicitly to the assumption that 'n == 1'.
Also, this shows that the promise is in a weird state (have to use browser/debug to see it):
f3a <- function(n) {
try(get("n", environment(), inherits=FALSE))
browser()
rep("hello", times = n)
}
f3a(x)
## Error in get("n", environment(), inherits = FALSE) : object 'x' not found
## Called from: f3a(x)
# Browse[1]> (n)
## Error: object 'x' not found
## In addition: Warning message:
## restarting interrupted promise evaluation
## Browse[1]> c
## [1] "hello"
I received earlier today a report that the bug has been fixed in R-devel and R-patched.
The issue was that the test for missingness in the R sources did not consider the case of an interrupted promise evaluation. A fix has been committed by Luke Tierney and can be seen on GitHub.
f4 <- function(n) {
print('test')
print(ls.str())
print('end test')
rep("hello", times = n)
}
f4(x)
## [1] "test"
## n : <missing>
## [1] "end test"
## [1] "hello"
There's something within print.ls_str, from Frank's test on chat the follwing code exhibit the same problem:
f6 <- function(n) {
z = tryCatch(get("n", new.env(), mode = "any"), error = function(e) e)
rep("A", n)
}
Digging a little inside R source I found the following code
# define GET_VALUE(rval) \
/* We need to evaluate if it is a promise */ \
if (TYPEOF(rval) == PROMSXP) { \
PROTECT(rval); \
rval = eval(rval, genv); \
UNPROTECT(1); \
} \
\
if (!ISNULL(rval) && NAMED(rval) == 0) \
SET_NAMED(rval, 1)
GET_VALUE(rval);
break;
case 2: // get0(.)
if (rval == R_UnboundValue)
return CAD4R(args);// i.e. value_if_not_exists
GET_VALUE(rval);
break;
}
return rval;
}
#undef GET_VALUE
I'm quite surprised this compile properly, as far as I remember (my C is quite far behind) #define doesn't allow spaces between the # and define.
After digging for that, I'm wrong, from gcc doc:
Whitespace is also allowed before and after the `#'.
So there's probably something around this part of code, but that's above my head to pinpoint what exactly.

Check each argument exists as an input in a function

I am trying to make a function which gets few inputs. I would like to know how to check the availability of my arguments . Here is my function:
MyFunction<-function(data,window,dim,option) {
}
First, I want to see if there is any argument , if no, print an error
is it correct to use
if ~nargin
error('no input data')
}
Then, I want to make sure that the second argument is also inserted
is it right to ask like this
if nargin < 2
error('no window size specified')
}
Then, I want to check if the third argument is empty , set it as 1
if nargin < 3 || isempty(dim)
dim<-1
}
you can use hasArg()
testfunction <- function(x,y){
if(!hasArg(x)){
stop("missing x")
}
if(!hasArg(y)){
y = 3
}
return(x+y)
}
>testfunction(y=2)
Error in testfunction(y = 2) : missing x
> testfunction(x=1,y=2)
[1] 3
> testfunction(x=1)
[1] 4
As #Ben Bolker said , missing is used to test whether a value was specified as an argument to a function. You can have different conditions in your R functions such as warning or stop.
In your case, I would do the following
MyFunction<-function(data,window,dim,option) {
if (missing(data))
stop("the first argument called data is missing")
if (missing(window))
stop("the second argument called window is missing")
if (missing(dim))
dim <- 1
if (missing(option))
stop("the second argument called option is missing")
}

How to check if arguments have been correctly passed to Rscript on Windows

I'm trying to write an R script that takes in 3 arguments when run with Rscript: input file name, whether it has a header or not (values are 'header' or 'no_header', and a positive integer (the number of replacements; its for a bootstrap application). So, when I run it this way:
Rscript bootstrapWithReplacement.R survival.csv header 50
it should, before running, check if:
1) The script indeed took in 3 parameters;
2) whether the first parameter is a file;
3) whether the second parameter has a 'header' or 'no_header' value, and
4) if the number passed is a positive integer.
Here is my code so far:
pcArgs <- commandArgs()
snOffset <- grep('--args', pcArgs)
inputFile <- pcArgs[snOffset+1]
headerSpec <- pcArgs[snOffset+2] ## header/no_header
numberOfResamples <- pcArgs[snOffset+3] ## positive integer
check.integer <- function(N){
!length(grep("[^[:digit:]]", as.character(N)))
}
if (!file_test("-f",inputFile)) {stop("inputFile not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!exists("headerSpec")) {stop("headerSpec not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!exists("numberOfResamples")) {stop("numberOfResamples not defined. Proper use: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if ((headerSpec != 'header') == TRUE & (headerSpec != 'no_header') == TRUE) {stop("headerSpec not properly defined. Correct values: 'header' OR 'no_header'.")}
if (check.integer(numberOfResamples) != TRUE | (numberOfResamples>0) != TRUE) {stop("numberOfResamples not properly defined. Must be an integer larger than 0.")}
if (headerSpec == 'header') {
inputData<-read.csv(inputFile)
for (i in 1:numberOfResamples) {write.csv(inputData[sample(nrow(inputData),replace=TRUE),], paste("./bootstrap_",i,"_",inputFile,sep=""), row.names=FALSE)}
}
if (headerSpec == 'no_header') {
inputData<-read.table(inputFile,header=FALSE)
for (i in 1:numberOfResamples) {write.table(inputData[sample(nrow(inputData),replace=TRUE),], paste("./bootstrap_",i,"_",inputFile,sep=""),
sep=",", row.names=FALSE, col.names=FALSE)}
}
My problem is, the check for the existence of a file works, but for the header or integer don't.
Also, how can I, in the beginning, check if all three arguments have been passed?
Thanks!
As Vincent said, you should use the trailingOnly argument to commandArgs to simplify things.
As Konrad said, never, ever, ever compare directly to TRUE and FALSE.
Also, use assertive for doing assertions.
library(assertive)
library(methods)
cmd_args <- commandArgs(TRUE)
if(length(cmd_args) < 3)
{
stop("Not enough arguments. Please supply 3 arguments.")
}
inputFile <- cmd_args[1]
if (!file_test("-f", inputFile))
{
stop("inputFile not defined, or not correctly named."
}
headerSpec <- match.arg(cmd_args[2], c("header", "no_header"))
numberOfResamples <- as.numeric(cmd_args[3])
assert_all_numbers_are_whole_numbers(numberOfResamples)
assert_all_are_positive(numberOfResamples)
message("Success!")
I managed to solve all the checks, here's how:
if ((length(pcArgs) == 8) == FALSE) {stop("Not enough arguments. Please supply 3 arguments. Proper use example: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if (!file_test("-f",inputFile)) {stop("inputFile not defined, or not correctly named. Proper use example: Rscript bootstrapWithReplacementFile.R survival.csv header 50.")}
if ((headerSpec != 'header') == TRUE & (headerSpec != 'no_header') == TRUE) {stop("headerSpec not properly defined. Correct values: 'header' OR 'no_header'.")}
if (check.integer(numberOfResamples) != TRUE | (numberOfResamples>0) != TRUE) {stop("numberOfResamples not properly defined. Must be an integer larger than 0.")}
Thanks everyone!

Is there a way to run an expression on.exit() but only if completes normally, not on error?

I'm aware of the function on.exit in R, which is great. It runs the expression when the calling function exits, either normally or as the result of an error.
What I'd like is for the expression only to be run if the calling function returns normally, but not in the case of an error. I have multiple points where the function could return normally, and multiple points where it could fail. Is there a way to do this?
myfunction = function() {
...
on.exit( if (just exited normally without error) <something> )
...
if (...) then return( point 1 )
...
if (...) then return( point 2 )
...
if (...) then return( point 3 )
...
return ( point 4 )
}
The whole point of on.exit() is exactly to be run regardless of the exit status. Hence it disregards any error signal. This is afaik equivalent to the finally statement of the tryCatch function.
If you want to run code only on normal exit, simply put it at the end of your code. Yes, you'll have to restructure it a bit using else statements and by creating only 1 exit point, but that's considered good coding practice by some.
Using your example, that would be:
myfunction = function() {
...
if (...) then out <- point 1
...
else if (...) then out <- point 2
...
else if (...) then out <- point 3
...
else out <- point 4
WhateverNeedsToRunBeforeReturning
return(out)
}
Or see the answer of Charles for a nice implementation of this idea using local().
If you insist on using on.exit(), you can gamble on the working of the traceback mechanism to do something like this :
test <- function(x){
x + 12
}
myFun <- function(y){
on.exit({
err <- if( exists(".Traceback")){
nt <- length(.Traceback)
.Traceback[[nt]] == sys.calls()[[1]]
} else {FALSE}
if(!err) print("test")
})
test(y)
}
.Traceback contains the last call stack resulting in an error. You have to check whether the top call in that stack is equal to the current call, and in that case your call very likely threw the last error. So based on that condition you can try to hack yourself a solution I'd never use myself.
Just wrap the args of all your return function calls with the code that you want done. So your example becomes:
foo = function(thing){do something; return(thing)}
myfunction = function() {
...
if (...) then return( foo(point 1) )
...
if (...) then return( foo(point 2) )
...
if (...) then return( foo(point 3) )
...
return ( foo(point 4) )
}
Or just make each then clause into two statements. Using on.exit to lever some code into a number of places is going to cause spooky action-at-a-distance problems and make the baby Dijkstra cry (read Dijkstra's "GOTO considered harmful" paper).
Bit more readable version of my comment on #Joris' answer:
f = function() {
ret = local({
myvar = 42
if (runif(1) < 0.5)
return(2)
stop('oh noes')
}, environment())
# code to run on success...
print(sprintf('myvar is %d', myvar))
ret
}
I guess there is not a clean way yet. I usually create an OK variable at the beginning as FALSE and turn it to TRUE at the end. I prefer on.exit over isolating all my code into a tryCatch.
myfun = function() {
OK=F # the flag "OK" will be FALSE until the function ends OK
conn = my.db.connection.function()
dbBegin(conn)
on.exit({
if(OK) dbCommit(conn) else dbRollback(conn)
dbDisconnect(conn)
})
# ... Your code. You can edit database as a transaction.
# if anything fails in R or in the database a rollback will occur
OK=T # only if the code came to the end everything went ok, so we set the flag OK as TRUE
return(NULL)
}

Resources