How do I avoid halting the execution of a standalone r script that encounters an error? - r

I am running an optimization program I wrote in a multi-language framework. Because I rely on different languages to accomplish the task, everything must be standalone so it can be launched through a batch file. Everything has been going fine for 2-3 months, but I finally ran out of luck when one of the crucial parts of this process, executed through a standalone R script, encountered something new and gave me an error message. This error message makes everything screech to a halt despite my best efforts:
selMEM<-forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
Procedure stopped (adjR2thresh criteria) adjR2cum = 0.000000 with 0 variables (superior to -0.005810)
Error in forward.sel(muskfreq, musk.MEM, adjR2thresh = adjR2) :
No variables selected. Please change your parameters.
I know why I am getting this message: it is warning me that no variables are above the threshold I have programmed to retain during a forward selection. Although this didn't happen in hundreds of runs, it's not that big a deal, I just need to tell R what to do next. This is where I am lost. After an exhaustive search through several posts (such as here), it seams that try() and tryCatch() are the way to go. So I have tried the following:
selMEM<-try(forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2))
if(inherits(selMEM, "try-error")) {
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
} else {
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}
The code after the problematic line works perfectly if I execute it line by line in R, but when I execute it as a standalone script from the command prompt, I still get the same error message and my whole process screeches to a halt before it executes what follows.
Any suggestions on how to make this work?

Note this in the try help:
try is implemented using tryCatch; for programming, instead of
try(expr, silent = TRUE), something like tryCatch(expr, error =
function(e) e) (or other simple error handler functions) may be more
efficient and flexible.
Look to tryCatch, possibly:
selMEM <- tryCatch({
forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
}, error = function(e) {
message(e)
return(NULL)
})
if(is.null(selMEM)) {
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
} else {
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}

Have you tried setting the silent parameter to true in the Try function?
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
try({
selMEM <- forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}, silent=T)

Related

how to automatically restart the execution of a R script if its execution is interrupted

I have a script that constantly performs a set of calculations in an endless loop. But often there are various errors that I cannot predict and the script stops working. I would like to automatically restart the script every time it stops working, and I don't care why the error occurred, I just want to restart the script. I will cite a deliberately erroneous code that I would like to repeat after an error has been issued.
while (TRUE) {
1+m
}
The try() and trycatch() funtions are designed to deal with code that might cause errors. In the case of your example code, changing it to:
while (TRUE) {
try(1+m)
}
will keep trying to do the line that produces an error. If your code inside the loop is multiple lines, you can make it into a block by wrapping it in braces, e.g. :
while (TRUE) {
try({
a <- 1+m
print(a)
})
}

How do I close unused connections after read_html in R

I am quite new to R and am trying to access some information on the internet, but am having problems with connections that don't seem to be closing. I would really appreciate it if someone here could give me some advice...
Originally I wanted to use the WebChem package, which theoretically delivers everything I want, but when some of the output data is missing from the webpage, WebChem doesn't return any data from that page. To get around this, I have taken most of the code from the package but altered it slightly to fit my needs. This worked fine, for about the first 150 usages, but now, although I have changed nothing, when I use the command read_html, I get the warning message " closing unused connection 4 (http:....." Although this is only a warning message, read_html doesn't return anything after this warning is generated.
I have written a simplified code, given below. This has the same problem
Closing R completely (or even rebooting my PC) doesn't seem to make a difference - the warning message now appears the second time I use the code. I can run the querys one at a time, outside of the loop with no problems, but as soon as I try to use the loop, the error occurs again on the 2nd iteration.
I have tried to vectorise the code, and again it returned the same error message.
I tried showConnections(all=TRUE), but only got connections 0-2 for stdin, stdout, stderr.
I have tried searching for ways to close the html connection, but I can't define the url as a con, and close(qurl) and close(ttt) also don't work. (Return errors of no applicable method for 'close' applied to an object of class "character and no applicable method for 'close' applied to an object of class "c('xml_document', 'xml_node')", repectively)
Does anybody know a way to close these connections so that they don't break my routine? Any suggestions would be very welcome. Thanks!
PS: I am using R version 3.3.0 with RStudio Version 0.99.902.
CasNrs <- c("630-08-0","463-49-0","194-59-2","86-74-8","148-79-8")
tit = character()
for (i in 1:length(CasNrs)){
CurrCasNr <- as.character(CasNrs[i])
baseurl <- 'http://chem.sis.nlm.nih.gov/chemidplus/rn/'
qurl <- paste0(baseurl, CurrCasNr, '?DT_START_ROW=0&DT_ROWS_PER_PAGE=50')
ttt <- try(read_html(qurl), silent = TRUE)
tit[i] <- xml_text(xml_find_all(ttt, "//head/title"))
}
After researching the topic I came up with the following solution:
url <- "https://website_example.com"
url = url(url, "rb")
html <- read_html(url)
close(url)
# + Whatever you wanna do with the html since it's already saved!
I haven't found a good answer for this problem. The best work-around that I came up with is to include the function below, with Secs = 3 or 4. I still don't know why the problem occurs or how to stop it without building in a large delay.
CatchupPause <- function(Secs){
Sys.sleep(Secs) #pause to let connection work
closeAllConnections()
gc()
}
I found this post as I was running into the same problems when I tried to scrape multiple datasets in the same script. The script would get progressively slower and I feel it was due to the connections. Here is a simple loop that closes out all of the connections.
for (i in seq_along(df$URLs)){function(i)
closeAllConnections(i)
}

Sink does not release file

I know that the sink() function can be used to divert R output into a file, e.g.
sink('sink-closing.txt')
cat('Hello world!')
sink()
Is there a simple command to close all outstanding sinks?
Below, I elaborate on my question.
Suppose that my R-script opens a sink() in an R-script, but there is an error in the R-script which occurs before the script closes the sink(). I may run the R-script multiple times, trying to fix the error. Finally, I want to close all the sinks and print to the console. How do I do so?
Finally, in the interest of concreteness, I provide a MWE to illustrate the problem I face.
First, I write an R-script sink-closing.R which has an error in it.
sink('sink-closing.txt')
foo <- function() {
cat(sprintf('Hello world! My name is %s\n',
a.variable.that.does.not.exist))
}
foo()
sink()
Next, I source the R-script multiple times, say 3 times by mistake as I try to find and fix the bug.
> source('~/Dropbox/cookbook/r-cookbook/sink-closing.R')
Error in sprintf("Hello world! My name is %s\n", a.variable.that.does.not.exist) :
object 'a.variable.that.does.not.exist' not found
Now, suppose that I am debugging the R-script and want to print to the console. I can call sink() multiple times to close the earlier sinks. If I call it 3 times, then I can finally print to the console as before. But how do I know how many sinks I need to close?
closeAllConnections() # .........................
I'm getting upvotes for this as time goes along but Simon.S.A and others are better.
You can use sink.number() to tell you how many diversions are already set and then call sink that many times. Putting it into a function you could have this
sink.reset <- function(){
for(i in seq_len(sink.number())){
sink(NULL)
}
}
Based on #mnel's comment:
sinkall <- function() {
i <- sink.number()
while (i > 0) {
sink()
i <- i - 1
}
}
Should close all open sinks.
You may also encounter this problem when dealing with devices and plots, where the number of open devices isn't reported anywhere. For a more general case you could use this:
stopWhenError <- function(FUN) {
tryCatch({
while(TRUE) {
FUN()
}
}, warning = function(w) {
print("All finished!")
}, error = function(e) {
print("All finished!")
})
}
stopWhenError(sink) # for sink.
stopWhenError(dev.off) # close all open plotting devices.
EDIT:
sink throws a warning not an error so I've modified the code so that it won't run forever, whoops!
The most common time I experience this is when an error occurs preventing a sink from closing. For example, the following will leave an open sink after execution.
sink("output.txt")
my_function_that_will_error()
sink()
This can be avoided using on.exit(sink()). This will close the sink "when the current function exits (either naturally or as the result of an error)" (documentation here).
But you do have to change the order:
sink("output.txt")
on.exit(sink())
my_function_that_might_error()
So we create the sink, tell R to close it when it exits, and then execute the code that might error. This will close the sink regardless of whether the code errors or not.

tryCatch does not catch an error if called though RScript

I'm facing a strange issue in R.
Consider the following code (a really simplified version of the real code but still having the problem) :
library(timeSeries)
tryCatch(
{
specificWeekDay <- 2
currTs <- timeSeries(c(1,2),c('2012-01-01','2012-01-02'),
format='%Y-%m-%d',units='A')
# just 2 dates out of range
start <- time(currTs)[2]+100*24*3600
end <- time(currTs)[2]+110*24*3600
# this line returns an empty timeSeries
currTs <- window(currTs,start=start,end=end)
message("Up to now, everything is OK")
# this is the line with the uncatchable error
currTs[!(as.POSIXlt(time(currTs))$wday %in% specificWeekDay),] <- NA
message("I'm after the bugged line !")
},error=function(e){message(e)})
message("End")
When I run that code in RGui, I correctly get the following output:
Up to now, everything is OK
error in evaluating the argument 'i' in
selecting a method for function '[<-': Error in
as.POSIXlt.numeric(time(currTs)) : 'origin' must be supplied
End
Instead, when I run it through RScript (in windows) using the following line:
RScript.exe --vanilla "myscript.R"
I get this output:
Up to now, everything is OK
Execution interrupted
It seems like RScript crashes...
Any idea about the reason?
Is this a timeSeries package bug, or I'm doing something wrong ?
If the latter, what's the right way to be sure to catch all the errors ?
Thanks in advance.
EDIT :
Here's a smaller example reproducing the issue that doesn't use timeSeries package. To test it, just run it as described above:
library(methods)
# define a generic function
setGeneric("foo",
function(x, ...){standardGeneric("foo")})
# set a method for the generic function
setMethod("foo", signature("character"),
function(x) {x})
tryCatch(
{
foo("abc")
foo(notExisting)
},error=function(e)print(e))
It seems something related to generic method dispatching; when an argument of a method causes an error, the dispatcher cannot find the signature of the method and conseguently raises an exception that tryCatch function seems unable to handle when run through RScript.
Strangely, it doesn't happen for example with print(notExisting); in that case the exception is correctly handled.
Any idea about the reason and how to catch this kind of errors ?
Note:
I'm using R-2.14.2 on Windows 7
The issue is in the way the internal C code implementing S4 method dispatch tries to catch and handle some errors and how the non-interactive case is treated in this approach. A work-around should be in place in R-devel and R-patched soon.
Work-around now committed to R-devel and R-patched.
Information about tryCatch() [that the OP already knew and used but I didn't notice]
I think you are missing that your tryCatch() is not doing anything special with the error, hence you are raising an error in the normal fashion. In interactive use the error is thrown and handled in the usual fashion, but an error inside a script run in a non-interactive session (a la Rscript) will abort the running script.
tryCatch() is a complex function that allows the potential to trap and handle all sorts of events in R, not just errors. However by default it is set up to mimic the standard R error handling procedure; basically allow the error to be thrown and reported by R. If you want R to do anything other than the basic behaviour then you need to add a specific handler for the error:
> e <- simpleError("test error")
> tryCatch(foo, error = function(e) e,
+ finally = writeLines("There was a problem!"))
There was a problem!
<simpleError in doTryCatch(return(expr), name, parentenv, handler): object 'foo'
not found>
I suggest you read ?tryCatch in more detail to understand better what it does.
An alternative is to use try(). To modify your script I would just do:
# this is the line with the uncatchable error
tried <- try(currTs[!(as.POSIXlt(time(currTs))$wday %in% specificWeekDay),] <- NA,
silent = TRUE)
if(inherits(tried, "try-error")) {
writeLines("There was an error!")
} else {
writeLines("Everything worked fine!")
}
The key bit is to save the object returned from try() so you can test the class, and to have try() operate silently. Consider the difference:
> bar <- try(foo)
Error in try(foo) : object 'foo' not found
> bar <- try(foo, silent = TRUE)
> class(bar)
[1] "try-error"
Note that in the first call above, the error is caught and reported as a message. In the second, it is not reported. In both cases an object of class "try-error" is returned.
Internally, try() is written as a single call to tryCatch() which sets up a custom function for the error handler which reports the error as a message and sets up the returned object. You might wish to study the R code for try() as another example of using tryCatch().

To capture the warning thrown by the R script using R.NET

When I use the below script in R Console it gave me the output as string "Warning"
jj = ts(scan("jj.dat"), start=1960, frequency=4)
tryCatch(arima(jj,
order = c(1, 0,1)),
warning=function(w) cat("Warning"))
I tried to use the same code in R.NET and expected to get the string "Warning", but I'm getting Parser Exception showing "Code error". Below is the code snippet which I tried in R.NET.
try
{
string script = "tryCatch(arima(jj,
order = c(4, 0,6)),
warning=function(w) cat(\"Warning\"))";
string str=engine.EagerEvaluate("script").AsCharacter().First();//*
}catch (Exception ex)
{
}
Kindly throw to me some idea, on how can we tackle this issue. Or is there any other way to capture the R Script warnings and error messages in R.NET.
From my experience in these kind of R integration into other languages (rpy, coupling python and R) I would keep the amount of R source code inside .NET at a minimum. The way I would go would be to write a function inside a .R file which does what you want.
hello = function() { print("Hello World") }
Saving this function inside spam.r allows you to use source in order to load this new function into the R session running inside .NET. Then you can a very simple R script:
source("spam.r")
hello()
This is ofcourse a quite trivial example, but hello could contain much more complicated code. In this way you prevent any errors because of writing the R code in .NET (in rpy there where some problems with that, e.g. data.frame was not allowed). Hope this helps!

Resources