R - Set execution time limit in loop - r

I have a script that uses the rNOMADS package to download forecast data. Currently, it uses a for loop to call the forecast download function for each three hour forecast interval in order. The issue is the download function occasionally "freezes" at random which forces me to terminate R and start the process over. When it freezes the code hangs at the download function for minutes instead of the typical <1 sec it takes to execute, and then when I try to halt execution I get a message saying "R is not resrponding to your request to interrupt processing so to stop the current operation you may need to terminate R entirely."
Is there a way set a time limit for a specific block of code to execute in each for loop iteration, and then skip that block of code and throw an error if the time limit is reached? Something like tryCatch, that I could use to raise a flag to re-do that for loop iteration?
Something like:
for (i in 1:N) {
...
setTimeLimit(XXX seconds) {
downloadFunction()
} timeLimitReached {
doOverFlag <- 1
}
}
Thanks in advance!

The function evalWithTimeout of package R.utils does this.
evalWithTimeout(Sys.sleep(10), timeout = 1)
(times are in seconds).
Note: I have not used this function a lot, I liked your question so I did some googling around and found this.

I really like R.utils for some situations, but it clobbers the traceback for the internal error message if there was one (lets' say you're running in parallel and want to wrap it in a timeout)
R base has the functionality setTimeLimit that you can wrap using {} with your expression. It returns a simple error message so it's very useful and does not remove other error handling possibilities (like withCallingHandlers which is extremely useful for parsing/storing error messages and the call stack):
test_fun <- function() {
repeat {
runif(100)
}
}
res <- {
setTimeLimit(5)
test_fun()
}

This function works as follows now:
library(R.utils)
withTimeout(Sys.sleep(10), timeout = 1)#stop execution after one second

Good advice, following on from the answers above, and regarding the fact that the loop stops: To avoid breaking the entire loop, make sure you add if (all(class(variable_name)=="try-error")) next()

Related

R: Packaging a function with trace

We have some internal R packages with a very large number of functions. As part of an effort to eliminate unused code I looked into covr and codetools::checkUsage and both are insufficient - so we opted to hook all functions with trace that would record activity somewhere. Toy example with no technical details:
> f <- function() { print("Doing very very important work") }
> trace(f, tracer=substitute(print("recording call")))
[1] "f"
> f()
Tracing f() on entry
[1] "recording call"
[1] "Doing very very important work"
The tracer operation does not significantly delay the work, but the tracing all package functions (~35K) takes ~3 minutes - and I'm looking for ways to shorten it.
Is there some way to package the functions with the trace, so it won't have to be added in a separate post-load stage? Is there another direction I didn't think of?
You can put the trace() calls into the source for your package. Just make sure the trace() call happens after the function definition, either by putting it later in the same source file, or by putting it in a separate file that collates after all your function definitions.
For example, if your package has a file R/fun.R containing this source,
fun <- function(x) {
print('this is fun!')
}
then simply add another line to R/fun.R so it looks like this instead:
fun <- function(x) {
print('this is fun!')
}
trace(fun, tracer=substitute(print("recording call")))
This works because of the way R installs and traces things:
trace modifies functions to insert the tracing.
installing executes all of the source files in the R directory, and saves the results.
So putting a trace call in your source will modify the function before it is saved, and it will stay modified for any user of that package.

how to automatically restart the execution of a R script if its execution is interrupted

I have a script that constantly performs a set of calculations in an endless loop. But often there are various errors that I cannot predict and the script stops working. I would like to automatically restart the script every time it stops working, and I don't care why the error occurred, I just want to restart the script. I will cite a deliberately erroneous code that I would like to repeat after an error has been issued.
while (TRUE) {
1+m
}
The try() and trycatch() funtions are designed to deal with code that might cause errors. In the case of your example code, changing it to:
while (TRUE) {
try(1+m)
}
will keep trying to do the line that produces an error. If your code inside the loop is multiple lines, you can make it into a block by wrapping it in braces, e.g. :
while (TRUE) {
try({
a <- 1+m
print(a)
})
}

How to manually stop R while running code wrapped in try()?

I retrieve data from a server using {httr} in a for loop; sometimes requests time out, which causes an error. Since I want R to continue after errors, I wrapped this function call into try(). However, this makes it impossible for me to stop R manually using the stop button in Rstudio: {httr} throws an error and the for loop continues to the next iteration.
What can I do to ensure R stops when I hit 'stop' in Rstudio?
example code:
require(httr)
authorlist<-c("alice","bob","charlie","david","erin","frank")
datalist<-list()
for(author in authorlist){
result<-NULL
count<-0
while(is.null(result) & count<5){
try(
result<-GET(paste0("https://api.pushshift.io/reddit/search/comment?author=",author),timeout(10))
)
}
datalist[length(datalist)+1]<-result
}
I expect R to stop processing; but it merely skips to the next loop iteration (of either for-loop).

How do I avoid halting the execution of a standalone r script that encounters an error?

I am running an optimization program I wrote in a multi-language framework. Because I rely on different languages to accomplish the task, everything must be standalone so it can be launched through a batch file. Everything has been going fine for 2-3 months, but I finally ran out of luck when one of the crucial parts of this process, executed through a standalone R script, encountered something new and gave me an error message. This error message makes everything screech to a halt despite my best efforts:
selMEM<-forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
Procedure stopped (adjR2thresh criteria) adjR2cum = 0.000000 with 0 variables (superior to -0.005810)
Error in forward.sel(muskfreq, musk.MEM, adjR2thresh = adjR2) :
No variables selected. Please change your parameters.
I know why I am getting this message: it is warning me that no variables are above the threshold I have programmed to retain during a forward selection. Although this didn't happen in hundreds of runs, it's not that big a deal, I just need to tell R what to do next. This is where I am lost. After an exhaustive search through several posts (such as here), it seams that try() and tryCatch() are the way to go. So I have tried the following:
selMEM<-try(forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2))
if(inherits(selMEM, "try-error")) {
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
} else {
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}
The code after the problematic line works perfectly if I execute it line by line in R, but when I execute it as a standalone script from the command prompt, I still get the same error message and my whole process screeches to a halt before it executes what follows.
Any suggestions on how to make this work?
Note this in the try help:
try is implemented using tryCatch; for programming, instead of
try(expr, silent = TRUE), something like tryCatch(expr, error =
function(e) e) (or other simple error handler functions) may be more
efficient and flexible.
Look to tryCatch, possibly:
selMEM <- tryCatch({
forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
}, error = function(e) {
message(e)
return(NULL)
})
if(is.null(selMEM)) {
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
} else {
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}
Have you tried setting the silent parameter to true in the Try function?
max<-0
cumR2<-0
adjR2<-0
pvalue<-NA
try({
selMEM <- forward.sel(muskfreq, musk.MEM, adjR2thresh=adjR2)
max<-dim(selMEM)[1]
cumR2<-selMEM$R2Cum[max]
adjR2<-selMEM$AdjR2Cum[max]
pvalue<-selMEM$pval[max]
}, silent=T)

Sink does not release file

I know that the sink() function can be used to divert R output into a file, e.g.
sink('sink-closing.txt')
cat('Hello world!')
sink()
Is there a simple command to close all outstanding sinks?
Below, I elaborate on my question.
Suppose that my R-script opens a sink() in an R-script, but there is an error in the R-script which occurs before the script closes the sink(). I may run the R-script multiple times, trying to fix the error. Finally, I want to close all the sinks and print to the console. How do I do so?
Finally, in the interest of concreteness, I provide a MWE to illustrate the problem I face.
First, I write an R-script sink-closing.R which has an error in it.
sink('sink-closing.txt')
foo <- function() {
cat(sprintf('Hello world! My name is %s\n',
a.variable.that.does.not.exist))
}
foo()
sink()
Next, I source the R-script multiple times, say 3 times by mistake as I try to find and fix the bug.
> source('~/Dropbox/cookbook/r-cookbook/sink-closing.R')
Error in sprintf("Hello world! My name is %s\n", a.variable.that.does.not.exist) :
object 'a.variable.that.does.not.exist' not found
Now, suppose that I am debugging the R-script and want to print to the console. I can call sink() multiple times to close the earlier sinks. If I call it 3 times, then I can finally print to the console as before. But how do I know how many sinks I need to close?
closeAllConnections() # .........................
I'm getting upvotes for this as time goes along but Simon.S.A and others are better.
You can use sink.number() to tell you how many diversions are already set and then call sink that many times. Putting it into a function you could have this
sink.reset <- function(){
for(i in seq_len(sink.number())){
sink(NULL)
}
}
Based on #mnel's comment:
sinkall <- function() {
i <- sink.number()
while (i > 0) {
sink()
i <- i - 1
}
}
Should close all open sinks.
You may also encounter this problem when dealing with devices and plots, where the number of open devices isn't reported anywhere. For a more general case you could use this:
stopWhenError <- function(FUN) {
tryCatch({
while(TRUE) {
FUN()
}
}, warning = function(w) {
print("All finished!")
}, error = function(e) {
print("All finished!")
})
}
stopWhenError(sink) # for sink.
stopWhenError(dev.off) # close all open plotting devices.
EDIT:
sink throws a warning not an error so I've modified the code so that it won't run forever, whoops!
The most common time I experience this is when an error occurs preventing a sink from closing. For example, the following will leave an open sink after execution.
sink("output.txt")
my_function_that_will_error()
sink()
This can be avoided using on.exit(sink()). This will close the sink "when the current function exits (either naturally or as the result of an error)" (documentation here).
But you do have to change the order:
sink("output.txt")
on.exit(sink())
my_function_that_might_error()
So we create the sink, tell R to close it when it exits, and then execute the code that might error. This will close the sink regardless of whether the code errors or not.

Resources