R: Dealing with functions that sometimes crash the R session? - r

Have an R function ( let's call it MyFunction ) that sometimes crashes the R session , most of the time it does not.
Have to apply this function to a large number of objects in a sequential manner.
for(i in 1:nrow(objects))
{
result[i] <- MyFunction(objects[i]);
}
I'm coming from a C# background - where functions rarely crash the "session" and programmers normally surround such function calls in try - catch blocks. However, in R I've seen some functions that just crash the session and using tryCatch is of no help since the function does not cause an exception but a full blast session crash ;-)
Just wondering what's the best way of "catching" the crash.
I'm considering writing a Python script that calls the R function from Python ( via one of the R-Python connectors ) and catching the R crash in Python. Would that work ?
Any advise ?
Cheers !

Use the mcparallel function from the parallel package to run the function in a forked process. That way, if it crashes R, only the subprocess crashes, and an error is returned to the main process. If you want to apply this function to a large number of objects and collect the results in a list, use mclapply

Hello such behaviour is very rare in my experience. You might not know that there is a debuger that can help you to go step by step in you function.
install.packages('debug') #install the debug package
library(debug)
mtrace(myFunctionToBeDebuged) #this function will start the debuger
mtrace(myFunctionToBeDebuged, FALSE) #to stop the function to be traced
NOTE: when you are in the debuger, should you want to quit it do qqq()

Related

Very simple question on Console vs Script in R

I have just started to learn to code on R, so I apologize for the very simple question. I understand it is best to type your code in as a Script so you can edit and save it. However, when I try to make an object in the script section, it does not work. If I make an object in the console, R saves the object and it appears in my environment. I am typing in a very simple code to try a quick exercise on rolling dice:
die <- 1:6
But it only works in the console and not when typed as a script. Any help/explanation appreciated!
Essentially, you interact with R environment differently when running an .R script via RScript.exe or via console with R.exe, Rterm, etc. and in GUI IDEs like RGui or RStudio. (This applies to any programming language with interactive compilers not just R).
The script does save thedie object in R environment but only during the run or lifetime of that script (i.e., from beginning to end of code lines). Your code line is simply an assignment of object. You do nothing with it. Apply some function, output results, and other actions in that script to see.
On the console, the R environment persists interactively until you quit it with q(). So assigned objects remains for lifetime of your console session. After assigning, you can afterwards apply function, output results, or other actions in line by line calls.
Ultimately, scripts gathers all line by line code in advance of run for automated execution without relying on user to supply lines. Imagine running 1,000 lines of code with nested if/then or for/while loops, apply functions on console! Therefore, have all your R coding needs summarily handled in scripts.
It is always better to have the script, as you say, you can save edit correct, without having to rewrite the code to change a variable or number.
I recommend using Rstudio, it is very practical and will help you to program more efficiently and allows you to see, among other things, the different objects that you have created.

Why does R produce an error when calling quit() inside a function and can this be avoided?

So I was messing around with some R code and found something that seemed interesting to me. Specifically when I call quit() function it executes without a problem, however when I place the quit function within another function R says the following.
poshQuit<- function(){
quit()
}
poshQuit()
R session aborted. R encountered a fatal error. The session was terminated.
Now this isnt the end of the world as I was just idly messing about in code but I am interested as to why this might be the case and if there are any ways this error can be avoided.
Thanks

Platform-dependent data output

In order to aid myself with displaying debugging information, I decided to create the following tiny function that would dynamically switch between displaying data in RStudio's internal data browser and simple character-based output, depending on capabilities of the platform, where my modules are being sourced at:
View <- function (...) {
if (.Platform$GUI == "RStudio")
View(...)
else
print(...)
}
This function is located, along with other utility functions, in the module <PROJ_HOME>/utils/debug.R. All modules that need these functions, include it via source("../utils/debug.R").
Running my project's code on my Amazon EC2 instance's Linux console is fine. However, running it on the same virtual machine via RStudio Server results in the following error message:
Error: evaluation nested too deeply: infinite recursion / options(expressions=)?
Error during wrapup: evaluation nested too deeply: infinite recursion / options(expressions=)?
It seems to me that R gets confused as to which View() function needs to be called. Initially, I assumed that RStudio overloads the utils::View() and I tried to call it explicitly, but it failed. Then I thought that RStudio somehow defines its View() implementation in global environment, so that it just needs to be called View() without package/library reference. However, as I you see, it doesn't work either. Another potential reason of the error might be that I overestimated the "smartness" of R in terms of my use of ... arguments.
So, what is wrong and how to fix it?
RStudio hooks the View function. One approach might be to see if anyone has overridden the View function from utils, and to call the override if it exists. What about this?
View <- if (identical(utils::View, View)) print else View

Printing from mclapply in R Studio

I am using mclapply from within RStudio and would like to have an output to the console from each process but this seems to be suppressed somehow (as mentioned for example here: Is mclapply guaranteed to return its results in order?).
How could I get R Studio to print something like
x <- mclapply(1:20, function(i) cat(i, "\n"))
to the console?
I've tried print(), cat(), write() but they all seem not to work. I also tried to set mc.silent = FALSE explicitly without an effect.
Parallel processing with GUI's is problematic. I write a lot of parallel code and it's constantly crashing my colleague's computer because he insists on using Rstudio instead of console R.
From what I read, RStudio "does not propagate the output of forked processes to the RStudio console. If you are doing this, it is best to start R via a shell."
This makes sense as a workaround for the RStudio people because parallel processing typically breaks GUI's when people try to output to the GUI from a bunch of different processes. It works in the console (albeit often not in order) but parallel processing gurus will pinch their noses when they hear about any I/O from a forked thread.
If you must have output from forked threads, save them in a string and return it. Then collect and output from the main process. Or just use a console for your parallel runs. What I tell my colleague is to do all his debugging and development in RStudio using lapply(), then switch to a console for the real run.
Here's a workaround which uses shell echo to print to R console in Rstudio:
#' Function which prints a message using shell echo; useful for printing messages from inside mclapply when running in Rstudio
message_parallel <- function(...){
system(sprintf('echo "\n%s\n"', paste0(..., collapse="")))
}
Just expanding a little on the solution used by the asker, i.e. writing to a file to check progress:
write.file = '/temp_output/R_progress'
time1 = proc.time()[3]
outstuff = unlist(mclapply(1:1000000, function(i){
if (i %% 1000 == 0 ){
file.create(write.file)
fileConn<-file(write.file)
writeLines(paste0(i,'/',nrow(loc),' ',(i/nrow(loc)*100)), fileConn)
close(fileConn)
}
#do your stuff here
}, mc.cores=6))
print(proc.time()[3] - time1)
And then you can monitor from a console with
tail -c +0 -f '/temp_output/R_progress'

getURL get stuck, need a wait function

I am trying to use R to surf the web but I have a strange problem, lets say that I have a list named URLlist containing some URL. Here is my code
for (k in 1:length(URLlist)){
temp = getURL(URLlist[k])
}
I don't know why but at some random URL, R blocks. It has nothing to do with the URL as it can work for an execution of the loop for but not for another one for the same URL. I think that the loop is going to fast and that the download of data doesn't follow. So I was thinking of making the code wait for 1 seconde before each new calling of getURL function, but I didn't find such a wait function.
Any idea please ? thank you !
?Sys.sleep()
Description:
Suspend execution of R expressions for a given number of seconds
Usage:
Sys.sleep(time)
Arguments:
time: The time interval to suspend execution for, in seconds.
Whether or not this will solve your problem is another issue.
I would suggest looking at the XML package and using htmlParse() to surf the web with R since there are rarely instances where you want html being returned as text.

Resources