R: how to do RAII (or similar resource management) - r

I'm using a proprietary library that has an "openConnection" function that I use as such:
conn <- openConnection("user", "pass")
# do some stuff with 'conn' that may return early or throw exceptions
closeConnection(conn)
What's the R idiom for making sure that the connection gets closed no matter how the current method gets exited. In C++ it would be RAII, in Java it probably would be a "finally" block. What is it in R?

Typically, just a call to on.exit is used, but you need to do it inside a function.
f <- function() {
conn <- openConnection("user", "pass")
on.exit(close(conn))
# use conn...
readLines(conn)
} # on.exit is run here...
A common case is when you get passed a connection or file name, and you should only create (and close) the connection if you're given a file name:
myRead <- function(file) {
conn <- file
if (!inherits(file, "connection")) {
conn <- file(file, "r")
on.exit(close(conn))
} # else just use the connection...
readLines(conn)
} # on.exit runs here...
# Try it out:
cat("hello\nworld\n", file="foo.txt")
myRead("foo.txt") # file
myRead(stdin()) # connection

Related

Logging console history with errors in R or Rstudio

For educational purposes we are logging all commands that students type in the rstudio console during labs. In addition we would like to store if call was successful or raised an error, to identify students which struggling to get the syntax right.
The best I can come up with is something like this:
options(error = function(){
timestamp("USER ERROR", quiet = TRUE)
})
This adds an ## ERROR comment on the history log when an exception occurs. Thereby we could analyze history files to see which commands were followed by an ## ERROR comment.
However R's internal history system is not well suited for logging because it is in-memory, limited size and needs to be stored manually with savehistory(). Also I would prefer to store log one-line-per-call, i.e. escape linebreaks for multi-line commands.
Is there perhaps a hook or in the R or RStudio console for logging actual executed commands? That would allow me to insert each evaluated expression (and error) in a database along with a username and timestamp.
A possible solution would be to use addTaskCallback or the taskCallbackManager with a function that writes each top-level command to your database. The callback will only fire on the successful completion of a command, so you would still need to call a logging function on an error.
# error handler
logErr <- function() {
# turn logging callback off while we process errors separately
tcbm$suspend(TRUE)
# turn them back on when we're done
on.exit(tcbm$suspend(FALSE))
sc <- sys.calls()
sclen <- length(sc) # last call is this function call
if(sclen > 1L) {
cat("myError:\n", do.call(paste, c(lapply(sc[-sclen], deparse), sep="\n")), "\n")
} else {
# syntax error, so no call stack
# show the last line entered
# (this won't be helpful if it's a parse error in a function)
file1 <- tempfile("Rrawhist")
savehistory(file1)
rawhist <- readLines(file1)
unlink(file1)
cat("myError:\n", rawhist[length(rawhist)], "\n")
}
}
options(error=logErr)
# top-level callback handler
log <- function(expr, value, ok, visible) {
cat(deparse(expr), "\n")
TRUE
}
tcbm <- taskCallbackManager()
tcbm$add(log, name = "log")
This isn't a complete solution, but I hope it gives you enough to get started. Here's an example of what the output looks like.
> f <- function() stop("error")
f <- function() stop("error")
> hi
Error: object 'hi' not found
myError:
hi
> f()
Error in f() : error
myError:
f()
stop("error")

Invoke interrupt from R code

I have a generic function to catch all exceptions included in my package logR::tryCatch2 defined as:
tryCatch2 <- function(expr){
V=E=W=M=I=NULL
e.handler = function(e){
E <<- e
NULL
}
w.handler = function(w){
W <<- c(W, list(w))
invokeRestart("muffleWarning")
}
m.handler = function(m){
attributes(m$call) <- NULL
M <<- c(M, list(m))
}
i.handler = function(i){
I <<- i
NULL
}
V = suppressMessages(withCallingHandlers(
tryCatch(expr, error = e.handler, interrupt = i.handler),
warning = w.handler,
message = m.handler
))
list(value=V, error=E, warning=W, message=M, interrupt=I)
}
As you can see in the last line it returns a list which is more or less self describing.
It makes the real reaction to the exceptions delayed after the tryCatch2 call by simple !is.null:
f = function(){ warning("warn1"); warning("warn2"); stop("err") }
r = tryCatch2(f())
if(!is.null(r$error)) cat("Error detected\n")
# Error detected
if(!is.null(r$warning)) cat("Warning detected, count", length(r$warning), "\n")
# Warning detected, count 2
It works as expected, I can react with my own code. But in some cases I would like to not stop the interrupt process which is caught too. At the moment it seems I would need to add additional parameter to tryCatch2 which would control if interrupts should be catch or not. So the question asks about some invokeInterrupt function which I could use in the following way:
g = function(){ Sys.sleep(60); f() }
r = tryCatch2(g())
# interrupt by pressing ctrl+c / stop while function is running!
if(!is.null(r$interrupt)) cat("HERE I would like to invoke interrupt\n")
# HERE I would like to invoke interrupt
I think if R is able to catch one it should be also able to invoke one.
How can I achieve invokeInterrupt functionality?
I can propose a partial solution, which relies on the tools package.
invokeInterrupt <- function() {
require(tools)
processId <- Sys.getpid()
pskill(processId, SIGINT)
}
However, be aware that throwing the interrupt signal (SIGINT) with pskill doesn't appear to be very robust. I ran a few tests by sending the exception and catching it with your function, like so:
will_interrupt <- function() {
Sys.sleep(3)
invokeInterrupt()
Sys.sleep(3)
}
r = tryCatch2(will_interrupt())
On linux, this worked well when executed from the R commandline. On windows, the R commandline and R Gui did close when executing this code. There is worse: on both linux and windows, this code crashed Rstudio instantly...
So, if your code is to be executed from the R commandline on Linux, this solution should be OK. Otherwise you might be out of luck...
Late answer but I have found that rlang::interrupt can throw "user interrupts":
interrupt() allows R code to simulate a user interrupt of the kind that is signalled with Ctrl-C.
It is currently not possible to create custom interrupt condition objects.
Source: ?rlang::interrupt
Internally it calls the R API function Rf_onintr which is an alias for the function onintr.
Basically an interrupt is "just" a special condition with these classes:
interrupt and condition (see the R source code).
If you just want to simulate an interrupt to test tryCatching (without the need to interrupt a running R statement) it suffice to throw a condition with these classes via signalCondition:
interrupt_condition <- function() {
structure(list(), class = c("interrupt", "condition"))
}
tryCatch(signalCondition(interrupt_condition()),
interrupt = function(x) print("interrupt detected"))
# [1] "interrupt detected"

Clean way to wrap-up and handle RMySQL connections?

I'm fairly new to R, so forgive me if this is a amateur question. I still don't get parts of how the R language works and I haven't used closures enough to really build intuition on how to approach this problem.
I want to wrap up opening and closing a database connection in my R project in a clean way. I have a variety of scripts set aside that all use a common DB connection configuration file (I don't put it in my repo, it's a local file only), all of which need to connect to the same MySQL database.
The end goal is to do something like :
query <- db_open()
out <- query("select * from example limit 10")
db_close()
This is what I wrote so far (all my scripts load these functions from another .R file) :
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user = db_user, password = db_pass, host = db_host)
query <- function(...) { dbGetQuery(db_conn, ...) }
return(query)
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(result)
}
I'm probably thinking of this in an OOP way when I shouldn't be, but sticking db_conn in the global environment feels unnecessary or even wrong.
Is this a reasonable way to accomplish what I want? Is there a better way that I'm missing here?
Any advice is appreciated.
You basically had it, you just need to move the query function into its own function. Regarding keeping db_conn, there really is no reason not to have it in the global environment.
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user='root', password='Use14Characters!', dbname='msdb_complex', host='localhost')
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(return)
}
query <- function(x,num=-1)
{
q <- dbSendQuery(db_conn, x)
s <- fetch(q, num);
}
Then you should be able to do something like:
query <- db_open()
results <- query("SELECT * FROM msenrollmentlog", 10)
db_close()

Check that connection is valid

I'm using RPostgreSQL and sqldf inside my function like this:
MyFunction <- function(Connection) {
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
# ... some sqldf() stuff
}
How do I test that connection is valid?
You can check that an existing connection is valid using isPostgresqlIdCurrent.
conn <- dbConnect("RPgSQL", your_database_details)
isPostgresqlIdCurrent(conn)
For testing new connections, I don't think that there is a way to know if a connection is valid without trying it. (How would R know that the database exists and is available until it tries to connect?)
For most analysis purposes, just stopping on an error and fixing the login details is the best approach. So just call dbConnect and don't worry about extra check functions.
If you are creating some kind of application where you need to to handle errors gracefully, a simple tryCatch wrapper should do the trick.
conn <- tryCatch(conn <- dbConnection(wherever), error = function(e) do_something)
My current design uses tryCatch:
Connection <- c('usr','secret','db','host','5432')
CheckDatabase <- function(Connection) {
require(sqldf)
require(RPostgreSQL)
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
out <- tryCatch(
{
sqldf("select TRUE;")
},
error=function(cond) {
out <- FALSE
}
)
return(out)
}
if (!CheckDatabase(Connection)) {
stop("Not valid PostgreSQL connection.")
} else {
message("PostgreSQL connection is valid.")
}
One approach is to simply try executing the code, and catching any errors with a nice informative error message. Have a look at the documentation of tryCatch to see the details regarding how this works.
The following blog post provides an introduction to the exception-based style of programming.

Garbage collection com object in R

I want to be able to open an excel session from R, write to it and then close the excel session from R. While I can do this all from within the same function, I am trying to generalize the code for the cleanup of excel. However, somehow when I make the call to gc() from a function by passing in the excel object, it does not garbage collect. Below is the code:
opentest<-function() {
excel<-comCreateObject("Excel.Application")
comSetProperty(excel,"Visible",T)
comSetProperty(excel,"DisplayAlerts",FALSE)
comSetProperty(excel, "SheetsInNewWorkbook", 1)
wb <- comGetProperty(excel, "Workbooks")
wb <- comInvoke(wb, "Add")
excel
}
cleanupexcel<-function(excelobj) {
comInvoke(excelobj,"Quit")
rm(excelobj, envir=globalenv())
eapply(env=globalenv(), gc)
}
With the following calls to the function:
excelobj<- opentest()
cleanupexcel(excelobj)
When I call the two functions above, I can still see the excel session running in my task manager. However, if I make the call to gc() after returning from cleanupexcel(), it kills the excel session successfully.
Any ideas on how I can gc successfully from a generic function or is there some other issue that I am having here?
Here's a small change to your code that should work (I'm on Linux now, so I can't test it).
The main fix is to wrap the excel instance in an environment and return that instead.
The close can then access the instance and then remove it (ensuring no reference to it remains) before calling gc():
opentest<-function() {
excel<-comCreateObject("Excel.Application")
comSetProperty(excel,"Visible",T)
comSetProperty(excel,"DisplayAlerts",FALSE)
comSetProperty(excel, "SheetsInNewWorkbook", 1)
wb <- comGetProperty(excel, "Workbooks")
wb <- comInvoke(wb, "Add")
# wrap excel in an environment
env <- new.env(parent=emptyenv())
env$instance <- excel
env
}
cleanupexcel<-function(excel) {
comInvoke(excel$instance,"Quit")
rm("instance", envir=excel)
gc()
}
myexcel <- opentest()
cleanupexcel(myexcel)
...Note that your old code requires the variable to be named "excelobj" since you remove it from within the cleanupexcel function. That's not great.
OK, there are very subtle issues at play, so here's a reproducible example without excel:
opentest<-function() {
excel<-new.env()
reg.finalizer(excel, function(x) { cat("FINALIZING EXCEL!\n") }, FALSE)
# wrap excel in an environment
env <- new.env(parent=emptyenv())
env$instance <- excel
env
}
cleanupexcel<-function(excel) {
cat(excel$instance,"\n")
rm("instance", envir=excel)
gc()
}
myexcel <- opentest()
cleanupexcel(myexcel)
# Prints "FINALIZING EXCEL!"

Resources