I am writing a script that has to run on its own (a cron job), and need to connect to a MySQL database and perform some operations. I am using the package RMySQL in order to do so.
Since it has to be run automatically, I'd like the script to recursively test whether the connection was successful via checking if there are actually any tables in the connection. So far, with no luck I have tried the following:
library(RMySQL)
con <- dbConnect(MySQL(), dbname="test")
FCSQL <- function(connection) {
if (length(dbListTables(connection)) == 0 ) {
con <- dbConnect(MySQL(), dbname="test")
connection <- FCSQL(connection)
FCSQL(connection)
} else {
return("ON")
}
}
FCSQL(con)
#the error for the previous example:
#Error in .local(conn, statement, ...) :
#connection with pending rows, close resultSet before continuing
A slight modification with the same error:
FCSQL <- function(connection) {
if (length(dbListTables(connection)) == 0 ) {
con <- dbConnect(MySQL(), dbname="test")
FCSQL(con)
} else {
return("ON")
}
}
FCSQL(con)
#Error in .local(conn, statement, ...) :
#connection with pending rows, close resultSet before continuing
Related
Sometimes I run into an issue where the database query generates error. One example is:
nanodbc/nanodbc.cpp:3069: 07009: [Microsoft][ODBC SQL Server Driver]Invalid Descriptor Index
I know why the error occurs but I can't seem to catch the error to try something else when something like this happens.
result <- tryCatch(
data <- tbl(conn, query),
error = function(e){
print("Error encountered: ", e)
print("Attempting to run by sorting the columns")
new_query <- create_query_with_column_names(query)
print("Attempting to fetch the data with the new query")
data <- tbl(conn, new_query)
end_time <- Sys.time()
show_query_runtime(total_time=end_time-start_time, caller="fetch data without lazy loading.")
}
)
But instead, the code runs without error, but when I run the result, I get the error again.
> result
Error in result_fetch(res#ptr, n) :
nanodbc/nanodbc.cpp:3069: 07009: [Microsoft][ODBC SQL Server Driver]Invalid Descriptor Index
Warning message:
In dbClearResult(res) : Result already cleared
The above code won't catch the error. Why? How can I fix this?
Take a look at this answer for detailed guidance on tryCatch in R.
The problem is most likely how you are returning values.
If it executes correctly, the Try part returns the last statement in the section.
If the Try does not execute correctly, then the error section will return the last statement in the section.
Right now, the last statement in your error section is show_query_runtime(...) but what it looks like you want is tbl(conn, new_query).
Try the following, note the use of return to specify the value that should be returned:
result <- tryCatch(
# try section
data <- tbl(conn, query),
# error section
error = function(e){
print("Error encountered: ", e)
print("Attempting to run by sorting the columns")
new_query <- create_query_with_column_names(query)
print("Attempting to fetch the data with the new query")
data <- tbl(conn, new_query)
end_time <- Sys.time()
show_query_runtime(total_time=end_time-start_time, caller="fetch data without lazy loading.")
return(data)
}
)
In case it is part of the confusion, assigning data <- tbl(conn, new_query) within a function does not make the assignment in the calling environment. The variable data is not available once the tryCatch finishes. This is why we need to return the result out of the calling function.
In my project I have a lot of functions to work with database looking like this:
some_function <- function(smth) {
con <- dbConnect(SQLite(), db)
-- something --
dbDisconnect(con)
return(smth)
}
Is there any way to reduce the code and write something like a Python decorator with connection and disconnetion from db?
like this:
#conenct-disconnect
some_funcion <- function(smth){
-- something --
}
Or may be another way to do it?
A function operator
with_con <- function(f, db) {
function(...) {
con <- dbConnect(SQLite(), db)
f(..., `con` = con)
on.exit(dbDisconnect(con))
}
}
some_func <- function(query, con) {
#so something with x and the connection
dbGetQuery(con, query)
}
connected_func <- with_con(some_func, db)
connected_func('SELECT * FROM table')
I'm fairly new to R, so forgive me if this is a amateur question. I still don't get parts of how the R language works and I haven't used closures enough to really build intuition on how to approach this problem.
I want to wrap up opening and closing a database connection in my R project in a clean way. I have a variety of scripts set aside that all use a common DB connection configuration file (I don't put it in my repo, it's a local file only), all of which need to connect to the same MySQL database.
The end goal is to do something like :
query <- db_open()
out <- query("select * from example limit 10")
db_close()
This is what I wrote so far (all my scripts load these functions from another .R file) :
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user = db_user, password = db_pass, host = db_host)
query <- function(...) { dbGetQuery(db_conn, ...) }
return(query)
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(result)
}
I'm probably thinking of this in an OOP way when I shouldn't be, but sticking db_conn in the global environment feels unnecessary or even wrong.
Is this a reasonable way to accomplish what I want? Is there a better way that I'm missing here?
Any advice is appreciated.
You basically had it, you just need to move the query function into its own function. Regarding keeping db_conn, there really is no reason not to have it in the global environment.
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user='root', password='Use14Characters!', dbname='msdb_complex', host='localhost')
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(return)
}
query <- function(x,num=-1)
{
q <- dbSendQuery(db_conn, x)
s <- fetch(q, num);
}
Then you should be able to do something like:
query <- db_open()
results <- query("SELECT * FROM msenrollmentlog", 10)
db_close()
I'm using RPostgreSQL and sqldf inside my function like this:
MyFunction <- function(Connection) {
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
# ... some sqldf() stuff
}
How do I test that connection is valid?
You can check that an existing connection is valid using isPostgresqlIdCurrent.
conn <- dbConnect("RPgSQL", your_database_details)
isPostgresqlIdCurrent(conn)
For testing new connections, I don't think that there is a way to know if a connection is valid without trying it. (How would R know that the database exists and is available until it tries to connect?)
For most analysis purposes, just stopping on an error and fixing the login details is the best approach. So just call dbConnect and don't worry about extra check functions.
If you are creating some kind of application where you need to to handle errors gracefully, a simple tryCatch wrapper should do the trick.
conn <- tryCatch(conn <- dbConnection(wherever), error = function(e) do_something)
My current design uses tryCatch:
Connection <- c('usr','secret','db','host','5432')
CheckDatabase <- function(Connection) {
require(sqldf)
require(RPostgreSQL)
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
out <- tryCatch(
{
sqldf("select TRUE;")
},
error=function(cond) {
out <- FALSE
}
)
return(out)
}
if (!CheckDatabase(Connection)) {
stop("Not valid PostgreSQL connection.")
} else {
message("PostgreSQL connection is valid.")
}
One approach is to simply try executing the code, and catching any errors with a nice informative error message. Have a look at the documentation of tryCatch to see the details regarding how this works.
The following blog post provides an introduction to the exception-based style of programming.
I'm using a proprietary library that has an "openConnection" function that I use as such:
conn <- openConnection("user", "pass")
# do some stuff with 'conn' that may return early or throw exceptions
closeConnection(conn)
What's the R idiom for making sure that the connection gets closed no matter how the current method gets exited. In C++ it would be RAII, in Java it probably would be a "finally" block. What is it in R?
Typically, just a call to on.exit is used, but you need to do it inside a function.
f <- function() {
conn <- openConnection("user", "pass")
on.exit(close(conn))
# use conn...
readLines(conn)
} # on.exit is run here...
A common case is when you get passed a connection or file name, and you should only create (and close) the connection if you're given a file name:
myRead <- function(file) {
conn <- file
if (!inherits(file, "connection")) {
conn <- file(file, "r")
on.exit(close(conn))
} # else just use the connection...
readLines(conn)
} # on.exit runs here...
# Try it out:
cat("hello\nworld\n", file="foo.txt")
myRead("foo.txt") # file
myRead(stdin()) # connection