Check that connection is valid - r

I'm using RPostgreSQL and sqldf inside my function like this:
MyFunction <- function(Connection) {
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
# ... some sqldf() stuff
}
How do I test that connection is valid?

You can check that an existing connection is valid using isPostgresqlIdCurrent.
conn <- dbConnect("RPgSQL", your_database_details)
isPostgresqlIdCurrent(conn)
For testing new connections, I don't think that there is a way to know if a connection is valid without trying it. (How would R know that the database exists and is available until it tries to connect?)
For most analysis purposes, just stopping on an error and fixing the login details is the best approach. So just call dbConnect and don't worry about extra check functions.
If you are creating some kind of application where you need to to handle errors gracefully, a simple tryCatch wrapper should do the trick.
conn <- tryCatch(conn <- dbConnection(wherever), error = function(e) do_something)

My current design uses tryCatch:
Connection <- c('usr','secret','db','host','5432')
CheckDatabase <- function(Connection) {
require(sqldf)
require(RPostgreSQL)
options(sqldf.RPostgreSQL.user = Connection[1],
sqldf.RPostgreSQL.password = Connection[2],
sqldf.RPostgreSQL.dbname = Connection[3],
sqldf.RPostgreSQL.host = Connection[4],
sqldf.RPostgreSQL.port = Connection[5])
out <- tryCatch(
{
sqldf("select TRUE;")
},
error=function(cond) {
out <- FALSE
}
)
return(out)
}
if (!CheckDatabase(Connection)) {
stop("Not valid PostgreSQL connection.")
} else {
message("PostgreSQL connection is valid.")
}

One approach is to simply try executing the code, and catching any errors with a nice informative error message. Have a look at the documentation of tryCatch to see the details regarding how this works.
The following blog post provides an introduction to the exception-based style of programming.

Related

Force stop query in R

I am using odbc in my R to get data from SQL server.
Recently, I had an issue: For some unknown reason. My query may take hours to get the result from the SQL server. It was fine before. The return data is only 10,000 rows. My data team colleagues haven't figure out the reason. My old code was:
getSqlData = function(server, sqlstr){
con = odbc::dbConnect(odbc(),
Driver = "SQL Server",
Server = server,
Trusted_Connection = "True")
result = odbc::dbGetQuery(con, sqlstr)
dbDisconnect(con)
return(result)
}
At first, I was trying to find a timeout parameter for dbGetQuery(). Unfortunately, there is no such parameter for this function. So I decide to monitor the runtime by myself.
getSqlData = function(server, sqlstr){
con = odbc::dbConnect(odbc(),
Driver = "SQL Server",
Server = server,
Trusted_Connection = "True")
result = tryCatch(
{
a = withTimeout(odbc::dbGetQuery(con, sqlstr), timeout = 600, onTimeout = "error")
return(a)
},
error=function(cond) {
msg <- "The query timed out:"
msg <- paste(msg,sqlstr,sep = " ")
error(logger,msg)
return(NULL)
},finally={
dbDisconnect(con)
}
)
return(result)
}
I force the function to stop if dbGetQuery() didn't finish in 10 mins. However, I get warning message as
In connection_release(conn#ptr) : There is a result object still in use.
The connection will be automatically released when it is closed
My understanding is this means the query is still running and the connection is not closed.
Is there a way to force the connection to be closed and force the query to stop?
The other thing I notice is even I set timeout = 1, it will not raise the error in 1s, it will run for around 1mins and then raise the error. Does anyone know why it behaved like this?
Thank you.

How to redo tryCatch after error in for loop

I am trying to implement tryCatch in a for loop.
The loop is built to download data from a remote server. Sometimes the server no more responds (when the query is big).
I have implemented tryCatch in order to make the loop keeping.
I also have added a sys.sleep() pause if an error occurs in order to wait some minutes before sending next query to the remote server (it works).
The problem is that I don't figure out how to ask the loop to redo the query that failed and lead to a tryCatch error (and to sys.sleep()).
for(i in 1:1000){
tmp <- tryCatch({download_data(list$tool[i])},
error = function(e) {Sys.sleep(800)})
}
Could you give me some hints?
You can do something like this:
for(i in 1:1000){
download_finished <- FALSE
while(!download_finished) {
tmp <- tryCatch({
download_data(list$tool[i])
download_finished <- TRUE
},
error = function(e) {Sys.sleep(800)})
}
}
If you are certain that waiting for 800 seconds always fixes the issue this change should do it.
for(i in 1:1000) {
tmp <- tryCatch({
download_data(list$tool[i])
},
error = function(e) {
Sys.sleep(800)
download_data(list$tool[i])
})
}
A more sophisticated approach could be, to collect the information of which request failed and then rerun the script until all requests succeed.
One way to do this is to use the possibly() function from the purrr package. It would look something like this:
todo <- rep(TRUE, length(list$tool))
res <- list()
while (any(todo)) {
res[todo] <- map(list$tool[todo],
possibly(download_data, otherwise = NA))
todo <- map_lgl(res, ~ is.na(.))
}

Handling 404 and other bad responses with looping twitteR queries

I would like it if my loop wouldn't break if I get an error. I want it to just move on to the next iteration. This example is a minimal example of the error I'm getting and the loop breaking. In my real application I'm iterating through some of the followers I've generated from another script.
library(twitteR)
#set oauth...
for(i in 1:10) {
+ x <- getUser("nalegezx") }
Error in twInterfaceObj$doAPICall(paste("users", "show", sep = "/"), params = params, :
client error: (404) Not Found
I understand that this loop would simply rewrite the same response to x. I'm just interested in not breaking the loop.
I'm not an expert in the R Twitter API, but I can suggest that you consider placing your call to getUser() inside a try block like this:
for (i in 1:10) {
x <- try(getUser("sdlfkja"))
}
This should stop your code from crashing in the middle of the loop. If you want to also have separate logic when a warning or error occurs in the loop, you can use tryCatch:
for (i in 1:10) {
x <- tryCatch(getUser("sdlfkja"),
warning = function(w) {
print("warning");
# handle warning here
},
error = function(e) {
print("error");
# handle error here
})
}
I accepted Tim's answer because it resolved the problem I had but for the specific instance of getting many results from twitter on the profile for users I used lookupUsers which does the job for me without messing with my request limit.

Clean way to wrap-up and handle RMySQL connections?

I'm fairly new to R, so forgive me if this is a amateur question. I still don't get parts of how the R language works and I haven't used closures enough to really build intuition on how to approach this problem.
I want to wrap up opening and closing a database connection in my R project in a clean way. I have a variety of scripts set aside that all use a common DB connection configuration file (I don't put it in my repo, it's a local file only), all of which need to connect to the same MySQL database.
The end goal is to do something like :
query <- db_open()
out <- query("select * from example limit 10")
db_close()
This is what I wrote so far (all my scripts load these functions from another .R file) :
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user = db_user, password = db_pass, host = db_host)
query <- function(...) { dbGetQuery(db_conn, ...) }
return(query)
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(result)
}
I'm probably thinking of this in an OOP way when I shouldn't be, but sticking db_conn in the global environment feels unnecessary or even wrong.
Is this a reasonable way to accomplish what I want? Is there a better way that I'm missing here?
Any advice is appreciated.
You basically had it, you just need to move the query function into its own function. Regarding keeping db_conn, there really is no reason not to have it in the global environment.
db_open <- function() {
db_close()
db_conn <<- dbConnect(MySQL(), user='root', password='Use14Characters!', dbname='msdb_complex', host='localhost')
}
db_close <- function() {
result <- tryCatch({
dbDisconnect(db_conn)
}, warning = function(w) {
# ignore
}, error = function(e) {
return(FALSE)
})
return(return)
}
query <- function(x,num=-1)
{
q <- dbSendQuery(db_conn, x)
s <- fetch(q, num);
}
Then you should be able to do something like:
query <- db_open()
results <- query("SELECT * FROM msenrollmentlog", 10)
db_close()

R: how to do RAII (or similar resource management)

I'm using a proprietary library that has an "openConnection" function that I use as such:
conn <- openConnection("user", "pass")
# do some stuff with 'conn' that may return early or throw exceptions
closeConnection(conn)
What's the R idiom for making sure that the connection gets closed no matter how the current method gets exited. In C++ it would be RAII, in Java it probably would be a "finally" block. What is it in R?
Typically, just a call to on.exit is used, but you need to do it inside a function.
f <- function() {
conn <- openConnection("user", "pass")
on.exit(close(conn))
# use conn...
readLines(conn)
} # on.exit is run here...
A common case is when you get passed a connection or file name, and you should only create (and close) the connection if you're given a file name:
myRead <- function(file) {
conn <- file
if (!inherits(file, "connection")) {
conn <- file(file, "r")
on.exit(close(conn))
} # else just use the connection...
readLines(conn)
} # on.exit runs here...
# Try it out:
cat("hello\nworld\n", file="foo.txt")
myRead("foo.txt") # file
myRead(stdin()) # connection

Resources