Failed R instances + RODBC - r

Quick question: I am running multiple R instances parallel in batch mode using an RODBC connection, and randomly one (or more) of my instances is failing. If I go back and run the instances one by one, all of them are successful. There is no error in the log, and I am just trying to deduce where exactly the issue is coming from. My main hypotheses are that I am hitting a memory heap top and the instance is failing, or (more probably) there is some kind of time out happening with the RODCB connection. Any suggestions?
Thanks,
Jim

It is not clear why no error shows, perhaps you could try options(error = recover)
I used to get the following error when using multiple database connections:
Error in mysqlExecStatement(conn, statement, ...) :
RS-DBI driver: (connection with pending rows, close resultSet before continuing)
I avoid this error by issuing the following line to close any open connections before issuing a new query:
lapply(dbListConnections(MySQL()), dbDisconnect)
I took this code from the R help list.
update: one of my collaborators has created a suite of functions to facilitate database interactions, including db.con, db.open, db.close, and db.query that could be used like:
## load functions
source("https://raw.github.com/PecanProject/pecan/master/db/R/utils.R")
## example
params <- list(dbname = "mydb", username = "myname", password = "!##?$")
con <- db.open(params)
mydata <- db.query("select * from mytable;")
db.close(con)

Related

RStudio Connect - Intermittent C stack usage errors when downloading from Snowflake database

Having an intermittent issue when downloading a fairly large dataset from a Snowflake database view. The dataset is about 60 million rows. Here is the code used to connect to Snowflake and download the data:
sf_db <- dbConnect(odbc::odbc(),
Driver="SnowflakeDSIIDriver",
AUTHENTICATOR = "SNOWFLAKE_JWT",
Server = db_param_snowflake$server,
Database = db_param_snowflake$db,
PORT=db_param_snowflake$port,
Trusted_Connection = "True",
uid = db_param_snowflake$uid,
db = db_param_snowflake$db,
warehouse = db_param_snowflake$warehouse,
PRIV_KEY_FILE = db_param_snowflake$priv_key_file,
PRIV_KEY_FILE_PWD = db_param_snowflake$priv_key_file_pwd,
timeout = 20)
snowflake_query <- 'SELECT "address" FROM ABC_DB.DEV.VW_ADDR_SUBSET'
my_table <- tbl(sf_db, sql(snowflake_query)) %>%
collect() %>%
data.table()
About 50% of the time, this part of the script runs fine. When it fails, the RStudio Connect logs contain messages like this:
2021/05/05 18:17:51.522937848 Error: C stack usage 940309959492 is too close to the limit
2021/05/05 18:17:51.522975132 In addition: Warning messages:
2021/05/05 18:17:51.523077000 Lost warning messages
2021/05/05 18:17:51.523100338
2021/05/05 18:17:51.523227401 *** caught segfault ***
2021/05/05 18:17:51.523230793 address (nil), cause 'memory not mapped'
2021/05/05 18:17:51.523251671 Warning: stack imbalance in 'lazyLoadDBfetch', 113 then 114
To try to get this working consistently, I have tried using a process that downloads rows in batches, and that also intermittently fails, usually after downloading many millions of records. I have also tried connecting with Pool and downloading, that also only works sometimes. Also tried dbGetQuery, same inconsistent results.
I have Googled this extensively, and found threads related to C stack errors and recursion, but those problems seemed to be consistent (unlike this one that works sometimes) and I'm not sure what I can do if there is some recursive process running as part of this download.
We are running this on a Connect server with 125GB of memory, and at the time this script runs there are no other scripts running, and (at least according to the Admin screen that shows CPU and memory usage) this script doesn't use any more than 8-10GB before it (sometimes) fails. As far as when it succeeds and when it fails, I haven't noticed any pattern. I could run it now and it fail, then immediately run it again and it works. When it succeeds, it takes about 7-8 minutes. When it fails, it generally fails after anywhere from 3-8 minutes. All packages are the newest versions, and this has always worked inconsistently, so cannot think of anything to roll back.
Any ideas for troubleshooting, or alternate approach ideas, are welcome. Thank you.

Every time I source my R script it leaks a db connection

I cannot paste the entire script here, but I am explaining the situation. If you have ever got leaked DB connections then you would be knowing what I am talking about.
I have an R script file that has many functions (around 50) that use db connections using the DBI & RMySQL R packages. I have consolidated all DB access through 4 or 5 functions. I use on.exit(dbDisconnect(db)) in every single function where a dbConnect is used.
I discovered that just on loading this script using source("dbscripts.R") causes one DB connection to leak. I see this when I run the command
dbListConnections(MySQL())
[[1]]
MySQLConnection:0,607>
[[2]]
MySQLConnection:0,608>
[[3]]
MySQLConnection:0,609>
[[4]]
MySQLConnection:0,610>
I see one more DB connection added to the list everytime. This quickly reaches to 16 and my script stops working.
The problem is, I am unable to find out which line of code is causing the leak.
I have checked each dbConnect line in the code. All of them are within functions and no dbConnect happens outside in the main code.
So, why is the connection leak occurring?

R; how to solve an "expired PostgreSQLConnection" error?

I have a file with R code that builds up several dataframes and next tries to store them into a Postgres database. This ususally fails, the code snippet that fails is below.
require ("RPostgreSQL")
drv <- dbDriver("PostgreSQL")
res <- dbConnect (drv, dbname = db,
host = "localhost", port = 5432,
user = "postgres", password = pw)
table_name <- "gemeenten"
print (c ("adding ", table_name))
if (dbExistsTable (con, table_name)) dbRemoveTable (con, table_name) ### Error!
result <- dbWriteTable (con, table_name, gemeenten)
The error I get is:
Error in postgresqlQuickSQL(conn, statement, ...) :
expired PostgreSQLConnection
and the error occurs at the test of dbExistsTable. When I call dbListConnections (PostgreSQL ()) then umber of connections increases by one each time, a call dbDisconnect (con) does not decrement this number.
I got this error before when I tried to create the driver from a .Profile file and I could resolve this be removing the drv variable and assigning it again. I have succeeded twice in creating this table but I am not able to reconstruct why this happened. Does anyone know what I am doing wrong?
One of the things I noticed is that I started to get this error when I started sourcing my trials. When starting to source I made a lot of mistakes and I noticed in the Postgres status screen that these connections remained open. I carefully tried to disConnect all connections after usage. Using a tryCatch block is very useful in this respect. Use the finally branch to close the connection, unload the driver and remove their variables. It is not enough when you close connections from Postgres, R still thinks they're open and will refuse any connection attempt after there are 16 connections open. dbListConnections (PostgreSQL ()) returns a list, disConnect all elements of that list.
This did not work at first, I tried to remove package "RPostgreSQL" but that did not work either. I had to manually kill it from the library. As I am a newbie in R as in Postgres I suspect I did something wrong during install. Anyway, remove and reinstall the package. Next restart the Postgres server. After that it worked.
Somewhat paranoid I agree, but after having lost a night of sleep I didn't want to take any chances :-) If someone can pinpoint more precisely the cause of the problem I'll happily choose his answer as the correct one.

RPostgreSQL connections are expired as soon as they are initiated with doParallel clusterEvalQ

I'm trying to setup a parallel task where each worker will need to make database queries. I'm trying to setup each worker with a connection as seen in this question but each time I try it returns <Expired PostgreSQLConnection:(2781,0)> for however many workers I registered.
Here's my code:
cl <- makeCluster(detectCores())
registerDoParallel(cl)
clusterEvalQ(cl, {
library(RPostgreSQL)
drv<-dbDriver("PostgreSQL")
con<-dbConnect(drv, user="user", password="password", dbname="ISO",host="localhost")
})
If I try to run my foreach despite the error, it fails with task 1 failed - "expired PostgreSQLConnection"
When I go into the postgres server status it shows all the active sessions that were created.
I don't have any problems interacting with postgres from my main R instance.
If I run
clusterEvalQ(cl, {
library(RPostgreSQL)
drv<-dbDriver("PostgreSQL")
con<-dbConnect(drv, user="user", password="password", dbname="ISO",host="localhost")
dbGetQuery(con, "select inet_client_port()")
})
then it will return all the client ports. It doesn't give me the expired notice but if I try to run my foreach command it will fail with the same error.
Edit:
I've tried this on Ubuntu and 2 windows computers, they all give the same error.
Another Edit:
Now 3 windows computers
I was able to reproduce your problem locally. I am not entirely sure but I think the problem is related to the way clusterEvalQ works internally. For example, you say that dbGetQuery(con, "select inet_client_port())
gave you the client port output. If the query was actually evaluated/executed on the cluster nodes then you would be unable to see this output (the same way that you are unable to directly read any other output or print statements that are executed on the external clusternodes).
Hence, It is my understanding that the evaluation is somehow first performed on the local environment and the relevant functions and variables are subsequently copied/exported to the individual clusternodes. This would work for any other type of functions/variables but obviously not for db connections. If the connections/portmappings are linked to the master R instance, then the connections would not work from the slave instances. You would also get the exact same error if you tried to use the clusterExport function in order to export connections that are created on the master instance.
As an alternative, what you can do is create separate connections inside the individual foreach tasks. I have verified with a local database that the following works:
library(doParallel)
nrCores = detectCores()
cl <- makeCluster(nrCores)
registerDoParallel(cl)
clusterEvalQ(cl,library(RPostgreSQL))
clusterEvalQ(cl,library(DBI))
result <- foreach(i=1:nrCores) %dopar%
{
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, user="user", password="password", dbname="ISO",host="localhost")
queryResult <- dbGetQuery(con, "fetch something...")
dbDisconnect(con)
return(queryResult)
}
stopCluster(cl)
However, now you have to take into account that you will create and disconnect a new connection every foreach iteration. You might incur some performance overhead because of this. You can obviously circumvent this by splitting up your queries/data intelligently so that a lot of work gets done during the same iteration. Ideally, you should split up the work in exactly as much number of cores that you have available.

Unknown error (worker initialization failed: 21) in foreach() with doParallel cluster (R)

First-time poster here. Before posting, I read FAQs and posting guides as recommended so I hope I am posting my question in the correct format.
I am running foreach() tasks using the doParallel cluster backend in R 64 bit console v. 3.1.2. on Windows 8. Relevant packages are foreach v. 1.4.2 and doParallel v. 1.0.8.
Some sample code to give you an idea of what I am doing:
out <- foreach (j = 1:nsim.times, .combine=rbind, .packages=c("vegan")) %dopar% {
b<-oecosimu(list.mat[[j]], compute.function, "quasiswap", nsimul=nsim.swap) ## where list.mat is a list of matrices and compute.function is a custom function
..... # some intermediate code
return(c(A,B)) ## where A and B are some emergent properties derived from object b from above
}
In one of my tasks, I encountered an error I have never seen before. I tried to search for the error online but couldn't find any clues.
The error was:
Error in e$fun(obj, substitute(ex), parent.frame(), e$data) :
worker initialization failed: 21
In the one time I got this error, I ran the code after stopping a previous task (using the Stop button in R Console) but without closing the cluster via 'stopCluster()'.
I ran the same code again after stopping the cluster via 'stopCluster()' and registering a new cluster 'makeCluster()' and 'registerDoParallel()' and the task ran fine.
Has anyone encountered this error or might have any clues/tips as to how I could figure out the issue? Could the error be related to not stopping the previous doParallel cluster?
Any help or advice is much appreciated!
Cheers and thanks!
I agree that the problem was caused by stopping the master and continuing to use the cluster object which was left in a corrupt state. There was probably unread data in the the socket connections to the cluster workers, causing the master and workers to be out of sync. You may even have trouble calling stopCluster, since that also writes to the socket connections.
If you do stop the master, I would recommend calling stopCluster and then creating a another cluster object, but keep in mind that the previous workers may not always exit properly. It would be best to verify that the worker processes are dead, and manually kill them if they are not.
I had the same issue and in fact you need to add before your foreach loop:
out <- matrix()
It will initialize your table and avoid this error. It did worked for me.
After many many trials, I think I got a potential solution based on the answer by #Steve Weston.
For some reason, before the stopCluster call, you need to also call registerDoSEQ().
Like this:
clus <- makeCluster()
... do something ...
registerDoSEQ()
stopCluster(clus)

Resources