Redshift JDBC connection crashes on second opening in R - r

I am using the RJDBC package to connect to AWS Redshift from an EC2 ubuntu instance.
I can successfully connect using the JDBC() call, retrieve/insert rows and then close the connection.
However, when I re-open a second connection in the same R session, R crashes with a segmentation fault. This happens in both R Studio and console R. I'm using conda to manage the R.
I have tried the connection using the native redshift jar provided by Amazon and also another jar from Progess Software. I get the same effect with both drivers: first connection is fine, subsequent connections crash.
I've installed the latest JVM v8. I had seen some other threads that suggested installing v6 as a workaround, but unfortunately that is no longer available at the oracle site.
My gut feeling is that Java has a weird interaction with R, but I'm at a loss as to how to proceed.

OK, I solved this myself and thought I'd record in case this is useful to others.
The problem was really with rJava not re-initialising the JVM correctly.
I added the following line before opening a database connection:
rJava::.jinit(force.init = TRUE)
Now I can open and close connections without issue using RJDBC

Related

Azure Data Factory - Self-Hosted Integration Runtime - ODBC driver mystery

We are using a Self-Hosted Integration Runtime for Azure Data Factory.
On that machine there was installed an Exasol ODBC driver of version 6. We wanted to upgrade the driver, deleted an old one and installed a new driver of version 7.
Weird thing is that now in Exasol logs we can see that Data Factory is sometimes connecting via driver version 7, and sometimes via driver version 6.
I made an experiment and deleted Exasol ODBC driver from the machine completely. After that Data Factory still was able to connect to Exasol using the driver I just deleted.
Looks like drivers' DLLs are cached somewhere. What can it be?
Update 1
I captured following actions in Process Monitor when Data Fatory connected to Exasol with ODBC driver of version 6:
Where these C:\Config.Msi\3739be5*.rbfASolution-6.1\ODBC\ DLLs may come from? There is no C:\Config.Msi\ directory on the machine.
Update 2
I noticed that when I test connection via Microsoft Integration Runtime Configuration Manager on the machine or in Data Factory Linked Service, then connection is always performed with ODBC driver of version 7.
But when I test connection via Data Factory Dataset, then in some cases connection is done with ODBC driver of version 6.
You could check the registry but clean at your own risk. An alternative might be the SysIternals tools, Process Monitor or Process Explorer which might help you get to the bottom of this. Install them on the SHIR VM if you are allowed to. Process Explorer in particular is a bit like SQL Profiler (if you've ever used that) so will be able to tell you which registry keys external processes are using. It will give you a lot a lot of information so you will have to make judicious use of timestamp and filtering. The proposed steps:
Start a trace using Process Monitor
Start a pipeline using the Exasol driver
Wait til it completes (or at least you know it has started)
Stop the Process Monitor trace Spend time going through the millions
of records it has captured, trying to filter down, or search for your
process
An alternative would be to build a clean SHIR and install only the new driver. Then swap it in for the old one. You may have to get the new SHIR added to the firewall if this is an issue for you.
Honestly I would propose both of these approached in parallel for a production problem. Procmon / Process Explorer can be quite labour and time expensive but should help you get to the bottom of the issue. Building a cleaner SHIR is probably a safer option in the long-term, but requires new infrastructure.
It may sound silly, but rebooting the server where SHIR is working solved the problem.
We noticed, that this server was running for more than 30 days, and decided to reboot it. Maybe restarting Integration Runtime service itself would also help, but we didn't do it.
Thanks to everyone for you help.

R Studio keeps 'hanging' when trying to establish an Oracle Database connection

I'm working a little side project on R Studio and I'm trying to import data from an Oracle database. The problem is, whenever I try to establish the connection using the DBI::dbConnect command, it just 'hangs'. It won't continue to the next command on my R Studio script. I've added a timeout to the dbConnect command, but it doesn't help anything. In order to exit, I have to shut down R Studio completely.
I've tested the connection, check the screenshot below, using the 'Connections' tab on R Studio. As you can see, it is able to establish the connection. So that should mean the parameters are set correctly, right? But when I run it on the script, it just keeps 'hanging' on the dbConnect command.
What can I do?
When it connects successfully to the database RStudio loads in connection pane a tree which present all schemas and all tables. It's useful by it can be slow.
Depending on Oracle database I use, it often takes a long time to load this and it's longer when user have low privileges on database (user with only one schema for instance).
I see two ways trying to go further:
connect with your statement on RGui to see if it works correctly then it's really linked to RStudio's connection pane
connect with system / admin or a higher level user to check if it's work better with

Could there be any language issue with a ODBC in R with an Access DB?

I am using an R Script which connects to a local Access database. For that, I used the 'odbc' package in R and created an odbc Driver in Windows. It works well on my machine.
The issue I have is, that it can't connect to the database when running the script on a foreign computer with different language settings than English. Both machines are running Windows 64-bit with Access and R on 64-bit. Running following Code:
library(odbc)
con <- dbConnect(odbc::odbc(), "AccessDB")
results in following error message:
Error in connection_info(ptr) : nanodbc/nanodbc.cpp:1072:
I didn't find a solution yet, I am thinking of using another database.
I received the same error today on a setup that usually works. After downgrading the odbc-package to 1.1.6, it works fine again.

Error connecting to mongoDB using Mongolite

I'm having issues connecting to my MongoDB via Mongolite, and I'm not sure if it is an issue on my side, or if I need to use a different package to connect to the database. Please keep in mind that I cannot change the software being run by the MongoDB server, and I am a novice when it comes to all of this, so it could just be a silly error on my part.
I've run the following code:
m <- mongo(collection = "test", url="mongodb://22.92.59.149:27017")
As far as I can tell from the Mongolite tutorial (https://jeroen.github.io/mongolite/), this is the correct syntax to connect to the database, but I'm not 100% sure. Regardless, I get the following error:
Error: Server at 22.92.59.149:27017 reports wire version 2,
but this version of libmongoc requires at least 3 (MongoDB 3.0)
From what I can tell, this means that mongolite won't work with my database. If that is the case, what other package should I try to use to connect, or if it is not the issue what am I doing wrong?
Thanks in advance!
As the message says, there is a mismatch between versions of the client and the server.
More precisely, mongolite relies on a more general driver written in C, libmongoc, and it seems the version automatically installed by the install.packages("mongolite") statement is too recent towards the server's version.
If you can't change anything server-side, maybe you could try to manually install an older version of libmongoc before installing mongolite, but I'm not confident about the compatibility with that R package afterwards.
Maybe you can use RMongo, an older and archived package to interact with Mongo in R, but I'm afraid what you're going to develop won't be stable in further R versions.
I'd rather recommend you to look at the problem server side.

Issues with RS-DBI driver in R

I'm having an issue figuring out why I can't connect to a PSql DB from R. I am able to access the database from the terminal using the psql command, but when connecting through DBI and R I get the following message [with some information redacted]:
RS-DBI driver: (could not connect [username]#[database URI] on dbname "[dbname]"
The database string works fine both the terminal and this code works fine on the machine I am porting it from. I have reinstalled the versions of the libraries that match what was on the dev machine, and am still having problems.
Any advice?
Edit:
I was able to get it working by fiddling around with the library(...) statements. It seems changing the order of the DBI and RPostgreSQL libraries have an effect. RPostgreSQL requires DBI, but importing just RPostgreSQL still produced the could not connect error.
To future readers with this issue: fiddle with the order, it may help!
Just an educated guess: your psql is from the same machine, so uses the local connection. The DBI-based methods using the Postgresql library will use network connection so you actually have to open that the corresponding config file.
See eg here about pg_hba.conf.

Resources