R Connect to Database WITHOUT Java? - r

I use the RJDBC package to connect to Amazon Redshift and it works ok. But very often I get a Java Heapsize error and it's really frustrating and not consistent. Sometimes it works fine, sometimes it doesn't. My problem is that my computer can ONLY use 32bit Java, which also forces me to use 32bit R, which means I can't use all my RAM.
Is there a package that can do this WITHOUT Java?

Related

cannot allocate memory - RSelenium and EC2

I am trying to implement a Selenium test to perform automated actions on a website (looping through pages). I am using R and RSelenium package as well as a PostgreSQL database using DBI package. All this using EC2 AWS server.
My problem is that after a few minutes that the script was launched, my RStudio session freezes (as well as my Linux session) and I can see a message like "cannot allocate memory".
So this is clearly a memory issue without a doubt, and by doing top I could see that my Selenium docker was using most of the resources.
But my question is how can I reduce the amount of memory used by the Selenium test?
IMHO there is no practical way for a test to use less memory than the memory required by the given test. You can try to simplify the given test by breaking it up into 2 or more tests. Check for memory leaks, as suggested in another answer.
It would be much easier to use the next largest instance type with more memory, and shut down the instance when not in use to save money, if that is an issue.
Don't forget drive.close() in your code, if you don't close your driver, you will have a lot instance of Chrome.

How to use psql in R? copy_to function of the "rpg" package not working

I am connecting to a PostgreSQL database and I would like to make use of psql commands (especially the \copy command) from within R.
I’m on a windows client using ODBC drivers to connect to the database. Basically any of the major ODBC package in R, including the „rpg“ package, is working to connect to the database and to read and write tables etc.
Apart from placing regular SQL queries the „rpg“ package allows to make use of psql commands. The „copy_to“ function should send the psql „\copy“ command to the database.
However, when running the function I get the error: „psql not found“
I also have pgAdmin III installed. Here running the \copy command is no problem at all.
Digging deeper I found that the rpg::copy_to function first runs Sys.which(„psql“) which returns: "" leading to said error.
Reading this thread made me think that adding the path to the pgAdmin psql.exe would do the trick. So I added the line
psql=C:\Program Files (x86)\pgAdmin III\1.16\psql.exe
in the R environment.
Running Sys.which(„psql“) still returns "", while Sys.getenv() correctly shows the path to the pqsl.exe that I specified.
How can I make Sys.which() find the psql.exe? Given that’s the correct way to solve this issue in the first place.
I would appreciate any help!

Error connecting to mongoDB using Mongolite

I'm having issues connecting to my MongoDB via Mongolite, and I'm not sure if it is an issue on my side, or if I need to use a different package to connect to the database. Please keep in mind that I cannot change the software being run by the MongoDB server, and I am a novice when it comes to all of this, so it could just be a silly error on my part.
I've run the following code:
m <- mongo(collection = "test", url="mongodb://22.92.59.149:27017")
As far as I can tell from the Mongolite tutorial (https://jeroen.github.io/mongolite/), this is the correct syntax to connect to the database, but I'm not 100% sure. Regardless, I get the following error:
Error: Server at 22.92.59.149:27017 reports wire version 2,
but this version of libmongoc requires at least 3 (MongoDB 3.0)
From what I can tell, this means that mongolite won't work with my database. If that is the case, what other package should I try to use to connect, or if it is not the issue what am I doing wrong?
Thanks in advance!
As the message says, there is a mismatch between versions of the client and the server.
More precisely, mongolite relies on a more general driver written in C, libmongoc, and it seems the version automatically installed by the install.packages("mongolite") statement is too recent towards the server's version.
If you can't change anything server-side, maybe you could try to manually install an older version of libmongoc before installing mongolite, but I'm not confident about the compatibility with that R package afterwards.
Maybe you can use RMongo, an older and archived package to interact with Mongo in R, but I'm afraid what you're going to develop won't be stable in further R versions.
I'd rather recommend you to look at the problem server side.

Redshift JDBC connection crashes on second opening in R

I am using the RJDBC package to connect to AWS Redshift from an EC2 ubuntu instance.
I can successfully connect using the JDBC() call, retrieve/insert rows and then close the connection.
However, when I re-open a second connection in the same R session, R crashes with a segmentation fault. This happens in both R Studio and console R. I'm using conda to manage the R.
I have tried the connection using the native redshift jar provided by Amazon and also another jar from Progess Software. I get the same effect with both drivers: first connection is fine, subsequent connections crash.
I've installed the latest JVM v8. I had seen some other threads that suggested installing v6 as a workaround, but unfortunately that is no longer available at the oracle site.
My gut feeling is that Java has a weird interaction with R, but I'm at a loss as to how to proceed.
OK, I solved this myself and thought I'd record in case this is useful to others.
The problem was really with rJava not re-initialising the JVM correctly.
I added the following line before opening a database connection:
rJava::.jinit(force.init = TRUE)
Now I can open and close connections without issue using RJDBC

R using RODBC in 64bit mode to connect with 32bit MS Access 2010

I love the RODBC package, but unfortunately my work only has Office 2010 32Bit installed. This is starting to be an issue since the primary reason I'm using R is to work with giant datasets, which demand 64bit mode.
I already know that 64bit R won't make a 32bit ODBC connection... but is there a way around this?
One hack I can think of is creating a 32bit script that will grab the data generated in the 64bit session and do the database work. But that would require a few data handshakes.
Most elegant would be to somehow, temporarily, trick R into 32bit so it can throw the data into the database, then be back in 64 bit afterwards.
Is that possible?

Resources