how to read data from Cassandra (DBeaver) to R - r

I am using Cassandra CQL- system in DBeaver database tool. I want to connect this cassandra to R to read data. Unfortunately the connection takes more time (i waited for more than 2 hours) with RCassandra package. but it does not seem to get connected at all and still loading. Does anyone has any idea on this?
the code as follows:
library(RCassandra)
rc <- RC.connect(host ="********", port = 9042)
RC.login(rc, username = "*****", password = "******")
after this step RC.login, it is still loading for more than 2 hours.
I have also tried using RJDBC package like posted here : How to read data from Cassandra with R?.
library(RJDBC)
drv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver",
list.files("C:/Program Files/DBeaver/jre/lib",
pattern="jar$",full.names=T))
But this throws error
Error in .jfindClass(as.character(driverClass)[1]) : class not found
None of the answers are working for me from the above link.I am using latest R version 3.4.0 (2017-04-21) and New version of DBeaver : 4.0.4.

For your first approach, which I am less familiar with, should you not have a line that sets the use of the connection?
such as:
library(RCassandra)
c <- RC.connect(host ="52.0.15.195", port = 9042)
RC.login(c, username = "*****", password = "******")
RC.use(c, "some_db")
Did you check logs that you are not getting some silent error while connecting?
For your second approach, your R program is not seeing a driver in a classpath for Java (JMV).
See this entry for help how to fix it.

Related

R - handle error when accessing a database

I'm trying to automate data download from db using RJDBC using a for loop. The database im using automatically closes the connection after every 10mins, so what i want to do is somehow catch the error, remake the connection, and continue the loop. In order to do this i need to capture the error somehow, the problem is, it is not an r error so none of the commands trycatch and similar works. I just get a text on the console telling me:
Error in .jcheck() : No running JVM detected. Maybe .jinit() would help.
How do i handle this in terms of:
if (output == ERROR) {remake connection and run dbQuery} else {run dbQuery}
thanks for any help
You could use the pool package to abstract away the logic of connection management.
This does exactly what you expect regarding connection management with DBI.
It should work with RJDBC which is an implentation of DBI, but I didn't test it with this driver.
libray(pool)
library(RJDBC)
conn <- dbPool(
drv = RJDBC::JDBC(...),
dbname = "mydb",
host = "hostadress",
username = "test",
password = "test"
)
on.exit(poolClose(conn))
dbGetQuery(conn, "select... ")

Problems with connecting to local postgres database from R through RPostgreSQL

I have to following code
drv <- RPostgreSQL::PostgreSQL()
con <- DBI::dbConnect(drv, dbname = 'dbname', user = 'user',
host = 'host.name', port = 5432, password = 'password')
When I run it on server (Ubuntu server 16.04 with latest updates) running the database I get the following error:
Error in .valueClassTest(ans, "data.frame", "dbGetQuery") :
invalid value from generic function ‘dbGetQuery’, class “NULL”, expected “data.frame”
But when I run R from commandline with sudo, it works, when I run it from different my laptop connecting to the DB on the server it also works. So it shouldn't be connection problem. I am thinking about problem with access rights to some libraries/executables/configs on the system? Any help will be appreciated.
When I run the dbConnect multiple times and it ends with the error, when I run drv_info <- RPostgreSQL::dbGetInfo(drv), I still get multiple connectionIds in the drv_info:
drv_info <- RPostgreSQL::dbGetInfo(drv)
> drv_info
$drvName
[1] "PostgreSQL"
$connectionIds
$connectionIds[[1]]
<PostgreSQLConnection>
$connectionIds[[2]]
<PostgreSQLConnection>
$fetch_default_rec
[1] 500
$managerId
<PostgreSQLDriver>
$length
[1] 16
$num_con
[1] 2
$counter
[1] 2
Found a source of confusion, but not necessarily the root problem. (I was assuming RPostgres, while you are using RPostgreSQL (github mirror).)
If you check the source, you'll find that the method is calling postgresqlNewConnection, which includes a call to dbGetQuery. The problem you're seeing is that your call to dbConnect is failing (my guess is at line 100) and returns something unexpected, but postgresqlNewConnection continues.
I see three options for you:
try calling dbConnect(..., forceISOdate=FALSE) to bypass that one call to dbGetQuery (note that this doesn't fix the connection problem, but it at least will not give you the unexpected query error on connection attempt);
raise an issue for the package maintainers; or
switch to using RPostgres, still DBI-based and actively developed (it looks like RPostgreSQL has not had significant commits in the last few years, not sure if that's a sign of good code stability or development stagnation)
One lesson you may take from this is that you should check the value returned from dbConnect; if is.null(con), something is wrong. (Again, this does not solve the connection problem.)

R- mongolite on OS X Sierra- No suitable servers found

I am trying to follow the "Getting started with MongoDB in R" page to get a database up and running. I have mongoDB installed in my PATH so I am able to run mongod from the terminal and open an instance. Though when I open an instance in the background and try running the following commands in R:
library(mongolite)
m <- mongo(collection = "diamonds") #diamonds is a built in dataset
It throws an error after that last statement saying:
Error: No suitable servers found (`serverSelectionTryOnce` set): [Failed to resolve 'localhost']
How do I enable it to find the connection I have open? Or is it something else? Thanks.
It could be that mongolite is looking in the wrong place for the local server. I solved this same problem for myself by explicitly adding the local host address in the connection call:
m <- mongo(collection = "diamonds", url = "mongodb://127.0.0.1")

ODBC Teradata connection Julia

I am trying to connect to Teradata using ODBC in Julia using the following but I am getting an error:
co = ODBC.connect(dsn = "DBS", uid = "me", pwd = "xxx")
Error:
>ArgumentError: function connect does not accept
I've installed the ODBC package and if I run ODBC.listdsns() I can see a list of available DSNs, one of which I am using in my connection string. I've tried a few other variants that are listed in the documentation but I get the same error message.
I have been using the above connection string in R so I'm confident that I can connect with ODBC but I only installed Julia over the weekend so I am still getting to grips with it.
Any help would be much appreciated.

dbConnect with R 3.0 on Ubuntu 12.04 x64 --Error in as.integer(from) : cannot coerce type 'S4' to vector of type 'integer'

Just updated to R 3.0 and updated all the packages, including DBI. To my surprise, a script that I often use stopped working.
I am unable to connect to a MySQL database using dbConnect. The code script instantly, so only a few lines will reproduce the problem
> require("RMySQL")
> m = dbDriver("MySQL")
> dbConnect(m, user = 'user', password = 'pass', dbname = 'dbname', host = 'localhost', client.flag = CLIENT_MULTI_STATEMENTS)
Error in as.integer(from) :
cannot coerce type 'S4' to vector of type 'integer'
Calls: dbConnect ... mysqlNewConnection -> isIdCurrent -> as -> asMethod
Also tried it as:
dbConnect(MySQL(), user = 'user', password = 'pass', dbname = 'dbname', host = 'localhost', client.flag = CLIENT_MULTI_STATEMENTS)
but the same problem
Also tried removing other parameters, but the same issue from the dbDriver.
What changed in the DBI package with the latest update? How can I fix this?
I noticed that the DBI package is orphaned so don't know who to ask.
I had the same issue with R 3.0.1 on ubuntu.
Installing the latest version of the RMySQL-package resolved the problem:
> install.pacakges("RMySQL")
Make sure to restart R after the installation.
I'm still digging into the issue, but I think I've identified multiple causes of this issue. At their root, they all have to do with R expecting an S4 object but getting back an integer instead. I believe these are generally a result of the connection failing to establish.
Why is it failing? One thing I've noticed is that if you fail to close to many of your connections (~16 [see the number of maximum connections specified in the driver handle call] open) DBI won't/can't open a new connection. Make sure you are calling dbDisconnect as needed. Usually, this sort of problem results in a sensible error message, however sometimes results in the above referenced error. If possible access the DB through an abstraction layer, e.g. dplyr as some will monitor the db connections and kill them if they are inactive. Whereas, AFIK if you open a connection in a function and the function breaks, you have no way to close the open connection unless you returned the driver object from your initial call to dbConnect. In this case you have no choice but to restart your instance of R (possibly resetting your machine and clearing your workspace as well).
The other issue I recently encountered is that if RMySQL masks RPostgreSQL, then RPostgreSQL will fail. The reverse does not appear to be the case, but because others have mentioned RPostgreSQL in here as showing the same error message, it seemed worthy of note.
Update
The latest version of RMySQL (0.10.1) seems to have finished off RPostgreSQL - RPostgreSQL now fails to work regardless of load order. The same people working on RMySQL appear to be working on RPostgres (https://github.com/rstats-db/RPostgres) and this conflict appears to be a non-issue if using that package instead of RPostgreSQL. Specifically, use RPostgres::Postgres() in the place of RPostgreSQL::PostgreSQL() when specifying the driver in dbConnect. Other packages, e.g. dplyr, currently assume RPostgreSQL, so this issue can still bite (but it seems a resolution is in the works (https://github.com/rstats-db/RMySQL/issues/28).

Resources