Unable to Connect to Cassandra Database from R using JDBC - r

I am trying to connect R with Cassandra. Following is my code:
library(RJDBC)
#Load in the Cassandra-JDBC diver
cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver",
list.files("D:/cassandra/lib",pattern="jar$",full.names=T))
#Connect to Cassandra node and Keyspace
casscon <- dbConnect(cassdrv, "jdbc:cassandra://192.168.1.20:9042/demodb")
When I run above code in R, I get following error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLNonTransientConnectionException: org.apache.thrift.transport.TTransportException: Read a negative frame size (--2080374784)!
Any ideas how to solve this error?
Thanks in advance!

Related

Can't connect Snowflake to R Error: nanodbc/nanodbc.cpp:1021: 00000: [Snowflake][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function

I'm trying to connect Snowflake to R. I tried using the following lines of code on R:
install.packages(c("DBI", "dplyr","dbplyr","odbc"))
library(DBI)
library(dplyr)
library(dbplyr)
library(odbc)
myconn <- DBI::dbConnect(odbc::odbc(), "SNOWFLAKEDSII", uid="username", pwd='pwd')
mydata1 <- DBI::dbGetQuery(myconn,"SELECT * FROM mydata")
head(mydata1)
When I run the line "myconn" I keep getting this error:
Error: nanodbc/nanodbc.cpp:1021: 00000: [Snowflake][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.
Could anyone help me figure out how to fix this?
I'd appreciate your help!
You need to know where the ODBC driver manager lib is located on your machine. The name of the file: libodbcinst.dylib
You can search it: find / -iname libodbcinst.dylib
When you've got the path to that file, you need to edit the config file:
/opt/snowflake/snowflakeodbc/lib/simba.snowflake.ini
or it can also be:
/opt/snowflake/snowflakeodbc/lib/universal/simba.snowflake.ini
find the line ODBCInstLib=libodbcinst.dylib and change it as:
ODBCInstLib=<full_path_to_the_file>/libodbcinst.dylib
full_path_to_the_file is the one you found on step 1.

I am getting a class not found error when I try to connect R with AWS Redshift

I am trying to connect R with redshift using the JDBC template they provide on their website.
I got the most updated version of the redshift jdbc and pulled JDBC() and it's not working.
install.packages("RJDBC",dep=TRUE)
library(RJDBC)
download.file('https://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC42-1.2.10.1009.jar','RedshiftJDBC42-1.2.10.1009.jar')
driver_redshift <- JDBC("com.amazon.redshift.jdbc42.Driver",
"RedshiftJDBC41-1.1.9.1009.jar", identifier.quote="`")
I am getting an error that says Error in .jfindClass(as.character(driverClass)[1]) : class not found
Try to download the driver with binary mode:
download.file('https://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC42-1.2.10.1009.jar','RedshiftJDBC42-1.2.10.1009.jar', mode="wb");
Then make sure that you're referring the correct jar:
driver <- JDBC("com.amazon.redshift.jdbc42.Driver", "RedshiftJDBC42-1.2.10.1009.jar", identifier.quote="`")

R Redshift Error in .jfindClass

reading the Howto on connecting Redshift to R, and am getting an error, any ideas ?
source - https://aws.amazon.com/blogs/big-data/connecting-r-with-amazon-redshift/
after the driver <- line I get this error:
driver <- JDBC("com.amazon.redshift.jdbc41.Driver", "RedshiftJDBC41-1.1.9.1009.jar", identifier.quote="`")
Error in .jfindClass(as.character(driverClass)[1]) : class not found
this error went away when I downloaded the used the 42-driver, not the 41-driver
download.file('http://s3.amazonaws.com/redshift-downloads/drivers/RedshiftJDBC42-1.2.1.1001.jar','RedshiftJDBC42-1.2.1.1001.jar')
Hopefully this will help someone.. on Windows 7
Ray

Error in .jfindClass(as.character(driverClass)[1]) : class not found - Hive R

I am connected to a remote R server which is built on x86_64-redhat-linux-gnu (64-bit) platform. The R version installed in this server is 3.3.1. I want to connect to remote hive database using this R server so that I can extract data and do some analysis on it. I am trying the following things,
options( java.parameters = "-Xmx8g" )
library(rJava)
library(RJDBC)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver",
"/home/username/R/x86_64-redhat-linux-gnu-library/3.3/hive-jdbc-0.10.0.jar",
identifier.quote="`")
I am getting error as Error in .jfindClass(as.character(driverClass)[1]) : class not found. I downloaded the jar file and kept it in this path , /home/username/R/x86_64-redhat-linux-gnu-library/3.3/. I have downloaded only this jar file. Inside this /home/username/R/x86_64-redhat-linux-gnu-library/3.3/ path, I am having three folders such as DBI, rJava and RJDBC and the file hive-jdbc-0.10.0.jar.
Apart from this have not downloaded anything else for now. Is there anything else which I need to download in order for this error to resolve?
Another attempt which I tried was,
hivedrv <- JDBC("org.apache.hadoop.hive.jdbc.HiveDriver",
c(list.files("/home/username/R/x86_64-redhat-linux-gnu-library/3.3/",pattern="jar$",full.names=T),
list.files("/home/username/R/x86_64-redhat-linux-gnu-library/3.3/",pattern="jar$",full.names=T)))
which ran without any error. But when I try the following command,
hivecon <- dbConnect(hivedrv, "jdbc:hive://hostname:portname/", "username", "password")
I am getting the following error,
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException
Not sure how to solve this problem. Can anybody please help me in connecting the R server to Hive database? Any information would be helpful.

How could R use RJDBC to connect to Hive?

I'm using hadoop-2.2.0 and hive-0.12. I followed the following steps to try to connect to Hive in Rstudio:
library("DBI")
library("rJava")
library("RJDBC")
for(l in list.files('/PATH/TO/hive/lib/')){ .jaddClassPath(paste("/PATH/TO/hive/lib/",l,sep=""))}
for(l in list.files('/PATH/TO/hadoop/')){ .jaddClassPath(paste("/PATH/TO/hadoop/",l,sep=""))}
options( java.parameters = "-Xmx8g" )
drv <- JDBC("org.apache.hive.jdbc.HiveDriver", "/PATH/TO/hive/lib/hive-jdbc.jar")
conn <- dbConnect(drv, "jdbc:hive2://HOST:PORT", USER, PASSWD)
But I got the following error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
Any tips will be appreciated.
The problem is solved.
I load all of the jar packages in the hadoop dir and then I can connect to Hive.
you can simply connect to hiveserver2 from R using RHIVE package
below are the commands that i had used.
Sys.setenv(HIVE_HOME="/usr/local/hive") Sys.setenv(HADOOP_HOME="/usr/local/hadoop") rhive.env(ALL=TRUE) rhive.init() rhive.connect("localhost")

Resources