I am trying to follow an example given on "http://www.datastax.com/dev/blog/big-analytics-with-r-cassandra-and-hive" to connect R with Cassandra. Following is my code:
library(RJDBC)
#Load in the Cassandra-JDBC diver
cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", list.files("D:/cassandra/lib",pattern="jar$",full.names=T))
#Connect to Cassandra node and Keyspace
casscon <- dbConnect(cassdrv, "jdbc:cassandra://127.0.0.1:9042/demodb")
When I run above code in R, I get following error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLNonTransientConnectionException: org.apache.thrift.transport.TTransportException: Read a negative frame size (-2113929216)!
On the Cassandra server window get the following error for the above code:
ERROR 14:41:26,671 Unexpected exception during request
java.lang.ArrayIndexOutOfBoundsException: 34
at org.apache.cassandra.transport.Message$Type.fromOpcode(Message.java:1
06)
at org.apache.cassandra.transport.Frame$Decoder.decode(Frame.java:168)
at org.jboss.netty.handler.codec.frame.FrameDecoder.callDecode(FrameDeco
der.java:425)
at org.jboss.netty.handler.codec.frame.FrameDecoder.messageReceived(Fram
eDecoder.java:303)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:26
8)
at org.jboss.netty.channel.Channels.fireMessageReceived(Channels.java:25
5)
at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:88)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(Abstract
NioWorker.java:109)
at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNi
oSelector.java:312)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioW
orker.java:90)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
I tried to change port from 9042 to 9160 then request won't reach server in that case.
I also tried to increase the size of thrift_framed_transport_size_in_mb from 15 to 500 but the error is same.
The Cassandra is otherwise running fine and database is connected/updated easily through "devcenter".
R version: R-3.1.0,
Cassandra version: 2.0.8,
Operating System: Windows,
XP Firewall: off
Finally I was able to connect to cassandra through R. I followed the following steps:
I updated my java 7 and R to the latest version.
Then, I reinstalled RJDBC, rJava, DBI
Then, I used the following code, and successfully got connected:
library(RJDBC)
drv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver", list.files("D:/cassandra/lib/",pattern="jar$",full.names=T))
.jaddClassPath("D:/mysql-connector-java-3.1.14/cassandra-clientutil-1.0.2.jar")
conn <- dbConnect(drv, "jdbc:cassandra://127.0.0.1:9160/demodb")
res <- dbGetQuery(conn, "select * from emp")
# print values
res
Related
Disclaimer: I'm a statistician/bioinformatician by training, so I'm quite new to networks, servers, and databases.
System: Macbook Pro (M1 chip).
I'm trying to connect to an SQL Server database remotely via R and RStudio.
To start, I ran the following commands in terminal (as seen here https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/install-microsoft-odbc-driver-sql-server-macos?view=sql-server-ver16):
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
brew tap microsoft/mssql-release https://github.com/Microsoft/homebrew-mssql-release
brew update
HOMEBREW_NO_ENV_FILTERING=1 ACCEPT_EULA=Y brew install msodbcsql18 mssql-tools18
The code I'm running in RStudio is as follows (as seen here https://db.rstudio.com/getting-started/connect-to-database):
library(DBI)
library(odbc)
con <- DBI::dbConnect(odbc::odbc(),
Driver = "ODBC Driver 18 for SQL Server",
Server = "xxx.xxx.xxx.xx",
Database = "Dbname",
UID = "username",
PWD = "password",
Port = 3306,
.connection_string ="TrustServerCertificate=yes")
The above gives me the following error:
Error: nanodbc/nanodbc.cpp:1021: 00000: [unixODBC][Driver Manager]Data
source name not found and no default driver specified
I can't find any help related to the errors I'm getting at https://db.rstudio.com/getting-started.
Slightly different piece of code gives me a different error:
con <- DBI::dbConnect(odbc::odbc(),
.connection_string = "Driver={ODBC Driver 18 for SQL
Server};Uid=username;Pwd=password;Host=xxx.xxx.xxx.xx;Port=3306;Database=Dbname;TrustServerCertificate=yes;")
Error: nanodbc/nanodbc.cpp:1021: 00000: [Microsoft][ODBC Driver 18 for
SQL Server]Neither DSN nor SERVER keyword supplied [Microsoft][ODBC
Driver 18 for SQL Server]Invalid connection string attribute
What is a Server Keyword as referred to in the second error? Is the server supposed to be an IP address as I've indicated in the code?
Does the use of ODBC Driver matter? How can I tell if I'm using the right one?
Am I off the mark with any of the information I'm feeding into dbConnect()?
Any tips welcome.
Thanks.
Thanks to #AlwaysLearning for pointing at the mistake in my second chunk of code. Host= should have been Server=.
As for the first chunk, I don't know why it wouldn't work.
A colleague at work is having trouble using the odbc package function. I am trying to find help.
He is using an oracle database using R instead of running our traditional SAS programs, but he has not been successful. We are trying to find out what is causing the error messages below. Can someone help?
Attempt 1:
#Get the Oracle JDBC driver
jdbcDriver =JDBC("oracle.jdbc.OracleDriver",
classPath="C:/instantclient_19_10/ojdbc8.jar")
Create connection string to the Database we want
connect.string <-
glue("jdbc:oracle:thin:#//{host}:{port}/{sid}",
host = "stdbprd01.states.bls.gov",
port = 1521,
sid = "lausonep")
print(connect.string)
#Establish connection to your database
con <- dbConnect(jdbcDriver,
connect.string,
user = "username",
password = rstudioapi::askForPassword("Database password"))
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLRecoverableException: Listener refused the connection with the following error:
ORA-12514, TNS:listener does not currently know of service requested in connect descriptor
Attempt 2:
library(odbc)
con <- DBI::dbConnect(odbc::odbc(),
driver="Oracle in OraClient12Home1",
database="lausprd",
uid="aakre_n",
pwd="!QAZ1qaz#WSX2wsx",
host="stdbprd01.states.bls.gov",
port=1521
)
Error: nanodbc/nanodbc.cpp:1021: IM006: [Oracle][ODBC][ORA]-12560: TNS:protocol adapter error
[Microsoft][ODBC Driver Manager] Driver’s SQLSetConnectAttr failed
I'm trying to connect RStudio to Amazon Redshift via JDBC and this is what I tried to run:
driver <- JDBC("com.amazon.redshift.jdbc42.Driver", "~/Downloads/RedshiftJDBC42-1.2.1.1001.jar", identifier.quote="`")
# url <- "<JDBCURL>:<PORT>/<DBNAME>?user=<USER>&password=<PW>
url <- "jdbc:redshift://<cluster-name>.<xxxxxx>.us-east-1.redshift.amazonaws.com:5439/<dbname>?user=<username>&password=<password>"
conn <- dbConnect(driver, url)
When executing dbConnect(), I get the following error:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLException: [Amazon](500150) Error setting/closing connection: Operation timed out.
Any idea what is causing this and how to fix it?
Update: There was a problem with access through security groups. If you're having a similar issue, check the inbound rules of your security group and make sure they allow access to Redshift via your IP.
I want to use RSQLServer instead of RODBC to connect to a database called 'Mkt_DW'. I think my server hostname is my machine DHX number - that's what is returned when I query the hostname in SQL Server 2008 using:
SELECT HOST_NAME() AS HostName, SUSER_NAME() LoggedInUser
I then enter the following code into R Studio:
library(RSQLServer)
library(DBI)
driver <- dbDriver("SQLServer")
url <- "DHX32510;Database=Mkt_DW;Trusted_Connection=TRUE;"
conn <- dbConnect(driver, url)
I get the following error:
Error in rJava::.jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", url, :
java.sql.SQLException: Network error IOException: Connection refused: connect
Can anyone tell me what I'm doing wrong?
Thanks,
Neil
It is because it cannot find the 'sql.yaml' file:
See not from the package author:
"See ?SQLServer. It will look for the YAML file in the following location by default: Sys.getenv("HOME")"
https://github.com/imanuelcostigan/RSQLServer/issues/57
I have been trying to use RJDBC package to connect R(on local machine) with Hive(server), and am seeing errors:
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLException: org.apache.thrift.TApplicationException: Invalid method name: 'execute'
I directly copied the jars that's running on the server to my local machine, which then shouldn't be the problem with the driver versions. I also tried to use earlier versions of RJDBC package, but it didn't work, neither.
I would really appreciate any ideas/suggestions.
My script:
#
hive_connection <- function( hostname, port, lib_dir, hive_jars){
library(RJDBC)
library(DBI)
library(rJava)
library(Rserve)
# lib_dir: directory containing the jars & drivers
hive_class_path <- file.path( lib_dir, hive_jars )
drv <- JDBC( 'org.apache.hadoop.hive.jdbc.HiveDriver', classPath= hive_class_path, "`" )
server <- sprintf( 'jdbc:hive://%s:%s', hostname, port )
return ( dbConnect( drv, server, 'hive','hive') )
}
conn <- hive_connection('hostname',9083,'lib_dir', list.files('lib_dir'))
This is related to driver and port, i was facing same error while connecting hive with jdbc driver. finally i find out right driver and hive service with port. it worked fine.
Try
drv <- JDBC( 'org.apache.hadoop.hive.jdbc.HiveDriver',c(hive_class_path ,pattern="jar$" ,full.names=T) )
I resolved the same issue by following two steps -
Change 1:
drv <- JDBC( 'org.apache.hive.jdbc.HiveDriver', classPath= hive_class_path, "`" )
The change is in the driver, notice I took out .hadoop
Change 2:
server <- sprintf( 'jdbc:hive2://%s:%s', hostname, port )
I added "2" in the url for connection, to connect to hiveServer2.
I got the detailed explanation by reading this - http://jayunit100.blogspot.com/2013/12/the-anatomy-of-jdbc-connection-in-hive.html