Connecting to hive (kerberoes enabled) with R rJDBC package from Rstudio windows - r

the following issue is coming when trying to connect Hive 2 (kerberoes authenticat is enabled) using R rjdbc. used simba driver to connect to hive.
hiveConnection <- dbConnect(hiveJDBC, "jdbc:hive2://xxxx:10000/default;AuthMech=1;KrbRealm=xx.yy.com;KrbHostFQDN=dddd.yy.com;KrbServiceName=hive")
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.sql.SQLException: [Simba]HiveJDBCDriver Invalid operation: Unable to obtain Principal Name for authentication ;

make sure kinit is issued and kerberoes ticket is generated using klist
right Java version for the given R version (32/64 bit) available on the class-path
right slf4j jars available based on your java version
All these steps should resolve the issue assuming your code does not have logic issues.

Related

Connecting Oracle to R issue

I am trying to connect Oracle to RStudio using the "ROracle" package. I've installed the package and loaded it. I also have the "DBI" package installed and loaded.
I am using dbConnect(dbDriver("Oracledrivername"),oracle_schema,oracle_password,dbname="dbname") to connect to my oracle schema but I am getting this error:
Error in h(simpleError(msg,call)):
eror in evaluating the argument 'drv' in selecting a method for function 'dbConnect'
I then tried to narrow it down by testing dbDriver("Oracledrivername") by itself and the Error I get is:
Error: Couldn't find driver Oracledrivername
Things that I have done to attempt to fix this are:
I tested my connection to "Oracledrivername" in the ODBC data source administrator, the connection was ok.
The Rstudio I am using is 64 bit, Oracle client is a v12.1.0 64 bit, and the ODBC driver was set up on 64 bit
I have set the oracle_home location to C:\ORACLE12_64BIT\product\12.1.0\client_1

Simba Athena ODBC: unable to use SQLGetPrivateProfileString functions

This is very strange, I want to setup a connection from RStudio to my instance in AWS Athena.
I am using unixodbc as the driver manager, and succeded by testing the connection using isql -v 'Simba Athena'. However, when I test the connection in RStudio with...
con <- DBI::dbConnect(
odbc::odbc(),
"Simba Athena"
)
... it gives me the error Error: nanodbc/nanodbc.cpp:1021: 00000: [Simba][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.. Any clue about it, I am a bit stuck.
It is basically not finding the correct ODBC driver. Simba by default references the driver in its /Library/simba/athenaodbc/lib/simba.athenaodbc.ini setup file to libodbc.dylib but it should be libodbcinst.dylib. At least in MacOS.
This solved my problem.
I got the same error when I link with static library of "libodbc.a", however I can succeed to connect when I change to link with dynamic library of "libodbc.so"

Error in .jfindClass(as.character(driverClass)[1]) : class not found - Hive R

I am connected to a remote R server which is built on x86_64-redhat-linux-gnu (64-bit) platform. The R version installed in this server is 3.3.1. I want to connect to remote hive database using this R server so that I can extract data and do some analysis on it. I am trying the following things,
options( java.parameters = "-Xmx8g" )
library(rJava)
library(RJDBC)
drv <- JDBC("org.apache.hive.jdbc.HiveDriver",
"/home/username/R/x86_64-redhat-linux-gnu-library/3.3/hive-jdbc-0.10.0.jar",
identifier.quote="`")
I am getting error as Error in .jfindClass(as.character(driverClass)[1]) : class not found. I downloaded the jar file and kept it in this path , /home/username/R/x86_64-redhat-linux-gnu-library/3.3/. I have downloaded only this jar file. Inside this /home/username/R/x86_64-redhat-linux-gnu-library/3.3/ path, I am having three folders such as DBI, rJava and RJDBC and the file hive-jdbc-0.10.0.jar.
Apart from this have not downloaded anything else for now. Is there anything else which I need to download in order for this error to resolve?
Another attempt which I tried was,
hivedrv <- JDBC("org.apache.hadoop.hive.jdbc.HiveDriver",
c(list.files("/home/username/R/x86_64-redhat-linux-gnu-library/3.3/",pattern="jar$",full.names=T),
list.files("/home/username/R/x86_64-redhat-linux-gnu-library/3.3/",pattern="jar$",full.names=T)))
which ran without any error. But when I try the following command,
hivecon <- dbConnect(hivedrv, "jdbc:hive://hostname:portname/", "username", "password")
I am getting the following error,
Error in .jcall(drv#jdrv, "Ljava/sql/Connection;", "connect", as.character(url)[1], :
java.lang.NoClassDefFoundError: org/apache/hadoop/hive/metastore/api/MetaException
Not sure how to solve this problem. Can anybody please help me in connecting the R server to Hive database? Any information would be helpful.

Shiny Server cannot use RODBC to connect to DB2 but RStudio can in a Docker Container

I am working on deploying a shiny application in a Docker container onto Bluemix. I am using the rocker/shiny Docker image (https://hub.docker.com/r/rocker/shiny/) as my initial starting point. I have installed unixODBC-dev, RODBC, ibm data server driver package, the ibmdbR library for R, and all needed dependencies. My only problem is that when I try to access the shiny app from a web browser it fails to execute, the error is:
Warning in odbcDriverConnect("DSN=BLUDB", :
[RODBC] ERROR: state 01000, code 0, message [unixODBC][Driver Manager]Can't open lib '/root/db2_cli_odbc_driver/dsdriver/odbc_cli_driver/linuxamd64/clidriver/lib/libdb2o.so' : file not found
Warning in odbcDriverConnect("DSN=BLUDB; :
ODBC connection failed
Error in idaInit(con) : con is not an open connection, please use idaConnect() to create an open connection to the data base.
Initially I had this same problem whenever I would try to use isql to connect to the database or try to connect from RStudio, I used ldd on that library file and found what was missing and that fixed making connections from the command line and RStudio, however my Shiny-Server still gives me the same error, is there anything I am missing?
I ended up solving the problem myself, turns out the libraries were not accessible by the shiny-server which was running as a service. I moved the db2 odbc drivers over to /usr/local/lib to make it accessible, I also ran the "ldd" command on the library mentioned in the error message and found that I had to install libxml2 as well. After doing that I simply changed my odbcinst.ini file at /etc to reference the new location of the db2 library and now it all works! Hopefully anyone else trying to deploy Shiny Apps that rely on connecting to a DB2 database will find this useful.

Kerberos connection error to Hive2 using JDBC in R

I used to be able to run R code to pull Hive table using JDBC under Cloudera CDH 4.5. However now I got below connection error after upgraded to CDH5.3 (failed
to find any Kerberos tgt), seems it can not to connect to Cluster anymore.
The Hive server has been upgraded to hive2 server/Beeline.
Please see the code and error log below. Any experience and advise on how to fix this? Thanks.
options(width=120)
options( java.parameters = "-Xmx4g" )
query="select * from Hive_table"
user="user1"
passw="xxxxxxx"
hiveQuerytoDataFrame<-function(user,passw,query){
library(RJDBC)
.jaddClassPath("/opt/cloudera/parcels/CDH/lib/hive/lib/hive-jdbc-0.10.0-cdh5.3.3.jar")
drv <- JDBC("org.apache.hive.jdbc.HiveDriver",classPath = list.files("/opt/cloudera/parcels/CDH/lib/",pattern="jar$",full.names=T, recursive = TRUE),identifier.quote="`")
`conn <- dbConnect(drv,"jdbc:hive2://server.domain.<>.com:10000/default;principal=hive/server.domain.com#SERVER.DOMAIN.COM",user,passw)
#dbListTables(conn)
jdbc_out<-dbGetQuery(conn,query)
str(jdbc_out)
return(jdbc_out)
} `
**Log:
ERROR transport.TSaslTransport: SASL negotiation failure
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]**`

Resources