I am not able to connect to an oracle database from my R/Python scripts. Following is the code. TNSPing utility is able to resolve the database using LDAP. I am pasting TNSPING output as well.
TNSPing output
C:\Windows\System32>tnsping UHK00500_SECCOMPAS_APPL
TNS Ping Utility for 32-bit Windows: Version 11.2.0.2.0 - Production on 12-APR-2
013 10:26:26
Copyright (c) 1997, 2010, Oracle. All rights reserved.
Used parameter files:
c:\apps\oracle\network\admin\sqlnet.ora
Used LDAP adapter to resolve the alias
Attempting to contact (DESCRIPTION = (SDU = 8192) (TDU = 8192) (ADDRESS_LIST = (
ADDRESS = (PROTOCOL = TCP)(HOST = PHKLOD2002-SCAN.ap.hedani.net)(PORT = 1522)) (
LOAD_BALANCE = on) (FAILOVER = on ) ) (CONNECT_DATA = (SERVICE_NAME = UHK00500_S
ECCOMPAS_APPL.WORLD) (FAILOVER_MODE = (TYPE = session) (METHOD = basic) (RETRIES
= 20) (DELAY = 5))))
OK (60 msec)
R script output
Oracle 11g driver
chan <- odbcDriverConnect("driver=Oracle in OraHome112_32;DBQ=UHK00500_SECCOMPAS_APPL;UID=toolkit;PWD=**")
Warning messages:
1: In odbcDriverConnect("driver=Oracle in OraHome112_32;DBQ=UHK00500_SECCOMPAS_APPL;UID=toolkit;PWD=**") :
[RODBC] ERROR: state 08004, code 12154, message [Oracle][ODBC][Ora]ORA-12154: TNS:could not resolve the connect identifier specified
2: In odbcDriverConnect("driver=Oracle in OraHome112_32;DBQ=UHK00500_SECCOMPAS_APPL;UID=toolkit;PWD=**") :
ODBC connection failed
ODBC driver output
chan <- odbcDriverConnect("Driver={Microsoft ODBC for Oracle};Server=UHK00500_SECCOMPAS_APPL;Uid=toolkit;Pwd=**")
Warning messages:
1: In odbcDriverConnect("Driver={Microsoft ODBC for Oracle};Server=UHK00500_SECCOMPAS_APPL;Uid=toolkit;Pwd=*") :
[RODBC] ERROR: state 08001, code 12154, message [Microsoft][ODBC driver for Oracle][Oracle]ORA-12154: TNS:could not resolve the connect identifier specified
2: In odbcDriverConnect("Driver={Microsoft ODBC for Oracle};Server=UHK00500_SECCOMPAS_APPL;Uid=toolkit;Pwd=**") :
ODBC connection failed
Can someone please advice what i should check here to correct this issue?
not sure what the issue was, but after restarted my R instance the connection was fine.
Related
I'm trying to connect to a database via SSL using the code suggested here: https://github.com/ropensci/ssh/issues/13
I listed the dummy code below that shows how I enable the connection and try to query some data. The solution works great for 'smaller' queries.
However, when I try to get 'larger' data, the query fails and R gives back the following error:
System failure for: recv() from user (Connection reset by peer) follwed by a fetching error Failed to fetch row: SSL error: decryption failed or bad record mac (see output in code snipped)
Accordingly to the 1st error message, I suppose the error occurs served-sided ('reset by peer' --> What does "connection reset by peer" mean?).
Is that true or is there a way to fix this error on the client side (in R)?
ssh::ssh_read_key(file = ssh::ssh_home("id_rsa"), password = "rsa_password")
cmd <- "session <- ssh::ssh_connect('user#host:port');ssh::ssh_tunnel(session, port = 5432, target = '127.0.0.1:5432')"
pid <- sys::r_background(std_out = T, args = c("-e", cmd))
dbcon <-DBI::dbConnect(drv = RPostgres::Postgres(),
dbname = "db_name",
host = "127.0.0.1",
port = 5432,
user = "db_user",
password = "db_password",
base::list(sslmode="require"),
service = NULL)
# example of working query
res <- DBI::dbGetQuery(conn = dbcon, statement = "SELECT * FROM small_table") #
# example of non-working query (see R-otutput)
res <- DBI::dbGetQuery(conn = dbcon, statement = "SELECT * FROM large_table;") #
## R-output
# Tunneled 31897311 bytes...Fehler: System failure for: recv() from user (Connection reset by peer)
# Ausführung angehalten
# Fehler: Failed to fetch row: SSL error: decryption failed or bad record mac
# Warnmeldung:
# Disconnecting from unused ssh session. Please use ssh_disconnect()
"Connection reset by peer" means that whatever you have tried connecting to has responded in an RST flag, meaning that they have reset the connection.
I am trying to connect to oracle database using odbc package in R and while connecting it is throwing caught segfault and we are unable to connect to database. Please find the details mentioned below and we have configured unixodbc configuration in the server but with RODBC package, we are able to establish the connection.
library(odbc)
Target2Conn<-dbConnect(odbc::odbc(), dsn = "PEGA_DEV", uid = "username", pwd = "username_123")
* caught segfault *
address 0x38, cause 'memory not mapped'
Traceback:
1:
odbc_connect(connection_string, timezone = timezone, timezone_out = timezone_out, encoding = encoding, bigint = bigint, timeout = timeout)
2:
OdbcConnection(dsn = dsn, ..., timezone = timezone, timezone_out = timezone_out, encoding = encoding, bigint = bigint, timeout = timeout, driver = driver, server = server, database = database, uid = uid, pwd = pwd, dbms.name = dbms.name, .connection_string = .connection_string)
3: .local(drv, ...)
4:
dbConnect(odbc::odbc(), dsn = "PEGA_DEV", uid = "username", pwd = "username_123")
5:
dbConnect(odbc::odbc(), dsn = "PEGA_DEV", uid = "username", pwd = "username_123")
Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection
I am trying to access to a database HWSD.md which can be downloaded here : http://webarchive.iiasa.ac.at/Research/LUC/External-World-soil-database/HTML/HWSD_Data.html?sb=4
I have a linux machine and I tried this command line but I have a message error
require(MonetDB.R)
Mydb <- src_monetdb("*/HWSD.mdb")
Error in socketConnection(host = host, port = port, blocking = TRUE, open = "r+b", :
cannot open the connection
In addition: Warning message:
In socketConnection(host = host, port = port, blocking = TRUE, open = "r+b", :
localhost:50000 cannot be opened
Anyone can help me out with it?
I'm having a problem with configuring oracle odbc
the dialog page is blank
when I enter TNS name as : XE
I get the following error:
unable to connect SQLState=08004
my tnsnames file is:
KPI_SERVER=
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST =localhost)(PORT =1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = XE)
)
)
the connection is successful in SQL developer by the following data:
hostname: localhost
port number: 1521
service: XE
and the trns_admin variable is set to: C:\oracle_odbc\tnsnames
Path is set to: C:\oracle_odbc
what did I do wrong?
thank you for your time
We try to make order: open a command prompt,
launch echo %TNS_ADMIN% the result is C:\oracle_odbc\?
launch dir C:\oracle_odbc\, the result is tnsnames.ora?
launch type C:\oracle_odbc\tnsnames.ora the result is the content of "my tnsnames file is" section of your initial post?
If all the response are yes, can you retry to lauch 'sqlplus.exe dbuser/dbpassword#KPI_SERVER
Recently I found out about great dplyr.spark.hive package that enables dplyr frontend operations with spark or hive backend .
There is an information on how to install this package in package's README :
options(repos = c("http://r.piccolboni.info", unlist(options("repos"))))
install.packages("dplyr.spark.hive")
and there are also many examples on how to work with dplyr.spark.hive when one is already connected to hiveServer - check this.
But I am not able to connect to hiveServer, so I can not benefit from the great power of this package...
I've tried such commands, but they did not work out. Does anyone have any solution or comment on what am I doing wrong?
> library(dplyr.spark.hive,
+ lib.loc = '/opt/wpusers/mkosinski/R/x86_64-redhat-linux-gnu-library/3.1')
Warning: changing locked binding for ‘over’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning: changing locked binding for ‘partial_eval’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning: changing locked binding for ‘default_op’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning messages:
1: replacing previous import by ‘purrr::%>%’ when loading ‘dplyr.spark.hive’
2: replacing previous import by ‘purrr::order_by’ when loading ‘dplyr.spark.hive’
>
> Sys.setenv(SPARK_HOME = "/opt/spark-1.5.0-bin-hadoop2.4")
> Sys.setenv(HIVE_SERVER2_THRIFT_BIND_HOST = 'tools-1.hadoop.srv')
> Sys.setenv(HIVE_SERVER2_THRIFT_PORT = '10000')
>
> my_db = src_SparkSQL()
Error in .jfindClass(as.character(driverClass)[1]) : class not found
>
> my_db = src_SparkSQL(host = 'jdbc:hive2://tools-1.hadoop.srv:10000/loghost;auth=noSasl',
+ port = 10000)
Error in .jfindClass(as.character(driverClass)[1]) : class not found
>
> my_db = src_SparkSQL(start.server = TRUE)
Error in start.server() :
Couldn't start thrift server:org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 running as process 37580. Stop it first.
In addition: Warning message:
running command 'cd /opt/tech/prj_bdc/pmozie_status/user_topics;/opt/spark-1.5.0-bin-hadoop2.4/sbin/start-thriftserver.sh ' had status 1
>
> my_db = src_SparkSQL(start.server = TRUE,
+ list(spark.num.executors='5', spark.executor.cores='5', master="yarn-client"))
Error in start.server() :
Couldn't start thrift server:org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 running as process 37580. Stop it first.
In addition: Warning message:
running command 'cd /opt/tech/prj_bdc/pmozie_status/user_topics;/opt/spark-1.5.0-bin-hadoop2.4/sbin/start-thriftserver.sh ' had status 1
EDIT 2
I have set more paths to system variables like this but now I receive a warning telling me that some kind of Java logging-configuration is not specified bu I think it is
> library(dplyr.spark.hive,
+ lib.loc = '/opt/wpusers/mkosinski/R/x86_64-redhat-linux-gnu-library/3.1')
Warning messages:
1: replacing previous import by ‘purrr::%>%’ when loading ‘dplyr.spark.hive’
2: replacing previous import by ‘purrr::order_by’ when loading ‘dplyr.spark.hive’
3: package ‘SparkR’ was built under R version 3.2.1
>
> Sys.setenv(SPARK_HOME = "/opt/spark-1.5.0-bin-hadoop2.4")
> Sys.setenv(HIVE_SERVER2_THRIFT_BIND_HOST = 'tools-1.hadoop.srv')
> Sys.setenv(HIVE_SERVER2_THRIFT_PORT = '10000')
> Sys.setenv(HADOOP_JAR = "/opt/spark-1.5.0-bin-hadoop2.4/lib/spark-assembly-1.5.0-hadoop2.4.0.jar")
> Sys.setenv(HADOOP_HOME="/usr/share/hadoop")
> Sys.setenv(HADOOP_CONF_DIR="/etc/hadoop")
> Sys.setenv(PATH='/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/share/hadoop/bin:/opt/hive/bin')
>
>
> my_db = src_SparkSQL()
log4j:WARN No appenders could be found for logger (org.apache.hive.jdbc.Utils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
My log properties are not empty.
-bash-4.2$ wc /etc/hadoop/log4j.properties
179 432 6581 /etc/hadoop/log4j.properties
EDIT 3
My exact call to the scr_SparkSQL() is
> detach("package:SparkR", unload=TRUE)
Warning message:
package ‘SparkR’ was built under R version 3.2.1
> detach("package:dplyr", unload=TRUE)
> library(dplyr.spark.hive, lib.loc = '/opt/wpusers/mkosinski/R/x86_64-redhat-linux-gnu-library/3.1')
Warning: changing locked binding for ‘over’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning: changing locked binding for ‘partial_eval’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning: changing locked binding for ‘default_op’ in ‘dplyr’ whilst loading ‘dplyr.spark.hive’
Warning messages:
1: replacing previous import by ‘purrr::%>%’ when loading ‘dplyr.spark.hive’
2: replacing previous import by ‘purrr::order_by’ when loading ‘dplyr.spark.hive’
> Sys.setenv(HADOOP_JAR = "/opt/spark-1.5.0-bin-hadoop2.4/lib/spark-assembly-1.5.0-hadoop2.4.0.jar")
> Sys.setenv(HIVE_SERVER2_THRIFT_BIND_HOST = 'tools-1.hadoop.srv')
> Sys.setenv(HIVE_SERVER2_THRIFT_PORT = '10000')
> my_db = src_SparkSQL()
log4j:WARN No appenders could be found for logger (org.apache.hive.jdbc.Utils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
And then the proces does not stop (never).
Where those settings work for beeline with such params:
beeline -u "jdbc:hive2://tools-1.hadoop.srv:10000/loghost;auth=noSasl" -n mkosinski --outputformat=tsv --incremental=true -f sql_statement.sql > sql_output
but I am not able to pass user name and dbname to src_SparkSQL()
so I have tried to manual use the code from inside that function but I receive the sam problem that the below code also does not finish
host = 'tools-1.hadoop.srv'
port = 10000
driverclass = "org.apache.hive.jdbc.HiveDriver"
Sys.setenv(HADOOP_JAR = "/opt/spark-1.5.0-bin-hadoop2.4/lib/spark-assembly-1.5.0-hadoop2.4.0.jar")
library(RJDBC)
dr = JDBC(driverclass, Sys.getenv("HADOOP_JAR"))
url = paste0("jdbc:hive2://", host, ":", port)
class = "Hive"
con.class = paste0(class, "Connection") # class = "Hive"
# dbConnect_retry =
# function(dr, url, retry){
# if(retry > 0)
# tryCatch(
# dbConnect(drv = dr, url = url),
# error =
# function(e) {
# Sys.sleep(0.1)
# dbConnect_retry(dr = dr, url = url, retry - 1)})
# else dbConnect(drv = dr, url = url)}
#################
##con = new(con.class, dbConnect_retry(dr, url, retry = 100))
#################
con = new(con.class, dbConnect(dr, url, user = "mkosinski", dbname = "loghost"))
Maybe the url should containg also /loghost - the dbname?
I now see that you tried multiple things with multiple errors. Let me comment error by error.
my_db = src_SparkSQL()
Error in .jfindClass(as.character(driverClass)[1]) : class not found
The RJDBC object could not be created. Unless we solve this, nothing else will work, workarounds or not. Have you set HADOOP_JAR with, for instance,
Sys.setenv(HADOOP_JAR = "../spark/assembly/target/scala-2.10/spark-assembly-1.5.0-hadoop2.6.0.jar"). Sorry I seem to have skipped this in the instructions. Will fix.
my_db = src_SparkSQL(host = 'jdbc:hive2://tools-1.hadoop.srv:10000/loghost;auth=noSasl',
+ port = 10000)
Error in .jfindClass(as.character(driverClass)[1]) : class not found
Same problem. Please note host port argument do not accept URL syntax, just host and port. URL is formed internally.
my_db = src_SparkSQL(start.server = TRUE)
Error in start.server() :
Couldn't start thrift server:org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 running as process 37580. Stop it first.
In addition: Warning message:
running command 'cd /opt/tech/prj_bdc/pmozie_status/user_topics;/opt/spark-1.5.0-bin-hadoop2.4/sbin/start-thriftserver.sh ' had status 1
Stop thriftserver first or connect to existing one, but you still have to fix the class not found problem.
my_db = src_SparkSQL(start.server = TRUE,
+ list(spark.num.executors='5', spark.executor.cores='5', master="yarn-client"))
Error in start.server() :
Couldn't start thrift server:org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 running as process 37580. Stop it first.
In addition: Warning message:
running command 'cd /opt/tech/prj_bdc/pmozie_status/user_topics;/opt/spark-1.5.0-bin-hadoop2.4/sbin/start-thriftserver.sh ' had status 1
Same as above.
Plan:
Set HADOOP_JAR. Find host and port of running thriftserver, if not default. Try src_SparkSQL with start.server = FALSE. If happy quit, else goto step 2
Stop existing thriftserver. Try again src_SparkSQL with start.server = TRUE
Let me know how things go.
There was a problem that I did't specify the proper classPath that was needed inside JDBC function that created a driver. Parameters to classPath in dplyr.spark.hive package are passed via HADOOP_JAR global variable.
To use JDBC as a driver to hiveServer2 (through the Thrift protocol) one need to add at least those 3 .jars with Java classes to create a proper driver
hive-jdbc-1.0.0-standalone.jar
hadoop/common/lib/commons-configuration-1.6.jar
hadoop/common/hadoop-common-2.4.1.jar
versions are arbitrary and should be compatible with the installed version of local hive, hadoop and hiveServer2.
They need to be set with the .Platform$path.sep (as described here)
classPath = c("system_path1_to_hive/hive/lib/hive-jdbc-1.0.0-standalone.jar",
"system_path1_to_hadoop/hadoop/common/lib/commons-configuration-1.6.jar",
"system_path1_to_hadoop/hadoop/common/hadoop-common-2.4.1.jar")
Sys.setenv(HADOOP_JAR= paste0(classPath, collapse=.Platform$path.sep)
Then when HADOOP_JAR is set one have to be carefull with hiveServer2 url. In my case it had to be
host = 'tools-1.hadoop.srv'
port = 10000
url = paste0("jdbc:hive2://", host, ":", port, "/loghost;auth=noSasl")
and finally the proper connection with hiveServer2 using RJDBC package is
Sys.setenv(HADOOP_HOME="/usr/share/hadoop/share/hadoop/common/")
Sys.setenv(HIVE_HOME = '/opt/hive/lib/')
host = 'tools-1.hadoop.srv'
port = 10000
url = paste0("jdbc:hive2://", host, ":", port, "/loghost;auth=noSasl")
driverclass = "org.apache.hive.jdbc.HiveDriver"
library(RJDBC)
.jinit()
dr2 = JDBC(driverclass,
classPath = c("/opt/hive/lib/hive-jdbc-1.0.0-standalone.jar",
#"/opt/hive/lib/commons-configuration-1.6.jar",
"/usr/share/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar",
"/usr/share/hadoop/share/hadoop/common/hadoop-common-2.4.1.jar"),
identifier.quote = "`")
url = paste0("jdbc:hive2://", host, ":", port, "/loghost;auth=noSasl")
dbConnect(dr2, url, username = "mkosinski") -> cont