Spark SQL ODBC Connection not connected - odbc

I have build the spark source using the following command
mvn -Pyarn -Phadoop-2.5 -Dhadoop.version=2.5.2 -Phive -Phive-1.1.0 -Phive-thriftserver -DskipTests clean package
I have started the thrift server using the following command
spark-submit --class org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 --master local[*] file:///c:/spark-1.3.1/sql/hive-thriftserver/target/spark-hive-thriftserver_2.10-1.3.1.jar
Connected the thriftserver in beeline using the following command
Jdbc:hive2://localhost:10000
Created the table named as people using the following query
Create table people(Name String);
Load data local inpath ‘C:\spark-1.3.1\examples\src\main\resources\people.txt’ overwrite into table people;
How to read this table from C# application using odbc connection or thrift library?
I have use the following code snippet to read the table using C# code generated by thrift and Thrift dll
Console.WriteLine("Thrift hive server for Spark SQL Connection....");
TSocket hiveSocket = new TSocket("localhost", 10000);
TBinaryProtocol protocol =new TBinaryProtocol(hiveSocket);
ThriftHive.Client client = new ThriftHive.Client(protocol);
if (!hiveSocket.IsOpen)
{
hiveSocket.Open();
}
Console.WriteLine("Thrift server connected");
client.execute("select * from people1");
But i can not execute the query.

It is not throwing any error or exception because there was probably no error and the processing worked. You just need to actually retrieve the results using client.fetchAll().

Related

How do I load data from my SQLite DB into Rstudio?

I created a database with SQL for a schoolproject. Currently I'm stuck at importing this data into rstudio. I put my DB file in this directory: /Users/milanpatty/Documents/Business/Semester_2/R/Proftaak". When I tried to make a connection with this code:
db <- dbConnect(SQLite(), dbname = 'Festivate.db')
I got this error:
Warning message:
Couldn't set synchronous mode: file is not a database
Use `synchronous` = NULL to turn off this warning.
BTW I use DB browser as an interface for SQLite. Could anybody help me with this problem?
edit: I installed the library RSQLite in R

database protocol 'sqlite' not supported - Failed to initialize zdb connection pool()

I am using libzdb - Database Connection Pool Library with sqlite database. I am getting following exception :
Failed to start connection pool - database protocol 'sqlite' not supported
After ConnectionPool_start() - it goes in static int _fillPool(T p), in that it is getting falied at above statement
Connection_T con = Connection_new(P, &P->error);
My connection url is as follows :
sqlite:///home/ZDB_TESTING/zdb-test/testDb.db
Kindly help me with this problem.
This means that the SQLite library is not compiled into the libzdb library. If installing from a distribution, make sure that you select libzdb built with SQLite. If you built libzdb yourself from source, after you run ./configure make sure the output says, SQLite3: ENABLED. Otherwise you need to install SQLite on your system first.

SparkR job(R script) submit using spark-submit fails in BigInsights Hadoop cluster

I have created IBM BigInsights service with hadoop cluster of 5 nodes(including Apache Spark with SparkR). I trying to use SparkR to connect cloudant db and get some data and do some processing.
SparkR job(R script) submit using spark-submit fails in BigInsights Hadoop cluster.
I have created SparkR script and ran the following code,
-bash-4.1$ spark-submit --master local[2] test_sparkr.R
16/08/07 17:43:40 WARN SparkConf: The configuration key 'spark.yarn.applicationMaster.waitTries' has been deprecated as of Spark 1.3 and and may be removed in the future. Please use the new key 'spark.yarn.am.waitTime' instead.
Error: could not find function "sparkR.init"
Execution halted
-bash-4.1$
Content of test_sparkr.R file is:
# Creating SparkConext and connecting to Cloudant DB
sc <- sparkR.init(sparkEnv = list("cloudant.host"="<<cloudant-host-name>>","<<><<cloudant-user-name>>>","cloudant.password"="<<cloudant-password>>", "jsonstore.rdd.schemaSampleSize"="-1"))
# Database to be connected to extract the data
database <- "testdata"
# Creating Spark SQL Context
sqlContext <- sparkRSQL.init(sc)
# Creating DataFrame for the "testdata" Cloudant DB
testDataDF <- read.df(sqlContext, database, header='true', source = "com.cloudant.spark",inferSchema='true')
How to install the spark-cloudant connector in IBM BigInsights and resolve the issue. Kindly do the needful. Help would be much appreciated.
I believe that the spark-cloudant connector isn’t for R yet.
Hopefully I can update this answer when it is!

Unable to Connect to Database using Custom Connection String with ODBC.jl

I'm having trouble connecting to a database with the ODBC.jl package. I can't tell if the problem is with my setup (more likely) or the package. The problem is that ODBC.jl can't seem to locate the correct ODBC driver.
> using ODBC
> ODBC.listdrivers()
/path/to/generic/odbc/
But I need to use a different driver than the one picked up from above.
I'm trying to use a custom connection string as follows:
>ODBC.DSN("DRIVER=path/to/driver/i/want;SERVER=myserver;USER=myuser;PASSWORD=mypass;DATABASE=somedb;")
which returns this:
[ODBC] IM002: [unixODBC][Driver Manager]Data source name not found, and no default driver specified
ERROR: ODBC.ODBCError("ODBC.API.SQLDriverConnect(dbc,window_handle,conn_string,out_conn.ptr,BUFLEN,out_buff,driver_prompt) failed; return code: -1 => SQL_ERROR ")
My understanding is that I should be able to specify the driver as done above, but this does not give the desired connection.
I have .odbc.ini and .odbcinist.ini files set-up in my home directory, which I believe are working correctly. I'm on a Suse enterprise distro. When connecting via isql i have no problems.
Any help is appreciated.

Problems connecting remotely to PostgreSQL on Heroku from R using RPostgreSQL

I'm using the RPostgreSQL 0.4 library (compiled on R 2.15.3) on R 2.15.2 under Windows 7 64-bit to interface to PostgreSQL. This works fine when connecting to my PostgreSQL databases on localhost. I'm trying to get my R code to run with a remote PostgreSQL database on Heroku. I can connect to Heroku's PostgreSQL database from the psql command shell on my machine, and it connects without a problem. I get the message:
psql (9.2.3, server 9.1.9)
WARNING: psql version 9.2, server version 9.1.
Some psql features might not work.
WARNING: Console code page (437) differs from Windows code page (1252)
8-bit characters might not work correctly. See psql reference
page "Notes for Windows users" for details.
SSL connection (cipher: DHE-RSA-AES256-SHA, bits: 256)
Clearly, psql uses SSL to connect. When I try to connect using the RPostgreSQL library routine dbConnect(), however, supplying exactly the same credentials using dname=, host=, port=, user=, password=, the connection fails with the complaint:
Error in postgresqlNewConnection(drv, ...) :
RS-DBI driver: (could not connect <user>#<hostname> on dbname <dbname>)
Calls: source ... .valueClassTest -> is -> is -> postgresqlNewConnection -> .Call
Execution halted
I know that Heroku insists on an SSL connection if you want to access their database remotely, so it seems likely that the R interface routine dbConnect() isn't trying SSL. Is there something else that I can do to get a remote connection from R to PostgreSQL on Heroku to work?
To get the JDBC URL for your heroku instance:
Get your hostname, username and password using [pg:credentials].
Your jdbc URL is going to be:
jdbc:postgresql://[hostname]/[database]?user=[user]&password=[password]&ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory
Proceed as you would normally with JDBC.
Apparently there is a way using RJDBC. See:
http://ryepup.unwashedmeme.com/blog/2010/11/17/working-with-r-postgresql-ssl-and-mssql/
Please note that in order to connect to Heroku database with JDBC externally, it is important to set the sslfactory parameter as well. Hope Heroku team goes through it and modifies their documentation.
String dbUri = "jdbc:postgresql://ec2-54-243-202-174.compute-1.amazonaws.com:5432/**xxxxxxx**";
Properties props = new Properties();
props.setProperty("user", "**xxxxx**");
props.setProperty("password", "**xxxxx**");
props.setProperty("ssl", "true");//ssl to be set true
props.setProperty("sslfactory", "org.postgresql.ssl.NonValidatingFactory");// sslfactory to be set as shown above
Connection c=DriverManager.getConnection(dbUri,props);
See answer to related Q at https://stackoverflow.com/a/38942581. The suggestion of using RPostgres (https://github.com/rstats-db/RPostgres) instead of RPostgreSQL resolved this same issue for me.

Resources