sqoop query to get sql server data into cloudera manager - cloudera

sqoop import --connect 'jdbc:sqlserver://IP address;username=user;password=pswd;database=Master' --table [Person].[BusinessEntityContact] --target-dir /home/ubuntu/hdfs/dir is not working .
Reference:http://mapredit.blogspot.com/2011/10/sqoop-and-microsoft-sql-server.html [1]: http://i.stack.imgur.com/W5mBB.png

In your error logs got SQLServerException and says "Connect timed out. Verify the connection properties.". Please check whether you have access from where you try run this command and also MSSQL port "1433". Then add number of maps "-m" in your command.

Nirmale
can you try from your unix box to
curl http://131.107.174.121:1433
If you get "Empty reply from server" it is ok, or if you get an error like "couldn't connect to host", check with your SQL Server admin as to what port the SQL Server is listening on.

Best way to check is using sqoop list-tables command, something as follows:
sqoop list-tables -connect 'jdbc:sqlserver://IP address;username=user;password=pswd;database=Master' -username --password

Related

[DataDirect][ODBC lib] Driver Manager Message file not found. Please check for the value of InstallDir in your odbc.ini in Informatica

I am using informatica, I have Singlestore DB which I am trying to connect.
I am able to login to singelstore DB using Singlestore ODBC Driver as below.
Singlestore version:8.0.5
SS ODBC Driver version: 1.1.1
Singlestore is self managed.
[abc#rnd-2 ~]$ isql SingleStore-server
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> ^C
While I am trying to connect informatica with Singlestore using ODBC Connection, I am gettion error:
Message Code: WRT_8001
Message: Error connecting to database...
WRT_8001 [Session s_test Username dev DB Error -1
[DataDirect][ODBC lib] Driver Manager Message file not found. Please check for the value of InstallDir in your odbc.ini.
Database driver error...
Function Name : Connect
Database driver error...
Function Name : Connect
Database Error: Failed to connect to database using user [dev] and connection string [SingleStore-server].]Message Code: WRT_8001
Message: Error connecting to database...
WRT_8001 [Session s_test Username dev DB Error -1
[DataDirect][ODBC lib] Driver Manager Message file not found. Please check for the value of InstallDir in your odbc.ini.
Database driver error...
Function Name : Connect
Database driver error...
Function Name : Connect
Database Error: Failed to connect to database using user [dev] and connection string [SingleStore-server].]
My location of odbc.ini file: /etc/odbc.ini
odbc.ini
[SingleStore_server]
Description=SingleStore server
Driver=/home/abc/singlestore-connector-odbc-1.1.1-centos7-amd64/libssodbca.so
SERVER=<>
USER=<>
PASSWORD=<>
DATABASE=<>
PORT=<>
I added path in .bash_profile, but still getting same error:
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
export ODBCINI=/etc/odbc.ini
Pls let me know how to resolve this error.
Ref link: https://knowledge.informatica.com/s/article/577839?language=en_US
https://knowledge.informatica.com/s/article/Error-connecting-to-database-DataDirect-ODBC-lib-Driver-Manager-Message-file-not-found-Please-check-for-the-value-of-InstallDir-in-your-odbc-ini-while?language=en_US
https://docs.singlestore.com/managed-service/en/developer-resources/connect-with-application-development-tools/connect-with-odbc/the-singlestore-odbc-driver.html
Reg export ODBCINI=/etc/odbc.ini, I have seen Informatica always use their ODBC drivers. Can you please check if you have single store drivers available in /<INFA_HOME>/ODBCX.version/odbc.ini​ file? If yes, i highly recommend to use it.
If yes, please see if you can test the ODBC driver with Infa provided tool $INFA_HOME/tools/debugtools/ssgodbc -d dsn -u username -p password [-v] against your DB. This will ensure you have no issues with ODBC setup.
You can find all about this here link.
If no, then, pls make sure you have installed correct version single store ODBC drivers (32 or 64 bit) and Informatica user have RWX permission on them. Then,
Add the driver path to LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:$HOME/server_dir:$ODBCHOME/lib;
set ODBCINI=/etc/odbc.ini
grant access - chmod 777 /etc/odbc.ini
see if the tool ssgodbc is able to establish connection.
Please see the following examples of integrating SingleStore data with Informatica:
JDBC - https://www.cdata.com/kb/tech/singlestore-jdbc-informatica-cloud.rst
ODBC - https://www.cdata.com/kb/tech/singlestore-odbc-informatica.rst

Running Apache Airflow with Azure SQL server as backend DB

I'm trying to run airflow with Azure SQL database as backend using mssql+pyodbc connection string(all relevant drivers have been installed).
while airflow is able to connect to DB and create tables, i.e, airflow initdb runs successfully, I'm facing issues while running airflow scheduler, as a result, the tasks triggered are always in "running" state.
This is the error I get while running airflow scheduler:
*sqlalchemy.exc.ProgrammingError: (pyodbc.ProgrammingError) ('42000', "[42000] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Incorrect syntax near '1'. (102) (SQLExecDirectW)")
[SQL: SELECT dag.dag_id AS dag_dag_id
FROM dag
WHERE dag.is_paused IS 1 AND dag.dag_id IN (?)]
[parameters: ('example_http_operator',)]*
(Background on this error at: http://sqlalche.me/e/13/f405)
I'm using apache-airflow==1.10.11.
If you were able to run airflow + azure SQL DB with any configuration please feel free to jump in.
I found a document and talk the configuration about run airflow + azure SQL DB. Maybe it's helpful for you.
Ref: Setting up Airflow on Azure & connecting to MS SQL Server
This post also give some configurations about it: Apache Airflow - Connection issue to MS SQL Server using pymssql + SQLAlchemy
For MSSQL as backend DB, there is workaround in Airflow#10713. I using apache-airflow==1.10.15 and solved same error as yours.
The command suggested is attached, but I use vi update instead of run sed command.
RUN sed -i 's/import copy/import copy,sqlalchemy/g' /usr/local/lib/python3.6/site-packages/airflow/models/dag.py \ && sed -i 's/DagModel.is_paused.is_(True)/DagModel.is_paused == sqlalchemy.sql.expression.true()/g' /usr/local/lib/python3.6/site-packages/airflow/models/dag.py

SSH within R script to get to MySQL Database

I am trying to connect to a MySQL server, which is restricted by being connected to a given server. I am trying to connect through this restricting server while not physically connected.
Through the command line this is doable by creating a SSH connection, after which I can run MySQL commands from the command line. For example:
ssh myUsername#Hostname
myUsername#Hostname's password:
[myUsername#Host ~]$ mysql -h mySQLHost -u mySQLUsername -p mySQLPassword
However, I wish to connect to the MySQL database from within R, so I can send queries to read in tables into my current R session. Usually I would run a R session inside of the commandline, but the server does not have R installed on it.
For example, I have this snippet of code that work when I am physically connected to the server (filled in information changed):
myDB <- dbConnect(MySQL(), user="mySQLUsername", password="mySQLPassword", dbname="myDbname", host="mySQLHost")
In essence, I want to run this same command through a pipe, so that the myDB object is a working mySQL connection.
I have been trying to pipe my way into the restricting server from within R, and have been able to read in a csv file. For example:
dat <- read.table(pipe('ssh myUsername#Hostname "cat /path/to/your/file"'))
This prompts me for my password, and the table is read (as is suggested it would here). However, I am unsure how to translate this to a MySQL connection. For example, should I make the pipe part of the host argument? That was my first thought, but have been unable to make that work.
Any help would be appreciated.
I accomplish a similar task with Postgres using SSH tunneling. Effectively, what you're doing with an SSH tunnel is saying "establish a connection to the remote server, and make a port from that server available as a port on my local machine."
You can set up a SSH tunnel using the following command on your local machine:
ssh -L local_port:lochalhost:remote_port username#remote_host
Specifically, what you're doing with this command is creating a Local Port Forwarding SSH tunnel, which is taking the port you'd connect to directly on the machine with your database installed (remote_port), and securely sending it to the machine you have R installed on as local_port.
For example, for a database server with the following options:
hostname: 192.168.1.3
username: mysql
server mysql port: 3306
You could use the following command (at the command line, or in R using system2) to create a tunnel to port 9000 on your machine:
ssh -L 9000:localhost:3306 mysql#192.168.1.3
Depending on what your exact DBI connection looks like in R, you may have to edit the connection configuration slightly to make it connect to your newly created tunneling port. The reason why I use a different localhost port is that it prevents conflicts with a local version of the database, if you've got one.

Issue in oozie installation: Error: IO_ERROR : java.net.UnknownHostException: master

I tried to install oozie on my pc and looks like successfully installed
oozie admin -oozie http://localhost:11000/oozie -status
System mode: NORMAL
But when running an oozie job it is showing below error
Error: IO_ERROR : java.net.UnknownHostException: master
Can you please suggest what could be the reason?
The reason for this failure could be further examined in oozie.log, which usually located in /var/log/oozie folder for CDH distribution. Or the log location could be examined with command:
ps -ef | grep oozie
and look for "-Doozie.log.dir=..."
And the oozie host needs to be reachable from the host where the command line is invoked. Try telnet to oozie port to make sure connection is good. An example session is like this:
[cloudera#localhost hadoop-yarn]$ telnet master 11000
Trying 127.0.0.1...
Connected to master.
Escape character is '^]'.

How to verify if SFTP access has been granted on a server

How can we verify that SFTP access has been granted on a server, without installing any software/tools?
Most servers have curl and scp installed, which you can use to log into an SFTP server. To test if your credentials work using curl, you could do this:
$ curl -u username sftp://example.org/
Enter host password for user 'username':
Enter your password and if it works you'll get a listing of files (like ls -al), if it doesn't work you'll get an error like this:
curl: (67) Authentication failure
You could also try using scp:
$ scp username#example.org:testing .
Password:
scp: testing: No such file or directory
This verifies that you that you were able to log in, but it couldn't find the testing file. If you weren't able to log in you'd get a message like this:
Permission denied, please try again.
Received disconnect from example.org: 2: ...error message...
One of the many ways to check for SFTP access using password based authentication:
sftp username#serverName
or
sftp username#serverIP
And then entering password.
You will get "Permission denied, please try again." message if it fails otherwise you will be allowed inside the server with screen-
sftp>
You can test it fully works with commands like ls, mkdir etc.
Try logging in.
Not being snarky -- that really is probably the simplest way. By 'verify[ing] that SFTP access has been granted," what you're really doing is checking is a particular l/p pair is recognized by the server.
Alternatively, other than doing the "sftp -v" command mentioned above, you can always cat the SSH/SFTP logs stored on any server running sshd and direct them to a file for viewing.
A command set like the following would work, where 1.1.1 would be the /24 of the block you are trying to search.
cd /var/log/
cat secure.4 secure.3 secure.2 secure.1 secure |grep sshd| grep -v 1.1.1> /tmp/secure.sshd.txt
gzip -9 /tmp/secure.sshd.txt
G'day,
What about telnet on to port 115 (if we're talking Simple FTP) and see what happens when you connect. If you don't get refused try sending a USER command, then a PASS command, and then a QUIT command.
HTH
cheers,
In SFTP , the authentication can be of following types :
1. Password based authetication
2. Key based authentication
But if u r going for key based authentication then u have to prepare setup according to that and
proceed the login procedure.If the key based authentication fails it automatically asks for password means it automatically switches to password based mode. By the way if u want to verify u can use this on linux :
"ssh -v user#IP "
It will show u all the debug messages , and if the authentication is passed u will be logged in otherwise u will get "Permission denied". Hope this will help u.

Resources