Airflow - MSSQL connection works on UI but fails in DAG run - airflow

On Airflow 2.2.4 (Postgres and Celery).
I have a connection created for Microsoft SQL Server. When I try to test the connection, UI asked me to enter API Auth user/password (Basic Auth) and showed up a green flash "Connection successfully tested"
But,
When I use the same connection ID in an Operator definition to run some SQL queries, I am getting an error as below.
pymssql._mssql.MSSQLDatabaseException: (20009, b'DB-Lib error message 20009, severity 9:\nUnable to connect: Adaptive Server is unavailable or does not exist (<IP>)\nNet-Lib error during Connection timed out (110)\n
DB-Lib error message 20009, severity 9:\nUnable to connect: Adaptive Server is unavailable or does not exist (<IP>)\n
Net-Lib error during Connection timed out (110)\n')
Here is what my code from my custom operator looks like,
def get_update_num(self):
conn = None
try:
query = f"""
UPDATE SOME_TABLE
SET COL_A = COL_A+1
OUTPUT INSERTED.COL_A
WHERE COL_ID='12345'
self.log.info(f"SQL = {query}")
conn_id = self.conn_id # This is equal to MSSQL_CONNECTION
hook = MsSqlHook(mssql_conn_id=conn_id)
conn = hook.get_conn()
hook.set_autocommit(conn, True)
cursor = conn.cursor()
cursor.execute(query)
row = cursor.fetchone()
self.log.info(f"row = {row}")
return row[0]
except Exception as e:
message = "Error: Could not run SQL"
raise AirflowException(message)
finally:
if not conn:
conn.close()
Any help would be much appreceated.

Related

Airflow Error callback "on_failure_callback" is not executing all the lines in the function

I have a problem in the usage of the on_failure_callback. I have defined my error callback function to perform 2 "http post" requests and I have added a logging.error( ) message between the two. I notice that only one is getting executed. Is there any delay or some thing that I am missing here?
please help.
def custom_failure_function(context):
logging.error("These task instances ahhh")
to_json= json.loads(t_teams)
var1= json.dumps(to_json)
print(var1)
r = requests.post('https://myteamschannel/teams', data=var1,verify=False)
logging.error("hello")
runID='OPERATION_CONTEXT .OCV8.TEST2 alarm_object 193'
headers = {'Content-Type':'text/xml'}
alarmRequest='<soapenv:Envelope xmlns:soapenv=\"http://schemas.xmlsoap.org/soap/envelope/\" xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><soapenv:Header xmlns:wsa=\"http://www.w3.org/2005/08/addressing\"><wsu:Timestamp xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsu:Created>2014-05-22T11:57:38.267Z</wsu:Created><wsu:Expires>2014-05-22T12:02:38.000Z</wsu:Expires></wsu:Timestamp><wsse:Security xmlns:wsse=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd\" soapenv:mustUnderstand=\"1\"><wsse:UsernameToken xmlns:wsu=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-utility-1.0.xsd\"><wsse:Username>girws</wsse:Username><wsse:Password Type=\"http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-username-token-profile-1.0#PasswordText\">Temip</wsse:Password></wsse:UsernameToken></wsse:Security></soapenv:Header> <soapenv:Body> <oper:Set_Request xmlns:oper=\"http://172.19.146.147:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object\"><EntitySpec><Natural> '+ runID + '</Natural></EntitySpec><Arguments> <Attribute_Values><Filtering_Type>' + 'AUTOFAIL' + '</Filtering_Type></Attribute_Values></Arguments></oper:Set_Request> </soapenv:Body> </soapenv:Envelope>'
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers, data=alarmRequest,verify=True)
logging.error ("FAILED TASK")
logging.error("============================================")
The logs of my airflow are below. Its stopping at the "hello" message and not printing "FAILED TASK".
*** Reading local file: /data/airflow//logs/MOE_TEST_DAG/TeamsTest/2021-10-02T08:24:14.821970+00:00/3.log
[2021-10-02 10:24:36,535] {MOE_TEST.py:132} ERROR - These task instances ahhh
[2021-10-02 10:24:36,987] {MOE_TEST.py:138} ERROR - hello
From your description it's more likely that there is an issue with requests.post() try to add timeout to the request:
def custom_failure_function(context):
...
try:
r = requests.post('http://myerrorappli:7180/TeMIP_WS/services/OPERATION_CONTEXT-alarm_object', header=headers,
data=alarmRequest, verify=True, timeout=5)
except requests.Timeout:
logging.error("request timeout")
except requests.ConnectionError:
logging.error("request connection error")
logging.error("FAILED TASK")
logging.error("============================================")

Connection reset by peer when trying to connect to Postgres via SSL using R

I'm trying to connect to a database via SSL using the code suggested here: https://github.com/ropensci/ssh/issues/13
I listed the dummy code below that shows how I enable the connection and try to query some data. The solution works great for 'smaller' queries.
However, when I try to get 'larger' data, the query fails and R gives back the following error:
System failure for: recv() from user (Connection reset by peer) follwed by a fetching error Failed to fetch row: SSL error: decryption failed or bad record mac (see output in code snipped)
Accordingly to the 1st error message, I suppose the error occurs served-sided ('reset by peer' --> What does "connection reset by peer" mean?).
Is that true or is there a way to fix this error on the client side (in R)?
ssh::ssh_read_key(file = ssh::ssh_home("id_rsa"), password = "rsa_password")
cmd <- "session <- ssh::ssh_connect('user#host:port');ssh::ssh_tunnel(session, port = 5432, target = '127.0.0.1:5432')"
pid <- sys::r_background(std_out = T, args = c("-e", cmd))
dbcon <-DBI::dbConnect(drv = RPostgres::Postgres(),
dbname = "db_name",
host = "127.0.0.1",
port = 5432,
user = "db_user",
password = "db_password",
base::list(sslmode="require"),
service = NULL)
# example of working query
res <- DBI::dbGetQuery(conn = dbcon, statement = "SELECT * FROM small_table") #
# example of non-working query (see R-otutput)
res <- DBI::dbGetQuery(conn = dbcon, statement = "SELECT * FROM large_table;") #
## R-output
# Tunneled 31897311 bytes...Fehler: System failure for: recv() from user (Connection reset by peer)
# Ausführung angehalten
# Fehler: Failed to fetch row: SSL error: decryption failed or bad record mac
# Warnmeldung:
# Disconnecting from unused ssh session. Please use ssh_disconnect()
"Connection reset by peer" means that whatever you have tried connecting to has responded in an RST flag, meaning that they have reset the connection.

How to write unittest cases for checking database connection with SQL server in Python?

I have just started using unittest in Python for writing test cases, I have a function that makes the connection with SQL server.
sql_connection.py
def getConnection():
connection = pyodbc.connect("Driver={ODBC Driver 13 for SQL Server};"
"Server="+appConfig['sql_server']['server']+";"
"Database="+appConfig['sql_server']['database']+";"
"UID="+appConfig['sql_server']['uid']+";"
"PWD="+appConfig['sql_server']['password']+";"
"Trusted_Connection=no;",
)
return connection
I have tried below test case for checking database connect or not.
test_connection.py
import pyodbc
getConnection1=getConnection()
class TestDatabseConnection(unittest.TestCase):
def test_getConnection(self):
try:
db_connection = getConnection1.connection
except pyodbc.Error as ex:
sqlstate = ex.args[1]
print(sqlstate)
self.fail(
"getConnection() raised pyodbc.OperationalError. " +
"Connection to database failed. Detailed error message: " + sqlstate)
self.assertIsNone(db_connection)
But still not able to get succeed.
======================================================================
ERROR: test_getConnection (__main__.TestDatabseConnection)
----------------------------------------------------------------------
Traceback (most recent call last):
File "test_connection.py", line 23, in test_getConnection
db_connection = getConnection1.connection
AttributeError: 'pyodbc.Connection' object has no attribute 'connection'
----------------------------------------------------------------------
Ran 1 test in 0.001s
FAILED (errors=1)
Please help me out in this.
A unit test for your getConnection could look like below. I would suggest using a patch and mock from unittest.mock. With a unit test you are only interested in testing the functionality of getConnection and should "Mock" all other function calls within the function. If you want to test the full potential of pyodbc.connect then I would suggest a functional test that actually connects to the databse which would no longer be a unit test. For more information on patch and Mock checkout the docs. These are very powerful and make unit testing fun and easy! unittest.mock
import unittest
from unittest.mock import patch, Mock
import pyodbc
def getConnection():
appConfig = {'sql_server': {'server':'', 'database':'', 'uid':'', 'password':''}}
connection = pyodbc.connect("Driver={ODBC Driver 13 for SQL Server};"
"Server="+appConfig['sql_server']['server']+";"
"Database="+appConfig['sql_server']['database']+";"
"UID="+appConfig['sql_server']['uid']+";"
"PWD="+appConfig['sql_server']['password']+";"
"Trusted_Connection=no;",
)
return connection
#patch('pyodbc_example.pyodbc')
class TestDatabseConnection(unittest.TestCase):
def test_getConnection(self, pyodbc_mock):
pyodbc_mock.connect.return_value = Mock()
connection = getConnection()
self.assertEqual(connection, pyodbc_mock.connect.return_value)

Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR when use openstacksdk to create_server

When I create the openstack server, I get bellow Exception:
Resource 7bed8adc-9ed9-49dc-b15e-6660e2fc3285 transitioned to failure state ERROR
My code is bellow:
server_args = {
"name":server_name,
"image_id":image_id,
"flavor_id":flavor_id,
"networks":[{"uuid":network.id}],
"admin_password": admin_password,
}
try:
server = user_conn.conn.compute.create_server(**server_args)
server = user_conn.conn.compute.wait_for_server(server)
except Exception as e: # there I except the Exception
raise e
When create_server, my server_args data is bellow:
{'flavor_id': 'd4424892-4165-494e-bedc-71dc97a73202', 'networks': [{'uuid': 'da4e3433-2b21-42bb-befa-6e1e26808a99'}], 'admin_password': '123456', 'name': '133456', 'image_id': '60f4005e-5daf-4aef-a018-4c6b2ff06b40'}
My openstacksdk version is 0.9.18.
In the end, I find the flavor data is too big for openstack compute node, so I changed it to a small flavor, so I create success.

SQLITE_ERROR: Connection is closed when connecting from Spark via JDBC to SQLite database

I am using Apache Spark 1.5.1 and trying to connect to a local SQLite database named clinton.db. Creating a data frame from a table of the database works fine but when I do some operations on the created object, I get the error below which says "SQL error or missing database (Connection is closed)". Funny thing is that I get the result of the operation nevertheless. Any idea what I can do to solve the problem, i.e., avoid the error?
Start command for spark-shell:
../spark/bin/spark-shell --master local[8] --jars ../libraries/sqlite-jdbc-3.8.11.1.jar --classpath ../libraries/sqlite-jdbc-3.8.11.1.jar
Reading from the database:
val emails = sqlContext.read.format("jdbc").options(Map("url" -> "jdbc:sqlite:../data/clinton.sqlite", "dbtable" -> "Emails")).load()
Simple count (fails):
emails.count
Error:
15/09/30 09:06:39 WARN JDBCRDD: Exception closing statement
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (Connection is closed)
at org.sqlite.core.DB.newSQLException(DB.java:890)
at org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)
at org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1$$anonfun$8.apply(JDBCRDD.scala:358)
at org.apache.spark.TaskContextImpl$$anon$1.onTaskCompletion(TaskContextImpl.scala:60)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:79)
at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:77)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:77)
at org.apache.spark.scheduler.Task.run(Task.scala:90)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
res1: Long = 7945
I got the same error today, and the important line is just before the exception:
15/11/30 12:13:02 INFO jdbc.JDBCRDD: closed connection
15/11/30 12:13:02 WARN jdbc.JDBCRDD: Exception closing statement
java.sql.SQLException: [SQLITE_ERROR] SQL error or missing database (Connection is closed)
at org.sqlite.core.DB.newSQLException(DB.java:890)
at org.sqlite.core.CoreStatement.internalClose(CoreStatement.java:109)
at org.sqlite.jdbc3.JDBC3Statement.close(JDBC3Statement.java:35)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.org$apache$spark$sql$execution$datasources$jdbc$JDBCRDD$$anon$$close(JDBCRDD.scala:454)
So Spark succeeded to close the JDBC connection, and then it fails to close the JDBC statement
Looking at the source, close() is called twice:
Line 358 (org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD, Spark 1.5.1)
context.addTaskCompletionListener{ context => close() }
Line 469
override def hasNext: Boolean = {
if (!finished) {
if (!gotNext) {
nextValue = getNext()
if (finished) {
close()
}
gotNext = true
}
}
!finished
}
If you look at the close() method (line 443)
def close() {
if (closed) return
you can see that it checks the variable closed, but that value is never set to true.
If I see it correctly, this bug is still in the master. I have filed a bug report.
Source: JDBCRDD.scala (lines numbers differ slightly)

Resources