Best practices for managing connections for database polling using pyodbc? - odbc

I need to run a stored procedure on an Azure SQL Managed Instance every 10 seconds from a Python application. The specific call to cursor.execute() happens in a class that extends threading.Thread like so:
class Parser(threading.Thread):
def __init__(self, name, event, interface, config):
threading.Thread.__init__(self)
self.name = name
self.stopped = event
self.interface = interface
self.config = config
self.connection_string = config['connection_string']
self.cnxn = pyodbc.connect(self.connection_string)
def run(self):
while not self.stopped.wait(10):
try:
cursor = self.cnxn.cursor()
cursor.execute("exec dbo.myStoredProcedure")
except Exception as e:
logging.error(e)
My current challenge is that the above thread does not recover gracefully from interruptions to network connectivity. My goal is to have the thread continue to run and re-attempt every 10 seconds until connectivity is restored, then recover gracefully.
Is the best practice here to delete and recreate the connection with every pass of the while loop?
Should I be using ConnectRetryCount or ConnectRetryInterval in my connection string?
While debugging I have found that even after connectivity is restored, pyodbc.connect() still fails with ODBC error 08S01 Communication link failure.
I have looked at the solution proposed in this post but don't see how to apply that solution to a continuous polling architecture.

Related

start_server() method in python asyncio module

This question is really for the different coroutines in base_events.py and streams.py that deal with Network Connections, Network Servers and their higher API equivalents under Streams but since its not really clear how to group these functions I am going to attempt to use start_server() to explain what I don't understand about these coroutines and haven't found online (unless I missed something obvious).
When running the following code, I am able to create a server that is able to handle incoming messages from a client and I also periodically print out the number of tasks that the EventLoop is handling to see how the tasks work. What I'm surprised about is that after creating a server, the task is in the finished state not too long after the program starts. I expected that a task in the finished state was a completed task that no longer does anything other than pass back the results or exception.
However, of course this is not true, the EventLoop is still running and handling incoming messages from clients and the application is still running. Monitor however shows that all tasks are completed and no new task is dispatched to handle a new incoming message.
So my question is this:
What is going on underneath asyncio that I am missing that explains the behavior I am seeing? For example, I would have expected a task (or tasks created for each message) that is handling incoming messages in the pending state.
Why is the asyncio.Task.all_tasks() passing back finished tasks. I would have thought that once a task has completed it is garbage collected (so long as no other references are to it).
I have seen similar behavior with the other asyncio functions like using create_connection() with a websocket from a site. I know at the end of these coroutines, their result is usually a tuple such as (reader, writer) or (transport, protocol) but I don't understand how it all ties together or what other documentation/code to read to give me more insight. Any help is appreciated.
import asyncio
from pprint import pprint
async def echo_message(self, reader, writer):
data = await reader.read(1000)
message = data.decode()
addr = writer.get_extra_info('peername')
print('Received %r from %r' % (message, addr))
print('Send: %r' % message)
writer.write(message.encode())
await writer.drain()
print('Close the client socket')
writer.close()
async def monitor():
while True:
tasks = asyncio.Task.all_tasks()
pprint(tasks)
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(asyncio.start_server(echo_message, 'localhost', 7777, loop))
loop.run_forever()
Outputs:
###
# Soon after starting the application, monitor prints out:
###
{<Task pending coro=<start_server() running ...>,
<Task pending coro=<monitor() running ...>,
<Task pending coro=<BaseEventLoop._create_server_getaddrinfo() running ...>}
###
# After, things initialized and the server has started and the next print out is:
###
{<Task finished coro=<start_server() done ...>,
<Task pending coro=<monitor() running ...>,
<Task finished coro=<BaseEventLoop._create_server_getaddrinfo() done ...>}

Autobahn+twisted reconnecting

I have a series of clients which need to be connected constantly to my server via ws protocol. For a number of different reasons, the connections occasionally drop. This is acceptable, but when it happens I'd like my clients to reconnect.
Currently my temporary workaround is to have a parent process launch the client and when it detects connection drop, terminate it (client never handles any critical data, there are no side effects to sigkill-ing it) and respawn a new client. While this does the job, I'd very much prefer to fix the actual problem.
This is roughly my client:
from autobahn.twisted.websocket import WebSocketClientProtocol, WebSocketClientFactory
from twisted.internet import reactor
from threading import Thread
from time import sleep
class Client:
def __init__(self):
self._kill = False
self.factory = WebSocketClientFactory("ws://0.0.0.0")
self.factory.openHandshakeTimeout = 60 # ensures handshake doesnt timeout due to technical limitations
self.factory.protocol = self._protocol_factory()
self._conn = reactor.connectTCP("0.0.0.0", 1234, self.factory)
reactor.run()
def _protocol_factory(self):
class ClientProtocol(WebSocketClientProtocol):
def onConnect(self, response):
Thread(target=_self.mainloop, daemon=True).start()
def onClose(self, was_clean, code, reason):
_self.on_cleanup()
_self = self
return ClientProtocol
def on_cleanup(self):
self._kill = True
sleep(30)
# Wait for self.mainloop to finish.
# It is guaranteed to exit within 30 seconds of setting _kill flag
self._kill = False
self._conn = reactor.connectTCP("0.0.0.0", 1234, self.factory)
def mainloop(self):
while not self._kill:
sleep(1) # does some work
This code makes client work correctly until first connection drop at which point it attempts to reconnect. No exceptions are raised during the process, it appears that everything went correctly clientside, the onConnect is called and new mainloop starts, but the server never received that second handshake. Client seems to think it is connected, though.
What am I doing wrong? Why could this be happening?
I'm not a twisted expert and can't really tell what you are doing wrong, but I'm currently working with Autobahn in a project and I solved the reconnection problems using the ReconnectingClientFactory. Maybe you want to check some examples of the use of ReconnectingClientFactory with Autobahn.

SQLite.NET PCL Busy Exception

We are using the SQLite.NET PCL in a Xamarin application.
When putting the database under pressure by doing inserts into multiple tables we are seeing BUSY exceptions being thrown.
Can anyone explain what the difference is between BUSY and LOCKED? And what causes the database to be BUSY?
Our code uses a single connection to the database created using the following code:
var connectionString = new SQLiteConnectionString(GetDefaultConnectionString(),
_databaseConfiguration.StoreTimeAsTicks);
var connectionWithLock = new SQLiteConnectionWithLock(new SQLitePlatformAndroid(), connectionString);
return new SQLiteAsyncConnection (() => { return connectionWithLock; });
So our problem turned out to be that although we had ensured within the class we'd written that it only created a single connection to the database we hadn't ensured that this class was a singleton, therefore we were still creating multiple connections to the database. Once we ensured it was a singleton then the busy errors stopped
What I've take from this is:
Locked means you have multiple threads trying to access the database, the code is inherently not thread safe.
Busy means you have a thread waiting on another thread to complete, your code is thread safe but you are seeing contention in using the database.
...current operation cannot proceed because the required resources are locked...
I am assuming that you are using async-style inserts and are on different threads and thus an insert is timing out waiting for the lock of a different insert to complete. You can use synchronous inserts to avoid this condition. I personally avoid this, when needed, by creating a FIFO queue and consuming that queue synchronously on a dedicated thread. You could also handle the condition by retrying your transaction X number of times before letting the Exception ripple up.
SQLiteBusyException is a special exception that is thrown whenever SQLite returns SQLITE_BUSY or SQLITE_IOERR_BLOCKED error code. These codes mean that the current operation cannot proceed because the required resources are locked.
When a timeout is set via SQLiteConnection.setBusyTimeout(long), SQLite will attempt to get the lock during the specified timeout before returning this error.
Ref: http://www.sqlite.org/lockingv3.html
Ref: http://sqlite.org/capi3ref.html#sqlite3_busy_timeout
I have applied the following solution which works in my case(mobile app).
Use sqlitepclraw.bundle_green nugget package with SqlitePCL.
Try to use the single connection throughout the app.
After creating the SQLiteConnection.
Apply busytime out using following call.
var connection = new SQLiteConnection(databasePath: path);
SQLite3.BusyTimeout(connection.Handle, 5000); // 5000 millisecond.

Workflow hosted inside WorkflowApplication, aborting on persistence

I have been trying to resolve a somewhat intermittent issue working with a long running state machine running inside a WorkflowApplication. I can step through the workflow and this behaves as expected, transitioning through the states as expected, then a bookmark is reached which then persists the workflow. However, the workflow is then aborted and I get the following message:
The execution of an InstancePersistenceCommand was interrupted because the instance owner registration for owner ID 'ba26f4e9-f38b-4179-aa09-31ab9f8af337' has become invalid. This error indicates that the in-memory copy of all instances locked by this owner have become stale and should be discarded, along with the InstanceHandles. Typically, this error is best handled by restarting the host.
The Sql Instance store is initialised as follows:
SqlStore = new SqlWorkflowInstanceStore(ConfigurationManager.ConnectionStrings["SqlInstanceStore"].ConnectionString);
SqlStore.HostLockRenewalPeriod = TimeSpan.FromSeconds(15);
SqlStore.InstanceCompletionAction = InstanceCompletionAction.DeleteAll;
handle = SqlStore.CreateInstanceHandle();
InstanceView sqlView = SqlStore.Execute(handle, new CreateWorkflowOwnerCommand(), TimeSpan.FromSeconds(30));
SqlStore.DefaultInstanceOwner = sqlView.InstanceOwner;
WorkflowHost = new WorkflowApplication(WorkflowDefinition, inputs);
WorkflowHost.Run();
To create the bookmark:
context.CreateBookmark(bkmk, OnResume);
The exception doesn't really provide enough information to help troubleshooting this issue. Any help would be appreciated.
I managed to resolve this by using an overload of the instance store where I am now passing a new GUID. Need to understand this a bit more.
handle = SqlStore.CreateInstanceHandle(Guid.NewGuid());

Correct usage of session with asynchronous servers in SQLAlchemy

Background:
We have a Python web application which uses SqlAlchemy as ORM. We run this application with Gunicorn(sync worker) currently. This application is only used to respond LONG RUNNING REQUESTS (i.e. serving big files, please don't advise using X-Sendfile/X-Accel-Redirect because the response is generated dynamically from Python app).
With Gunicorn sync workers, when we run 8 workers only 8 request is served simulatenously. Since all of these responses are IO bound, we want to switch to asyncronous worker type to get better throughput.
We have switched the worker type from sync to eventlet in Gunicorn configuration file. Now we can respond all of the requests simultaneously but another mysterious (mysterious to me) problem has occured.
In the application we have a scoped session object in module level. Following code is from our orm.py file:
uri = 'mysql://%s:%s#%s/%s?charset=utf8&use_unicode=1' % (\
config.MYSQL_USER,
config.MYSQL_PASSWD,
config.MYSQL_HOST,
config.MYSQL_DB,
)
engine = create_engine(uri, echo=False)
session = scoped_session(sessionmaker(
autocommit=False,
autoflush=False,
bind=engine,
query_cls=CustomQuery,
expire_on_commit=False
))
Our application uses the session like this:
from putio.models import session
f = session.query(File).first()
f.name = 'asdf'
session.add(f)
session.commit()
While we using sync worker, session was used from 1 request at a time. After we have switched to async eventlet worker, all requests in the same worker share the same session which is not desired. When the session is commited in one request, or an exception is happened, all other requests fail because the session is shared.
In documents of SQLAlchemy, said that scoped_session is used for seperated sessions in threaded environments. AFAIK requests in async workers run in same thread.
Question:
We want seperate sessions for each request in async worker. What is the correct way of using session with async workers in SQLAlchemy?
Use scoped_session's scopefunc argument.

Resources