Correct usage of session with asynchronous servers in SQLAlchemy - asynchronous

Background:
We have a Python web application which uses SqlAlchemy as ORM. We run this application with Gunicorn(sync worker) currently. This application is only used to respond LONG RUNNING REQUESTS (i.e. serving big files, please don't advise using X-Sendfile/X-Accel-Redirect because the response is generated dynamically from Python app).
With Gunicorn sync workers, when we run 8 workers only 8 request is served simulatenously. Since all of these responses are IO bound, we want to switch to asyncronous worker type to get better throughput.
We have switched the worker type from sync to eventlet in Gunicorn configuration file. Now we can respond all of the requests simultaneously but another mysterious (mysterious to me) problem has occured.
In the application we have a scoped session object in module level. Following code is from our orm.py file:
uri = 'mysql://%s:%s#%s/%s?charset=utf8&use_unicode=1' % (\
config.MYSQL_USER,
config.MYSQL_PASSWD,
config.MYSQL_HOST,
config.MYSQL_DB,
)
engine = create_engine(uri, echo=False)
session = scoped_session(sessionmaker(
autocommit=False,
autoflush=False,
bind=engine,
query_cls=CustomQuery,
expire_on_commit=False
))
Our application uses the session like this:
from putio.models import session
f = session.query(File).first()
f.name = 'asdf'
session.add(f)
session.commit()
While we using sync worker, session was used from 1 request at a time. After we have switched to async eventlet worker, all requests in the same worker share the same session which is not desired. When the session is commited in one request, or an exception is happened, all other requests fail because the session is shared.
In documents of SQLAlchemy, said that scoped_session is used for seperated sessions in threaded environments. AFAIK requests in async workers run in same thread.
Question:
We want seperate sessions for each request in async worker. What is the correct way of using session with async workers in SQLAlchemy?

Use scoped_session's scopefunc argument.

Related

grpc-swift: How to set timeout for an RPC in Swift?

I am using https://github.com/grpc/grpc-swift for inter-process communication. I have a GRPC server written in Go that listens on a unix domain socket, and a macOS app written in Swift that communicates with it over the socket.
Let's say the Go server process is not running and I make an RPC call from my Swift program. The default timeout before the call will fail is 20 seconds, but I would like to shorten it to 1 second. I am trying to do something like this:
let callOptions = CallOptions(timeLimit: .seconds(1)) // <-- Does not compile
This fails with compile error Type 'TimeLimit' has no member 'seconds'.
What is the correct way to decrease the timeout interval for Swift GRPC calls?
As mentioned in the error TimeLimit don't have a member seconds. This seconds function that you are trying to access is inside TimeAmount. So if you want to use a deadline, you will need to use:
CallOptions(timeLimit: .deadline(.now() + .seconds(1)))
here the .now is inside NIODeadline and it as a + operator defined for adding with TimeLimit (check here).
and for a timeout:
CallOptions(timeLimit: .timeout(.seconds(1)))
Note that I'm not an expert in Swift, but I checked in TimeLimitTests.swift and that seems to be the idea.

Best practices for managing connections for database polling using pyodbc?

I need to run a stored procedure on an Azure SQL Managed Instance every 10 seconds from a Python application. The specific call to cursor.execute() happens in a class that extends threading.Thread like so:
class Parser(threading.Thread):
def __init__(self, name, event, interface, config):
threading.Thread.__init__(self)
self.name = name
self.stopped = event
self.interface = interface
self.config = config
self.connection_string = config['connection_string']
self.cnxn = pyodbc.connect(self.connection_string)
def run(self):
while not self.stopped.wait(10):
try:
cursor = self.cnxn.cursor()
cursor.execute("exec dbo.myStoredProcedure")
except Exception as e:
logging.error(e)
My current challenge is that the above thread does not recover gracefully from interruptions to network connectivity. My goal is to have the thread continue to run and re-attempt every 10 seconds until connectivity is restored, then recover gracefully.
Is the best practice here to delete and recreate the connection with every pass of the while loop?
Should I be using ConnectRetryCount or ConnectRetryInterval in my connection string?
While debugging I have found that even after connectivity is restored, pyodbc.connect() still fails with ODBC error 08S01 Communication link failure.
I have looked at the solution proposed in this post but don't see how to apply that solution to a continuous polling architecture.

Flask-socketIO + Kafka as a background process

What I want to do
I have an HTTP API service, written in Flask, which is a template used to build instances of different services. As such, this template needs to be generalizable to handle use cases that do and do not include Kafka consumption.
My goal is to have an optional Kafka consumer running in the background of the API template. I want any service that needs it to be able to read data from a Kafka topic asynchronously, while also independently responding to HTTP requests as it usually does. These two processes (Kafka consuming, HTTP request handling) aren't related, except that they'll be happening under the hood of the same service.
What I've written
Here's my setup:
# ./create_app.py
from flask_socketio import SocketIO
socketio = None
def create_app(kafka_consumer_too=False):
"""
Return a Flask app object, with or without a Kafka-ready SocketIO object as well
"""
app = Flask('my_service')
app.register_blueprint(special_http_handling_blueprint)
if kafka_consumer_too:
global socketio
socketio = SocketIO(app=app, message_queue='kafka://localhost:9092', channel='some_topic')
from .blueprints import kafka_consumption_blueprint
app.register_blueprint(kafka_consumption_blueprint)
return app, socketio
return app
My run.py is:
# ./run.py
from . import create_app
app, socketio = create_app(kafka_consumer_too=True)
if __name__=="__main__":
socketio.run(app, debug=True)
And here's the Kafka consumption blueprint I've written, which is where I think it should be handling the stream events:
# ./blueprints/kafka_consumption_blueprint.py
from ..create_app import socketio
kafka_consumption_blueprint = Blueprint('kafka_consumption', __name__)
#socketio.on('message')
def handle_message(message):
print('received message: ' + message)
What it currently does
With the above, my HTTP requests are being handled fine when I curl localhost:5000. The problem is that, when I write to the some_topic Kafka topic (on port 9092), nothing is showing up. I have a CLI Kafka consumer running in another shell, and I can see that the messages I'm sending on that topic are showing up. So it's the Flask app that's not reacting: no messages are being consumed by handle_message().
What am I missing here? Thanks in advance.
I think you are interpreting the meaning of the message_queue argument incorrectly.
This argument is used when you have multiple server instances. These instances communicate with each other through the configured message queue. This queue is 100% internal, there is nothing that you are a user of the library can do with the message queue.
If you wanted to build some sort of pub/sub mechanism, then you have to implement the listener for that in your application.

F# Http.AsyncRequestStream just 'hangs' on long queries

I am working with:
let callTheAPI = async {
printfn "\t\t\tMAKING REQUEST at %s..." (System.DateTime.Now.ToString("yyyy-MM-ddTHH:mm:ss"))
let! response = Http.AsyncRequestStream(url,query,headers,httpMethod,requestBody)
printfn "\t\t\t\tREQUEST MADE."
}
And
let cts = new System.Threading.CancellationTokenSource()
let timeout = 1000*60*4//4 minutes (4 mins no grace)
cts.CancelAfter(timeout)
Async.RunSynchronously(callTheAPI,timeout,cts.Token)
use respStrm = response.ResponseStream
respStrm.Flush()
writeLinesTo output (responseLines respStrm)
To call a web API (REST) and the let! response = Http.AsyncRequestStream(url,query,headers,httpMethod,requestBody) just hangs on certain queries. Ones that take a long time (>4 minutes) particularly. This is why I have made it Async and put a 4 minute timeout. (I collect the calls that timeout and make them with smaller time range parameters).
I started Http.RequestStream from FSharp.Data first, but I couldn't add a timeout to this so the script would just 'hang'.
I have looked at the API's IIS server and the application pool Worker Process active requests in IIS manager and I can see the requests come in and go again. They then 'vanish' and the F# script hangs. I can't find an error message anywhere on the script side or server side.
I included the Flush() and removed the timeout and it still hung. (Removing the Async in the process)
Additional:
Successful calls are made. Failed calls can be followed by successful calls. However, it seems to get to a point where all the calls time out and the do so without even reaching the server any more. (Worker Process Active Requests doesn't show the query)
Update:
I made the Fsx script output the queries and ran them through IRM with now issues (I have timeout and it never locks up). I have a suspicion that there is an issue with FSharp.Data.Http.
Async.RunSynchronously blocks. Read the remarks section in the docs: RunSynchronously. Instead, use Async.AwaitTask.

ActiveMQ 5.2.0 + REST + HTTP POST = java.lang.OutOfMemoryError

First off, I am a newbie when it comes to JMS & ActiveMQ.
I have been looking into a messaging solution to serve as middleware for a message producer that will insert XML messages into a queue via HTTP POST. The producer is an existing system written in C++ that cannot be modified (so Java and the C++ API are out).
Using the "demo" examples and some trial and error, I have cobbled together a working example of what I want to do (on a windows box).
The web.xml I configured in a test directory under "webapps" specifies that the HTTP POST messages received from the producer are to be handled by the MessageServlet.
I added a line for the text app in "activemq.xml" ('ow' is the test app dir):
I created a test script to "insert" messages into the queue which works well.
The problem I am running into is that it as I continue to insert messages via REST/HTTP POST, the memory consumption and thread count used by ActiveMQ continues to rise (It happens when I have timely consumers as well as slow or non-existent consumers).
When memory consumption gets around 250MB's and the thread count exceeds 5000 (as shown in windows task manager), ActiveMQ crashes and I see this in the log:
Exception in thread "ActiveMQ Transport Initiator: vm://localhost#3564" java.lang.OutOfMemoryError: unable to create new native thread
It is as if Jetty is spawning a new thread to handle each HTTP POST and the thread never dies.
I did look at this page:
http://activemq.apache.org/javalangoutofmemory.html
and tried but that didn't fix the problem (although I didn't fully understand the implications of the change either).
Does anyone have any ideas?
Thanks!
Bruce Loth
PS - I included the "test message producer" python script below for what it is worth. I created batches of 100 messages and continued to run the script manually from the command line while watching the memory consumption and thread count of ActiveMQ in task manager.
def foo():
import httplib, urllib
body = "<?xml version='1.0' encoding='UTF-8'?>\n \
<ROOT>\n \
[snip: xml deleted to save space]
</ROOT>"
headers = {"content-type": "text/xml",
"content-length": str(len(body))}
conn = httplib.HTTPConnection("127.0.0.1:8161")
conn.request("POST", "/ow/message/RDRCP_Inbox?type=queue", body, headers)
response = conn.getresponse()
print response.status, response.reason
data = response.read()
conn.close()
## end method definition
## Begin test code
count = 0;
while(count < 100):
# Test with batches of 100 msgs
count += 1
foo()
The error is not directly caused by ActiveMQ but by the Java Runtime. Take a look here:
http://activemq.apache.org/javalangoutofmemory.html
how you can up your memory for the Java HEAP. There is also interessting stuff about WHY this happens and what you might do to prevent it. ActiveMQ is pretty good but needs some customizing here and there in the config files.
You may want to add the following to the URL's query string:
JMSDeliveryMode=persistent
Otherwise, by definition (read "by default"), the messages would be kept in AMQ's memory.

Resources