Python: How to add signal handler to asyncio HTTP server - http

My HTTP runs using aiohttp.web module:
import asyncio as aio
import aiohttp.web as web
server = web.Application()
server.add_routes([...])
web.run_app(server, port=8080)
The code inside web.run_app use the main event loop, it handles KeyboardInterrupt exception and exit when Ctrl+C is pressed, in simple apps. However, I need to terminate all threads, which aiohttp.web won't, and the programme doesn't exit.
How to override the default signal handler of aiohttp.web.Application?

Thanks to #user4815162342 in the comments, 2 solutions for the problem:
Solution 1: Add daemon=True:
my_thread = Thread(target=..., daemon=True)
Solution 2: Wrap web.run_app in a try block:
try:
web.run_app(server,...)
finally:
terminate_threads()

Related

How to get the Server instance in Streamlit version 1.12?

Until version 1.11, Streamlit had the following way to access the current Server instance:
from streamlit.server.server import Server
Server.get_current()
Now, in version 1.12, it changed to:
from streamlit.web.server import Server
It's ok. But the method get_current() was removed from the class Server.
So, is there another way to get the server instance?
In case there is no other way (if there is, please tell me), the server instance can be found in the object list of the garbage collector:
import gc
for obj in gc.get_objects():
if type(obj) is Server:
server = obj
break
They removed the singleton in this PR.
Here is one internal way to access the object, by fetching it from the closure variables of a signal handler that streamlit registers in its run() method:
import typing as T
from streamlit.web.server import Server
def get_streamlit_server() -> T.Optional[Server]:
"""
Get the active streamlit server object. Must be called within a running
streamlit session.
Easy access to this object was removed in streamlit 1.12:
https://github.com/streamlit/streamlit/pull/4966
"""
# In the run() method in `streamlit/web/bootstrap.py`, a signal handler is registered
# with the server as a closure. Fetch that signal handler.
streamlit_signal_handler = signal.getsignal(signal.SIGQUIT)
# Iterate through the closure variables and return the server if found.
for cell in streamlit_signal_handler.__closure__:
if isinstance(cell.cell_contents, Server):
return cell.cell_contents
return None

start_server() method in python asyncio module

This question is really for the different coroutines in base_events.py and streams.py that deal with Network Connections, Network Servers and their higher API equivalents under Streams but since its not really clear how to group these functions I am going to attempt to use start_server() to explain what I don't understand about these coroutines and haven't found online (unless I missed something obvious).
When running the following code, I am able to create a server that is able to handle incoming messages from a client and I also periodically print out the number of tasks that the EventLoop is handling to see how the tasks work. What I'm surprised about is that after creating a server, the task is in the finished state not too long after the program starts. I expected that a task in the finished state was a completed task that no longer does anything other than pass back the results or exception.
However, of course this is not true, the EventLoop is still running and handling incoming messages from clients and the application is still running. Monitor however shows that all tasks are completed and no new task is dispatched to handle a new incoming message.
So my question is this:
What is going on underneath asyncio that I am missing that explains the behavior I am seeing? For example, I would have expected a task (or tasks created for each message) that is handling incoming messages in the pending state.
Why is the asyncio.Task.all_tasks() passing back finished tasks. I would have thought that once a task has completed it is garbage collected (so long as no other references are to it).
I have seen similar behavior with the other asyncio functions like using create_connection() with a websocket from a site. I know at the end of these coroutines, their result is usually a tuple such as (reader, writer) or (transport, protocol) but I don't understand how it all ties together or what other documentation/code to read to give me more insight. Any help is appreciated.
import asyncio
from pprint import pprint
async def echo_message(self, reader, writer):
data = await reader.read(1000)
message = data.decode()
addr = writer.get_extra_info('peername')
print('Received %r from %r' % (message, addr))
print('Send: %r' % message)
writer.write(message.encode())
await writer.drain()
print('Close the client socket')
writer.close()
async def monitor():
while True:
tasks = asyncio.Task.all_tasks()
pprint(tasks)
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(asyncio.start_server(echo_message, 'localhost', 7777, loop))
loop.run_forever()
Outputs:
###
# Soon after starting the application, monitor prints out:
###
{<Task pending coro=<start_server() running ...>,
<Task pending coro=<monitor() running ...>,
<Task pending coro=<BaseEventLoop._create_server_getaddrinfo() running ...>}
###
# After, things initialized and the server has started and the next print out is:
###
{<Task finished coro=<start_server() done ...>,
<Task pending coro=<monitor() running ...>,
<Task finished coro=<BaseEventLoop._create_server_getaddrinfo() done ...>}

Channels consumer blocks normal HTTP in Django?

I am running a development server locally
python manage.py runserver 8000
Then I run a script which consumes the Consumer below
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
import time
time.sleep(99999999)
await self.accept()
Everything runs fine and the consumer sleeps for a long time as expected. However I am not able to access http://127.0.0.1:8000/ from the browser.
The problem is bigger in real life since the the consumer needs to make a HTTP request to the same server - and essentially ends up in a deadlock.
Is this the expected behaviour? How do I allow calls to my server while a slow consumer is running?
since this is an async function you should but using asyncio's sleep.
import asyncio
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await asyncio.sleep(99999999)
await self.accept()
if you use time.sleep you will sleep the entire python thread.
this also applies to when you make your upstream HTTP request you need to use an asyncio http library not a synchronise library. (basically you should be awaiting anything that is expected to take any time)

Flask-socketIO + Kafka as a background process

What I want to do
I have an HTTP API service, written in Flask, which is a template used to build instances of different services. As such, this template needs to be generalizable to handle use cases that do and do not include Kafka consumption.
My goal is to have an optional Kafka consumer running in the background of the API template. I want any service that needs it to be able to read data from a Kafka topic asynchronously, while also independently responding to HTTP requests as it usually does. These two processes (Kafka consuming, HTTP request handling) aren't related, except that they'll be happening under the hood of the same service.
What I've written
Here's my setup:
# ./create_app.py
from flask_socketio import SocketIO
socketio = None
def create_app(kafka_consumer_too=False):
"""
Return a Flask app object, with or without a Kafka-ready SocketIO object as well
"""
app = Flask('my_service')
app.register_blueprint(special_http_handling_blueprint)
if kafka_consumer_too:
global socketio
socketio = SocketIO(app=app, message_queue='kafka://localhost:9092', channel='some_topic')
from .blueprints import kafka_consumption_blueprint
app.register_blueprint(kafka_consumption_blueprint)
return app, socketio
return app
My run.py is:
# ./run.py
from . import create_app
app, socketio = create_app(kafka_consumer_too=True)
if __name__=="__main__":
socketio.run(app, debug=True)
And here's the Kafka consumption blueprint I've written, which is where I think it should be handling the stream events:
# ./blueprints/kafka_consumption_blueprint.py
from ..create_app import socketio
kafka_consumption_blueprint = Blueprint('kafka_consumption', __name__)
#socketio.on('message')
def handle_message(message):
print('received message: ' + message)
What it currently does
With the above, my HTTP requests are being handled fine when I curl localhost:5000. The problem is that, when I write to the some_topic Kafka topic (on port 9092), nothing is showing up. I have a CLI Kafka consumer running in another shell, and I can see that the messages I'm sending on that topic are showing up. So it's the Flask app that's not reacting: no messages are being consumed by handle_message().
What am I missing here? Thanks in advance.
I think you are interpreting the meaning of the message_queue argument incorrectly.
This argument is used when you have multiple server instances. These instances communicate with each other through the configured message queue. This queue is 100% internal, there is nothing that you are a user of the library can do with the message queue.
If you wanted to build some sort of pub/sub mechanism, then you have to implement the listener for that in your application.

Autobahn+twisted reconnecting

I have a series of clients which need to be connected constantly to my server via ws protocol. For a number of different reasons, the connections occasionally drop. This is acceptable, but when it happens I'd like my clients to reconnect.
Currently my temporary workaround is to have a parent process launch the client and when it detects connection drop, terminate it (client never handles any critical data, there are no side effects to sigkill-ing it) and respawn a new client. While this does the job, I'd very much prefer to fix the actual problem.
This is roughly my client:
from autobahn.twisted.websocket import WebSocketClientProtocol, WebSocketClientFactory
from twisted.internet import reactor
from threading import Thread
from time import sleep
class Client:
def __init__(self):
self._kill = False
self.factory = WebSocketClientFactory("ws://0.0.0.0")
self.factory.openHandshakeTimeout = 60 # ensures handshake doesnt timeout due to technical limitations
self.factory.protocol = self._protocol_factory()
self._conn = reactor.connectTCP("0.0.0.0", 1234, self.factory)
reactor.run()
def _protocol_factory(self):
class ClientProtocol(WebSocketClientProtocol):
def onConnect(self, response):
Thread(target=_self.mainloop, daemon=True).start()
def onClose(self, was_clean, code, reason):
_self.on_cleanup()
_self = self
return ClientProtocol
def on_cleanup(self):
self._kill = True
sleep(30)
# Wait for self.mainloop to finish.
# It is guaranteed to exit within 30 seconds of setting _kill flag
self._kill = False
self._conn = reactor.connectTCP("0.0.0.0", 1234, self.factory)
def mainloop(self):
while not self._kill:
sleep(1) # does some work
This code makes client work correctly until first connection drop at which point it attempts to reconnect. No exceptions are raised during the process, it appears that everything went correctly clientside, the onConnect is called and new mainloop starts, but the server never received that second handshake. Client seems to think it is connected, though.
What am I doing wrong? Why could this be happening?
I'm not a twisted expert and can't really tell what you are doing wrong, but I'm currently working with Autobahn in a project and I solved the reconnection problems using the ReconnectingClientFactory. Maybe you want to check some examples of the use of ReconnectingClientFactory with Autobahn.

Resources