Background task in fastapi making blocking requests

Background task in fastapi making blocking requests - fastapi

I have a resource intensive async method that i want to run as a background task. Example code for it looks like this:
#staticmethod
async def trigger_task(id: str, run_e2e: bool = False):
try:
add_status_for_task(id)
result1, result2 = await task(id)
update_status_for_task(id, result1, result2)
except Exception:
update_status_for_task(id, 'FAIL')
#router.post("/task")
async def trigger_task(background_tasks: BackgroundTasks):
background_tasks.add_task(EventsTrigger.trigger_task)
return {'msg': 'Task submitted!'}
When i trigger this endpoint, I expect an instant output: {'msg': 'Task submitted!'}. But instead the api output is awaited till the task completes. I am following this documentation from fastapi.
fastapi: v0.70.0
python: v3.8.10
I believe the issue is similar to what is described here.
Request help in making this a non-blocking call.

What I have learned from the github issues,
You can't use async def for task functions (Which will run in background.)
As in background process you can't access the coroutine, So, your async/await will not work.
You can still try without async/await. If that also doesn't work then you should go for alternative.
Alternative Background Solution
Celery is production ready task scheduler. So, you can easily configure and run the background task using your_task_function.delay(*args, **kwargs)
Note that, Celery also doesn't support async in background task. So, whatever you need to write is sync code to run in background.
Good Luck :)

Unfortunately you seem to have oversimplified your example so it is a little hard to tell what is going wrong.
But the important question is: are add_status_for_task() or update_status_for_task() blocking? Because if they are (and it seems like that is the case), then obviously you're going to have issues. When you run code with async/await all the code inside of it needs to be async as well.
This would make your code look more like:
async def trigger_task(id: str, run_e2e: bool = False):
try:
await add_status_for_task(id)
result1, result2 = await task(id)
await update_status_for_task(id, result1, result2)
except Exception:
await update_status_for_task(id, 'FAIL')
#router.post("/task/{task_id}")
async def trigger_task(task_id: str, background_tasks: BackgroundTasks):
background_tasks.add_task(EventsTrigger.trigger_task, task_id)
return {'msg': 'Task submitted!'}

How are you running your app?
According to the uvicorn docs its running with 1 worker by default, which means only one process will be issued simultaneously.
Try configuring your uvicorn to run with more workers.
https://www.uvicorn.org/deployment/
$ uvicorn example:app --port 5000 --workers THE_AMOUNT_OF_WORKERS
or
uvicorn.run("example:app", host="127.0.0.1", port=5000, workers=THE_AMOUNT_OF_WORKERS)

Related

start_server() method in python asyncio module

This question is really for the different coroutines in base_events.py and streams.py that deal with Network Connections, Network Servers and their higher API equivalents under Streams but since its not really clear how to group these functions I am going to attempt to use start_server() to explain what I don't understand about these coroutines and haven't found online (unless I missed something obvious).
When running the following code, I am able to create a server that is able to handle incoming messages from a client and I also periodically print out the number of tasks that the EventLoop is handling to see how the tasks work. What I'm surprised about is that after creating a server, the task is in the finished state not too long after the program starts. I expected that a task in the finished state was a completed task that no longer does anything other than pass back the results or exception.
However, of course this is not true, the EventLoop is still running and handling incoming messages from clients and the application is still running. Monitor however shows that all tasks are completed and no new task is dispatched to handle a new incoming message.
So my question is this:
What is going on underneath asyncio that I am missing that explains the behavior I am seeing? For example, I would have expected a task (or tasks created for each message) that is handling incoming messages in the pending state.
Why is the asyncio.Task.all_tasks() passing back finished tasks. I would have thought that once a task has completed it is garbage collected (so long as no other references are to it).
I have seen similar behavior with the other asyncio functions like using create_connection() with a websocket from a site. I know at the end of these coroutines, their result is usually a tuple such as (reader, writer) or (transport, protocol) but I don't understand how it all ties together or what other documentation/code to read to give me more insight. Any help is appreciated.
import asyncio
from pprint import pprint
async def echo_message(self, reader, writer):
data = await reader.read(1000)
message = data.decode()
addr = writer.get_extra_info('peername')
print('Received %r from %r' % (message, addr))
print('Send: %r' % message)
writer.write(message.encode())
await writer.drain()
print('Close the client socket')
writer.close()
async def monitor():
while True:
tasks = asyncio.Task.all_tasks()
pprint(tasks)
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(asyncio.start_server(echo_message, 'localhost', 7777, loop))
loop.run_forever()
Outputs:
###
# Soon after starting the application, monitor prints out:
###
{<Task pending coro=<start_server() running ...>,
<Task pending coro=<monitor() running ...>,
<Task pending coro=<BaseEventLoop._create_server_getaddrinfo() running ...>}
###
# After, things initialized and the server has started and the next print out is:
###
{<Task finished coro=<start_server() done ...>,
<Task pending coro=<monitor() running ...>,
<Task finished coro=<BaseEventLoop._create_server_getaddrinfo() done ...>}

Channels consumer blocks normal HTTP in Django?

I am running a development server locally
python manage.py runserver 8000
Then I run a script which consumes the Consumer below
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
import time
time.sleep(99999999)
await self.accept()
Everything runs fine and the consumer sleeps for a long time as expected. However I am not able to access http://127.0.0.1:8000/ from the browser.
The problem is bigger in real life since the the consumer needs to make a HTTP request to the same server - and essentially ends up in a deadlock.
Is this the expected behaviour? How do I allow calls to my server while a slow consumer is running?

since this is an async function you should but using asyncio's sleep.
import asyncio
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await asyncio.sleep(99999999)
await self.accept()
if you use time.sleep you will sleep the entire python thread.
this also applies to when you make your upstream HTTP request you need to use an asyncio http library not a synchronise library. (basically you should be awaiting anything that is expected to take any time)

Nullreference exception does not show up when async Task discarded

I have an async Task with a method signature defined like this:
public async Task<bool> HandleFooAsync()
When executing this task in an async way and discarding the results, exceptions happening in this task do not show up in our logs.
_ = _slackResponseService.HandleFooAsync();
When I await the execution of the task I see the error in our logs
var result = await _slackResponseService.HandleFooAsync();
Is this expected behaviour? Is there a way to achieve a solution in between: "do not wait for the result, but log errors nevertheless.." ? We invested hours debugging our logging setup, just to learn that our logging setup is correct, but discard means in dotnet that everything is discarded - even logs. Which is a quite a new perspective for us, coming from a Python background.
Our logging setup follows the default logging setup for dotnet core 3 https://learn.microsoft.com/en-us/aspnet/core/fundamentals/logging/?view=aspnetcore-3.1

Yes, it is an expected behavior. Call in that way can be considered like an anti-pattern. You can read about it C# Async Antipatterns
You need something which is called as "Fire and forget". One of its implementation can be find in repo AsyncAwaitBestPractices (nuget available too).

A Task in .net and netcore is meant to be awaited. If it is not awaited, the scope might be destroyed before the async method has finished.
If you want to run tasks in the background and not wait for a result you can use BackgroundService in .netcore or a third party such as Hangfire which supports fire and forget jobs out of the box
https://medium.com/#daniel.sagita/backgroundservice-for-a-long-running-work-3debe8f8d25b
https://www.hangfire.io/

One solution is to subscribe to the TaskScheduler.UnobservedTaskException event. It is not ideal because the event is raised when the faulted Task is garbage collected, which may happen long after the actual fault.
Another solution could be to use an extension method every time a task is fired and forgotten. Like this:
_slackResponseService.HandleFooAsync().FireAndForget();
Here is a basic implementation of the FireAndForget method:
public async static void FireAndForget(this Task task)
{
try
{
await task;
}
catch (Exception ex)
{
// log the exception here
}
}

async twisted, synchronous requests per domain (with delay)

let's say i have 10 domains, but every domain need to have delay between requests (to avoid dos situations and ip-banning).
I was thinking about async twisted that call a class, requests from requests module have delay(500) , but then another request to the same domain make it delay(250) and so on, and so on.
How to achive that static delay, and store somewhere something like queue for every domain (class) ?
It's custom web scraper, twisted is TCP but this shouldn't make difference. I don't want the code, but knowledge.

while using asyncio for async,
import asyncio
async def nested(x):
print(x)
await asyncio.sleep(1)
async def main():
# Schedule nested() to run soon concurrently
# with "main()".
for x in range(100):
await asyncio.sleep(1)
task = asyncio.create_task(nested(x))
# "task" can now be used to cancel "nested()", or
# can simply be awaited to wait until it is complete:
await task
asyncio.run(main())
with await in main, it will print every 2s,
without await in nasted, it will print every 1s.
without await task in main, it will print every 0s, even asyncio.sleep is declared.
It is totally hard to maintain if we are new in async.

running python application with asyncio async and await

I am trying to use asyncio and keywords await/async with python 3.5
I'm fairly new to asynchronous programming in python. Most of my experience with it has been with NodeJS. I seem to be doing everything right except for calling my startup function to initiate the program.
below is some fictitious code to water it down to where my confusion is because my code base is rather large and consists of several local modules.
import asyncio
async def get_data():
foo = await <retrieve some data>
return foo
async def run():
await get_data()
run()
but I recieve this asyncio exception:
runtimeWarning: coroutine 'run' was never awaited
I understand what this error is telling me but I'm confused as to how I am supposed to await a call to the function in order to run my program.

You should create event loop manually and run coroutine in it, like shown in documentation:
import asyncio
async def hello_world():
print("Hello World!")
loop = asyncio.get_event_loop()
loop.run_until_complete(hello_world())
loop.close()

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Background task in fastapi making blocking requests - fastapi

Related

start_server() method in python asyncio module

Channels consumer blocks normal HTTP in Django?

Nullreference exception does not show up when async Task discarded

async twisted, synchronous requests per domain (with delay)

running python application with asyncio async and await

Categories

Resources