Can't run asyncio.wait(...) on a list of futures - python-3.6

I'm submitting coroutines to an event loop in a separate thread. This all works well when I wait on each future in sequence with future.next(). But I want to now wait on the first completed future in a list of futures. I'm trying to use asyncio.wait(...) for that, but I appear to be using it incorrectly.
Below is a simplified example. I'm getting the exception TypeError: An asyncio.Future, a coroutine or an awaitable is required at the line done, pending = future.result().
This works if I pass [c1, c2, c3] to asyncio.wait([c1, c2, c3], return_when=asyncio.FIRST_COMPLETE), but I am submitting tasks at random times, so I only can gather the set of futures, not the original tasks. And the documentation clearly states that you can use futures.
coroutine asyncio.wait(futures, *, loop=None, timeout=None, return_when=ALL_COMPLETED)
Wait for the Futures and coroutine objects given by the sequence futures to complete. Coroutines will be wrapped in Tasks. Returns two sets of Future: (done, pending).
import asyncio
import threading
async def generate():
await asyncio.sleep(10)
return 'Hello'
def run_loop(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
event_loop = asyncio.get_event_loop()
threading.Thread(target=lambda: run_loop(event_loop)).start()
c1 = generate() # submitted at a random time
c2 = generate() # submitted at a random time
c3 = generate() # submitted at a random time
f1 = asyncio.run_coroutine_threadsafe(c1, event_loop)
f2 = asyncio.run_coroutine_threadsafe(c2, event_loop)
f3 = asyncio.run_coroutine_threadsafe(c3, event_loop)
all_futures = [f1, f2, f3]
# I'm doing something wrong in these 3 lines
waitable = asyncio.wait(all_futures, return_when=asyncio.FIRST_COMPLETED)
future = asyncio.run_coroutine_threadsafe(waitable, event_loop)
done, pending = future.result() # This returns my TypeError exception
for d in done:
print(d.result())

asyncio.wait expects asyncio futures and works inside an event loop. To wait for multiple concurrent.futures futures (and outside of an event loop), use concurrent.futures.wait instead:
done, pending = concurrent.futures.wait(
all_futures, return_when=concurrent.futures.FIRST_COMPLETED)
Note that your idea would have worked if you had access to the underlying asyncio futures. For example (untested):
async def submit(coro):
# submit the coroutine and return the asyncio task (future)
return asyncio.create_task(coro)
# ...generate() as before
# note that we use result() to get to the asyncio futures:
f1 = asyncio.run_coroutine_threadsafe(submit(c1), event_loop).result()
f2 = asyncio.run_coroutine_threadsafe(submit(c2), event_loop).result()
f3 = asyncio.run_coroutine_threadsafe(submit(c3), event_loop).result()
# these should be waitable by submitting wait() to the event loop
done, pending = asyncio.run_coroutine_threadsafe(
asyncio.wait([f1, f2, f3], return_when=asyncio.FIRST_COMPLETED)).result()

This answer was instrumental in helping answer this:
Create generator that yields coroutine results as the coroutines finish
asyncio.wait(...) can't take futures, only coroutines and awaitables that have NOT yet been scheduled. The correct way to do this is with a callback. When the future finishes it can just add itself to a thread safe queue and you can pull from that queue. The example below fixes the problem in the question:
import asyncio
import threading
import queue
import random
async def generate(i):
await asyncio.sleep(random.randint(5, 10))
return 'Hello {}'.format(i)
def run_loop(loop):
asyncio.set_event_loop(loop)
loop.run_forever()
def done(fut):
q.put(fut)
event_loop = asyncio.get_event_loop()
threading.Thread(target=lambda: run_loop(event_loop)).start()
q = queue.Queue()
c1 = generate(1)
c2 = generate(2)
c3 = generate(3)
f1 = asyncio.run_coroutine_threadsafe(c1, event_loop)
f2 = asyncio.run_coroutine_threadsafe(c2, event_loop)
f3 = asyncio.run_coroutine_threadsafe(c3, event_loop)
f1.add_done_callback(lambda fut: q.put(fut))
f2.add_done_callback(lambda fut: q.put(fut))
f3.add_done_callback(lambda fut: q.put(fut))
print(q.get().result())
print(q.get().result())
print(q.get().result())

Related

Async POST in Python 3.6+ in while loop

The scenario is the following.
I'm capturing frames from a local webcam using OpenCV.
I would like to POST every single frame in a while loop with the following logic:
url = "http://www.to.service"
while True:
cap = cv2.VideoCapture(0)
try:
_, frame = cap.read()
frame_data = do_something(frame)
async_post(url, frame_data)
except KeyboardInterrupt:
break
I tried with asyncio and aiohttp with the following, but without success
async def post_data(session, url, data):
async with session.post(url) as response:
return await response.text()
async def main():
async with aiohttp.ClientSession() as session:
cap = cv2.VideoCapture(0)
while True:
try:
_, frame = cap.read()
frame_data = do_something(frame) # get a dict
await post_data(session, url, frame_data) # post the dict
except KeyboardInterrupt:
break
cap.release()
cv2.destroyAllWindows()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
As far as I understood the logic hereby presented might not be adequate for the asynchronous requests, as I cannot, in principle, fill a list of tasks to be gathered.
I hope this is clear enough. Sorry in advance if it's not.
Any help is much appreciated.
Cheers

async code running synchronously, doesn't seem to have any lines that will be blocking

Running on Windows 10, Python 3.6.3, running inside PyCharm IDE, this code:
import asyncio
import json
import datetime
import time
from aiohttp import ClientSession
async def get_tags():
url_tags = f"{BASE_URL}tags?access_token={token}"
async with ClientSession() as session:
async with session.get(url_tags) as response:
return await response.read()
async def get_trips(vehicles):
url_trips = f"{BASE_URL}fleet/trips?access_token={token}"
for vehicle in vehicles:
body_trips = {"groupId": groupid, "vehicleId": vehicle['id'], "startMs": int(start_ms), "endMs": int(end_ms)}
async with ClientSession() as session:
async with session.post(url_trips, json=body_trips) as response:
yield response.read()
async def main():
tags = await get_tags()
tag_list = json.loads(tags.decode('utf8'))['tags']
veh = tag_list[0]['vehicles'][0:5]
return [await v async for v in get_trips(veh)]
t1 = time.time()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
t2 = time.time()
print(t2 - t1)
seems to be running completely synchronously, time increases linearly as size of loop increases. Following examples from a book I read, "Using Asyncio in Python 3", the code should be asynchronous; am I missing something here? Similar code in C# completes in a few seconds with about 2,000 requests, takes about 14s to run 20 requests here (6s to run 10).
Edit:
re-wrote some code:
async def get_trips(vehicle):
url_trips = f"{BASE_URL}fleet/trips?access_token={token}"
#for vehicle in vehicles:
body_trips = {"groupId": groupid, "vehicleId": vehicle['id'], "startMs": int(start_ms), "endMs": int(end_ms)}
async with ClientSession() as session:
async with session.post(url_trips, json=body_trips) as response:
res = await response.read()
return res
t1 = time.time()
loop = asyncio.new_event_loop()
x = loop.run_until_complete(get_tags())
tag_list = json.loads(x.decode('utf8'))['tags']
veh = tag_list[0]['vehicles'][0:10]
tasks = []
for v in veh:
tasks.append(loop.create_task(get_trips(v)))
loop.run_until_complete(asyncio.wait(tasks))
t2 = time.time()
print(t2 - t1)
This is in fact running asynchronously, but now I can't use the return value from my get_trips function and I don't really see a clear way to use it. Nearly all tutorials I see just have the result being printed, which is basically useless. A little confused on how async is supposed to work in Python, and why some things with the async keyword attached to them run synchronously while others don't.
Simple question is: how do I add the return result of a task to a list or dictionary? More advanced question, could someone explain why my code in the first example runs synchronously while the code in the 2nd part runs asynchronously?
Edit 2:
replacing:
loop.run_until_complete(asyncio.wait(tasks))
with:
x = loop.run_until_complete(asyncio.gather(*tasks))
Fixes the simple problem; now just curious why the async list comprehension doesn't run asynchronously
now just curious why the async list comprehension doesn't run asynchronously
Because your comprehension iterates over an async generator which produces a single task, which you then await immediately, thus killing the parallelism. That is roughly equivalent to this:
for vehicle in vehicles:
trips = await fetch_trips(vehicle)
# do something with trips
To make it parallel, you can use wait or gather as you've already discovered, but those are not mandatory. As soon as you create a task, it will run in parallel. For example, this should work as well:
# step 1: create the tasks and store (task, vehicle) pairs in a list
tasks = [(loop.create_task(get_trips(v)), v)
for v in vehicles]
# step 2: await them one by one, while others are running:
for t, v in tasks:
trips = await t
# do something with trips for vehicle v

How can I schedule awaitables for sequential execution without awaiting?

Suppose, I have a few async functions, f1, f2 and f3. I want to execute these functions in a sequential order. The easiest way to do this would be to await on them:
async def foo():
await f1()
# Do something else
await f2()
# Do something else
await f3()
# Do something else
However, I don't care about the results of these async functions, and I would like to continue with the execution of the rest of the function after scheduling the async functions.
From the asyncio tasks documentation it seems that asyncio.ensure_future() can help me with this. I used the following code to test this out, and the synchronous parts of foo() as per my expectations. However, bar() never executes past asyncio.sleep()
import asyncio
async def bar(name):
print(f'Sleep {name}')
await asyncio.sleep(3)
print(f'Wakeup {name}')
async def foo():
print('Enter foo')
for i in [1, 2, 3]:
asyncio.ensure_future(bar(i))
print(f'Scheduled bar({i}) for execution')
print('Exit foo')
loop = asyncio.get_event_loop()
loop.run_until_complete(foo())
The output for the above code:
Enter foo
Scheduled bar(1) for execution
Scheduled bar(2) for execution
Scheduled bar(3) for execution
Exit foo
Sleep 1
Sleep 2
Sleep 3
So, what is the proper method to do what I'm looking for?
I have a few async functions, f1, f2 and f3. I want to execute these functions in a sequential order. [...] I would like to continue with the execution of the rest of the function after scheduling the async functions.
The straightforward way to do this is by using a helper function and letting it run it in the background:
async def foo():
async def run_fs():
await f1()
await f2()
await f3()
loop = asyncio.get_event_loop()
loop.create_task(run_fs())
# proceed with stuff that foo needs to do
...
create_task submits a coroutine to the event loop. You can also use ensure_future for that, but create_task is preferred when spawning a coroutine.
The code in the question has two issues: first, the functions are not run sequentially, but in parallel. This is fixed as shown above, by running a single async function in the background that awaits the three in order. The second problem is that in asyncio run_until_complete(foo()) only waits for foo() to finish, not also for the tasks spawned by foo (though there are asyncio alternatives that address this). If you want run_until_complete(foo()) to wait for run_fs to finish, foo has to await it itself.
Fortunately, that is trivial to implement - just add another await at the end of foo(), awaiting the task created for run_fs earlier. If the task is already done by that point, the await will exit immediately, otherwise it will wait.
async def foo():
async def run_fs():
await f1()
await f2()
await f3()
loop = asyncio.get_event_loop()
f_task = loop.create_task(run_fs())
# proceed with stuff that foo needs to do
...
# finally, ensure that the fs have finished by the time we return
await f_task

why this asynchronous code won't break while loop?

I use tornado asynchronous http client, but it doesn't work.
from tornado.concurrent import Future
import time
def async_fetch_future(url):
http_client = AsyncHTTPClient()
my_future = Future()
fetch_future = http_client.fetch(url)
fetch_future.add_done_callback(
lambda f: my_future.set_result(f.result()))
return my_future
future = async_fetch_future(url)
while not future.done():
print '.....'
print future.result()
You must run the event loop to allow asynchronous things to happen. You can replace this while loop with print IOLoop.current.run_sync(async_fetch_future(url) (but also note that manually handling Future objects like this is generally unnecessary; async_fetch_future can return the Future from AsyncHTTPClient.fetch directly, and if it needs to do something else it would be more idiomatic to decorate async_fetch_future with #tornado.gen.coroutine and use yield.
If you want to do something other than just print dots in the while loop, you should probably use a coroutine that periodically does yield tornado.gen.moment:
#gen.coroutine
def main():
future = async_fetch_future(url)
while not future.done():
print('...')
yield gen.moment
print(yield future)
IOLoop.current.run_sync(main)

How to treat and deal with event loops?

Is a loop.close() needed prior to returning async values in the below code?
import asyncio
async def request_url(url):
return url
def fetch_urls(x):
loop = asyncio.get_event_loop()
return loop.run_until_complete(asyncio.gather(*[request_url(url) for url in x]))
That is, should fetch_urls be like this instead?:
def fetch_urls(x):
loop = asyncio.get_event_loop()
results = loop.run_until_complete(asyncio.gather(*[request_url(url) for url in x]))
loop.close()
return results
If the loop.close() is needed, then how can fetch_urls be called again without raising the exception: RuntimeError: Event loop is closed?
A previous post states that it is good practice to close the loops and start new ones however it does not specify how new loops can be opened?
You can also keep the event loop alive, and close it the end of your program, using run_until_complete more than once:
import asyncio
async def request_url(url):
return url
def fetch_urls(loop, urls):
tasks = [request_url(url) for url in urls]
return loop.run_until_complete(asyncio.gather(*tasks, loop=loop))
loop = asyncio.get_event_loop()
try:
print(fetch_urls(loop, ['a1', 'a2', 'a3']))
print(fetch_urls(loop, ['b1', 'b2', 'b3']))
print(fetch_urls(loop, ['c1', 'c2', 'c3']))
finally:
loop.close()
No, the async function (request in this case) should not be closing the event loop. The command loop.run_until_complete will close stop the event loop as soon as it runs out of things to do.
fetch_urls should be the second version -- that is, it will get an event loop, run the event loop until there is nothing left to do, and then closes it loop.close().

Resources