How to treat and deal with event loops? - cpython

Is a loop.close() needed prior to returning async values in the below code?
import asyncio
async def request_url(url):
return url
def fetch_urls(x):
loop = asyncio.get_event_loop()
return loop.run_until_complete(asyncio.gather(*[request_url(url) for url in x]))
That is, should fetch_urls be like this instead?:
def fetch_urls(x):
loop = asyncio.get_event_loop()
results = loop.run_until_complete(asyncio.gather(*[request_url(url) for url in x]))
loop.close()
return results
If the loop.close() is needed, then how can fetch_urls be called again without raising the exception: RuntimeError: Event loop is closed?
A previous post states that it is good practice to close the loops and start new ones however it does not specify how new loops can be opened?

You can also keep the event loop alive, and close it the end of your program, using run_until_complete more than once:
import asyncio
async def request_url(url):
return url
def fetch_urls(loop, urls):
tasks = [request_url(url) for url in urls]
return loop.run_until_complete(asyncio.gather(*tasks, loop=loop))
loop = asyncio.get_event_loop()
try:
print(fetch_urls(loop, ['a1', 'a2', 'a3']))
print(fetch_urls(loop, ['b1', 'b2', 'b3']))
print(fetch_urls(loop, ['c1', 'c2', 'c3']))
finally:
loop.close()

No, the async function (request in this case) should not be closing the event loop. The command loop.run_until_complete will close stop the event loop as soon as it runs out of things to do.
fetch_urls should be the second version -- that is, it will get an event loop, run the event loop until there is nothing left to do, and then closes it loop.close().

Related

How to get return from event loop functions

I am rookie at async.
So, my goal is to get array of responses from event loop. I don't actually understand where this return html_text goes after all.
Can you, please, correct my code or offer an alternative solution?
urls = [https://www.google.com, https://www.youtube.com/]
async def r_get(url):
async with aiohttp.ClientSession() as session:
async with session.get(url, headers=headers) as resp:
resp.encoding = 'utf-8'
html_text = await resp.text())
return html_text
urls = [asyncio.ensure_future(r_get(url)) for url in urls_with_id]
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.gather(*urls))
run_until_complete returns the value returned by the coroutine it runs. In turn, gather returns a tuple of the return values of the coroutines it awaits to completion. When the two are used together, run_until_complete(gather(a(), b(), c())) will return a tuple of values returned by a(), b(), and c() respectively.
In your case, just pick up the results by assigning them to a variables:
results = loop.run_until_complete(asyncio.gather(*urls))
# ...use results...

Async POST in Python 3.6+ in while loop

The scenario is the following.
I'm capturing frames from a local webcam using OpenCV.
I would like to POST every single frame in a while loop with the following logic:
url = "http://www.to.service"
while True:
cap = cv2.VideoCapture(0)
try:
_, frame = cap.read()
frame_data = do_something(frame)
async_post(url, frame_data)
except KeyboardInterrupt:
break
I tried with asyncio and aiohttp with the following, but without success
async def post_data(session, url, data):
async with session.post(url) as response:
return await response.text()
async def main():
async with aiohttp.ClientSession() as session:
cap = cv2.VideoCapture(0)
while True:
try:
_, frame = cap.read()
frame_data = do_something(frame) # get a dict
await post_data(session, url, frame_data) # post the dict
except KeyboardInterrupt:
break
cap.release()
cv2.destroyAllWindows()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
As far as I understood the logic hereby presented might not be adequate for the asynchronous requests, as I cannot, in principle, fill a list of tasks to be gathered.
I hope this is clear enough. Sorry in advance if it's not.
Any help is much appreciated.
Cheers

async code running synchronously, doesn't seem to have any lines that will be blocking

Running on Windows 10, Python 3.6.3, running inside PyCharm IDE, this code:
import asyncio
import json
import datetime
import time
from aiohttp import ClientSession
async def get_tags():
url_tags = f"{BASE_URL}tags?access_token={token}"
async with ClientSession() as session:
async with session.get(url_tags) as response:
return await response.read()
async def get_trips(vehicles):
url_trips = f"{BASE_URL}fleet/trips?access_token={token}"
for vehicle in vehicles:
body_trips = {"groupId": groupid, "vehicleId": vehicle['id'], "startMs": int(start_ms), "endMs": int(end_ms)}
async with ClientSession() as session:
async with session.post(url_trips, json=body_trips) as response:
yield response.read()
async def main():
tags = await get_tags()
tag_list = json.loads(tags.decode('utf8'))['tags']
veh = tag_list[0]['vehicles'][0:5]
return [await v async for v in get_trips(veh)]
t1 = time.time()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
t2 = time.time()
print(t2 - t1)
seems to be running completely synchronously, time increases linearly as size of loop increases. Following examples from a book I read, "Using Asyncio in Python 3", the code should be asynchronous; am I missing something here? Similar code in C# completes in a few seconds with about 2,000 requests, takes about 14s to run 20 requests here (6s to run 10).
Edit:
re-wrote some code:
async def get_trips(vehicle):
url_trips = f"{BASE_URL}fleet/trips?access_token={token}"
#for vehicle in vehicles:
body_trips = {"groupId": groupid, "vehicleId": vehicle['id'], "startMs": int(start_ms), "endMs": int(end_ms)}
async with ClientSession() as session:
async with session.post(url_trips, json=body_trips) as response:
res = await response.read()
return res
t1 = time.time()
loop = asyncio.new_event_loop()
x = loop.run_until_complete(get_tags())
tag_list = json.loads(x.decode('utf8'))['tags']
veh = tag_list[0]['vehicles'][0:10]
tasks = []
for v in veh:
tasks.append(loop.create_task(get_trips(v)))
loop.run_until_complete(asyncio.wait(tasks))
t2 = time.time()
print(t2 - t1)
This is in fact running asynchronously, but now I can't use the return value from my get_trips function and I don't really see a clear way to use it. Nearly all tutorials I see just have the result being printed, which is basically useless. A little confused on how async is supposed to work in Python, and why some things with the async keyword attached to them run synchronously while others don't.
Simple question is: how do I add the return result of a task to a list or dictionary? More advanced question, could someone explain why my code in the first example runs synchronously while the code in the 2nd part runs asynchronously?
Edit 2:
replacing:
loop.run_until_complete(asyncio.wait(tasks))
with:
x = loop.run_until_complete(asyncio.gather(*tasks))
Fixes the simple problem; now just curious why the async list comprehension doesn't run asynchronously
now just curious why the async list comprehension doesn't run asynchronously
Because your comprehension iterates over an async generator which produces a single task, which you then await immediately, thus killing the parallelism. That is roughly equivalent to this:
for vehicle in vehicles:
trips = await fetch_trips(vehicle)
# do something with trips
To make it parallel, you can use wait or gather as you've already discovered, but those are not mandatory. As soon as you create a task, it will run in parallel. For example, this should work as well:
# step 1: create the tasks and store (task, vehicle) pairs in a list
tasks = [(loop.create_task(get_trips(v)), v)
for v in vehicles]
# step 2: await them one by one, while others are running:
for t, v in tasks:
trips = await t
# do something with trips for vehicle v

why this asynchronous code won't break while loop?

I use tornado asynchronous http client, but it doesn't work.
from tornado.concurrent import Future
import time
def async_fetch_future(url):
http_client = AsyncHTTPClient()
my_future = Future()
fetch_future = http_client.fetch(url)
fetch_future.add_done_callback(
lambda f: my_future.set_result(f.result()))
return my_future
future = async_fetch_future(url)
while not future.done():
print '.....'
print future.result()
You must run the event loop to allow asynchronous things to happen. You can replace this while loop with print IOLoop.current.run_sync(async_fetch_future(url) (but also note that manually handling Future objects like this is generally unnecessary; async_fetch_future can return the Future from AsyncHTTPClient.fetch directly, and if it needs to do something else it would be more idiomatic to decorate async_fetch_future with #tornado.gen.coroutine and use yield.
If you want to do something other than just print dots in the while loop, you should probably use a coroutine that periodically does yield tornado.gen.moment:
#gen.coroutine
def main():
future = async_fetch_future(url)
while not future.done():
print('...')
yield gen.moment
print(yield future)
IOLoop.current.run_sync(main)

HashDict and OTP GenServer context within Elixir

I am having trouble using the HashDict function within OTP. I would like to use one GenServer process to put and a different one to fetch. When I try and implement this, I can put and fetch items from the HashDict when calling from the same GenServer; it works perfectly (MyServerA in the example below). But when I use one GenServer to put and a different one to fetch, the fetch implementation does not work. Why is this? Presumably it's because I need to pass the HashDict data structure around between the three different processes?
Code example below:
I use a simple call to send some state to MyServerB:
MyServerA.add_update(state)
For MyServerB I have implemented the HashDict as follows:
defmodule MyServerB do
use GenServer
def start_link do
GenServer.start_link(__MODULE__,[], name: __MODULE__)
end
def init([]) do
#Initialise HashDict to store state
d = HashDict.new
{:ok, d}
end
#Client API
def add_update(update) do
GenServer.cast __MODULE__, {:add, update}
end
def get_state(window) do
GenServer.call __MODULE__, {:get, key}
end
# Server APIs
def handle_cast({:add, update}, dict) do
%{key: key} = update
dict = HashDict.put(dict, key, some_Value)
{:noreply, dict}
end
def handle_call({:get, some_key}, _from, dict) do
value = HashDict.fetch!(dict, some_key)
{:reply, value, dict}
end
end
So if from another process I use MyServerB.get_state(dict,some_key), I don't seem to be able to return the contents of the HashDict...
UPDATE:
So if I use ETS I have something like this:
def init do
ets = :ets.new(:my_table,[:ordered_set, :named_table])
{:ok, ets}
end
def handle_cast({:add, update}, state) do
update = :ets.insert(:my_table, {key, value})
{:noreply, ups}
end
def handle_call({:get, some_key}, _from, state) do
sum = :ets.foldl(fn({{key},{value}}, acc)
when key == some_Key -> value + acc
(_, acc) ->
acc
end, 0, :my_table)
{:reply, sum, state}
end
So again, the cast works - when I check with observer I can see the its filling up with my key value pairs. However, when I try my call it returns nothing again. So I'm wondering if I'm handling the state incorrectly?? Any help, gratefully received??
Thanks
Your problem is with this statement:
I would like to use one GenServer process to put and a different one to fetch.
In Elixir processes cannot share state. So you cannot have one process with data, and another process reading it directly. You could for example, store the HashDict in one process and then have the other process send a message to the first asking for data. That would make it appear as you describe, however behind the scenes it would still have all transactions go through the first process. There are techniques for doing this in a distributed/concurrent fashion so that multiple cores are utilize but that may be more work than you're looking to do at the moment.
Take a look at ETS, which will allow you to create a public table and access the data from multiple processes.
ETS is the way to go. Share a HashDict as state between GenServers is not possible.
I really don't know how you are testing your code, but ETS by default has read and write concurrency to false. For example, if you have no problem with reading or writing concurrently then you can change your init function to:
def init do
ets = :ets.new :my_table, [:ordered_set, :named_table,
read_concurrency: true,
write_concurrency: true]
{:ok, ets}
end
Hope this helps.

Resources