Problem with async iterations in telethon - asynchronous

I have some async telethon programm with iterations in TelegramClient class. The problem is that the connection to telegram starts only after all iterations, for example, after 5 minutes. Is it possible to make connection immediately.

Related

Polling multiple SQS messages using Airflow SQSSensor

I am using this SQSSensoe settings to poll messages
fetch_sqs_message = SQSSensor(
task_id="...",
sqs_queue="...",
aws_conn_id="aws_default",
max_messages=10,
wait_time_seconds=30,
poke_interval=60,
timeout=300,
dag=dag
)
I would assume everytime it polls it should poll up to 10 messages. Which my queue has around 5 when I tested this.
But each time I trigger the dag, it only polls 1 message at a time, which I found out from the SQS message count.
Why is it doing this? How can I to get it poll as much messages as possible?
Recently, a new feature has been added to SQSSensor so that the sensor can polls SQS multiple times instead of only once.
You can check out this merged PR
For example, if num_batches is set to 3, SQSSensor will poll the queue 3 times before returning the results.
Disclaimer: I contributed to this feature.

start_server() method in python asyncio module

This question is really for the different coroutines in base_events.py and streams.py that deal with Network Connections, Network Servers and their higher API equivalents under Streams but since its not really clear how to group these functions I am going to attempt to use start_server() to explain what I don't understand about these coroutines and haven't found online (unless I missed something obvious).
When running the following code, I am able to create a server that is able to handle incoming messages from a client and I also periodically print out the number of tasks that the EventLoop is handling to see how the tasks work. What I'm surprised about is that after creating a server, the task is in the finished state not too long after the program starts. I expected that a task in the finished state was a completed task that no longer does anything other than pass back the results or exception.
However, of course this is not true, the EventLoop is still running and handling incoming messages from clients and the application is still running. Monitor however shows that all tasks are completed and no new task is dispatched to handle a new incoming message.
So my question is this:
What is going on underneath asyncio that I am missing that explains the behavior I am seeing? For example, I would have expected a task (or tasks created for each message) that is handling incoming messages in the pending state.
Why is the asyncio.Task.all_tasks() passing back finished tasks. I would have thought that once a task has completed it is garbage collected (so long as no other references are to it).
I have seen similar behavior with the other asyncio functions like using create_connection() with a websocket from a site. I know at the end of these coroutines, their result is usually a tuple such as (reader, writer) or (transport, protocol) but I don't understand how it all ties together or what other documentation/code to read to give me more insight. Any help is appreciated.
import asyncio
from pprint import pprint
async def echo_message(self, reader, writer):
data = await reader.read(1000)
message = data.decode()
addr = writer.get_extra_info('peername')
print('Received %r from %r' % (message, addr))
print('Send: %r' % message)
writer.write(message.encode())
await writer.drain()
print('Close the client socket')
writer.close()
async def monitor():
while True:
tasks = asyncio.Task.all_tasks()
pprint(tasks)
await asyncio.sleep(60)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.create_task(monitor())
loop.create_task(asyncio.start_server(echo_message, 'localhost', 7777, loop))
loop.run_forever()
Outputs:
###
# Soon after starting the application, monitor prints out:
###
{<Task pending coro=<start_server() running ...>,
<Task pending coro=<monitor() running ...>,
<Task pending coro=<BaseEventLoop._create_server_getaddrinfo() running ...>}
###
# After, things initialized and the server has started and the next print out is:
###
{<Task finished coro=<start_server() done ...>,
<Task pending coro=<monitor() running ...>,
<Task finished coro=<BaseEventLoop._create_server_getaddrinfo() done ...>}

C# Tasks created by async/await are not creating separate Threads. How else does it work to have a picture in mind?

According to
If async-await doesn't create any additional threads, then how does it make applications responsive?
a C# task, executed by await ... doesn't create a separate thread for the target Task. However, I observed, that such a task is executed not every time from the same thread, but can switch it's thread.
I still do not understand, what's going on.
public class TestProgram
{
private static async Task HandleClient(TcpClient clt)
{
using NetworkStream ns = clt.GetStream();
using StreamReader sr = new StreamReader(ns);
while (true)
{
string msg = await sr.ReadLineAsync();
Console.WriteLine($"Received in {System.Threading.Thread.CurrentThread.ManagedThreadId} :({msg.Length} bytes):\n{msg}");
}
}
private static async Task AcceptConnections(int port)
{
TcpListener listener = new TcpListener(IPAddress.Parse("127.0.0.1"), port);
listener.Start();
while(true)
{
var client = await listener.AcceptTcpClientAsync().ConfigureAwait(false);
Console.WriteLine($"Accepted connection for port {port}");
var task = HandleClient(client);
}
}
public async static Task Main(string[] args)
{
var task1=AcceptConnections(5000);
var task2=AcceptConnections(5001);
await Task.WhenAll(task1, task2).ConfigureAwait(false);
}
}
This example code creates two listeners for ports 5000 and 5001. Each of it can accept multiple connections and read independently from the socket created.
Maybe it is not "nice", but it works and I observed, that messages received from different sockets are sometimes handled in the same thread, and that the used thread for execution even changes.
Accepted connection for port 5000
Accepted connection for port 5000
Accepted connection for port 5001
Received new message in 5 :(17 bytes):
Port-5000 Message from socket-1
Received new message in 7 :(18 bytes):
Port-5000 Message from socket-1
Received new message in 7 :(18 bytes):
Port-5000 Message from socket-1
Received new message in 7 :(20 bytes):
Port-5000 Message from socket-2
Received new message in 7 :(18 bytes):
Port-5000 Message from socket-2
Received new message in 7 :(18 bytes):
Port-5001 Message from socket-3
Received new message in 8 :(17 bytes):
Port-5001 Message from socket-3
(texts manually edit for clarity, byte lengths are not valid)
If there is heavy load (I didn't test it yet), how many threads would be involved in order to execute those parallel tasks? I heard about a thread pool, but do not know, how to have some influence on it.
Or is it totally wrong asking that and I do not at all have to care about what particular thread is used and how many of them are involved?
a C# task, executed by await ... doesn't create a separate thread for the target Task.
One important correction: a task is not "executed" by await. Asynchronous tasks are already in-progress by the time they're returned. await is used by the consuming code to perform an "asynchronous wait"; i.e., pause the current method and resume it when that task has completed.
I observed, that such a task is executed not every time from the same thread, but can switch it's thread.
I observed, that messages received from different sockets are sometimes handled in the same thread, and that the used thread for execution even changes.
The task isn't "executed" anywhere. But the code in the async method does have to run, and it has to run on a thread. await captures a "context" when it pauses the method, and when the task completes it uses that context to resume executing the method. Console apps don't have a context, so the method resumes on any available thread pool thread.
If there is heavy load (I didn't test it yet), how many threads would be involved in order to execute those parallel tasks? I heard about a thread pool, but do not know, how to have some influence on it.
Or is it totally wrong asking that and I do not at all have to care about what particular thread is used and how many of them are involved?
You usually do not have to know; as long as your code isn't blocking thread pool threads you're generally fine. It's important to note that zero threads are being used while doing I/O, e.g., while listening/accepting a new TCP socket. There's no thread being blocked there. Thread pool threads are only borrowed when they're needed.
For the most part, you don't have to worry about it. But if you need to, the thread pool has several knobs for tweaking.

Channels consumer blocks normal HTTP in Django?

I am running a development server locally
python manage.py runserver 8000
Then I run a script which consumes the Consumer below
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
import time
time.sleep(99999999)
await self.accept()
Everything runs fine and the consumer sleeps for a long time as expected. However I am not able to access http://127.0.0.1:8000/ from the browser.
The problem is bigger in real life since the the consumer needs to make a HTTP request to the same server - and essentially ends up in a deadlock.
Is this the expected behaviour? How do I allow calls to my server while a slow consumer is running?
since this is an async function you should but using asyncio's sleep.
import asyncio
from channels.generic.websocket import AsyncJsonWebsocketConsumer
class MyConsumer(AsyncJsonWebsocketConsumer):
async def connect(self):
await asyncio.sleep(99999999)
await self.accept()
if you use time.sleep you will sleep the entire python thread.
this also applies to when you make your upstream HTTP request you need to use an asyncio http library not a synchronise library. (basically you should be awaiting anything that is expected to take any time)

async twisted, synchronous requests per domain (with delay)

let's say i have 10 domains, but every domain need to have delay between requests (to avoid dos situations and ip-banning).
I was thinking about async twisted that call a class, requests from requests module have delay(500) , but then another request to the same domain make it delay(250) and so on, and so on.
How to achive that static delay, and store somewhere something like queue for every domain (class) ?
It's custom web scraper, twisted is TCP but this shouldn't make difference. I don't want the code, but knowledge.
while using asyncio for async,
import asyncio
async def nested(x):
print(x)
await asyncio.sleep(1)
async def main():
# Schedule nested() to run soon concurrently
# with "main()".
for x in range(100):
await asyncio.sleep(1)
task = asyncio.create_task(nested(x))
# "task" can now be used to cancel "nested()", or
# can simply be awaited to wait until it is complete:
await task
asyncio.run(main())
with await in main, it will print every 2s,
without await in nasted, it will print every 1s.
without await task in main, it will print every 0s, even asyncio.sleep is declared.
It is totally hard to maintain if we are new in async.

Resources