How to find number of active tokio task? - asynchronous

I would like to get the count of active running tokio tasks. In python, I can use len(asyncio.all_tasks()) which returns the unfinished tasks for the current running loop. I would like to know any equivalent in tokio.
Here is a sample code:
use std::time::Duration;
use tokio; // 1.24.1
use tokio::time::sleep;
fn active_tasks() -> usize {
todo!("get active task somehow")
}
#[tokio::main]
async fn main() {
tokio::spawn(async { sleep(Duration::from_secs(5)).await });
tokio::spawn(async { sleep(Duration::from_secs(1)).await });
tokio::spawn(async { sleep(Duration::from_secs(3)).await });
println!("t = 0, running = {}", active_tasks());
sleep(Duration::from_secs(2)).await;
println!("t = 2, running = {}", active_tasks());
sleep(Duration::from_secs(4)).await;
println!("t = 6, running = {}", active_tasks());
}
I expect the output of the above program to print number of active task, since main itself is a tokio task, I would not be surprised to find the following output:
t = 0, running = 4
t = 2, running = 3
t = 6, running = 1
active_tasks() can be an async function if required.

I was hoping that the unstable RuntimeMetrics would be albe to solve this for you, but it seems designed for a different purpose. I don't believe Tokio will be able to handle this for you.
With that said, here's a potential solution to achieve a similar result:
use std::{
future::Future,
sync::{Arc, Mutex},
time::Duration,
};
use tokio::time::sleep;
struct ThreadManager {
thread_count: Arc<Mutex<usize>>,
}
impl ThreadManager {
#[must_use]
fn new() -> Self {
Self {
thread_count: Arc::new(Mutex::new(0)),
}
}
fn spawn<T>(&self, future: T)
where
T: Future + Send + 'static,
T::Output: Send + 'static,
{
// Increment the internal count just before the thread starts.
let count = Arc::clone(&self.thread_count);
*count.lock().unwrap() += 1;
tokio::spawn(async move {
let result = future.await;
// Once we've executed the future, let's decrement this thread.
*count.lock().unwrap() -= 1;
result
});
}
fn thread_count(&self) -> usize {
// Get a copy of the current thread count.
*Arc::clone(&self.thread_count).lock().unwrap()
}
}
#[tokio::main]
async fn main() {
let manager = ThreadManager::new();
manager.spawn(async { sleep(Duration::from_secs(5)).await });
manager.spawn(async { sleep(Duration::from_secs(1)).await });
manager.spawn(async { sleep(Duration::from_secs(3)).await });
println!("t = 0, running = {}", manager.thread_count());
sleep(Duration::from_secs(2)).await;
println!("t = 2, running = {}", manager.thread_count());
sleep(Duration::from_secs(4)).await;
println!("t = 6, running = {}", manager.thread_count());
}
And the result is:
t = 0, running = 3
t = 2, running = 2
t = 6, running = 0
This will do approximately what you're describing. To get a little closer to what you're looking for, you can combine the manager with lazy_static and wrap it in a function called spawn or something. You can also start the counter at 1 to account for the main thread.

Related

reusing futures::stream::Stream and modifying state of each element

I have a futures::stream::Stream which produces elements in the form <(State, impl std::fmt::Binary)> (Binary is an arbitrary placeholder for a trait I want to use):
let peers = (0..10).map(move |peer| async move {
let delay = core::time::Duration::from_secs(2); // should be 'random'
tokio::time::sleep(delay).await;
if peer % 2 == 0 {
stream::iter(Ok::<i32, std::io::Error>(peer).into_iter())
} else {
let custom_error = std::io::Error::new(std::io::ErrorKind::Other, "oh no!");
stream::iter(Err::<i32, std::io::Error>(custom_error).into_iter())
}
})
.collect::<FuturesUnordered<_>>()
.flatten()
.map(|peer| (State { foo: "foo".into(), bar: "bar".into() }, peer));
The peers above should correspond to a stream of peers which are connected successfully. Since I do not know until runtime how many peers are connected I can't store this in a Vec<(State, impl ...)> or similar.
Is it possible to somehow do a series of tasks concurrently which modifies the State internally for each peer where the task completion is determined by the first peer that completes the task? Similar to a race for each task.
I thought the following might work:
use futures::{
stream::{self, FuturesUnordered},
StreamExt, FutureExt,
};
#[derive(Debug, Clone)]
struct State {
foo: String,
bar: String
}
#[tokio::main]
async fn main() {
let futures = (0..10).map(move |peer| async move {
let mut delay = core::time::Duration::from_secs(2);
if peer == 0 {
delay = core::time::Duration::from_secs(100); // slow peer
}
tokio::time::sleep(delay).await;
if peer % 2 == 0 {
stream::iter(Ok::<i32, std::io::Error>(peer).into_iter())
} else {
let custom_error = std::io::Error::new(std::io::ErrorKind::Other, "oh no!");
stream::iter(Err::<i32, std::io::Error>(custom_error).into_iter())
}
})
.collect::<FuturesUnordered<_>>()
.flatten()
.map(|peer| (State { foo: "foo".into(), bar: "bar".into() }, peer));
// first task
let notify = std::rc::Rc::new(tokio::sync::Notify::new());
let futures = futures.map(|(mut state, x)| {
let notify = notify.clone();
async move {
tokio::select! {
biased;
_ = async {
println!("processing task #1 for peer {:b} with state {:?}", x, state);
let delay = core::time::Duration::from_secs(2);
tokio::time::sleep(delay).await;
state.foo = "test1".to_owned();
notify.notify_waiters();
} => {
(state, x)
}
_ = notify.notified() => { (state, x) }
}
}
}).buffer_unordered(10).collect::<Vec<(State, _)>>().await;
// second task
let notify = std::rc::Rc::new(tokio::sync::Notify::new());
let futures = stream::iter(futures);
let futures = futures.map(|(mut state, x)| {
let notify = notify.clone();
async move {
tokio::select! {
biased;
_ = async {
println!("processing task #2 for peer {:b} with state {:?}", x, state);
let delay = core::time::Duration::from_secs(2);
tokio::time::sleep(delay).await;
notify.notify_waiters();
} => {
(state, x)
}
_ = notify.notified() => { (state, x) }
}
}
}).buffer_unordered(10).collect::<Vec<(State, _)>>().await;
}
playground link
But it will be stuck on the first task because it is waiting for the slow peer with 100 seconds delay. Ideally, I want to prematurely finish the collect once the task is done. I have tried using take_until with notify.notified():
let futures = futures.map(|(mut state, x)| {
let notify = notify.clone();
async move {
tokio::select! {
...
}
}
}).buffer_unordered(10).take_until(notify.notified()).collect::<Vec<(State, _)>>().await;
but this will discard the other peers and leave only 1 peer in futures. I think this is because the outer notify.notified() takes precedence over the inner notify.notified() used in the tokio::select! statement.
Is there a way to reuse a futures::stream::Stream and simultaneously modify the elements which I have tried doing above?
Or is there a more idiomatic solution to what I am trying to achieve here?

How to run multiple Tokio async tasks in a loop without using tokio::spawn?

I built a LED clock that also displays weather. My program does a couple of different things in a loop, each thing with a different interval:
updates the LEDs every 50ms,
checks the light level (to adjust the brightness) every 1 second,
fetches weather every 10 minutes,
actually some more, but that's irrelevant.
Updating the LEDs is the most critical: I don't want this to be delayed when e.g. weather is being fetched. This should not be a problem as fetching weather is mostly an async HTTP call.
Here's the code that I have:
let mut measure_light_stream = tokio::time::interval(Duration::from_secs(1));
let mut update_weather_stream = tokio::time::interval(WEATHER_FETCH_INTERVAL);
let mut update_leds_stream = tokio::time::interval(UPDATE_LEDS_INTERVAL);
loop {
tokio::select! {
_ = measure_light_stream.tick() => {
let light = lm.get_light();
light_smooth.sp = light;
},
_ = update_weather_stream.tick() => {
let fetched_weather = weather_service.get(&config).await;
// Store the fetched weather for later access from the displaying function.
weather_clock.weather = fetched_weather.clone();
},
_ = update_leds_stream.tick() => {
// Some code here that actually sets the LEDs.
// This code accesses the weather_clock, the light level etc.
},
}
}
I realised the code doesn't do what I wanted it to do - fetching the weather blocks the execution of the loop. I see why - the docs of tokio::select! say the other branches are cancelled as soon as the update_weather_stream.tick() expression completes.
How do I do this in such a way that while fetching the weather is waiting on network, the LEDs are still updated? I figured out I could use tokio::spawn to start a separate non-blocking "thread" for fetching weather, but then I have problems with weather_service not being Send, let alone weather_clock not being shareable between threads. I don't want this complication, I'm fine with everything running in a single thread, just like what select! does.
Reproducible example
use std::time::Duration;
use tokio::time::{interval, sleep};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut slow_stream = interval(Duration::from_secs(3));
let mut fast_stream = interval(Duration::from_millis(200));
// Note how access to this data is straightforward, I do not want
// this to get more complicated, e.g. care about threads and Send.
let mut val = 1;
loop {
tokio::select! {
_ = fast_stream.tick() => {
println!(".{}", val);
},
_ = slow_stream.tick() => {
println!("Starting slow operation...");
// The problem: During this await the dots are not printed.
sleep(Duration::from_secs(1)).await;
val += 1;
println!("...done");
},
}
}
}
You can use tokio::join! to run multiple async operations concurrently within the same task.
Here's an example:
async fn measure_light(halt: &Cell<bool>) {
while !halt.get() {
let light = lm.get_light();
// ....
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
async fn blink_led(halt: &Cell<bool>) {
while !halt.get() {
// LED blinking code
tokio::time::sleep(UPDATE_LEDS_INTERVAL).await;
}
}
async fn poll_weather(halt: &Cell<bool>) {
while !halt.get() {
let weather = weather_service.get(&config).await;
// ...
tokio::time::sleep(WEATHER_FETCH_INTERVAL).await;
}
}
// example on how to terminate execution
async fn terminate(halt: &Cell<bool>) {
tokio::time::sleep(Duration::from_secs(10)).await;
halt.set(true);
}
async fn main() {
let halt = Cell::new(false);
tokio::join!(
measure_light(&halt),
blink_led(&halt),
poll_weather(&halt),
terminate(&halt),
);
}
If you're using tokio::TcpStream or other non-blocking IO, then it should allow for concurrent execution.
I've added a Cell flag for halting execution as an example. You can use the same technique to share any mutable state between join branches.
EDIT: Same thing can be done with tokio::select!. The main difference with your code is that the actual "business logic" is inside the futures awaited by select.
select allows you to drop unfinished futures instead of waiting for them to exit on their own (so halt termination flag is not necessary).
async fn main() {
tokio::select! {
_ = measure_light() => {},
_ = blink_led() = {},
_ = poll_weather() => {},
}
}
Here's a concrete solution, based on the second part of stepan's answer:
use std::time::Duration;
use tokio::time::sleep;
#[tokio::main]
async fn main() {
// Cell is an acceptable complication when accessing the data.
let val = std::cell::Cell::new(1);
tokio::select! {
_ = async {loop {
println!(".{}", val.get());
sleep(Duration::from_millis(200)).await;
}} => {},
_ = async {loop {
println!("Starting slow operation...");
// The problem: During this await the dots are not printed.
sleep(Duration::from_secs(1)).await;
val.set(val.get() + 1);
println!("...done");
sleep(Duration::from_secs(3)).await;
}} => {},
}
}
Playground link

Not noticing any performance improvement when using async

I have a small program that executes the aws s3 cli commands but with different arguments. I'm using the Command crate and the the command makes a network call and returns some response. At first I have this synchronous & single-threaded implementation:
fn make_call<'a>(_name: &'a str, _bucket_poll: &mut BucketPoll<'a>) -> Option<BucketDetails<'a>> {
let invoke_result = invoke_network_call(_name);
let mut bucket = BucketDetails::new(_name);
match invoke_result {
Ok(invoke_str) => {
bucket.output = invoke_str;
_bucket_poll.insert_bucket(bucket.clone());
_bucket_poll.successful_count += 1;
Some(bucket)
}
Err(_) => {
_bucket_poll.insert_bucket(bucket);
None
}
}
}
// I invoke this function in sequential order, something like
make_call('name_1');
make_call('name_2');
make_call('name_3');
Because I don't really care at which order this function is executed, I decided to learn Tokio to help with performance. I changed the make_call function to be async:
async fn make_call_race() -> ExecutionResult {
let bucket_poll = BucketPoll::new();
let bucket_poll_guard = Arc::new(Mutex::new(bucket_poll));
loop {
let bucket_details = tokio::select! {
Some(bucket_details) = make_call_async("name_1", &bucket_poll_guard) => bucket_details,
Some(bucket_details) = make_call_async("name_2", &bucket_poll_guard) => bucket_details,
Some(bucket_details) = make_call_async("name_3", &bucket_poll_guard) => bucket_details,
Some(bucket_details) = make_call_async("name_4", &bucket_poll_guard) => bucket_details,
else => { break }
};
success_printer(bucket_details);
}
// more printing, no more network calls
ExecutionResult::Success
}
make_call_async is essentially the same as make_call:
async fn make_call_async<'a>(
_name: &'a str,
_bucket_poll_guard: &'a Arc<Mutex<BucketPoll<'a>>>,
) -> Option<BucketDetails<'a>> {
{
if let Ok(bucket_poll_guard) = _bucket_poll_guard.lock() {
if bucket_poll_guard.has_polled(_name) {
return None;
}
}
}
let invoke_result = invoke_network_call(_name);
let mut bucket = BucketDetails::new(_name);
match invoke_result {
Ok(invoke_str) => {
bucket.output = invoke_str;
{
if let Ok(mut bucket_poll_guard) = _bucket_poll_guard.lock() {
bucket_poll_guard.insert_bucket(bucket.clone());
bucket_poll_guard.successful_count += 1;
}
}
Some(bucket)
}
Err(_) => {
{
if let Ok(mut bucket_poll_guard) = _bucket_poll_guard.lock() {
bucket_poll_guard.insert_bucket(bucket);
}
}
None
}
}
}
When I run the async version, I do see that my network calls are made a random order but I do not notice any speedups. I increased the number of network calls to ~50ish invocations but the runtime is nearly the same if not slightly worse. As I am new to async programming and Rust in general, I would like to understand why my async implementation does not seem to offer any improvement.
Extra:
Here is the invoke_network_call method:
fn invoke_network_call(_name: &str) -> core::result::Result<String, AwsCliError> {
let output = Command::new("aws")
.arg("s3")
.arg("ls")
.arg(_name)
.output()
.expect("Could not list s3 objects");
if !output.status.success() {
err_printer(format!("Failed to list s3 objects for bucket {}.", _name));
return Err(AwsCliError);
}
let output_str = get_stdout_string_from_output(&output);
Ok(output_str)
}
EDIT: yorodm's comment makes sense. What I did was use Tokio's Command instead of std::process's Command and made the invoke_network_call async. This reduced my runtime by half. Thank you!
You could rewrite invoke_network_call using an async version of Command.
async fn invoke_network_call(_name: &str) -> core::result::Result<String, AwsCliError> {
let output = tokio::process::Command::new("aws")
.arg("s3")
.arg("ls")
.arg(_name)
.output()
.await
.expect("Could not list s3 objects");
if !output.status.success() {
err_printer(format!("Failed to list s3 objects for bucket {}.", _name));
return Err(AwsCliError);
}
let output_str = get_stdout_string_from_output(&output);
Ok(output_str)
}
Thus removing the blocking std::process::Command call. However I would say that if you're going to access AWS services you should go with rusoto

How can I stop running synchronous code when the future wrapping it is dropped?

I have asynchronous code which calls synchronous code that takes a while to run, so I followed the suggestions outlined in What is the best approach to encapsulate blocking I/O in future-rs?. However, my asynchronous code has a timeout, after which I am no longer interested in the result of the synchronous calculation:
use std::{thread, time::Duration};
use tokio::{task, time}; // 0.2.10
// This takes 1 second
fn long_running_complicated_calculation() -> i32 {
let mut sum = 0;
for i in 0..10 {
thread::sleep(Duration::from_millis(100));
eprintln!("{}", i);
sum += i;
// Interruption point
}
sum
}
#[tokio::main]
async fn main() {
let handle = task::spawn_blocking(long_running_complicated_calculation);
let guarded = time::timeout(Duration::from_millis(250), handle);
match guarded.await {
Ok(s) => panic!("Sum was calculated: {:?}", s),
Err(_) => eprintln!("Sum timed out (expected)"),
}
}
Running this code shows that the timeout fires, but the synchronous code also continues to run:
0
1
Sum timed out (expected)
2
3
4
5
6
7
8
9
How can I stop running the synchronous code when the future wrapping it is dropped?
I don't expect that the compiler will magically be able to stop my synchronous code. I've annotated a line with "interruption point" where I'd be required to manually put some kind of check to exit early from my function, but I don't know how to easily get a notification that the result of spawn_blocking (or ThreadPool::spawn_with_handle, for pure futures-based code) has been dropped.
You can pass an atomic boolean which you then use to flag the task as needing cancellation. (I'm not sure I'm using an appropriate Ordering for the load/store calls, that probably needs some more consideration)
Here's a modified version of your code that outputs
0
1
Sum timed out (expected)
2
Interrupted...
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::{thread, time::Duration};
use tokio::{task, time}; // 0.2.10
// This takes 1 second
fn long_running_complicated_calculation(flag: &AtomicBool) -> i32 {
let mut sum = 0;
for i in 0..10 {
thread::sleep(Duration::from_millis(100));
eprintln!("{}", i);
sum += i;
// Interruption point
if !flag.load(Ordering::Relaxed) {
eprintln!("Interrupted...");
break;
}
}
sum
}
#[tokio::main]
async fn main() {
let some_bool = Arc::new(AtomicBool::new(true));
let some_bool_clone = some_bool.clone();
let handle =
task::spawn_blocking(move || long_running_complicated_calculation(&some_bool_clone));
let guarded = time::timeout(Duration::from_millis(250), handle);
match guarded.await {
Ok(s) => panic!("Sum was calculated: {:?}", s),
Err(_) => {
eprintln!("Sum timed out (expected)");
some_bool.store(false, Ordering::Relaxed);
}
}
}
playground
It's not really possible to get this to happen automatically on the dropping of the futures / handles with current Tokio. Some work towards this is being done in http://github.com/tokio-rs/tokio/issues/1830 and http://github.com/tokio-rs/tokio/issues/1879.
However, you can get something similar by wrapping the futures in a custom type.
Here's an example which looks almost the same as the original code, but with the addition of a simple wrapper type in a module. It would be even more ergonomic if I implemented Future<T> on the wrapper type that just forwards to the wrapped handle, but that was proving tiresome.
mod blocking_cancelable_task {
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use tokio::task;
pub struct BlockingCancelableTask<T> {
pub h: Option<tokio::task::JoinHandle<T>>,
flag: Arc<AtomicBool>,
}
impl<T> Drop for BlockingCancelableTask<T> {
fn drop(&mut self) {
eprintln!("Dropping...");
self.flag.store(false, Ordering::Relaxed);
}
}
impl<T> BlockingCancelableTask<T>
where
T: Send + 'static,
{
pub fn new<F>(f: F) -> BlockingCancelableTask<T>
where
F: FnOnce(&AtomicBool) -> T + Send + 'static,
{
let flag = Arc::new(AtomicBool::new(true));
let flag_clone = flag.clone();
let h = task::spawn_blocking(move || f(&flag_clone));
BlockingCancelableTask { h: Some(h), flag }
}
}
pub fn spawn<F, T>(f: F) -> BlockingCancelableTask<T>
where
T: Send + 'static,
F: FnOnce(&AtomicBool) -> T + Send + 'static,
{
BlockingCancelableTask::new(f)
}
}
use std::sync::atomic::{AtomicBool, Ordering};
use std::{thread, time::Duration};
use tokio::time; // 0.2.10
// This takes 1 second
fn long_running_complicated_calculation(flag: &AtomicBool) -> i32 {
let mut sum = 0;
for i in 0..10 {
thread::sleep(Duration::from_millis(100));
eprintln!("{}", i);
sum += i;
// Interruption point
if !flag.load(Ordering::Relaxed) {
eprintln!("Interrupted...");
break;
}
}
sum
}
#[tokio::main]
async fn main() {
let mut h = blocking_cancelable_task::spawn(long_running_complicated_calculation);
let guarded = time::timeout(Duration::from_millis(250), h.h.take().unwrap());
match guarded.await {
Ok(s) => panic!("Sum was calculated: {:?}", s),
Err(_) => {
eprintln!("Sum timed out (expected)");
}
}
}
playground

How to write a simple Rust asynchronous proxy using futures "0.3" and hyper "0.13.0-alpha.4"?

I am trying to rewrite the proxy example of Asynchronous Programming in Rust book by migrating to :
futures-preview = { version = "0.3.0-alpha.19", features = ["async-await"]}`
hyper = "0.13.0-alpha.4"`
from:
futures-preview = { version = "=0.3.0-alpha.17", features = ["compat"] }`
hyper = "0.12.9"
The current example converts the returned Future from a futures 0.3 into a futures 0.1, because hyper = "0.12.9" is not compatible with futures 0.3's async/await.
My code:
use {
futures::future::{FutureExt, TryFutureExt},
hyper::{
rt::run,
service::{make_service_fn, service_fn},
Body, Client, Error, Request, Response, Server, Uri,
},
std::net::SocketAddr,
std::str::FromStr,
};
fn forward_uri<B>(forward_url: &'static str, req: &Request<B>) -> Uri {
let forward_uri = match req.uri().query() {
Some(query) => format!("{}{}?{}", forward_url, req.uri().path(), query),
None => format!("{}{}", forward_url, req.uri().path()),
};
Uri::from_str(forward_uri.as_str()).unwrap()
}
async fn call(
forward_url: &'static str,
mut _req: Request<Body>,
) -> Result<Response<Body>, hyper::Error> {
*_req.uri_mut() = forward_uri(forward_url, &_req);
let url_str = forward_uri(forward_url, &_req);
let res = Client::new().get(url_str).await;
res
}
async fn run_server(forward_url: &'static str, addr: SocketAddr) {
let forwarded_url = forward_url;
let serve_future = service_fn(move |req| call(forwarded_url, req).boxed());
let server = Server::bind(&addr).serve(serve_future);
if let Err(err) = server.await {
eprintln!("server error: {}", err);
}
}
fn main() {
// Set the address to run our socket on.
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
let url = "http://127.0.0.1:9061";
let futures_03_future = run_server(url, addr);
run(futures_03_future);
}
First, I receive this error for server in run_server function:
the trait tower_service::Service<&'a
hyper::server::tcp::addr_stream::AddrStream> is not implemented for
hyper::service::service::ServiceFn<[closure#src/main.rs:35:35: 35:78
forwarded_url:_], hyper::body::body::Body>
Also, I cannot use hyper::rt::run because it might have been implemented differently in hyper = 0.13.0-alpha.4.
I will be grateful if you tell me your ideas on how to fix it.
By this issue, to create a new service for each connection you need to create MakeService in hyper = "0.13.0-alpha.4". You can create MakeService with a closure by using make_service_fn.
Also, I cannot use hyper::rt::run because it might have been implemented differently in hyper = 0.13.0-alpha.4.
Correct, under the hood hyper::rt::run was calling tokio::run, it has been removed from the api but currently i don't know the reason. You can run your future with calling tokio::run by yourself or use #[tokio::main] annotation. To do this you need to add tokio to your cargo:
#this is the version of tokio inside hyper "0.13.0-alpha.4"
tokio = "=0.2.0-alpha.6"
then change your run_server like this:
async fn run_server(forward_url: &'static str, addr: SocketAddr) {
let server = Server::bind(&addr).serve(make_service_fn(move |_| {
async move { Ok::<_, Error>(service_fn(move |req| call(forward_url, req))) }
}));
if let Err(err) = server.await {
eprintln!("server error: {}", err);
}
}
and main :
#[tokio::main]
async fn main() -> () {
// Set the address to run our socket on.
let addr = SocketAddr::from(([127, 0, 0, 1], 3000));
let url = "http://www.google.com:80"; // i have tested with google
run_server(url, addr).await
}

Resources