I'm facing a scenario where I need to run async code from the drop handler of an object. The whole application runs in a tokio async context, so I know that the drop handler is called with an active tokio Runtime, but unfortunately drop itself is a sync function.
Ideally, I'd like a solution that works on both multi-thread and current-thread runtimes, but if that doesn't exist, then I'm ok with a solution that blocks the dropping thread and relies on other threads to drive the futures.
I considered multiple options but I'm not sure which approach is best or understand much about their trade offs. For these examples, let's assume my class has an async terminate(&mut self) function that I would like to be called from drop().
struct MyClass;
impl MyClass {
async fn terminate(&mut self) {}
}
Option 1: tokio::runtime::Handle::block_on
impl Drop for MyClass {
fn drop(&mut self) {
tokio::runtime::Handle::current().block_on(self.terminate());
}
}
This seems to be the most straightforward approach, but unfortunately it panics with
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
see playground
I'm a bit confused by this since I thought Handle::block_on would use the currently running runtime but it seems this tries to start a new runtime? What is going on here?
Also, according to the documentation of Handle::block_on, this cannot drive IO threads. So I guess blocking this thread is a risk - if too many objects are destructed at the same time, each blocking a thread, and those futures wait for IO work, then this will deadlock.
Option 2: futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
futures::executor::block_on(self.terminate());
}
}
see playground
This seems to work fine. If I understand this correctly, then it spawns a new non-tokio executor on the current thread and has that thread drive the future. Is this an issue? Does this cause conflicts between the running tokio executor and the new futures executor?
Also, can this actually drive IO threads, avoiding the issue of option 1? Or can it happen that those IO threads are still waiting on the tokio executor?
Option 3: tokio::task::spawn with futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
let task = tokio::task::spawn(self.terminate());
futures::executor::block_on(task);
}
}
see playground
This should have the tokio runtime drive the termination future while the futures runtime only blocks the current thread to wait until the tokio runtime finished? Is this safer than option 2 and causes fewer conflicts between the runtimes? Unfortunately, this ran into a lifetime issue I couldn't figure out.:
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/main.rs:8:44
|
7 | fn drop(&mut self) {
| --------- this data with an anonymous lifetime `'_`...
8 | let task = tokio::task::spawn(self.terminate());
| ---- ^^^^^^^^^
| |
| ...is used here...
|
note: ...and is required to live as long as `'static` here
--> src/main.rs:8:20
|
8 | let task = tokio::task::spawn(self.terminate());
| ^^^^^^^^^^^^^^^^^^
note: `'static` lifetime requirement introduced by this bound
--> /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.17.0/src/task/spawn.rs:127:28
|
127 | T: Future + Send + 'static,
| ^^^^^^^
I also tried to fix this with LocalSet but couldn't get it to work. Any way to make this work?
Option 3b
I was, however, able to make it work if I make terminate() take self by value and wrap MyClass into a Wrapper. Not pretty but maybe better than Option 2 because it uses the tokio runtime to drive the future?
struct MyClass;
impl MyClass {
async fn terminate(self) {}
}
struct Wrapper(Option<MyClass>);
impl Drop for Wrapper {
fn drop(&mut self) {
if let Some(v) = self.0.take() {
let task = tokio::task::spawn(v.terminate());
futures::executor::block_on(task).unwrap();
}
}
}
see playground
Is this a good approach? Is it actually important that the tokio runtime drives the drop future or is the simpler Option 2 better? Any ways to make option 3b prettier / easier to use?
Option 4: Background Task
I found this option here: https://stackoverflow.com/a/68851788/829568
It basically spawns a background task in the constructor of the object that waits for a trigger and runs async drop code when triggered. The drop implementation then triggers it and runs a busy waiting loop until it is finished.
This seems overly complex and also more error prone than the other options here. Or is this actually the best solution?
Side question on exhausting worker threads
Except for option 1, all of these options block a tokio worker thread to wait for the async drop to complete. In a multi threaded runtime, this will go well most of the time, but could in theory lock up all worker threads if multiple destructors run in parallel - and IIUC then we would have a deadlock with no thread making progress. Option 1 seems somewhat better but the block_on documentation says it can only drive non-IO futures. So it could still lock up if too many destructors do IO work. Is there a way to tell tokio to increase the number of worker threads by one? If we do that for each thread we block, would that avoid this issue?
Option 5: new runtime in new thread
impl Drop for MyClass {
fn drop(&mut self) {
std::thread::scope(|s| {
s.spawn(|| {
let runtime = tokio::runtime::Builder::new_multi_thread()
.build()
.unwrap();
runtime.block_on(self.terminate());
});
});
}
}
see playground
This seems to work and attempts to avoid the issue of blocking worker threads by running the drop task on a new runtime in a new thread. This new thread should, hopefully, be able to drive IO tasks. But does this actually fully solve the problem? What if the drop task depends on an IO task that is running on the main tokio executor? I think this may still have a chance of causing the program to lock up indefinitely.
If you want to "do something", without exclusive mutable access to MyClass, maybe using oneshot channels to trigger async compute would work? Somewhat similar to the option #4.
You can send some extra state through the channel too.
use std::time::Duration;
use tokio::{
runtime::Runtime,
sync::oneshot::{self, Receiver, Sender},
time::interval,
};
struct MyClass {
tx: Option<Sender<()>>, // can have SomeStruct instead of ()
// my_state: Option<SomeStruct>
}
impl MyClass {
pub async fn new() -> Self {
println!("MyClass::new()");
let (tx, mut rx) = oneshot::channel();
tokio::task::spawn(async move {
let mut interval = interval(Duration::from_millis(100));
println!("drop wait loop starting...");
loop {
tokio::select! {
_ = interval.tick() => println!("Another 100ms"),
msg = &mut rx => {
println!("should process drop here");
break;
}
}
}
});
Self { tx: Some(tx) }
}
}
impl Drop for MyClass {
fn drop(&mut self) {
println!("drop()");
self.tx.take().unwrap().send(()).unwrap();
// self.tx.take().unwrap().send(self.my_state.take().unwrap()).unwrap();
}
}
#[tokio::main]
async fn main() {
let class = MyClass::new().await;
}
This prints most of the time:
MyClass::new()
drop()
drop wait loop starting...
should process drop here
sometimes the process exists before the receiving side-task gets a chance to spawn. But if you have a non-exiting code, should be fine.
Not sure if the select! interval.tick is necessary, though unfortunately oneshot channel has no async blocking receive method.
Related
Currently I'm checking out Actix, a Rust based actor framework. I'm also using Actix web to build a REST API. Now, I'm familiar with actor based architecture from working with Akka, however, I'm not being able to start a simple async task inside of my handler.
It's simplified, but I have the following code:
#[post("/upload")]
pub async fn upload_images(
app_config: web::Data<AppConfig>,
mut payload: Multipart,
) -> Result<HttpResponse> {
... transforms the multipart form into images...
for img in img_vec {
app_config.image_processor_addr.do_send(ResizeImage{
img_id: img._id,
img_format: img.format,
image_buffer: img.image.bytes,
});
};
Ok(HttpResponse::Ok().content_type(ContentType::plaintext()).body(format!("Inserted {} images.", vec_len)))
}
As you can see, I receive a multipart upload which consists of images, which I then send to an image processing actor to perform a resize on the images.
And this is the simplified code for the handling of the message ResizeImage for the ImageProcessor actor:
impl Handler<ResizeImage> for ImageProcessor {
type Result = ();
fn handle(&mut self, msg: ResizeImage, _: &mut Self::Context) -> Self::Result {
let thumbnail_col = self.thumbnail_col.clone();
let img_col = self.img_col.clone();
let img_format: ImageFormat = msg.img_format.clone().into();
log::info!("Parsing image {} with actor {}.", msg.img_id, self.id);
let actor_task_fut = Box::pin(async move {
... parses the image here...
});
match Arbiter::current().spawn(actor_task_fut) {
true => log::info!("Sent task to arbiter."),
false => log::error!("Failed to send task to arbiter!"),
}
}
}
The idea is that I would resolve the web handler, and the resize task would be done async on the actor thread. However, this works on the first call, but when I call the same endpoint before all the images from the previous call are parsed, it doesn't resolve immediately, it waits till the actor has resized the previous batch.
I was under the impression that the messages would be sent to the actor mailbox and then the handler code would not need to wait for anything, since I'm using do_sent, which the documentation states that it doesn't await for the answer. Using Akka I can easily do something similar, and it seems to work. Am I missing something here? Is the way I'm handling async inside the actor thread wrong?
I tried the example of actix-multipart with actix-web v3.3.2 and actix-multipart v0.3.0.
For a minimal example,
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
This works well, but I want to do with form data asynchronously. So I tried instead:
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
actix_web::rt::spawn(async move {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
});
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
(Added actix_web::rt::spawn to save_file.)
But this did not work -- the message "Done" never printed. The number of "Read a chunk" displayed in the second case was less than the first case, so I guess that field.next().await cannot terminate for some reason before completing reading all data.
I do not know so much about asynchronous programming, so I am not sure why field.next() did not work in actix_web::rt::spawn.
My question are: why is it, and how can I do with actix_web::rt::spawn?
When you make this call:
actix_web::rt::spawn(async move {
// do things...
});
spawn returns a JoinHandle which is used to poll the task. When you drop that handle (by not binding it to anything), the task is "detached", i.e., it runs in the background.
The actix documentation is not particularly helpful here, but actix uses the tokio runtime under the hood. A key issue is that in tokio, spawned tasks are not guaranteed to complete. The executor needs to know, somehow, that it should perform work on that future. In your second example, the spawned task is never .awaited, nor does it communicate with any other task via channels.
Most likely, the spawned task is never polled and does not make any progress. In order to ensure that it completes, you can either .await the JoinHandle (which will drive the task to completion) or .await some other Future that depends on work in the spawned task (usually by using a channel).
As for your more general goal, the work is already being performed asynchronously! Most likely, actix is doing roughly what you tried to do in your second example: upon receiving a request, it spawns a task to handle the request and polls it repeatedly (as well as the other active requests) until it completes, then sends a response.
use std::thread;
use tokio::task; // 0.3.4
#[tokio::main]
async fn main() {
thread::spawn(|| {
task::spawn(async {
println!("123");
});
})
.join();
}
When compiling I get a warning:
warning: unused `std::result::Result` that must be used
--> src/main.rs:6:5
|
6 | / thread::spawn(|| {
7 | | task::spawn(async {
8 | | println!("123");
9 | | });
10 | | })
11 | | .join();
| |____________^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
And when executing I get an error:
thread '<unnamed>' panicked at 'must be called from the context of Tokio runtime configured with either `basic_scheduler` or `threaded_scheduler`', src/main.rs:7:9
The key piece is that you need to get a Tokio Handle. This is a reference to a Runtime and it allows you to spawn asynchronous tasks from outside of the runtime.
When using #[tokio::main], the simplest way to get a Handle is via Handle::current before spawning another thread then give the handle to each thread that might want to start an asynchronous task:
use std::thread;
use tokio::runtime::Handle; // 0.3.4
#[tokio::main]
async fn main() {
let threads: Vec<_> = (0..3)
.map(|thread_id| {
let handle = Handle::current();
thread::spawn(move || {
eprintln!("Thread {} started", thread_id);
for task_id in 0..3 {
handle.spawn(async move {
eprintln!("Thread {} / Task {}", thread_id, task_id);
});
}
eprintln!("Thread {} finished", thread_id);
})
})
.collect();
for t in threads {
t.join().expect("Thread panicked");
}
}
You could also create a global, mutable singleton of a Mutex<Option<Handle>>, initialize it to None, then set it to Some early in your tokio::main function. Then, you can grab that global variable, unwrap it, and clone the Handle when you need it:
use once_cell::sync::Lazy; // 1.5.2
static HANDLE: Lazy<Mutex<Option<Handle>>> = Lazy::new(Default::default);
*HANDLE.lock().unwrap() = Some(Handle::current());
let handle = HANDLE.lock().unwrap().as_ref().unwrap().clone();
See also:
How do I add tasks to a Tokio event loop that is running on another thread?
How do I synchronously return a value calculated in an asynchronous Future in stable Rust?
How to create a dedicated threadpool for CPU-intensive work in Tokio?
How do I create a global, mutable singleton?
I have a job processing app that exposes a web API to add jobs and process them but the API request should not wait for the job to finish (it could take a while). I use Server-Sent Events to broadcast the job result. This means the main API server is executing inside main with #[tokio::main], but where should I be running the job executor? In the job executor, I will have plenty of waiting: things like downloading. They will interfere with the web API server. The crucial question is how do I even start both executions in parallel?
In this scenario, you need to create a separate thread with thread::spawn inside which you will create a Tokio executor. The error you get is that inside your second thread, there is no Tokio executor (runtime). You need to create one manually and tell it to run your tasks. The easier way is to use the Runtime API:
use tokio::runtime::Runtime; // 0.2.23
// Create the runtime
let rt = Runtime::new().unwrap();
// Spawn a future onto the runtime
rt.spawn(async {
println!("now running on a worker thread");
});
In your main thread, an executor is already available with the use of #[tokio::main]. Prior to the addition of this attribute, the runtime was created manually.
If you want to stick with the async/await philosophy, you can use join:
use tokio; // 0.2.23
#[tokio::main]
async fn main() {
let (_, _) = tokio::join!(start_server_listener(), start_job_processor());
}
This is why most answers are questioning your approach. Although very rare, I believe there are scenarios where you want an async runtime to be on another thread while also having the benefit to manually configure the runtime.
I am trying to write an udp client in rust which establishes a socket connection to a remote server, should listen for incoming messages(and then process the data), while also be able to send messages and then disconnects after a given time. I would like to use the new async/await syntax in tokio and spawn a task that takes care of reading incoming/processing the incoming messages, while keeping the socket in the main task to send messages in parallel, especially at the end the protocol to close the connection.
How can I avoid moving the socket into the spawned task? Is there a way to borrow it in that task maybe trough a reference. I looked through answers to similar questions but could not understand it as they apply to the version of tokio without the new syntax and as I am an absolute beginner in rust.
I can move the socket into the spawned function, but then it is of course no longer available to the code outside, which needs to send messages in parallel.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let remote_addr: SocketAddr = "...:xxxx".parse()?;
let local_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0,0,0,0)), 0);
let mut socket = UdpSocket::bind(&local_addr)?;
socket.connect(&remote_addr)?;
// do some protocol work with the socket to establish a connection
tokio::spawn(async move {
let mut buf = [0; 1024];
loop {
let l = match socket.recv(&mut buf).await {
// socket closed
Ok(l) if l == 0 => {
println!("socket closed");
return;
},
Ok(l) => l,
Err(e) => {
println!("failed to read from socket; err = {:?}", e);
return;
}
};
let data = buf[..l].to_vec();
println!("Received {} bytes:\n{:#x?}", l, data);
}
});
// here I would like to use the socket again to send messages and to do the disconnect protocol, i.e.
let len = socket.recv(.....
When I use the socket afterwards, I get the error that the variable moved due to use in generator and gets dropped at the end of the spawn task (which it should not). Later use of socket says value borrowed after move, which is clear, but how can I avoid it?
I would appreciate if somebody could help me with this beginner question, especially in the context of the new async/await syntax of tokio. Thanks!
Well, I solved the problem in a different way. Tokio's UdpSocket can be split into a receiving and a sending part. I run both of them in separate task and then use multiple tokio::mpsc::channel to communicate between the two task and the main task.
I have a main function, where I create a Tokio runtime and run two futures on it.
use tokio;
fn main() {
let mut runtime = tokio::runtime::Runtime::new().unwrap();
runtime.spawn(MyMegaFutureNumberOne {});
runtime.spawn(MyMegaFutureNumberTwo {});
// Some code to 'join' them after receiving an OS signal
}
How do I receive a SIGTERM, wait for all unfinished tasks (NotReadys) and exit the application?
Dealing with signals is tricky and it would be too broad to explain how to handle all possible cases. The implementation of signals is not standard across platforms, so my answer is specific to Linux. If you want to be more cross-platform, use the POSIX function sigaction combined with pause; this will offer you more control.
One way to achieve what you want is to use the tokio_signal crate to catch signals, like this: (doc example)
extern crate futures;
extern crate tokio;
extern crate tokio_signal;
use futures::prelude::*;
use futures::Stream;
use std::time::{Duration, Instant};
use tokio_signal::unix::{Signal, SIGINT, SIGTERM};
fn main() -> Result<(), Box<::std::error::Error>> {
let mut runtime = tokio::runtime::Runtime::new()?;
let sigint = Signal::new(SIGINT).flatten_stream();
let sigterm = Signal::new(SIGTERM).flatten_stream();
let stream = sigint.select(sigterm);
let deadline = tokio::timer::Delay::new(Instant::now() + Duration::from_secs(5))
.map(|()| println!("5 seconds are over"))
.map_err(|e| eprintln!("Failed to wait: {}", e));
runtime.spawn(deadline);
let (item, _rest) = runtime
.block_on_all(stream.into_future())
.map_err(|_| "failed to wait for signals")?;
let item = item.ok_or("received no signal")?;
if item == SIGINT {
println!("received SIGINT");
} else {
assert_eq!(item, SIGTERM);
println!("received SIGTERM");
}
Ok(())
}
This program will wait for all current tasks to complete and will catch the selected signals. This doesn't seem to work on Windows as it instantly shuts down the program.
The other answer is for Tokio version 0.1.x, which is very old. For Tokio version 1.x.y, the official Tokio tutorial has a page on this topic: Graceful shutdown