How to run an asynchronous task from a non-main thread in Tokio? - asynchronous

use std::thread;
use tokio::task; // 0.3.4
#[tokio::main]
async fn main() {
thread::spawn(|| {
task::spawn(async {
println!("123");
});
})
.join();
}
When compiling I get a warning:
warning: unused `std::result::Result` that must be used
--> src/main.rs:6:5
|
6 | / thread::spawn(|| {
7 | | task::spawn(async {
8 | | println!("123");
9 | | });
10 | | })
11 | | .join();
| |____________^
|
= note: `#[warn(unused_must_use)]` on by default
= note: this `Result` may be an `Err` variant, which should be handled
And when executing I get an error:
thread '<unnamed>' panicked at 'must be called from the context of Tokio runtime configured with either `basic_scheduler` or `threaded_scheduler`', src/main.rs:7:9

The key piece is that you need to get a Tokio Handle. This is a reference to a Runtime and it allows you to spawn asynchronous tasks from outside of the runtime.
When using #[tokio::main], the simplest way to get a Handle is via Handle::current before spawning another thread then give the handle to each thread that might want to start an asynchronous task:
use std::thread;
use tokio::runtime::Handle; // 0.3.4
#[tokio::main]
async fn main() {
let threads: Vec<_> = (0..3)
.map(|thread_id| {
let handle = Handle::current();
thread::spawn(move || {
eprintln!("Thread {} started", thread_id);
for task_id in 0..3 {
handle.spawn(async move {
eprintln!("Thread {} / Task {}", thread_id, task_id);
});
}
eprintln!("Thread {} finished", thread_id);
})
})
.collect();
for t in threads {
t.join().expect("Thread panicked");
}
}
You could also create a global, mutable singleton of a Mutex<Option<Handle>>, initialize it to None, then set it to Some early in your tokio::main function. Then, you can grab that global variable, unwrap it, and clone the Handle when you need it:
use once_cell::sync::Lazy; // 1.5.2
static HANDLE: Lazy<Mutex<Option<Handle>>> = Lazy::new(Default::default);
*HANDLE.lock().unwrap() = Some(Handle::current());
let handle = HANDLE.lock().unwrap().as_ref().unwrap().clone();
See also:
How do I add tasks to a Tokio event loop that is running on another thread?
How do I synchronously return a value calculated in an asynchronous Future in stable Rust?
How to create a dedicated threadpool for CPU-intensive work in Tokio?
How do I create a global, mutable singleton?

I have a job processing app that exposes a web API to add jobs and process them but the API request should not wait for the job to finish (it could take a while). I use Server-Sent Events to broadcast the job result. This means the main API server is executing inside main with #[tokio::main], but where should I be running the job executor? In the job executor, I will have plenty of waiting: things like downloading. They will interfere with the web API server. The crucial question is how do I even start both executions in parallel?
In this scenario, you need to create a separate thread with thread::spawn inside which you will create a Tokio executor. The error you get is that inside your second thread, there is no Tokio executor (runtime). You need to create one manually and tell it to run your tasks. The easier way is to use the Runtime API:
use tokio::runtime::Runtime; // 0.2.23
// Create the runtime
let rt = Runtime::new().unwrap();
// Spawn a future onto the runtime
rt.spawn(async {
println!("now running on a worker thread");
});
In your main thread, an executor is already available with the use of #[tokio::main]. Prior to the addition of this attribute, the runtime was created manually.
If you want to stick with the async/await philosophy, you can use join:
use tokio; // 0.2.23
#[tokio::main]
async fn main() {
let (_, _) = tokio::join!(start_server_listener(), start_job_processor());
}
This is why most answers are questioning your approach. Although very rare, I believe there are scenarios where you want an async runtime to be on another thread while also having the benefit to manually configure the runtime.

Related

Problem with handling async inside of an actix-rust actor

Currently I'm checking out Actix, a Rust based actor framework. I'm also using Actix web to build a REST API. Now, I'm familiar with actor based architecture from working with Akka, however, I'm not being able to start a simple async task inside of my handler.
It's simplified, but I have the following code:
#[post("/upload")]
pub async fn upload_images(
app_config: web::Data<AppConfig>,
mut payload: Multipart,
) -> Result<HttpResponse> {
... transforms the multipart form into images...
for img in img_vec {
app_config.image_processor_addr.do_send(ResizeImage{
img_id: img._id,
img_format: img.format,
image_buffer: img.image.bytes,
});
};
Ok(HttpResponse::Ok().content_type(ContentType::plaintext()).body(format!("Inserted {} images.", vec_len)))
}
As you can see, I receive a multipart upload which consists of images, which I then send to an image processing actor to perform a resize on the images.
And this is the simplified code for the handling of the message ResizeImage for the ImageProcessor actor:
impl Handler<ResizeImage> for ImageProcessor {
type Result = ();
fn handle(&mut self, msg: ResizeImage, _: &mut Self::Context) -> Self::Result {
let thumbnail_col = self.thumbnail_col.clone();
let img_col = self.img_col.clone();
let img_format: ImageFormat = msg.img_format.clone().into();
log::info!("Parsing image {} with actor {}.", msg.img_id, self.id);
let actor_task_fut = Box::pin(async move {
... parses the image here...
});
match Arbiter::current().spawn(actor_task_fut) {
true => log::info!("Sent task to arbiter."),
false => log::error!("Failed to send task to arbiter!"),
}
}
}
The idea is that I would resolve the web handler, and the resize task would be done async on the actor thread. However, this works on the first call, but when I call the same endpoint before all the images from the previous call are parsed, it doesn't resolve immediately, it waits till the actor has resized the previous batch.
I was under the impression that the messages would be sent to the actor mailbox and then the handler code would not need to wait for anything, since I'm using do_sent, which the documentation states that it doesn't await for the answer. Using Akka I can easily do something similar, and it seems to work. Am I missing something here? Is the way I'm handling async inside the actor thread wrong?

Rust Async Drop

I'm facing a scenario where I need to run async code from the drop handler of an object. The whole application runs in a tokio async context, so I know that the drop handler is called with an active tokio Runtime, but unfortunately drop itself is a sync function.
Ideally, I'd like a solution that works on both multi-thread and current-thread runtimes, but if that doesn't exist, then I'm ok with a solution that blocks the dropping thread and relies on other threads to drive the futures.
I considered multiple options but I'm not sure which approach is best or understand much about their trade offs. For these examples, let's assume my class has an async terminate(&mut self) function that I would like to be called from drop().
struct MyClass;
impl MyClass {
async fn terminate(&mut self) {}
}
Option 1: tokio::runtime::Handle::block_on
impl Drop for MyClass {
fn drop(&mut self) {
tokio::runtime::Handle::current().block_on(self.terminate());
}
}
This seems to be the most straightforward approach, but unfortunately it panics with
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
see playground
I'm a bit confused by this since I thought Handle::block_on would use the currently running runtime but it seems this tries to start a new runtime? What is going on here?
Also, according to the documentation of Handle::block_on, this cannot drive IO threads. So I guess blocking this thread is a risk - if too many objects are destructed at the same time, each blocking a thread, and those futures wait for IO work, then this will deadlock.
Option 2: futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
futures::executor::block_on(self.terminate());
}
}
see playground
This seems to work fine. If I understand this correctly, then it spawns a new non-tokio executor on the current thread and has that thread drive the future. Is this an issue? Does this cause conflicts between the running tokio executor and the new futures executor?
Also, can this actually drive IO threads, avoiding the issue of option 1? Or can it happen that those IO threads are still waiting on the tokio executor?
Option 3: tokio::task::spawn with futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
let task = tokio::task::spawn(self.terminate());
futures::executor::block_on(task);
}
}
see playground
This should have the tokio runtime drive the termination future while the futures runtime only blocks the current thread to wait until the tokio runtime finished? Is this safer than option 2 and causes fewer conflicts between the runtimes? Unfortunately, this ran into a lifetime issue I couldn't figure out.:
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/main.rs:8:44
|
7 | fn drop(&mut self) {
| --------- this data with an anonymous lifetime `'_`...
8 | let task = tokio::task::spawn(self.terminate());
| ---- ^^^^^^^^^
| |
| ...is used here...
|
note: ...and is required to live as long as `'static` here
--> src/main.rs:8:20
|
8 | let task = tokio::task::spawn(self.terminate());
| ^^^^^^^^^^^^^^^^^^
note: `'static` lifetime requirement introduced by this bound
--> /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.17.0/src/task/spawn.rs:127:28
|
127 | T: Future + Send + 'static,
| ^^^^^^^
I also tried to fix this with LocalSet but couldn't get it to work. Any way to make this work?
Option 3b
I was, however, able to make it work if I make terminate() take self by value and wrap MyClass into a Wrapper. Not pretty but maybe better than Option 2 because it uses the tokio runtime to drive the future?
struct MyClass;
impl MyClass {
async fn terminate(self) {}
}
struct Wrapper(Option<MyClass>);
impl Drop for Wrapper {
fn drop(&mut self) {
if let Some(v) = self.0.take() {
let task = tokio::task::spawn(v.terminate());
futures::executor::block_on(task).unwrap();
}
}
}
see playground
Is this a good approach? Is it actually important that the tokio runtime drives the drop future or is the simpler Option 2 better? Any ways to make option 3b prettier / easier to use?
Option 4: Background Task
I found this option here: https://stackoverflow.com/a/68851788/829568
It basically spawns a background task in the constructor of the object that waits for a trigger and runs async drop code when triggered. The drop implementation then triggers it and runs a busy waiting loop until it is finished.
This seems overly complex and also more error prone than the other options here. Or is this actually the best solution?
Side question on exhausting worker threads
Except for option 1, all of these options block a tokio worker thread to wait for the async drop to complete. In a multi threaded runtime, this will go well most of the time, but could in theory lock up all worker threads if multiple destructors run in parallel - and IIUC then we would have a deadlock with no thread making progress. Option 1 seems somewhat better but the block_on documentation says it can only drive non-IO futures. So it could still lock up if too many destructors do IO work. Is there a way to tell tokio to increase the number of worker threads by one? If we do that for each thread we block, would that avoid this issue?
Option 5: new runtime in new thread
impl Drop for MyClass {
fn drop(&mut self) {
std::thread::scope(|s| {
s.spawn(|| {
let runtime = tokio::runtime::Builder::new_multi_thread()
.build()
.unwrap();
runtime.block_on(self.terminate());
});
});
}
}
see playground
This seems to work and attempts to avoid the issue of blocking worker threads by running the drop task on a new runtime in a new thread. This new thread should, hopefully, be able to drive IO tasks. But does this actually fully solve the problem? What if the drop task depends on an IO task that is running on the main tokio executor? I think this may still have a chance of causing the program to lock up indefinitely.
If you want to "do something", without exclusive mutable access to MyClass, maybe using oneshot channels to trigger async compute would work? Somewhat similar to the option #4.
You can send some extra state through the channel too.
use std::time::Duration;
use tokio::{
runtime::Runtime,
sync::oneshot::{self, Receiver, Sender},
time::interval,
};
struct MyClass {
tx: Option<Sender<()>>, // can have SomeStruct instead of ()
// my_state: Option<SomeStruct>
}
impl MyClass {
pub async fn new() -> Self {
println!("MyClass::new()");
let (tx, mut rx) = oneshot::channel();
tokio::task::spawn(async move {
let mut interval = interval(Duration::from_millis(100));
println!("drop wait loop starting...");
loop {
tokio::select! {
_ = interval.tick() => println!("Another 100ms"),
msg = &mut rx => {
println!("should process drop here");
break;
}
}
}
});
Self { tx: Some(tx) }
}
}
impl Drop for MyClass {
fn drop(&mut self) {
println!("drop()");
self.tx.take().unwrap().send(()).unwrap();
// self.tx.take().unwrap().send(self.my_state.take().unwrap()).unwrap();
}
}
#[tokio::main]
async fn main() {
let class = MyClass::new().await;
}
This prints most of the time:
MyClass::new()
drop()
drop wait loop starting...
should process drop here
sometimes the process exists before the receiving side-task gets a chance to spawn. But if you have a non-exiting code, should be fine.
Not sure if the select! interval.tick is necessary, though unfortunately oneshot channel has no async blocking receive method.

Hyper cannot find Server module

I'm writing a "hello world" HTTP server with Hyper, however I am not able to find the Server and rt modules when trying to import them.
When invoking cargo run, then I see this error message:
26 | let server = hyper::Server::bind(&addr).serve(router);
| ^^^^^^ could not find `Server` in `hyper`
I must be missing something obvious about Rust and Hyper. What I am trying to do is writing something as dry/simple as possible with just the HTTP layer and some basic routes. I would like to include as little as possible 3rd party dependencies e.g avoiding Tokio which I think involves async behavior, but I am not sure about the context as I am new to Rust.
Looks like I must use futures, so I included this dependency and perhaps futures only work with the async reserved word (which I am not sure if it comes from Tokio or Rust itself).
What confuses me is that in the Hyper examples I do see imports like use hyper::{Body, Request, Response, Server};, so that Server thing must be there, somewhere.
These are the dependencies in Cargo.toml:
hyper = "0.14.12"
serde_json = "1.0.67"
futures = "0.3.17"
This is the code in main.rs:
use futures::future;
use hyper::service::service_fn;
use hyper::{Body, Method, Response, StatusCode};
use serde_json::json;
fn main() {
let router = || {
service_fn(|req| match (req.method(), req.uri().path()) {
(&Method::GET, "/foo") => {
let mut res = Response::new(
Body::from(json!({"message": "bar"}).to_string())
);
future::ok(res)
},
(_, _) => {
let mut res = Response::new(
Body::from(json!({"content": "route not found"}).to_string())
);
*res.status_mut() = StatusCode::NOT_FOUND;
future::ok(res)
}
})
};
let addr = "127.0.0.1:8080".parse::<std::net::SocketAddr>().unwrap();
let server = hyper::Server::bind(&addr).serve(router); // <== this line fails to compile
// hyper::rt::run(server.map_err(|e| {
// eprintln!("server error: {}", e);
// }));
}
How do I make the code above compile and run?
According to documentation, you are missing one module namespace in your call hyper::server::Server:
let server = hyper::server::Server::bind(&addr).serve(router)
In order to use server you need to activate the feature flag in cargo:
hyper = { version = "0.14.12", features = ["server"] }

Spawn reading data from multipart in actix-web

I tried the example of actix-multipart with actix-web v3.3.2 and actix-multipart v0.3.0.
For a minimal example,
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
This works well, but I want to do with form data asynchronously. So I tried instead:
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
actix_web::rt::spawn(async move {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
});
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
(Added actix_web::rt::spawn to save_file.)
But this did not work -- the message "Done" never printed. The number of "Read a chunk" displayed in the second case was less than the first case, so I guess that field.next().await cannot terminate for some reason before completing reading all data.
I do not know so much about asynchronous programming, so I am not sure why field.next() did not work in actix_web::rt::spawn.
My question are: why is it, and how can I do with actix_web::rt::spawn?
When you make this call:
actix_web::rt::spawn(async move {
// do things...
});
spawn returns a JoinHandle which is used to poll the task. When you drop that handle (by not binding it to anything), the task is "detached", i.e., it runs in the background.
The actix documentation is not particularly helpful here, but actix uses the tokio runtime under the hood. A key issue is that in tokio, spawned tasks are not guaranteed to complete. The executor needs to know, somehow, that it should perform work on that future. In your second example, the spawned task is never .awaited, nor does it communicate with any other task via channels.
Most likely, the spawned task is never polled and does not make any progress. In order to ensure that it completes, you can either .await the JoinHandle (which will drive the task to completion) or .await some other Future that depends on work in the spawned task (usually by using a channel).
As for your more general goal, the work is already being performed asynchronously! Most likely, actix is doing roughly what you tried to do in your second example: upon receiving a request, it spawns a task to handle the request and polls it repeatedly (as well as the other active requests) until it completes, then sends a response.

Using the Saturn Framework, how can I get a reference to the Websockets hub outside of a particular request?

I'm building an application for a toy problem to learn more about SAFE. I have some background processes running server-side and occasionally they need to send a message unprompted to the connected clients. This means that I need a reference to the SocketHub from outside of any particular request.
Currently I have a mutable variable which I pass a value to when the Channel is joined:
let mainChannel = channel {
join (fun ctx socketId ->
task {
printfn "Connected! Main Socket Id: %O" socketId
let hub = ctx.GetService<Channels.ISocketHub>()
webSocketHub <- Some hub // Passing the reference to a mutable variable
task {
do! Task.Delay 500
let m = (socketId |> (SetChannelSocketId >> GameData))
do! (harderSendMessage socketId "message" m "Problem sending SocketId")
} |> ignore
return Channels.Ok })
}
However, it seems to me like there should be a better way to get access to the hub - I just can't figure it out.

Resources