Spawn reading data from multipart in actix-web - asynchronous

I tried the example of actix-multipart with actix-web v3.3.2 and actix-multipart v0.3.0.
For a minimal example,
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
This works well, but I want to do with form data asynchronously. So I tried instead:
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
actix_web::rt::spawn(async move {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
});
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
(Added actix_web::rt::spawn to save_file.)
But this did not work -- the message "Done" never printed. The number of "Read a chunk" displayed in the second case was less than the first case, so I guess that field.next().await cannot terminate for some reason before completing reading all data.
I do not know so much about asynchronous programming, so I am not sure why field.next() did not work in actix_web::rt::spawn.
My question are: why is it, and how can I do with actix_web::rt::spawn?

When you make this call:
actix_web::rt::spawn(async move {
// do things...
});
spawn returns a JoinHandle which is used to poll the task. When you drop that handle (by not binding it to anything), the task is "detached", i.e., it runs in the background.
The actix documentation is not particularly helpful here, but actix uses the tokio runtime under the hood. A key issue is that in tokio, spawned tasks are not guaranteed to complete. The executor needs to know, somehow, that it should perform work on that future. In your second example, the spawned task is never .awaited, nor does it communicate with any other task via channels.
Most likely, the spawned task is never polled and does not make any progress. In order to ensure that it completes, you can either .await the JoinHandle (which will drive the task to completion) or .await some other Future that depends on work in the spawned task (usually by using a channel).
As for your more general goal, the work is already being performed asynchronously! Most likely, actix is doing roughly what you tried to do in your second example: upon receiving a request, it spawns a task to handle the request and polls it repeatedly (as well as the other active requests) until it completes, then sends a response.

Related

How to send large custom struct over HTTP in Rust lang using reqwest, tokio and actix_web

Issue
I have a client that needs to send the following custom data structure to an API:
#[derive(Serialize, Deserialize)]
pub struct FheSum {
pub server_keys: ServerKey,
pub value1: FheUint8,
pub value2: FheUint8,
}
The code for the client is the following:
let fhe_post: FheSum = FheSum {
server_keys: server_keys.to_owned(),
value1: value_api.to_owned(),
value2: value_api_2.to_owned(),
};
let client = reqwest::blocking::Client::builder()
.timeout(None)
.build().unwrap();
let response = client
.post("http://127.0.0.1:8000/computesum")
.json(&fhe_post)
.send().unwrap();
let response_json: Result<FheSumResult, reqwest::Error> = response.json();
match response_json {
Ok(j) => {
let result_api: u8 = FheUint8::decrypt(&j.result, &client_keys);
println!("Final Result: {}", result_api)
},
Err(e) => println!("{:}", e),
};
In the API, I have the following definition of an HttpServer:
HttpServer::new(|| {
let json_cfg = actix_web::web::JsonConfig::default()
.limit(std::usize::MAX);
App::new()
.app_data(json_cfg)
.service(integers::computesum)
})
.client_disconnect_timeout(std::time::Duration::from_secs(3000))
.client_request_timeout(std::time::Duration::from_secs(3000))
.max_connection_rate(std::usize::MAX)
.bind(("127.0.0.1", 8000))?
.run()
.await
And the associated endpoint the client is trying to access:
#[post("/computesum")]
pub async fn computesum(req: Json<FheSum>) -> HttpResponse {
let req: FheSum = req.into_inner();
let recovered: FheSum = FheSum::new(
req.server_keys,
req.value1,
req.value2,
);
set_server_key(recovered.server_keys);
let result_api_enc: FheSumResult = FheSumResult::new(recovered.value1 + recovered.value2);
HttpResponse::Ok()
.content_type(ContentType::json())
.json(&result_api_enc)
}
Problem
The structs are the same in both the client and the server. This code works when using common data types such as Strings. The issue is when using this data structures. The memory occupied, obtained with mem::size_of_val which returns the size in bytes, is the following:
Size 1: 2488
Size 2: 32
Size 3: 32
The result has been obtained in bytes, so, given the limit established in the HttpServer, this shouldn't be an issue. Timeouts have also been set at much higher values than commonly needed.
Even with this changes, the client always shows Killed, and doesn't display the answer from the server, not giving any clues on what the problem might be.
The client is killing the process before being able to process the server's response. I want to find a way to send these custom data types to the HTTP server without the connection closing before the operation has finished.
I have already tried different libraries for the client such as the acw crate, apart from reqwest and the result is the same. I have also tried not using reqwest in blocking mode, and the error persists.

Problem with handling async inside of an actix-rust actor

Currently I'm checking out Actix, a Rust based actor framework. I'm also using Actix web to build a REST API. Now, I'm familiar with actor based architecture from working with Akka, however, I'm not being able to start a simple async task inside of my handler.
It's simplified, but I have the following code:
#[post("/upload")]
pub async fn upload_images(
app_config: web::Data<AppConfig>,
mut payload: Multipart,
) -> Result<HttpResponse> {
... transforms the multipart form into images...
for img in img_vec {
app_config.image_processor_addr.do_send(ResizeImage{
img_id: img._id,
img_format: img.format,
image_buffer: img.image.bytes,
});
};
Ok(HttpResponse::Ok().content_type(ContentType::plaintext()).body(format!("Inserted {} images.", vec_len)))
}
As you can see, I receive a multipart upload which consists of images, which I then send to an image processing actor to perform a resize on the images.
And this is the simplified code for the handling of the message ResizeImage for the ImageProcessor actor:
impl Handler<ResizeImage> for ImageProcessor {
type Result = ();
fn handle(&mut self, msg: ResizeImage, _: &mut Self::Context) -> Self::Result {
let thumbnail_col = self.thumbnail_col.clone();
let img_col = self.img_col.clone();
let img_format: ImageFormat = msg.img_format.clone().into();
log::info!("Parsing image {} with actor {}.", msg.img_id, self.id);
let actor_task_fut = Box::pin(async move {
... parses the image here...
});
match Arbiter::current().spawn(actor_task_fut) {
true => log::info!("Sent task to arbiter."),
false => log::error!("Failed to send task to arbiter!"),
}
}
}
The idea is that I would resolve the web handler, and the resize task would be done async on the actor thread. However, this works on the first call, but when I call the same endpoint before all the images from the previous call are parsed, it doesn't resolve immediately, it waits till the actor has resized the previous batch.
I was under the impression that the messages would be sent to the actor mailbox and then the handler code would not need to wait for anything, since I'm using do_sent, which the documentation states that it doesn't await for the answer. Using Akka I can easily do something similar, and it seems to work. Am I missing something here? Is the way I'm handling async inside the actor thread wrong?

Rust Async Drop

I'm facing a scenario where I need to run async code from the drop handler of an object. The whole application runs in a tokio async context, so I know that the drop handler is called with an active tokio Runtime, but unfortunately drop itself is a sync function.
Ideally, I'd like a solution that works on both multi-thread and current-thread runtimes, but if that doesn't exist, then I'm ok with a solution that blocks the dropping thread and relies on other threads to drive the futures.
I considered multiple options but I'm not sure which approach is best or understand much about their trade offs. For these examples, let's assume my class has an async terminate(&mut self) function that I would like to be called from drop().
struct MyClass;
impl MyClass {
async fn terminate(&mut self) {}
}
Option 1: tokio::runtime::Handle::block_on
impl Drop for MyClass {
fn drop(&mut self) {
tokio::runtime::Handle::current().block_on(self.terminate());
}
}
This seems to be the most straightforward approach, but unfortunately it panics with
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
see playground
I'm a bit confused by this since I thought Handle::block_on would use the currently running runtime but it seems this tries to start a new runtime? What is going on here?
Also, according to the documentation of Handle::block_on, this cannot drive IO threads. So I guess blocking this thread is a risk - if too many objects are destructed at the same time, each blocking a thread, and those futures wait for IO work, then this will deadlock.
Option 2: futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
futures::executor::block_on(self.terminate());
}
}
see playground
This seems to work fine. If I understand this correctly, then it spawns a new non-tokio executor on the current thread and has that thread drive the future. Is this an issue? Does this cause conflicts between the running tokio executor and the new futures executor?
Also, can this actually drive IO threads, avoiding the issue of option 1? Or can it happen that those IO threads are still waiting on the tokio executor?
Option 3: tokio::task::spawn with futures::executor::block_on
impl Drop for MyClass {
fn drop(&mut self) {
let task = tokio::task::spawn(self.terminate());
futures::executor::block_on(task);
}
}
see playground
This should have the tokio runtime drive the termination future while the futures runtime only blocks the current thread to wait until the tokio runtime finished? Is this safer than option 2 and causes fewer conflicts between the runtimes? Unfortunately, this ran into a lifetime issue I couldn't figure out.:
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/main.rs:8:44
|
7 | fn drop(&mut self) {
| --------- this data with an anonymous lifetime `'_`...
8 | let task = tokio::task::spawn(self.terminate());
| ---- ^^^^^^^^^
| |
| ...is used here...
|
note: ...and is required to live as long as `'static` here
--> src/main.rs:8:20
|
8 | let task = tokio::task::spawn(self.terminate());
| ^^^^^^^^^^^^^^^^^^
note: `'static` lifetime requirement introduced by this bound
--> /playground/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.17.0/src/task/spawn.rs:127:28
|
127 | T: Future + Send + 'static,
| ^^^^^^^
I also tried to fix this with LocalSet but couldn't get it to work. Any way to make this work?
Option 3b
I was, however, able to make it work if I make terminate() take self by value and wrap MyClass into a Wrapper. Not pretty but maybe better than Option 2 because it uses the tokio runtime to drive the future?
struct MyClass;
impl MyClass {
async fn terminate(self) {}
}
struct Wrapper(Option<MyClass>);
impl Drop for Wrapper {
fn drop(&mut self) {
if let Some(v) = self.0.take() {
let task = tokio::task::spawn(v.terminate());
futures::executor::block_on(task).unwrap();
}
}
}
see playground
Is this a good approach? Is it actually important that the tokio runtime drives the drop future or is the simpler Option 2 better? Any ways to make option 3b prettier / easier to use?
Option 4: Background Task
I found this option here: https://stackoverflow.com/a/68851788/829568
It basically spawns a background task in the constructor of the object that waits for a trigger and runs async drop code when triggered. The drop implementation then triggers it and runs a busy waiting loop until it is finished.
This seems overly complex and also more error prone than the other options here. Or is this actually the best solution?
Side question on exhausting worker threads
Except for option 1, all of these options block a tokio worker thread to wait for the async drop to complete. In a multi threaded runtime, this will go well most of the time, but could in theory lock up all worker threads if multiple destructors run in parallel - and IIUC then we would have a deadlock with no thread making progress. Option 1 seems somewhat better but the block_on documentation says it can only drive non-IO futures. So it could still lock up if too many destructors do IO work. Is there a way to tell tokio to increase the number of worker threads by one? If we do that for each thread we block, would that avoid this issue?
Option 5: new runtime in new thread
impl Drop for MyClass {
fn drop(&mut self) {
std::thread::scope(|s| {
s.spawn(|| {
let runtime = tokio::runtime::Builder::new_multi_thread()
.build()
.unwrap();
runtime.block_on(self.terminate());
});
});
}
}
see playground
This seems to work and attempts to avoid the issue of blocking worker threads by running the drop task on a new runtime in a new thread. This new thread should, hopefully, be able to drive IO tasks. But does this actually fully solve the problem? What if the drop task depends on an IO task that is running on the main tokio executor? I think this may still have a chance of causing the program to lock up indefinitely.
If you want to "do something", without exclusive mutable access to MyClass, maybe using oneshot channels to trigger async compute would work? Somewhat similar to the option #4.
You can send some extra state through the channel too.
use std::time::Duration;
use tokio::{
runtime::Runtime,
sync::oneshot::{self, Receiver, Sender},
time::interval,
};
struct MyClass {
tx: Option<Sender<()>>, // can have SomeStruct instead of ()
// my_state: Option<SomeStruct>
}
impl MyClass {
pub async fn new() -> Self {
println!("MyClass::new()");
let (tx, mut rx) = oneshot::channel();
tokio::task::spawn(async move {
let mut interval = interval(Duration::from_millis(100));
println!("drop wait loop starting...");
loop {
tokio::select! {
_ = interval.tick() => println!("Another 100ms"),
msg = &mut rx => {
println!("should process drop here");
break;
}
}
}
});
Self { tx: Some(tx) }
}
}
impl Drop for MyClass {
fn drop(&mut self) {
println!("drop()");
self.tx.take().unwrap().send(()).unwrap();
// self.tx.take().unwrap().send(self.my_state.take().unwrap()).unwrap();
}
}
#[tokio::main]
async fn main() {
let class = MyClass::new().await;
}
This prints most of the time:
MyClass::new()
drop()
drop wait loop starting...
should process drop here
sometimes the process exists before the receiving side-task gets a chance to spawn. But if you have a non-exiting code, should be fine.
Not sure if the select! interval.tick is necessary, though unfortunately oneshot channel has no async blocking receive method.

In Firebase and Kotlin, in case of no network connection is there an easy way to handle endless network request looping?

In the case of network connectivity loss, the following code just loops endlessly and keeps making API calls. Is there a way to cancel with a timeout (for example, 5000 ms) using Firebase API? Or would I have to make my own Coroutine to handle this?
fun updateUserFieldInDB(
collectionPath: String,
strArr: ArrayList<String>,
onSuccess: (() -> Unit),
onFail: (() -> Unit)
) {
val fbUser = Firebase.auth.currentUser
if (fbUser == null) {
Log.i(TAG, "user is null....")
return
}
val db = Firebase.firestore
when (strArr.size) {
2 -> {
db.collection(collectionPath).document(fbUser.uid).update(strArr[0], strArr[1])
.addOnSuccessListener {
onSuccess()
}
.addOnFailureListener {
onFail()
}
}
}
}
The onSuccess ad onFail completion handlers for Firestore only fire once the write operation has been committed or rejected on the server. You should only use them if you're interested in detecting that situation, in which case the looping is to be expected.
If you only care whether the write operation was recorded by the Firestore client (in its local cache), the best way to detect that is when the update(strArr[0], strArr[1]) call completes.
So pretty much: when the next line of code executes, the write has been recorded locally; when the completion listeners fire, the write has been handled on the server.

How to borrow/avoid a move of a socket in tokio::spawn(async

I am trying to write an udp client in rust which establishes a socket connection to a remote server, should listen for incoming messages(and then process the data), while also be able to send messages and then disconnects after a given time. I would like to use the new async/await syntax in tokio and spawn a task that takes care of reading incoming/processing the incoming messages, while keeping the socket in the main task to send messages in parallel, especially at the end the protocol to close the connection.
How can I avoid moving the socket into the spawned task? Is there a way to borrow it in that task maybe trough a reference. I looked through answers to similar questions but could not understand it as they apply to the version of tokio without the new syntax and as I am an absolute beginner in rust.
I can move the socket into the spawned function, but then it is of course no longer available to the code outside, which needs to send messages in parallel.
#[tokio::main]
async fn main() -> Result<(), Box<dyn Error>> {
let remote_addr: SocketAddr = "...:xxxx".parse()?;
let local_addr = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(0,0,0,0)), 0);
let mut socket = UdpSocket::bind(&local_addr)?;
socket.connect(&remote_addr)?;
// do some protocol work with the socket to establish a connection
tokio::spawn(async move {
let mut buf = [0; 1024];
loop {
let l = match socket.recv(&mut buf).await {
// socket closed
Ok(l) if l == 0 => {
println!("socket closed");
return;
},
Ok(l) => l,
Err(e) => {
println!("failed to read from socket; err = {:?}", e);
return;
}
};
let data = buf[..l].to_vec();
println!("Received {} bytes:\n{:#x?}", l, data);
}
});
// here I would like to use the socket again to send messages and to do the disconnect protocol, i.e.
let len = socket.recv(.....
When I use the socket afterwards, I get the error that the variable moved due to use in generator and gets dropped at the end of the spawn task (which it should not). Later use of socket says value borrowed after move, which is clear, but how can I avoid it?
I would appreciate if somebody could help me with this beginner question, especially in the context of the new async/await syntax of tokio. Thanks!
Well, I solved the problem in a different way. Tokio's UdpSocket can be split into a receiving and a sending part. I run both of them in separate task and then use multiple tokio::mpsc::channel to communicate between the two task and the main task.

Resources