Problem with handling async inside of an actix-rust actor - asynchronous

Currently I'm checking out Actix, a Rust based actor framework. I'm also using Actix web to build a REST API. Now, I'm familiar with actor based architecture from working with Akka, however, I'm not being able to start a simple async task inside of my handler.
It's simplified, but I have the following code:
#[post("/upload")]
pub async fn upload_images(
app_config: web::Data<AppConfig>,
mut payload: Multipart,
) -> Result<HttpResponse> {
... transforms the multipart form into images...
for img in img_vec {
app_config.image_processor_addr.do_send(ResizeImage{
img_id: img._id,
img_format: img.format,
image_buffer: img.image.bytes,
});
};
Ok(HttpResponse::Ok().content_type(ContentType::plaintext()).body(format!("Inserted {} images.", vec_len)))
}
As you can see, I receive a multipart upload which consists of images, which I then send to an image processing actor to perform a resize on the images.
And this is the simplified code for the handling of the message ResizeImage for the ImageProcessor actor:
impl Handler<ResizeImage> for ImageProcessor {
type Result = ();
fn handle(&mut self, msg: ResizeImage, _: &mut Self::Context) -> Self::Result {
let thumbnail_col = self.thumbnail_col.clone();
let img_col = self.img_col.clone();
let img_format: ImageFormat = msg.img_format.clone().into();
log::info!("Parsing image {} with actor {}.", msg.img_id, self.id);
let actor_task_fut = Box::pin(async move {
... parses the image here...
});
match Arbiter::current().spawn(actor_task_fut) {
true => log::info!("Sent task to arbiter."),
false => log::error!("Failed to send task to arbiter!"),
}
}
}
The idea is that I would resolve the web handler, and the resize task would be done async on the actor thread. However, this works on the first call, but when I call the same endpoint before all the images from the previous call are parsed, it doesn't resolve immediately, it waits till the actor has resized the previous batch.
I was under the impression that the messages would be sent to the actor mailbox and then the handler code would not need to wait for anything, since I'm using do_sent, which the documentation states that it doesn't await for the answer. Using Akka I can easily do something similar, and it seems to work. Am I missing something here? Is the way I'm handling async inside the actor thread wrong?

Related

How to send large custom struct over HTTP in Rust lang using reqwest, tokio and actix_web

Issue
I have a client that needs to send the following custom data structure to an API:
#[derive(Serialize, Deserialize)]
pub struct FheSum {
pub server_keys: ServerKey,
pub value1: FheUint8,
pub value2: FheUint8,
}
The code for the client is the following:
let fhe_post: FheSum = FheSum {
server_keys: server_keys.to_owned(),
value1: value_api.to_owned(),
value2: value_api_2.to_owned(),
};
let client = reqwest::blocking::Client::builder()
.timeout(None)
.build().unwrap();
let response = client
.post("http://127.0.0.1:8000/computesum")
.json(&fhe_post)
.send().unwrap();
let response_json: Result<FheSumResult, reqwest::Error> = response.json();
match response_json {
Ok(j) => {
let result_api: u8 = FheUint8::decrypt(&j.result, &client_keys);
println!("Final Result: {}", result_api)
},
Err(e) => println!("{:}", e),
};
In the API, I have the following definition of an HttpServer:
HttpServer::new(|| {
let json_cfg = actix_web::web::JsonConfig::default()
.limit(std::usize::MAX);
App::new()
.app_data(json_cfg)
.service(integers::computesum)
})
.client_disconnect_timeout(std::time::Duration::from_secs(3000))
.client_request_timeout(std::time::Duration::from_secs(3000))
.max_connection_rate(std::usize::MAX)
.bind(("127.0.0.1", 8000))?
.run()
.await
And the associated endpoint the client is trying to access:
#[post("/computesum")]
pub async fn computesum(req: Json<FheSum>) -> HttpResponse {
let req: FheSum = req.into_inner();
let recovered: FheSum = FheSum::new(
req.server_keys,
req.value1,
req.value2,
);
set_server_key(recovered.server_keys);
let result_api_enc: FheSumResult = FheSumResult::new(recovered.value1 + recovered.value2);
HttpResponse::Ok()
.content_type(ContentType::json())
.json(&result_api_enc)
}
Problem
The structs are the same in both the client and the server. This code works when using common data types such as Strings. The issue is when using this data structures. The memory occupied, obtained with mem::size_of_val which returns the size in bytes, is the following:
Size 1: 2488
Size 2: 32
Size 3: 32
The result has been obtained in bytes, so, given the limit established in the HttpServer, this shouldn't be an issue. Timeouts have also been set at much higher values than commonly needed.
Even with this changes, the client always shows Killed, and doesn't display the answer from the server, not giving any clues on what the problem might be.
The client is killing the process before being able to process the server's response. I want to find a way to send these custom data types to the HTTP server without the connection closing before the operation has finished.
I have already tried different libraries for the client such as the acw crate, apart from reqwest and the result is the same. I have also tried not using reqwest in blocking mode, and the error persists.

In Firebase and Kotlin, in case of no network connection is there an easy way to handle endless network request looping?

In the case of network connectivity loss, the following code just loops endlessly and keeps making API calls. Is there a way to cancel with a timeout (for example, 5000 ms) using Firebase API? Or would I have to make my own Coroutine to handle this?
fun updateUserFieldInDB(
collectionPath: String,
strArr: ArrayList<String>,
onSuccess: (() -> Unit),
onFail: (() -> Unit)
) {
val fbUser = Firebase.auth.currentUser
if (fbUser == null) {
Log.i(TAG, "user is null....")
return
}
val db = Firebase.firestore
when (strArr.size) {
2 -> {
db.collection(collectionPath).document(fbUser.uid).update(strArr[0], strArr[1])
.addOnSuccessListener {
onSuccess()
}
.addOnFailureListener {
onFail()
}
}
}
}
The onSuccess ad onFail completion handlers for Firestore only fire once the write operation has been committed or rejected on the server. You should only use them if you're interested in detecting that situation, in which case the looping is to be expected.
If you only care whether the write operation was recorded by the Firestore client (in its local cache), the best way to detect that is when the update(strArr[0], strArr[1]) call completes.
So pretty much: when the next line of code executes, the write has been recorded locally; when the completion listeners fire, the write has been handled on the server.

Spawn reading data from multipart in actix-web

I tried the example of actix-multipart with actix-web v3.3.2 and actix-multipart v0.3.0.
For a minimal example,
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
This works well, but I want to do with form data asynchronously. So I tried instead:
use actix_multipart::Multipart;
use actix_web::{post, web, App, HttpResponse, HttpServer};
use futures::{StreamExt, TryStreamExt};
#[post("/")]
async fn save_file(mut payload: Multipart) -> HttpResponse {
actix_web::rt::spawn(async move {
while let Ok(Some(mut field)) = payload.try_next().await {
let content_type = field.content_disposition().unwrap();
let filename = content_type.get_filename().unwrap();
println!("filename = {}", filename);
while let Some(chunk) = field.next().await {
let data = chunk.unwrap();
println!("Read a chunk.");
}
println!("Done");
}
});
HttpResponse::Ok().finish()
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
HttpServer::new(|| App::new().service(save_file))
.bind("0.0.0.0:8080")?
.run()
.await
}
(Added actix_web::rt::spawn to save_file.)
But this did not work -- the message "Done" never printed. The number of "Read a chunk" displayed in the second case was less than the first case, so I guess that field.next().await cannot terminate for some reason before completing reading all data.
I do not know so much about asynchronous programming, so I am not sure why field.next() did not work in actix_web::rt::spawn.
My question are: why is it, and how can I do with actix_web::rt::spawn?
When you make this call:
actix_web::rt::spawn(async move {
// do things...
});
spawn returns a JoinHandle which is used to poll the task. When you drop that handle (by not binding it to anything), the task is "detached", i.e., it runs in the background.
The actix documentation is not particularly helpful here, but actix uses the tokio runtime under the hood. A key issue is that in tokio, spawned tasks are not guaranteed to complete. The executor needs to know, somehow, that it should perform work on that future. In your second example, the spawned task is never .awaited, nor does it communicate with any other task via channels.
Most likely, the spawned task is never polled and does not make any progress. In order to ensure that it completes, you can either .await the JoinHandle (which will drive the task to completion) or .await some other Future that depends on work in the spawned task (usually by using a channel).
As for your more general goal, the work is already being performed asynchronously! Most likely, actix is doing roughly what you tried to do in your second example: upon receiving a request, it spawns a task to handle the request and polls it repeatedly (as well as the other active requests) until it completes, then sends a response.

Always return Ok HttpResponse then do work in actix-web handler

I have a handler to initiate a password reset. It always returns a successful 200 status code, so that an attacker cannot use it to find out which email addresses are stored in the database. The problem is, if an email is in the database, it'll take a while for the request to be fulfilled (blocking user lookup and sending the actual email with a reset token). If the user is not in the db, the request returns very quickly, so an attacked would know the email is not there.
How would I go about returning the HTTP response right away while processing the request in the background?
pub async fn forgot_password_handler(
email_from_path: web::Path<String>,
pool: web::Data<Pool>,
redis_client: web::Data<redis::Client>,
) -> HttpResponse {
let conn: &PgConnection = &pool.get().unwrap();
let email_address = &email_from_path.into_inner();
// search for user with email address in users table
match users.filter(email.eq(email_address)).first::<User>(conn) {
Ok(user) => {
// some stuff omitted.. this is what happens:
// create random token for user and store a hash of it in redis (it'll expire after some time)
// send email with password reset link and token (not hashed) to client
// then return with
HttpResponse::Ok().finish(),
}
_ => HttpResponse::Ok().finish(),
}
}
You can use an Actix Arbiter to schedule an asynchronous task:
use actix::Arbiter;
async fn do_the_database_stuff(
email: String,
pool: web::Data<Pool>,
redis_client: web::Data<redis::Client>)
{
// async database code here
}
pub async fn forgot_password_handler(
email_from_path: web::Path<String>,
pool: web::Data<Pool>,
redis_client: web::Data<redis::Client>,
) -> HttpResponse {
let email = email_from_path.clone();
Arbiter::spawn(async {
do_the_database_stuff(
email,
pool,
redis_client
);
});
HttpResponse::Ok().finish()
}
If your database code is blocking, to prevent hogging the long-lived Actix worker threads, you could instead create a new Arbiter, with its own thread:
fn do_the_database_stuff(email: String) {
// blocking database code here
}
pub async fn forgot_password_handler(email_from_path: String) -> HttpResponse {
let email = email_from_path.clone();
Arbiter::new().exec_fn(move || {
async move {
do_the_database_stuff(email).await;
};
});
HttpResponse::Ok().finish()
}
This may be a bit more work because Pool and redis::Client are unlikely to be safe to share between threads, so you will have to solve that too. That's why I didn't include them in the example code.
It's better to use Arbiters than be tempted to spawn a new native thread with std::thread. If you mix the two, you can end up accidentally including code that messes up the worker. For example using std::thread::sleep in an async context would pause unrelated tasks that just happen to be scheduled on the same worker, and may not even have any effect on the task you intended.
Finally, you might also consider an architectural change. If you factor database-heavy tasks into their own microservices, you would solve this problem automatically. The web handler can then just send a message (Kafka, RabbitMQ, ZMQ, HTTP, or whatever you choose) and immediately return. This will let you scale the microservices independently of the webserver - 10x web server instances doesn't have to mean 10x database connections, if you only need one instance for the password reset service.

Using the Saturn Framework, how can I get a reference to the Websockets hub outside of a particular request?

I'm building an application for a toy problem to learn more about SAFE. I have some background processes running server-side and occasionally they need to send a message unprompted to the connected clients. This means that I need a reference to the SocketHub from outside of any particular request.
Currently I have a mutable variable which I pass a value to when the Channel is joined:
let mainChannel = channel {
join (fun ctx socketId ->
task {
printfn "Connected! Main Socket Id: %O" socketId
let hub = ctx.GetService<Channels.ISocketHub>()
webSocketHub <- Some hub // Passing the reference to a mutable variable
task {
do! Task.Delay 500
let m = (socketId |> (SetChannelSocketId >> GameData))
do! (harderSendMessage socketId "message" m "Problem sending SocketId")
} |> ignore
return Channels.Ok })
}
However, it seems to me like there should be a better way to get access to the hub - I just can't figure it out.

Resources