In Rust, what is gained from making a function 'async'? - asynchronous

I know that an async function will return a Future, and that calling .await on that future will repeatedly call poll() on it until it completes, but I don't understand the difference between that and a regular function.
Does it give any benefit to make a function async if it already works without it?
I have two versions of code that do the same thing. Consider the "experiment" functions as main functions running with #[tokio::main].
One version calls a regular function to create code that can run asynchronously with tokio::spawn() (playground link, though it seems to not show the output well):
pub async fn experiment1() {
spawn_task(1);
spawn_task(2);
loop {
println!("3");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
fn spawn_task(n: i32) {
tokio::spawn(async move {
loop {
println!("{n}");
tokio::time::sleep(Duration::from_secs(1)).await;
}
});
}
The output of this is below (3 numbers print out every second, this is 3 seconds worth of output). Each loops runs concurrently.
3
1
2
3
2
1
3
1
2
The other version uses async and await (playground link, though it seems to not show the output well):
pub async fn experiment3() {
spawn_task(1).await;
spawn_task(2).await;
loop {
println!("3");
tokio::time::sleep(Duration::from_secs(1)).await;
}
}
async fn spawn_task(n: i32) {
tokio::spawn(async move {
loop {
println!("{n}");
tokio::time::sleep(Duration::from_secs(1)).await;
}
});
}
The output of this is below (3 numbers print out every second, this is 3 seconds worth of output). Each loops also runs concurrently.
3
1
2
3
1
2
3
2
1

When you only run synchronous code like in your examples the only
difference is that you change the return type from plain T to
impl Future<Output = T>. You might want to use this to use otherwise
synchronous functions in places where an asynchronous function is
expected.
The real benefit comes in when you actually use asynchronous code
inside of it:
fn fun() {
// this is not allowed because we're not inside an async block or function
tokio::time::sleep(Duration::from_secs(1)).await;
}
The above function is not allowed, to make it work we have to declare
it as an async function:
async fn fun() {
// we can use await here
tokio::time::sleep(Duration::from_secs(1)).await;
}

Related

How to run multiple intervals in tauri with tokio

currently, I am building a small application with Rust and Tauri but I've got the following issue that I need to solve:
Things that I want to do simultaneously:
Checking every 10 sec if a specific application is running
Polling every second data from SharedMemory via winapi
Both of them are working fine but I tried to refactor stuff and now I've got the following problem:
When my frontend sends me an event that the application is ready (or inside .on_page_load() I want to start both processes I mentioned before:
#[tauri::command]
async fn app_ready(window: tauri::Window) {
let is_alive = false; // I think this needs to be a mutex or a mutex that is wrapped around Arc::new()
tokio::join!(
poll_acc_process(&window, &is_alive),
handle_phycics(&window, &is_alive),
);
}
Visual Studio Code is complaining about the following stuff: future cannot be sent between threads safely within impl futures::Future<Output = ()>, the trait std::marker::Send is not implemented for *mut c_void
c_void is the handle of CreateFileMappingW from the winapi crate.
async fn poll_acc_process(window: &Window, is_alive: &bool) {
loop {
window.emit("acc_process", is_alive).unwrap();
tokio::time::sleep(time::Duration::from_secs(10));
}
}
async fn handle_phycics(window: &Window, is_alive: &bool) {
while is_alive {
let s_handle = get_statics_mapped_file(); // _handle represents c_void here
let s_memory = get_statics_mapview_of_file(s_handle);
window
.emit("update_statistics", Statics::new_from_memory(s_memory))
.unwrap();
let p_handle = get_physics_mapped_file(); // _handle represents c_void here
let physics = get_physics_mapview_of_file(p_handle);
window.emit("update_physics", physics).unwrap();
if physics.current_max_rpm != 0 {
let g_handle = get_graphics_mapped_file(); // _handle represents c_void here
let g_memory = get_graphics_mapview_of_file(g_handle);
window
.emit("update_graphics", Graphics::new_from_mem(g_memory))
.unwrap();
}
tokio::time::sleep(time::Duration::from_secs(1)).await;
}
}
Is it possible to solve my problem somehow this way or should I try another approach?

Techniques for turning recursive functions into iterators in Rust?

I'm struggling to turn a simple recursive function into a simple iterator. The problem is that the recursive function maintains state in its local variables and call stack -- and to turn this into a rust iterator means basically externalizing all the function state into mutable properties on some custom iterator struct. It's quite a messy endeavor.
In a language like javascript or python, yield comes to the rescue. Are there any techniques in Rust to help manage this complexity?
Simple example using yield (pseudocode):
function one_level(state, depth, max_depth) {
if depth == max_depth {
return
}
for s in next_states_from(state) {
yield state_to_value(s);
yield one_level(s, depth+1, max_depth);
}
}
To make something similar work in Rust, I'm basically creating a Vec<Vec<State>> on my iterator struct, to reflect the data returned by next_states_from at each level of the call stack. Then for each next() invocation, carefully popping pieces off of this to restore state. I feel like I may be missing something.
You are performing a (depth-limited) depth-first search on your state graph. You can do it iteratively by using a single stack of unprocessed subtrees(depending on your state graph structure).
struct Iter {
stack: Vec<(State, u32)>,
max_depth: u32,
}
impl Iter {
fn new(root: State, max_depth: u32) -> Self {
Self {
stack: vec![(root, 0)],
max_depth
}
}
}
impl Iterator for Iter {
type Item = u32; // return type of state_to_value
fn next(&mut self) -> Option<Self::Item> {
let (state, depth) = self.stack.pop()?;
if depth < self.max_depth {
for s in next_states_from(state) {
self.stack.push((s, depth+1));
}
}
return Some(state_to_value(state));
}
}
There are some slight differences to your code:
The iterator yields the value of the root element, while your version does not. This can be easily fixed using .skip(1)
Children are processed in right-to-left order (reversed from the result of next_states_from). Otherwise, you will need to reverse the order of pushing the next states (depending on the result type of next_states_from you can just use .rev(), otherwise you will need a temporary)

How can I join all the futures in a vector without cancelling on failure like join_all does?

I've got a Vec of futures created from calling an async function. After adding all the futures to the vector, I'd like to wait on the whole set, getting either a list of results, or a callback for each one that finishes.
I could simply loop or iterate over the vector of futures and call .await on each future and that would allow me to handle errors correctly and not have futures::future::join_all cancel the others, but I'm sure there's a more idiomatic way to accomplish this task.
I also want to be able to handle the futures as they complete so if I get enough information from the first few, I can cancel the remaining incomplete futures and not wait for them and discard their results, error or not. This would not be possible if I iterated over the vector in order.
What I'm looking for is a callback (closure, etc) that lets me accumulate the results as they come in so that I can handle errors appropriately or cancel the remainder of the futures (from within the callback) if I determine I don't need the rest of them.
I can tell that's asking for a headache from the borrow checker: trying to modify a future in a Vec in a callback from the async engine.
There's a number of Stack Overflow questions and Reddit posts that explain how join_all joins on a list of futures but cancels the rest if one fails, and how async engines may spawn threads or they may not or they're bad design if they do.
Use futures::select_all:
use futures::future; // 0.3.4
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
type Result<T, E = Error> = std::result::Result<T, E>;
async fn might_fail(fails: bool) -> Result<i32> {
if fails {
Ok(42)
} else {
Err("boom".into())
}
}
async fn many() -> Result<i32> {
let raw_futs = vec![might_fail(true), might_fail(false), might_fail(true)];
let unpin_futs: Vec<_> = raw_futs.into_iter().map(Box::pin).collect();
let mut futs = unpin_futs;
let mut sum = 0;
while !futs.is_empty() {
match future::select_all(futs).await {
(Ok(val), _index, remaining) => {
sum += val;
futs = remaining;
}
(Err(_e), _index, remaining) => {
// Ignoring all errors
futs = remaining;
}
}
if sum > 42 {
// Early exit
return Ok(sum);
}
}
Ok(sum)
}
This will poll all the futures in the collection, returning the first one that is not pending, its index, and the remaining pending or unpolled futures. You can then match on the Result and handle the success or failure case.
Call select_all inside of a loop. This gives you the ability to exit early from the function. When you exit, the futs vector is dropped and dropping a future cancels it.
As mentioned in comments by #kmdreko, #John Kugelman and #e-net4-the-comment-flagger, use join_all from the futures crate version 0.3. The join_all in futures version 0.2 has the cancel behavior described in the question, but join_all in futures crate version 0.3 does not.
For a quick/hacky solution without having to mess with select_all. You can use join_all, but instead of returning Result<T> return Result<Result<T>> - always returning Ok for the top-level result
Or you can use join_all
The example from the documentation:
use futures::future::join_all;
async fn foo(i: u32) -> u32 { i }
let futures = vec![foo(1), foo(2), foo(3)];
assert_eq!(join_all(futures).await, [1, 2, 3]);

What is the purpose of async/await in Rust?

In a language like C#, giving this code (I am not using the await keyword on purpose):
async Task Foo()
{
var task = LongRunningOperationAsync();
// Some other non-related operation
AnotherOperation();
result = task.Result;
}
In the first line, the long operation is run in another thread, and a Task is returned (that is a future). You can then do another operation that will run in parallel of the first one, and at the end, you can wait for the operation to be finished. I think that it is also the behavior of async/await in Python, JavaScript, etc.
On the other hand, in Rust, I read in the RFC that:
A fundamental difference between Rust's futures and those from other languages is that Rust's futures do not do anything unless polled. The whole system is built around this: for example, cancellation is dropping the future for precisely this reason. In contrast, in other languages, calling an async fn spins up a future that starts executing immediately.
In this situation, what is the purpose of async/await in Rust? Seeing other languages, this notation is a convenient way to run parallel operations, but I cannot see how it works in Rust if the calling of an async function does not run anything.
You are conflating a few concepts.
Concurrency is not parallelism, and async and await are tools for concurrency, which may sometimes mean they are also tools for parallelism.
Additionally, whether a future is immediately polled or not is orthogonal to the syntax chosen.
async / await
The keywords async and await exist to make creating and interacting with asynchronous code easier to read and look more like "normal" synchronous code. This is true in all of the languages that have such keywords, as far as I am aware.
Simpler code
This is code that creates a future that adds two numbers when polled
before
fn long_running_operation(a: u8, b: u8) -> impl Future<Output = u8> {
struct Value(u8, u8);
impl Future for Value {
type Output = u8;
fn poll(self: Pin<&mut Self>, _ctx: &mut Context) -> Poll<Self::Output> {
Poll::Ready(self.0 + self.1)
}
}
Value(a, b)
}
after
async fn long_running_operation(a: u8, b: u8) -> u8 {
a + b
}
Note that the "before" code is basically the implementation of today's poll_fn function
See also Peter Hall's answer about how keeping track of many variables can be made nicer.
References
One of the potentially surprising things about async/await is that it enables a specific pattern that wasn't possible before: using references in futures. Here's some code that fills up a buffer with a value in an asynchronous manner:
before
use std::io;
fn fill_up<'a>(buf: &'a mut [u8]) -> impl Future<Output = io::Result<usize>> + 'a {
futures::future::lazy(move |_| {
for b in buf.iter_mut() { *b = 42 }
Ok(buf.len())
})
}
fn foo() -> impl Future<Output = Vec<u8>> {
let mut data = vec![0; 8];
fill_up(&mut data).map(|_| data)
}
This fails to compile:
error[E0597]: `data` does not live long enough
--> src/main.rs:33:17
|
33 | fill_up_old(&mut data).map(|_| data)
| ^^^^^^^^^ borrowed value does not live long enough
34 | }
| - `data` dropped here while still borrowed
|
= note: borrowed value must be valid for the static lifetime...
error[E0505]: cannot move out of `data` because it is borrowed
--> src/main.rs:33:32
|
33 | fill_up_old(&mut data).map(|_| data)
| --------- ^^^ ---- move occurs due to use in closure
| | |
| | move out of `data` occurs here
| borrow of `data` occurs here
|
= note: borrowed value must be valid for the static lifetime...
after
use std::io;
async fn fill_up(buf: &mut [u8]) -> io::Result<usize> {
for b in buf.iter_mut() { *b = 42 }
Ok(buf.len())
}
async fn foo() -> Vec<u8> {
let mut data = vec![0; 8];
fill_up(&mut data).await.expect("IO failed");
data
}
This works!
Calling an async function does not run anything
The implementation and design of a Future and the entire system around futures, on the other hand, is unrelated to the keywords async and await. Indeed, Rust has a thriving asynchronous ecosystem (such as with Tokio) before the async / await keywords ever existed. The same was true for JavaScript.
Why aren't Futures polled immediately on creation?
For the most authoritative answer, check out this comment from withoutboats on the RFC pull request:
A fundamental difference between Rust's futures and those from other
languages is that Rust's futures do not do anything unless polled. The
whole system is built around this: for example, cancellation is
dropping the future for precisely this reason. In contrast, in other
languages, calling an async fn spins up a future that starts executing
immediately.
A point about this is that async & await in Rust are not inherently
concurrent constructions. If you have a program that only uses async &
await and no concurrency primitives, the code in your program will
execute in a defined, statically known, linear order. Obviously, most
programs will use some kind of concurrency to schedule multiple,
concurrent tasks on the event loop, but they don't have to. What this
means is that you can - trivially - locally guarantee the ordering of
certain events, even if there is nonblocking IO performed in between
them that you want to be asynchronous with some larger set of nonlocal
events (e.g. you can strictly control ordering of events inside of a
request handler, while being concurrent with many other request
handlers, even on two sides of an await point).
This property gives Rust's async/await syntax the kind of local
reasoning & low-level control that makes Rust what it is. Running up
to the first await point would not inherently violate that - you'd
still know when the code executed, it would just execute in two
different places depending on whether it came before or after an
await. However, I think the decision made by other languages to start
executing immediately largely stems from their systems which
immediately schedule a task concurrently when you call an async fn
(for example, that's the impression of the underlying problem I got
from the Dart 2.0 document).
Some of the Dart 2.0 background is covered by this discussion from munificent:
Hi, I'm on the Dart team. Dart's async/await was designed mainly by
Erik Meijer, who also worked on async/await for C#. In C#, async/await
is synchronous to the first await. For Dart, Erik and others felt that
C#'s model was too confusing and instead specified that an async
function always yields once before executing any code.
At the time, I and another on my team were tasked with being the
guinea pigs to try out the new in-progress syntax and semantics in our
package manager. Based on that experience, we felt async functions
should run synchronously to the first await. Our arguments were
mostly:
Always yielding once incurs a performance penalty for no good reason. In most cases, this doesn't matter, but in some it really
does. Even in cases where you can live with it, it's a drag to bleed a
little perf everywhere.
Always yielding means certain patterns cannot be implemented using async/await. In particular, it's really common to have code like
(pseudo-code here):
getThingFromNetwork():
if (downloadAlreadyInProgress):
return cachedFuture
cachedFuture = startDownload()
return cachedFuture
In other words, you have an async operation that you can call multiple times before it completes. Later calls use the same
previously-created pending future. You want to ensure you don't start
the operation multiple times. That means you need to synchronously
check the cache before starting the operation.
If async functions are async from the start, the above function can't use async/await.
We pleaded our case, but ultimately the language designers stuck with
async-from-the-top. This was several years ago.
That turned out to be the wrong call. The performance cost is real
enough that many users developed a mindset that "async functions are
slow" and started avoiding using it even in cases where the perf hit
was affordable. Worse, we see nasty concurrency bugs where people
think they can do some synchronous work at the top of a function and
are dismayed to discover they've created race conditions. Overall, it
seems users do not naturally assume an async function yields before
executing any code.
So, for Dart 2, we are now taking the very painful breaking change to
change async functions to be synchronous to the first await and
migrating all of our existing code through that transition. I'm glad
we're making the change, but I really wish we'd done the right thing
on day one.
I don't know if Rust's ownership and performance model place different
constraints on you where being async from the top really is better,
but from our experience, sync-to-the-first-await is clearly the better
trade-off for Dart.
cramert replies (note that some of this syntax is outdated now):
If you need code to execute immediately when a function is called
rather than later on when the future is polled, you can write your
function like this:
fn foo() -> impl Future<Item=Thing> {
println!("prints immediately");
async_block! {
println!("prints when the future is first polled");
await!(bar());
await!(baz())
}
}
Code examples
These examples use the async support in Rust 1.39 and the futures crate 0.3.1.
Literal transcription of the C# code
use futures; // 0.3.1
async fn long_running_operation(a: u8, b: u8) -> u8 {
println!("long_running_operation");
a + b
}
fn another_operation(c: u8, d: u8) -> u8 {
println!("another_operation");
c * d
}
async fn foo() -> u8 {
println!("foo");
let sum = long_running_operation(1, 2);
another_operation(3, 4);
sum.await
}
fn main() {
let task = foo();
futures::executor::block_on(async {
let v = task.await;
println!("Result: {}", v);
});
}
If you called foo, the sequence of events in Rust would be:
Something implementing Future<Output = u8> is returned.
That's it. No "actual" work is done yet. If you take the result of foo and drive it towards completion (by polling it, in this case via futures::executor::block_on), then the next steps are:
Something implementing Future<Output = u8> is returned from calling long_running_operation (it does not start work yet).
another_operation does work as it is synchronous.
the .await syntax causes the code in long_running_operation to start. The foo future will continue to return "not ready" until the computation is done.
The output would be:
foo
another_operation
long_running_operation
Result: 3
Note that there are no thread pools here: this is all done on a single thread.
async blocks
You can also use async blocks:
use futures::{future, FutureExt}; // 0.3.1
fn long_running_operation(a: u8, b: u8) -> u8 {
println!("long_running_operation");
a + b
}
fn another_operation(c: u8, d: u8) -> u8 {
println!("another_operation");
c * d
}
async fn foo() -> u8 {
println!("foo");
let sum = async { long_running_operation(1, 2) };
let oth = async { another_operation(3, 4) };
let both = future::join(sum, oth).map(|(sum, _)| sum);
both.await
}
Here we wrap synchronous code in an async block and then wait for both actions to complete before this function will be complete.
Note that wrapping synchronous code like this is not a good idea for anything that will actually take a long time; see What is the best approach to encapsulate blocking I/O in future-rs? for more info.
With a threadpool
// Requires the `thread-pool` feature to be enabled
use futures::{executor::ThreadPool, future, task::SpawnExt, FutureExt};
async fn foo(pool: &mut ThreadPool) -> u8 {
println!("foo");
let sum = pool
.spawn_with_handle(async { long_running_operation(1, 2) })
.unwrap();
let oth = pool
.spawn_with_handle(async { another_operation(3, 4) })
.unwrap();
let both = future::join(sum, oth).map(|(sum, _)| sum);
both.await
}
The purpose of async/await in Rust is to provide a toolkit for concurrency—same as in C# and other languages.
In C# and JavaScript, async methods start running immediately, and they're scheduled whether you await the result or not. In Python and Rust, when you call an async method, nothing happens (it isn't even scheduled) until you await it. But it's largely the same programming style either way.
The ability to spawn another task (that runs concurrent with and independent of the current task) is provided by libraries: see async_std::task::spawn and tokio::task::spawn.
As for why Rust async is not exactly like C#, well, consider the differences between the two languages:
Rust discourages global mutable state. In C# and JS, every async method call is implicitly added to a global mutable queue. It's a side effect to some implicit context. For better or worse, that's not Rust's style.
Rust is not a framework. It makes sense that C# provides a default event loop. It also provides a great garbage collector! Lots of things that come standard in other languages are optional libraries in Rust.
Consider this simple pseudo-JavaScript code that fetches some data, processes it, fetches some more data based on the previous step, summarises it, and then prints a result:
getData(url)
.then(response -> parseObjects(response.data))
.then(data -> findAll(data, 'foo'))
.then(foos -> getWikipediaPagesFor(foos))
.then(sumPages)
.then(sum -> console.log("sum is: ", sum));
In async/await form, that's:
async {
let response = await getData(url);
let objects = parseObjects(response.data);
let foos = findAll(objects, 'foo');
let pages = await getWikipediaPagesFor(foos);
let sum = sumPages(pages);
console.log("sum is: ", sum);
}
It introduces a lot of single-use variables and is arguably worse than the original version with promises. So why bother?
Consider this change, where the variables response and objects are needed later on in the computation:
async {
let response = await getData(url);
let objects = parseObjects(response.data);
let foos = findAll(objects, 'foo');
let pages = await getWikipediaPagesFor(foos);
let sum = sumPages(pages, objects.length);
console.log("sum is: ", sum, " and status was: ", response.status);
}
And try to rewrite it in the original form with promises:
getData(url)
.then(response -> Promise.resolve(parseObjects(response.data))
.then(objects -> Promise.resolve(findAll(objects, 'foo'))
.then(foos -> getWikipediaPagesFor(foos))
.then(pages -> sumPages(pages, objects.length)))
.then(sum -> console.log("sum is: ", sum, " and status was: ", response.status)));
Each time you need to refer back to a previous result, you need to nest the entire structure one level deeper. This can quickly become very difficult to read and maintain, but the async/await version does not suffer from this problem.

sequential execution chaining of async operations in F#

Is there any primitive in the langage to compose async1 then async2, akin to what parallel does for parallel execution planning ?
to further clarify, I have 2 async computations
let toto1 = Async.Sleep(1000)
let toto2 = Async.Sleep(1000)
I would like to create a new async computation made of the sequential composition of toto1 and toto2
let toto = Async.Sequential [|toto1; toto2|]
upon start, toto would run toto1 then toto2, and would end after 2000 time units
The async.Bind operation is the basic primitive that asynchronous workflows provide for sequential composition - in the async block syntax, that corresponds to let!. You can use that to express sequential composition of two computations (as demonstrated by Daniel).
However, if you have an operation <|> that Daniel defined, than that's not expressive enough to implement async.Bind, because when you compose things sequentially using async.Bind, the second computation may depend on the result of the first one. The <e2> may use v1:
async.Bind(<e1>, fun v1 -> <e2>)
If you were writing <e1> <|> <e2> then the two operations have to be independent. This is a reason why the libraries are based on Bind - because it is more expressive form of sequential composition than the one you would get if you were following the structure of Async.Parallel.
If you want something that behaves like Async.Parallel and takes an array, then the easiest option is to implement that imperatively using let! in a loop (but you could use recursion and lists too):
let Sequential (ops:Async<'T>[]) = async {
let res = Array.zeroCreate ops.Length
for i in 0 .. ops.Length - 1 do
let! value = ops.[i]
res.[i] <- value
return res }
I'm not sure what you mean by "primitive." Async.Parallel is a function. Here are a few ways to run two asyncs:
In parallel:
Async.Parallel([|async1; async2|])
or
async {
let! child = Async.StartChild async2
let! result1 = child
let! result2 = async1
return [|result1; result2|]
}
Sequentially:
async {
let! result1 = async1
let! result2 = async2
return [|result1; result2|]
}
You could return tuples in the last two. I kept the return types the same as the first one.
I would say let! and do! in an async { } block is as close as you'll get to using a primitive for this.
EDIT
If all this nasty syntax is getting to you, you could define a combinator:
let (<|>) async1 async2 =
async {
let! r1 = async1
let! r2 = async2
return r1, r2
}
and then do:
async1 <|> async2 |> Async.RunSynchronously

Resources