F#: avoid race condition when calling Task.WhenAny? - asynchronous

So after working quite a bit with F#'s Async model I think I wrapped my head around it and its differences with C#'s Task model. One of the things I really like about the way they interoperate is Async.AwaitTask:
...
public Task<int> SomeCsharpApiAsync() {
...
}
...
let someFsharpAsyncCode = async {
let! someResult = Async.AwaitTask (SomeClass.SomeCsharpApiAsync())
return someResult
}
With the code above, the line that calls AwaitTask not only awaits for the task to be completed, but also assigns its result to the variable someResult in an operation which I believe is atomic. Very neat!
However, I'm having a problem to achieve the same thing (atomicity, so lack of race conditions) when using Task.WhenAny:
...
public Task<int> SomeCsharpApiAsync(string someParam) {
...
}
...
let someFsharpAsyncCode = async {
let task1 = SomeClass.SomeCsharpApiAsync "foo"
let task2 = SomeClass.SomeCsharpApiAsync "bar"
let! fastestTask = Async.AwaitTask (Task.WhenAny([task1;task2]))
return fastestTask.Result
}
The problem about the code above is that the AwaitTask operation returns a task (the fastest), not the result, so it's an extra layer of indirection, which I resolve by calling .Result later. But I've found problems about this, whose culprit I think are race conditions: the fastest task may have some result at that moment, but when retrieving the result of it, the task's cancellationToken might have been requested (cancelled)! Or the fastest task was faster because actually was the one canceled faster? I'm not sure and I'm a bit lost on how to figure out what's happening here.
Maybe I can avoid the problem altogether by using an F# alternative to C#'s Task.WhenAny?

See if this implementation of Async.WhenAny might fix your problem:
https://github.com/tpetricek/TryJoinads/blob/master/src/FSharp.Joinads/Async.fs#L9
See also this blog: http://tomasp.net/blog/joinads-async-implement.aspx/
This example is very similar to your code, but maybe there is a subtle difference: https://github.com/microsoft/fsharplu/blob/master/FSharpLu/Async.fs#L59
And here is another brief example with a different purpose, but still might be helpful:
http://www.fssnip.net/hx/title/AsyncAwaitTask-with-timeouts

Related

Calling async function from closure

I want to await an async function inside a closure used in an iterator. The function requiring the closure is called inside a struct implementation. I can't figure out how to do this.
This code simulates what I'm trying to do:
struct MyType {}
impl MyType {
async fn foo(&self) {
println!("foo");
(0..2).for_each(|v| {
self.bar(v).await;
});
}
async fn bar(&self, v: usize) {
println!("bar: {}", v);
}
}
#[tokio::main]
async fn main() {
let mt = MyType {};
mt.foo().await;
}
Obviously, this will not work since the closure is not async, giving me:
error[E0728]: `await` is only allowed inside `async` functions and blocks
--> src/main.rs:8:13
|
7 | (0..2).for_each(|v| {
| --- this is not `async`
8 | self.bar(v).await;
| ^^^^^^^^^^^^^^^^^ only allowed inside `async` functions and blocks
After looking for an answer on how to call an async function from a non-async one, I eded up with this:
tokio::spawn(async move {
self.bar(v).await;
});
But now I'm hitting lifetime issues instead:
error[E0759]: `self` has an anonymous lifetime `'_` but it needs to satisfy a `'static` lifetime requirement
--> src/main.rs:4:18
|
4 | async fn foo(&self) {
| ^^^^^
| |
| this data with an anonymous lifetime `'_`...
| ...is captured here...
...
8 | tokio::spawn(async move {
| ------------ ...and is required to live as long as `'static` here
This also doesn't surprise me since from what I understand the Rust compiler cannot know how long a thread will live. Given this, the thread spawned with tokio::spawn might outlive the type MyType.
The first fix I came up with was to make bar an associate function, copy everything I need in my closure and pass it as a value to bar and call it with MyType::bar(copies_from_self) but this is getting ugly since there's a lot of copying. It also feels like a workaround for not knowing how lifetimes work.
I was instead trying to use futures::executor::block_on which works for simple tasks like the one in this post:
(0..2).for_each(|v| {
futures::executor::block_on(self.bar(v));
});
But when putting this in my real life example where I use a third party library1 which also uses tokio, things no longer work. After reading the documentation, I realise that #[tokio::main] is a macro that eventually wraps everything in block_on so by doing this there will be nested block_on. This might be the reason why one of the async methods called in bar just stops working without any error or logging (works without block_on so shouldn't be anything with the code). I reached out to the authors who said that I could use for_each(|i| async move { ... }) which made me even more confused.
(0..2).for_each(|v| async move {
self.bar(v).await;
});
Will result in the compilation error
expected `()`, found opaque type`
which I think makes sense since I'm now returning a future and not (). My naive approach to this was to try and await the future with something like this:
(0..2).for_each(|v| {
async move {
self.bar(v).await;
}
.await
});
But that takes me back to square one, resulting in the following compilation error which I also think makes sense since I'm now back to using await in the closure which is sync.
only allowed inside `async` functions and blocks` since the
This discovery also makes it hard for me to make use of answers such as the ones found here and here.
The question after all this cargo cult programming is basically, is it possible, and if so how do I call my async function from the closure (and preferably without spawning a thread to avoid lifetime problems) in an iterator? If this is not possible, what would an idiomatic implementation for this look like?
1This is the library/method used
Iterator::for_each expects a synchronous closure, thus you can't use .await in it (not directly at least), nor can you return a future from it.
One solution is to just use a for loop instead of .for_each:
for v in 0..2 {
self.bar(v).await;
}
The more general approach is to use streams instead of iterators, since those are the asynchronous equivalent (and the equivalent methods on streams are typically asynchronous as well). This would work not only for for_each but for most other iterator methods:
use futures::prelude::*;
futures::stream::iter(0..2)
.for_each(|c| async move {
self.bar(v).await;
})
.await;

Should I return await in Rust?

In JavaScript, async code is written with Promises and async/await syntax similar to that of Rust. It is generally considered redundant (and therefore discouraged) to return and await a Promise when it can simply be returned (i.e., when an async function is executed as the last thing in another function):
async function myFn() { /* ... */ }
async function myFn2() {
// do setup work
return await myFn()
// ^ this is not necessary when we can just return the Promise
}
I am wondering whether a similar pattern applies in Rust. Should I prefer this:
pub async fn my_function(
&mut self,
) -> Result<()> {
// do synchronous setup work
self.exec_command(
/* ... */
)
.await
}
Or this:
pub fn my_function(
&mut self,
) -> impl Future<Output = Result<()>> {
// do synchronous setup work
self.exec_command(
/* ... */
)
}
The former feels more ergonomic to me, but I suspect that the latter might be more performant. Is this the case?
One semantic difference between the two variants is that in the first variant the synchronous setup code will run only when the returned future is awaited, while in the second variant it will run as soon as the function is called:
let fut = x.my_function();
// in the second variant, the synchronous setup has finished by now
...
let val = fut.await; // in the first variant, it runs here
For the difference to be noticeable, the synchronous setup code must have side effects, and there needs to be a delay between calling the async function and awaiting the future it returns.
Unless you have specific reason to execute the preamble immediately, go with the async function, i.e. the first variant. It makes the function slightly more predictable, and makes it easier to add more awaits later as the function is refactored.
There is no real difference between the two since async just resolves down to impl Future<Output=Result<T, E>>. I don't believe there is any meaningful performance difference between the two, at least in my empirical usage of both.
If you are asking for preference in style then in my opinion the first one is preferred as the types are clearer to me and I agree it is more ergonomic.

How do I execute an async/await function without using any external dependencies?

I am attempting to create simplest possible example that can get async fn hello() to eventually print out Hello World!. This should happen without any external dependency like tokio, just plain Rust and std. Bonus points if we can get it done without ever using unsafe.
#![feature(async_await)]
async fn hello() {
println!("Hello, World!");
}
fn main() {
let task = hello();
// Something beautiful happens here, and `Hello, World!` is printed on screen.
}
I know async/await is still a nightly feature, and it is subject to change in the foreseeable future.
I know there is a whole lot of Future implementations, I am aware of the existence of tokio.
I am just trying to educate myself on the inner workings of standard library futures.
My helpless, clumsy endeavours
My vague understanding is that, first off, I need to Pin task down. So I went ahead and
let pinned_task = Pin::new(&mut task);
but
the trait `std::marker::Unpin` is not implemented for `std::future::GenFuture<[static generator#src/main.rs:7:18: 9:2 {}]>`
so I thought, of course, I probably need to Box it, so I'm sure it won't move around in memory. Somewhat surprisingly, I get the same error.
What I could get so far is
let pinned_task = unsafe {
Pin::new_unchecked(&mut task)
};
which is obviously not something I should do. Even so, let's say I got my hands on the Pinned Future. Now I need to poll() it somehow. For that, I need a Waker.
So I tried to look around on how to get my hands on a Waker. On the doc it kinda looks like the only way to get a Waker is with another new_unchecked that accepts a RawWaker. From there I got here and from there here, where I just curled up on the floor and started crying.
This part of the futures stack is not intended to be implemented by many people. The rough estimate that I have seen in that maybe there will be 10 or so actual implementations.
That said, you can fill in the basic aspects of an executor that is extremely limited by following the function signatures needed:
async fn hello() {
println!("Hello, World!");
}
fn main() {
drive_to_completion(hello());
}
use std::{
future::Future,
ptr,
task::{Context, Poll, RawWaker, RawWakerVTable, Waker},
};
fn drive_to_completion<F>(f: F) -> F::Output
where
F: Future,
{
let waker = my_waker();
let mut context = Context::from_waker(&waker);
let mut t = Box::pin(f);
let t = t.as_mut();
loop {
match t.poll(&mut context) {
Poll::Ready(v) => return v,
Poll::Pending => panic!("This executor does not support futures that are not ready"),
}
}
}
type WakerData = *const ();
unsafe fn clone(_: WakerData) -> RawWaker {
my_raw_waker()
}
unsafe fn wake(_: WakerData) {}
unsafe fn wake_by_ref(_: WakerData) {}
unsafe fn drop(_: WakerData) {}
static MY_VTABLE: RawWakerVTable = RawWakerVTable::new(clone, wake, wake_by_ref, drop);
fn my_raw_waker() -> RawWaker {
RawWaker::new(ptr::null(), &MY_VTABLE)
}
fn my_waker() -> Waker {
unsafe { Waker::from_raw(my_raw_waker()) }
}
Starting at Future::poll, we see we need a Pinned future and a Context. Context is created from a Waker which needs a RawWaker. A RawWaker needs a RawWakerVTable. We create all of those pieces in the simplest possible ways:
Since we aren't trying to support NotReady cases, we never need to actually do anything for that case and can instead panic. This also means that the implementations of wake can be no-ops.
Since we aren't trying to be efficient, we don't need to store any data for our waker, so clone and drop can basically be no-ops as well.
The easiest way to pin the future is to Box it, but this isn't the most efficient possibility.
If you wanted to support NotReady, the simplest extension is to have a busy loop, polling forever. A slightly more efficient solution is to have a global variable that indicates that someone has called wake and block on that becoming true.

Converting a collection of Awaitables to a time-ordered AsyncGenerator in Hack

I am trying to implement Rx stream/observable merging with Hack async, and a core step is described by the title. A code version of this step would look something like this:
<?hh // strict
async function foo(Awaitable<Iterable<T>> $collection): Awaitable<void> {
$ordered_generator = async_collection_to_gen($collection) // (**)
foreach($ordered_generator await as $v) {
// do something with each awaited value in the time-order they are resolved
}
}
However, after mulling it over, I don't think I can write the starred (**) function. I've found that at some point or another, the implementations I've tried require functionality akin to JS's Promise.race, which resolves when the first of a collection Promises resolves/rejects. However, all of Hack's Awaitable collection helpers create an Awaitable of a fully resolved collection. Furthermore, Hack doesn't permit that we don't await async calls from async functions, which I've also found to be necessary.
Is it possible to anyone's knowledge?
This is possible actually! I dug around and stumbled upon a fork of asio-utilities by #jano implementing an AsyncPoll class. See PR for usage. It does exactly as I hoped.
So it turns out, there is an Awaitable called ConditionWaitHandle with succeed and fail methods* that can be invoked by any context (so long as the underlying WaitHandle hasn't expired yet), forcing the ConditionWaitHandle to resolve with the passed values.
I gave the code a hard look, and underneath it all, it works by successive Awaitable races, which ConditionWaitHandle permits. More specifically, the collection of Awaitables is compressed via AwaitAllWaitHandles (aka \HH\Asio\v) which resolves as slowly as the slowest Awaitable, then nested within a ConditionWaitHandle. Each Awaitable is awaited in an async function that triggers the common ConditionWaitHandle, concluding the race. This is repeated until the Awaitables have all resolved.
Here's a more compact implementation of a race using the same philosophy:
<?hh
function wait(int $i): Awaitable<void> {
return Race::wrap(async { await HH\Asio\usleep($i); return $i; });
}
// $wait_handle = null;
class Race {
public static ?ConditionWaitHandle $handle = null;
public static async function wrap<T>(Awaitable<T> $v): Awaitable<void> {
$ret = await $v;
$handle = self::$handle;
invariant(!is_null($handle), '');
$handle->succeed($ret);
}
}
Race::$handle = ConditionWaitHandle::create(
\HH\Asio\v(
Vector{
(wait(1))->getWaitHandle(),
(wait(1000000))->getWaitHandle()
}
)
);
printf("%d microsecond `wait` wins!", \HH\Asio\join(Race::$handle));
Very elegant solution, thanks #jano!
*(the semblance to promises/deferred intensifies)
I am curious how premature completion via ConditionWaitHandle meshes with the philosophy that all Awaitables should run to completion.

How does C# 5.0 async work?

I'm trying to grok how C# 5's new async feature works. Suppose I want to develop an atomic increment function for incrementing an integer in a fictitious IntStore. Multiple calls are made to this function in one thread only.
async void IncrementKey(string key) {
int i = await IntStore.Get(key);
IntStore.Set(key, i+1);
}
It seems to me that this function is flawed. Two calls to IncrementKey could get the same number back from IntStore (say 5), and then set it to 6, thus losing one of the increments?
How could this be re-written, if IntStore.Get is asynchronous (returns Task) in order to work correctly?
Performance is critical, is there a solution that avoids locking?
If you are sure you are calling your function from only one thread, then there shouldn't be any problem, because only one call to IntStore.Get could be awaiting at at time. This because:
await IncrementKey("AAA");
await IncrementKey("BBB");
the second IncrementKey won't be executed until the first IncrementKey has finished. The code will be converted to a state machine. If you don't trust it, change the IntStore.Get(key) to:
async Task<int> IntStore(string str) {
Console.WriteLine("Starting IntStore");
await TaskEx.Delay(10000);
return 0;
}
You'll see that the second Starting IntStore will be written 10 seconds after the first.
To quote from here http://blogs.msdn.com/b/ericlippert/archive/2010/10/29/asynchronous-programming-in-c-5-0-part-two-whence-await.aspx The “await” operator ... means “if the task we are awaiting has not yet completed then sign up the rest of this method as the continuation of that task, and then return to your caller immediately; the task will invoke the continuation when it completes.”

Resources