Shouldn't I call next on future::stream::Unfold? - asynchronous

I'm calling next multiple times on a Stream returned by this function: https://github.com/sdroege/rtsp-server/blob/96dbaf00a7111c775348430a64d6a60f16d66445/src/listener/message_socket.rs#L43:
pub(crate) fn async_read<R: AsyncRead + Unpin + Send>(
read: R,
max_size: usize,
) -> impl Stream<Item = Result<Message<Body>, ReadError>> + Send {
//...
futures::stream::unfold(Some(state), move |mut state| async move {
//...
})
}
sometimes it works, but sometimes I get:
thread 'main' panicked at 'Unfold must not be polled after it returned `Poll::Ready(None)`', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.13/src/stream/unfold.rs:115:21
The error comes from https://docs.rs/futures-util/0.3.2/src/futures_util/stream/unfold.rs.html#112
but I couldn't understand why. Shouldn't I be free to call next on a Stream, in a loop?

This error is an error for a reason, as it likely means that you are doing something wrong: when a stream returns Poll::Ready(None), it means that the stream is completed (in a similar fashion to Iterator, as has been commented).
However, if you are still sure that this is what you want to do, then you can call stream.fuse() in order to silence the error and simply return Poll::Ready(None) forever.

Related

Type mismatch in async method

I have an asynchronous method I'm writing which is supposed to asynchronously query for a port until it finds one, or time out at 5 minutes;
member this.GetPort(): Async<Port> = this._GetPort(DateTime.Now)
member this._GetPort(startTime: DateTime): Async<Port> = async {
match this._TryGetOpenPort() with
| Some(port) -> port
| None -> do
if (DateTime.Now - startTime).TotalMinutes >= 5 then
raise (Exception "Unable to open a port")
else
do! Async.Sleep(100)
let! result = this._GetPort(startTime)
result}
member this._TryGetOpenPort(): Option<Port> =
// etc.
However, I'm getting some strange type inconsistencies in _GetPort; the function says I'm returning a type of Async<unit> instead of Async<Port>.
It's a little unintuitive, but way to make your code work would be this:
member private this.GetPort(startTime: DateTime) =
async {
match this.TryGetOpenPort() with
| Some port ->
return port
| None ->
if (DateTime.Now - startTime).TotalMinutes >= 5 then
raise (Exception "Unable to open a port")
do! Async.Sleep(100)
let! result = this.GetPort(startTime)
return result
}
member private this.TryGetOpenPort() = failwith "yeet" // TODO
I took the liberty to clean up a few things and make the member private, since that seems to be what you're largely going after here with a more detailed internal way to get the port.
The reason why your code wasn't compiling was because you were inconsistent in what you were returning from the computation:
In the case of Some(port) you were missing a return keyword - which is required to lift the value back into an Async<port>
Your if expression where you raise an exception had an else branch but you weren't returning from both. In this case, since you clearly don't wish to return anything and just raise an exception, you can omit the else and make it an imperative program flow just like in non-async code.
The other thing you may wish to consider down the road is if throwing an exception is what you want, or if just returning a Result<T,Err> or an option is the right call. Exceptions aren't inherently bad, but often a lot of F# programming leads to avoiding their use if there's a good way to ascribe meaning to a type that wraps your return value.

How to tell if an async computation is sent to the thread pool?

I was recently informed that in
async {
return! async { return "hi" } }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (async { return 1 }) would not be sent to the thread pool for evaluation, whereas in
async {
use ms = new MemoryStream [| 0x68uy; 0x69uy |]
use sr = new StreamReader (ms)
return! sr.ReadToEndAsync () |> Async.AwaitTask }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (sr.ReadToEndAsync () |> Async.AwaitTask) would be. What is it about an Async<'T> that decides whether it's sent to the thread pool when it's executed in an asynchronous operation like let! or return!? In particular, how would you define one which is sent to the thread pool? What code do you have to include in the async block, or in the lambda passed into Async.FromContinuations?
TL;DR: It's not quite like that. The async itself doesn't "send" anything to the thread pool. All it does is just run continuations until they stop. And if one of those continuations decides to continue on a new thread - well, that's when thread switching happens.
Let's set up a small example to illustrate what happens:
let log str = printfn $"{str}: thread = {Thread.CurrentThread.ManagedThreadId}"
let f = async {
log "1"
let! x = async { log "2"; return 42 }
log "3"
do! Async.Sleep(TimeSpan.FromSeconds(3.0))
log "4"
}
log "starting"
f |> Async.StartImmediate
log "started"
Console.ReadLine()
If you run this script, it will print, starting, then 1, 2, 3, then started, then wait 3 seconds, and then print 4, and all of them except 4 will have the same thread ID. You can see that everything until Async.Sleep is executed synchronously on the same thread, but after that async execution stops and the main program execution continues, printing started and then blocking on ReadLine. By the time Async.Sleep wakes up and wants to continue execution, the original thread is already blocked on ReadLine, so the async computation gets to continue running on a new one.
What's going on here? How does this function?
First, the way the async computation is structured is in "continuation-passing style". It's a technique where every function doesn't return its result to the caller, but calls another function instead, passing the result as its parameter.
Let me illustrate with an example:
// "Normal" style:
let f x = x + 5
let g x = x * 2
printfn "%d" (f (g 3)) // prints 11
// Continuation-passing style:
let f x next = next (x + 5)
let g x next = next (x * 2)
g 3 (fun res1 -> f res1 (fun res2 -> printfn "%d" res2))
This is called "continuation-passing" because the next parameters are called "continuations" - i.e. they're functions that express how the program continues after calling f or g. And yes, this is exactly what Async.FromContinuations means.
Seeming very silly and roundabout on the surface, what this allows us to do is for each function to decide when, how, or even if its continuation happens. For example, our f function from above could be doing something asynchronous instead of just plain returning the result:
let f x next = httpPost "http://calculator.com/add5" x next
Coding it in continuation-passing style would allow such function to not block the current thread while the request to calculator.com is in flight. What's wrong with blocking the thread, you ask? I'll refer you to the original answer that prompted your question in the first place.
Second, when you write those async { ... } blocks, the compiler gives you a little help. It takes what looks like a step-by-step imperative program and "unrolls" it into a series of continuation-passing calls. The "breaking" points for this unfolding are all the constructs that end with a bang - let!, do!, return!.
The above async block, for example, would look somethiing like this (F#-ish pseudocode):
let return42 onDone =
log "2"
onDone 42
let f onDone =
log "1"
return42 (fun x ->
log "3"
Async.Sleep (3 seconds) (fun () ->
log "4"
onDone ()
)
)
Here, you can plainly see that the return42 function simply calls its continuation right away, thus making the whole thing from log "1" to log "3" completely synchronous, whereas the Async.Sleep function doesn't call its continuation right away, instead scheduling it to be run later (in 3 seconds) on the thread pool. That's where the thread switching happens.
And here, finally, lies the answer to your question: in order to have the async computation jump threads, your callback passed to Async.FromContinuations should do anything but call the success continuation immediately.
A few notes for further investigation
The onDone technique in the above example is technically called "monadic bind", and indeed in real F# programs it's represented by the async.Bind method. This answer might also be of help understanding the concept.
The above is a bit of an oversimplification. In reality the async execution is a bit more complicated than that. Internally it uses a technique called "trampoline", which in plain terms is just a loop that runs a single thunk on every turn, but crucially, the running thunk can also "ask" it to run another thunk, and if it does, the loop will do so, and so on, forever, until the next thunk doesn't ask to run another thunk, and then the whole thing finally stops.
I specifically used Async.StartImmediate to start the computation in my example, because Async.StartImmediate will do just what it says on the tin: it will start running the computation immediately, right there. That's why everything ran on the same thread as the main program. There are many alternative starting functions in the Async module. For example, Async.Start will start the computation on the thread pool. The lines from log "1" to log "3" will still all happen synchronously, without thread switching between them, but it will happen on a different thread from log "start" and log "starting". In this case thread switching will happen before the async computation even starts, so it doesn't count.

How can I join all the futures in a vector without cancelling on failure like join_all does?

I've got a Vec of futures created from calling an async function. After adding all the futures to the vector, I'd like to wait on the whole set, getting either a list of results, or a callback for each one that finishes.
I could simply loop or iterate over the vector of futures and call .await on each future and that would allow me to handle errors correctly and not have futures::future::join_all cancel the others, but I'm sure there's a more idiomatic way to accomplish this task.
I also want to be able to handle the futures as they complete so if I get enough information from the first few, I can cancel the remaining incomplete futures and not wait for them and discard their results, error or not. This would not be possible if I iterated over the vector in order.
What I'm looking for is a callback (closure, etc) that lets me accumulate the results as they come in so that I can handle errors appropriately or cancel the remainder of the futures (from within the callback) if I determine I don't need the rest of them.
I can tell that's asking for a headache from the borrow checker: trying to modify a future in a Vec in a callback from the async engine.
There's a number of Stack Overflow questions and Reddit posts that explain how join_all joins on a list of futures but cancels the rest if one fails, and how async engines may spawn threads or they may not or they're bad design if they do.
Use futures::select_all:
use futures::future; // 0.3.4
type Error = Box<dyn std::error::Error + Send + Sync + 'static>;
type Result<T, E = Error> = std::result::Result<T, E>;
async fn might_fail(fails: bool) -> Result<i32> {
if fails {
Ok(42)
} else {
Err("boom".into())
}
}
async fn many() -> Result<i32> {
let raw_futs = vec![might_fail(true), might_fail(false), might_fail(true)];
let unpin_futs: Vec<_> = raw_futs.into_iter().map(Box::pin).collect();
let mut futs = unpin_futs;
let mut sum = 0;
while !futs.is_empty() {
match future::select_all(futs).await {
(Ok(val), _index, remaining) => {
sum += val;
futs = remaining;
}
(Err(_e), _index, remaining) => {
// Ignoring all errors
futs = remaining;
}
}
if sum > 42 {
// Early exit
return Ok(sum);
}
}
Ok(sum)
}
This will poll all the futures in the collection, returning the first one that is not pending, its index, and the remaining pending or unpolled futures. You can then match on the Result and handle the success or failure case.
Call select_all inside of a loop. This gives you the ability to exit early from the function. When you exit, the futs vector is dropped and dropping a future cancels it.
As mentioned in comments by #kmdreko, #John Kugelman and #e-net4-the-comment-flagger, use join_all from the futures crate version 0.3. The join_all in futures version 0.2 has the cancel behavior described in the question, but join_all in futures crate version 0.3 does not.
For a quick/hacky solution without having to mess with select_all. You can use join_all, but instead of returning Result<T> return Result<Result<T>> - always returning Ok for the top-level result
Or you can use join_all
The example from the documentation:
use futures::future::join_all;
async fn foo(i: u32) -> u32 { i }
let futures = vec![foo(1), foo(2), foo(3)];
assert_eq!(join_all(futures).await, [1, 2, 3]);

Why do I have to wrap an Async<T> into another async workflow and let! it?

I'm trying to understand async workflows in F# but I found one part that I really don't understand.
The following code works fine:
let asynWorkflow = async{
let! result = Stream.TryOpenAsync(partition) |> Async.AwaitTask
return result
}
let stream = Async.RunSynchronously asynWorkflow
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I define a async workflow where TryOpenAsync returns a Task<StreamOpenResult> type. I convert it to Async<StreamOpenResult> with Async.AwaitTask. (Side quest: "Await"Task? It doesn't await it just convert it, does it? I think it has nothing to do with Task.Wait or the await keyword). I "await" it with let! and return it.
To start the workflow I use RunSynchronously which should start the workflow and return the result (bind it). On the result I check if the Stream is Found or not.
But now to my first question. Why do I have to wrap the TryOpenAsync call in another async computation and let! ("await") it?
E.g. the following code does not work:
let asynWorkflow = Stream.TryOpenAsync(partition) |> Async.AwaitTask
let stream = Async.RunSynchronously asynWorkflow
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I thought the AwaitTask makes it an Async<T> and RunSynchronously should start it. Then use the result. What do I miss?
My second question is why is there any "Async.Let!" function available? Maybe because it does not work or better why doesn't it work with the following code?
let ``let!`` task = async{
let! result = task |> Async.AwaitTask
return result
}
let stream = Async.RunSynchronously ( ``let!`` (Stream.TryOpenAsync(partition)) )
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I just insert the TryOpenAsync as a parameter but it does not work. By saying does not work I mean the whole FSI will hang. So it has something to do with my async/"await".
--- Update:
Result of working code in FSI:
>
Real: 00:00:00.051, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0
val asynWorkflow : Async<StreamOpenResult>
val stream : Stream
Result of not working code in FSI:
>
And you cannot execute anything in the FSI anymore
--- Update 2
I'm using Streamstone. Here the C# example: https://github.com/yevhen/Streamstone/blob/master/Source/Example/Scenarios/S04_Write_to_stream.cs
and here the Stream.TryOpenAsync: https://github.com/yevhen/Streamstone/blob/master/Source/Streamstone/Stream.Api.cs#L192
I can't tell you why the second example doesn't work without knowing what Stream and partition are and how they work.
However, I want to take this opportunity to point out that the two examples are not strictly equivalent.
F# async is kind of like a "recipe" for what to do. When you write async { ... }, the resulting computation is just sitting there, not actually doing anything. It's more like declaring a function than like issuing a command. Only when you "start" it by calling something like Async.RunSynchronously or Async.Start does it actually run. A corollary is that you can start the same async workflow multiple times, and it's going to be a new workflow every time. Very similar to how IEnumerable works.
C# Task, on the other hand, is more like a "reference" to an async computation that is already running. The computation starts as soon as you call Stream.TryOpenAsync(partition), and it's impossible to obtain a Task instance before the task actually starts. You can await the resulting Task multiple times, but each await will not result in a fresh attempt to open a stream. Only the first await will actually wait for the task's completion, and every subsequent one will just return you the same remembered result.
In the async/reactive lingo, F# async is what you call "cold", while C# Task is referred to as "hot".
The second code block looks like it should work to me. It does run it if I provide dummy implementations for Stream and StreamOpenResult.
You should avoid using Async.RunSynchronously wherever possible because it defeats the purpose of async. Put all of this code within a larger async block and then you will have access to the StreamOpenResult:
async {
let! openResult = Stream.TryOpenAsync(partition) |> Async.AwaitTask
let stream = if openResult.Found then openResult.Stream else Stream(partition)
return () // now do something with the stream
}
You may need to put a Async.Start or Async.RunSynchronously at the very outer edge of your program to actually run it, but it's better if you have the async (or convert it to a Task) and pass it to some other code (e.g. a web framework) that can call it in a non-blocking manner.
Not that I want to answer your question with another question, but: why are you doing code like this anyway? That might help to understand it. Why not just:
let asyncWorkflow = async {
let! result = Stream.TryOpenAsync(partition) |> Async.AwaitTask
if result.Found then return openResult.Stream else return Stream(partition) }
There's little point in creating an async workflow only to immediately call RunSynchronously on it - it's similar to calling .Result on a Task - it just blocks the current thread until the workflow returns.

XCTest: how to check whether an asynchronous element exists

I work on test automation for an app that communicates with the server. The app has 7 pre-defined strings. Depending on the info the server returns, which is not deterministic and depends on external factors, the app places one to three of the seven pre-defined strings in a table view as hittable static texts. The user has a choice which of those strings to tap.
To automate this test I need an asynchronous way to determine in the test code which of the 7 pre-defined strings actually appear on the screen.
I cannot use element.exists because it takes time for static texts to appear and I do not want to call sleep() because that would slow down the test.
So I tried to use XCTestExpectation but got a problem. XCTest always fails when waitForExpectationsWithTimeout() times out.
To illustrate the problem I wrote a simple test program:
func testExample() {
let element = XCUIApplication().staticTexts["Email"]
let gotText = haveElement(element)
print("Got text: \(gotText)")
}
func haveElement(element: XCUIElement) -> Bool{
var elementExists = true
let expectation = self.expectationForPredicate(
NSPredicate(format: "exists == true"),
evaluatedWithObject: element,
handler: nil)
self.waitForExpectationsWithTimeout(NSTimeInterval(5)) { error in
elementExists = error == nil
}
return elementExists
}
The test always fails with
Assertion Failure: Asynchronous wait failed: Exceeded timeout of 5 seconds, with unfulfilled expectations: "Expect predicate `exists == 1` for object "Email" StaticText".
I also tried
func haveElement(element: XCUIElement) -> Bool {
var elementExists = false
let actionExpectation = self.expectationWithDescription("Expected element")
dispatch_async(dispatch_get_main_queue()) {
while true {
if element.exists {
actionExpectation.fulfill()
elementExists = true
break
} else {
sleep(1)
}
}
}
self.waitForExpectationsWithTimeout(NSTimeInterval(5)) { error in
elementExists = error == nil
}
return elementExists
}
In this case the test always fails with
Stall on main thread.
error.
So the question is how do I check a presence of an asynchronous UI element that may or may not appear within specified time without the test failing on timeout?
Thank you.
You're overcomplicating the test. If you're communicating with a server, there is unnecessary variability in your tests -- my suggestion is to use stubbed network data for each case.
You can get a brief introduction to stubbing network data here:
http://masilotti.com/ui-testing-stub-network-data/
You will eliminate the randomness in the test based on response time of the server as well as the randomness of which string is appearing. Create test cases that respond to each case (i.e, how the app responds when you tap on each individual string)

Resources