Why do I have to wrap an Async<T> into another async workflow and let! it? - asynchronous

I'm trying to understand async workflows in F# but I found one part that I really don't understand.
The following code works fine:
let asynWorkflow = async{
let! result = Stream.TryOpenAsync(partition) |> Async.AwaitTask
return result
}
let stream = Async.RunSynchronously asynWorkflow
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I define a async workflow where TryOpenAsync returns a Task<StreamOpenResult> type. I convert it to Async<StreamOpenResult> with Async.AwaitTask. (Side quest: "Await"Task? It doesn't await it just convert it, does it? I think it has nothing to do with Task.Wait or the await keyword). I "await" it with let! and return it.
To start the workflow I use RunSynchronously which should start the workflow and return the result (bind it). On the result I check if the Stream is Found or not.
But now to my first question. Why do I have to wrap the TryOpenAsync call in another async computation and let! ("await") it?
E.g. the following code does not work:
let asynWorkflow = Stream.TryOpenAsync(partition) |> Async.AwaitTask
let stream = Async.RunSynchronously asynWorkflow
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I thought the AwaitTask makes it an Async<T> and RunSynchronously should start it. Then use the result. What do I miss?
My second question is why is there any "Async.Let!" function available? Maybe because it does not work or better why doesn't it work with the following code?
let ``let!`` task = async{
let! result = task |> Async.AwaitTask
return result
}
let stream = Async.RunSynchronously ( ``let!`` (Stream.TryOpenAsync(partition)) )
|> fun openResult -> if openResult.Found then openResult.Stream else Stream(partition)
I just insert the TryOpenAsync as a parameter but it does not work. By saying does not work I mean the whole FSI will hang. So it has something to do with my async/"await".
--- Update:
Result of working code in FSI:
>
Real: 00:00:00.051, CPU: 00:00:00.031, GC gen0: 0, gen1: 0, gen2: 0
val asynWorkflow : Async<StreamOpenResult>
val stream : Stream
Result of not working code in FSI:
>
And you cannot execute anything in the FSI anymore
--- Update 2
I'm using Streamstone. Here the C# example: https://github.com/yevhen/Streamstone/blob/master/Source/Example/Scenarios/S04_Write_to_stream.cs
and here the Stream.TryOpenAsync: https://github.com/yevhen/Streamstone/blob/master/Source/Streamstone/Stream.Api.cs#L192

I can't tell you why the second example doesn't work without knowing what Stream and partition are and how they work.
However, I want to take this opportunity to point out that the two examples are not strictly equivalent.
F# async is kind of like a "recipe" for what to do. When you write async { ... }, the resulting computation is just sitting there, not actually doing anything. It's more like declaring a function than like issuing a command. Only when you "start" it by calling something like Async.RunSynchronously or Async.Start does it actually run. A corollary is that you can start the same async workflow multiple times, and it's going to be a new workflow every time. Very similar to how IEnumerable works.
C# Task, on the other hand, is more like a "reference" to an async computation that is already running. The computation starts as soon as you call Stream.TryOpenAsync(partition), and it's impossible to obtain a Task instance before the task actually starts. You can await the resulting Task multiple times, but each await will not result in a fresh attempt to open a stream. Only the first await will actually wait for the task's completion, and every subsequent one will just return you the same remembered result.
In the async/reactive lingo, F# async is what you call "cold", while C# Task is referred to as "hot".

The second code block looks like it should work to me. It does run it if I provide dummy implementations for Stream and StreamOpenResult.
You should avoid using Async.RunSynchronously wherever possible because it defeats the purpose of async. Put all of this code within a larger async block and then you will have access to the StreamOpenResult:
async {
let! openResult = Stream.TryOpenAsync(partition) |> Async.AwaitTask
let stream = if openResult.Found then openResult.Stream else Stream(partition)
return () // now do something with the stream
}
You may need to put a Async.Start or Async.RunSynchronously at the very outer edge of your program to actually run it, but it's better if you have the async (or convert it to a Task) and pass it to some other code (e.g. a web framework) that can call it in a non-blocking manner.

Not that I want to answer your question with another question, but: why are you doing code like this anyway? That might help to understand it. Why not just:
let asyncWorkflow = async {
let! result = Stream.TryOpenAsync(partition) |> Async.AwaitTask
if result.Found then return openResult.Stream else return Stream(partition) }
There's little point in creating an async workflow only to immediately call RunSynchronously on it - it's similar to calling .Result on a Task - it just blocks the current thread until the workflow returns.

Related

Running Parallel async functions and waiting for the result in F# in a loop

I have a function that I want to run 5 at a time from a chunked list and then wait a second so as not to hit an Api rate limit. I know the code is wrong but it is as far as I have got.
let ordersChunked = mOrders |> List.chunkBySize 5
for fiveOrders in ordersChunked do
let! tasks =
fiveOrders
|> List.map (fun (order) -> trackAndShip(order) )
|> Async.Parallel
Async.Sleep(1000) |> ignore
trackAndShip is an async function not a task and I get a little confused about the two.
Some of the answers I have read add |> Async.RunSynchronously to the end of it - and I do that at the top level but I don't feel it is good practice here ( and I would need to convert async to Task? )
This is probably a very simple answer if you are familiar with async / tasks but I am still "finding my feet" in this area.
Unlike C#'s tasks, Async computation do not start when they are created and you have to explicitly start them. Furthermore, even if you would start the Async operation, it won't do what you expect because the sleep will not execute because you just ignore it.
From the let! in your example code, I'm going to assume you're the code snippet is inside a computation expression. So the following may be what you want
let asyncComputation =
async {
let ordersChunked = [] |> List.chunkBySize 5
for fiveOrders in ordersChunked do
let! tasks =
fiveOrders
|> List.map (fun (order) -> trackAndShip(order) )
|> Async.Parallel
do! Async.Sleep(1000) // Note the do! here, without it your sleep will be ignored
}
Async.RunSynchronously asyncComputation
Note that the Async.Sleep is now chained into the CE by using do!. The Async.RunSynchronously call is one way to start the async computation: it runs the async operation and blocks the current thread while waiting for the async operation's result. There are many other ways of running it described in the documentation.

How to Simplify Asynchronous Programming in F#

I come from a C# background having used async/ await. I am trying to find a "less verbose" way of programming using a library.
( specifically the Microsoft Playwright library for browser automation )
let (~~) = Async.AwaitTask
let getLastPageNumber(page: IPage) =
let playwright = ~~Playwright.CreateAsync() |> Async.RunSynchronously
let browser = ~~playwright.Chromium.LaunchAsync(new BrowserTypeLaunchOptions(Headless = false )) |> Async.RunSynchronously
~~page.GotoAsync("https://go.xero.com/BankRec/BankRec.aspx?accountID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&page1") |> Async.RunSynchronously |> ignore
let lastPageLink = ~~page.QuerySelectorAsync("#mainPagerEnd") |> Async.RunSynchronously
if lastPageLink = null then
//this is the last page
1
else
let lastPageNumber = ~~lastPageLink.GetAttributeAsync("href") |> Async.RunSynchronously
lastPageNumber |> int
I have shortened things a bit using the alias ~~ for Async.AwaitTask but it seems to be a lot of code to do something that was a lot easier in C#.
Async.RunSynchronously should only be used as a very last resort because it blocks a thread to perform the computation, which defeats the purpose of using async/tasks.
The F# equivalent of C#'s async/await is to use F#'s Async type, and the async computation expression. However, if you're using a .NET library which uses the .NET Task type then you can use the TaskBuilder.fs library which has a task computation expression.
Then you would write the function like this:
open FSharp.Control.Tasks
let getLastPageNumber(page: IPage) = task {
let! playwright = Playwright.CreateAsync()
let! browser = playwright.Chromium.LaunchAsync(new BrowserTypeLaunchOptions(Headless = false ))
let! _ = page.GotoAsync("https://go.xero.com/BankRec/BankRec.aspx?accountID=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx&page1")
let! lastPageLink = page.QuerySelectorAsync("#mainPagerEnd")
if lastPageLink = null then
//this is the last page
return 1
else
let! lastPageNumber = lastPageLink.GetAttributeAsync("href")
return lastPageNumber |> int
}
Inside the computation expression (also called a builder) let! is used to await tasks.
Note that this function now returns Task<int> rather than int, so the caller would probably need to follow a similar pattern and propogate the task further.
You can read more about using async in F# here. Some of that knowledge can also be applied to the task computation expression.

How to tell if an async computation is sent to the thread pool?

I was recently informed that in
async {
return! async { return "hi" } }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (async { return 1 }) would not be sent to the thread pool for evaluation, whereas in
async {
use ms = new MemoryStream [| 0x68uy; 0x69uy |]
use sr = new StreamReader (ms)
return! sr.ReadToEndAsync () |> Async.AwaitTask }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (sr.ReadToEndAsync () |> Async.AwaitTask) would be. What is it about an Async<'T> that decides whether it's sent to the thread pool when it's executed in an asynchronous operation like let! or return!? In particular, how would you define one which is sent to the thread pool? What code do you have to include in the async block, or in the lambda passed into Async.FromContinuations?
TL;DR: It's not quite like that. The async itself doesn't "send" anything to the thread pool. All it does is just run continuations until they stop. And if one of those continuations decides to continue on a new thread - well, that's when thread switching happens.
Let's set up a small example to illustrate what happens:
let log str = printfn $"{str}: thread = {Thread.CurrentThread.ManagedThreadId}"
let f = async {
log "1"
let! x = async { log "2"; return 42 }
log "3"
do! Async.Sleep(TimeSpan.FromSeconds(3.0))
log "4"
}
log "starting"
f |> Async.StartImmediate
log "started"
Console.ReadLine()
If you run this script, it will print, starting, then 1, 2, 3, then started, then wait 3 seconds, and then print 4, and all of them except 4 will have the same thread ID. You can see that everything until Async.Sleep is executed synchronously on the same thread, but after that async execution stops and the main program execution continues, printing started and then blocking on ReadLine. By the time Async.Sleep wakes up and wants to continue execution, the original thread is already blocked on ReadLine, so the async computation gets to continue running on a new one.
What's going on here? How does this function?
First, the way the async computation is structured is in "continuation-passing style". It's a technique where every function doesn't return its result to the caller, but calls another function instead, passing the result as its parameter.
Let me illustrate with an example:
// "Normal" style:
let f x = x + 5
let g x = x * 2
printfn "%d" (f (g 3)) // prints 11
// Continuation-passing style:
let f x next = next (x + 5)
let g x next = next (x * 2)
g 3 (fun res1 -> f res1 (fun res2 -> printfn "%d" res2))
This is called "continuation-passing" because the next parameters are called "continuations" - i.e. they're functions that express how the program continues after calling f or g. And yes, this is exactly what Async.FromContinuations means.
Seeming very silly and roundabout on the surface, what this allows us to do is for each function to decide when, how, or even if its continuation happens. For example, our f function from above could be doing something asynchronous instead of just plain returning the result:
let f x next = httpPost "http://calculator.com/add5" x next
Coding it in continuation-passing style would allow such function to not block the current thread while the request to calculator.com is in flight. What's wrong with blocking the thread, you ask? I'll refer you to the original answer that prompted your question in the first place.
Second, when you write those async { ... } blocks, the compiler gives you a little help. It takes what looks like a step-by-step imperative program and "unrolls" it into a series of continuation-passing calls. The "breaking" points for this unfolding are all the constructs that end with a bang - let!, do!, return!.
The above async block, for example, would look somethiing like this (F#-ish pseudocode):
let return42 onDone =
log "2"
onDone 42
let f onDone =
log "1"
return42 (fun x ->
log "3"
Async.Sleep (3 seconds) (fun () ->
log "4"
onDone ()
)
)
Here, you can plainly see that the return42 function simply calls its continuation right away, thus making the whole thing from log "1" to log "3" completely synchronous, whereas the Async.Sleep function doesn't call its continuation right away, instead scheduling it to be run later (in 3 seconds) on the thread pool. That's where the thread switching happens.
And here, finally, lies the answer to your question: in order to have the async computation jump threads, your callback passed to Async.FromContinuations should do anything but call the success continuation immediately.
A few notes for further investigation
The onDone technique in the above example is technically called "monadic bind", and indeed in real F# programs it's represented by the async.Bind method. This answer might also be of help understanding the concept.
The above is a bit of an oversimplification. In reality the async execution is a bit more complicated than that. Internally it uses a technique called "trampoline", which in plain terms is just a loop that runs a single thunk on every turn, but crucially, the running thunk can also "ask" it to run another thunk, and if it does, the loop will do so, and so on, forever, until the next thunk doesn't ask to run another thunk, and then the whole thing finally stops.
I specifically used Async.StartImmediate to start the computation in my example, because Async.StartImmediate will do just what it says on the tin: it will start running the computation immediately, right there. That's why everything ran on the same thread as the main program. There are many alternative starting functions in the Async module. For example, Async.Start will start the computation on the thread pool. The lines from log "1" to log "3" will still all happen synchronously, without thread switching between them, but it will happen on a different thread from log "start" and log "starting". In this case thread switching will happen before the async computation even starts, so it doesn't count.

In F#, how to always execute a bound Async process?

I have the following methods in F#:
let GetAllPatientNamesAsync() =
async {
let data = context.GetAllPatientNamesAsync()
return! Async.AwaitTask data
}
let getAllPatientNames = Async.RunSynchronously (FsNetwork.GetAllPatientNamesAsync())
This works the FIRST time:
{ m with names = getAllPatientNames}
But once getAllPatientNames has a value, calling it again does not execute the Async.RunSynchronously (FsNetwork.GetAllPatientNamesAsync()) -- rather getAllPatientNames returns its original value.
Since FsNetwork.GetAllPatientNamesAsync() is accessing the database, which is changing, after the first read getAllPatientNames is wrong.
Ofcourse, this works correctly no matter how many times I use it:
{ m with names = Async.RunSynchronously (FsNetwork.GetAllPatientNamesAsync()) }.
How should I write
let getAllPatientNames = Async.RunSynchronously (FsNetwork.GetAllPatientNamesAsync())
such that it ALWAYS executes the Async process? What am I doing wrong?
TIA
The problem is that
getAllPatientNames
is a a value and not a function. Make a it a function by changing it like below
let getAllPatientNames () = Async.RunSynchronously (FsNetwork.GetAllPatientNamesAsync())

F# Make Async<Async<MyTpe>[]> to Async<MyType>[]

I get some list of data from a HTTP call. I then know what values to get for another HTTP call. I would like to have everything be asynchronous. But I need to use this data with Expecto's testCaseAsync : string -> Async<unit> -> Test. So, my goal is to get a signature like so Async<Item>[]
So, I would like to get a list of testCaseAsync.
So, I basically have something like this:
// Async<Async<Item>[]>
let getAsyncCalls =
async {
let! table = API.getTable ()
// Async<Item>[]
let items =
table.root
|> Array.map (fun x -> API.getItem x.id)
return item
}
If I run them in parallel I get:
// Async<Item[]>
let getAsyncCalls =
async {
let! table = API.getTable ()
// Item[]
let! items =
table.root
|> Array.map (fun x -> API.getItem x.id)
return item
}
So, that doesn't get me to Async<Item>[]. I'm not sure if this is possible. I would like to avoid Async.RunSynchronously for the API.getTable call since that can lead to deadlocks, right? It will most likely be called from a cached value (memoized) so I'm not sure that will make a difference.
I guess I'll keep working on it unless someone else is more clever than me :-) Thanks in advance!
In general, you cannot turn Async<Async<T>[]> into Async<T>[]. The problem is that to even get the length of the array, you need to perform some operation asynchronously, so there is no way to "lift" the array outside of the async. If you knew the length of the array in advance, then you can make this work.
The following function turns Async<'T[]> into Async<'T>[] provided that you give it the length of the array. As you figured out, the returned asyncs need to somehow share access to the one top-level async. The easiest way of doing this I can think of is to use a task. Adapting that for your use case should be easy:
let unwrapAsyncArray (asyncs:Async<'T[]>) len =
let task = asyncs |> Async.StartAsTask
Array.init len (fun i -> async {
let! res = Async.AwaitTask task
if res.Length <> len then failwith "Wrong length!"
return res.[i] }
)

Resources