Select First Async Result - asynchronous

Is there an async operator to get the value first returned by two asynchronous values (Async<_>)?
For example, given two Async<_> values where one A1 returns after 1 second and A2 returns after 2 seconds, then I want the result from A1.
The reason is I want to implement an interleave function for asynchronous sequences, so that if there are two asynchronous sequences "defined" like this (with space indicating time as with marble diagrams):
S1 = -+-----+------------+----+
S2 = ---+-------+----------+-----+
Then I want to generate a new asynchronous sequence that acts like this:
S3 = -+-+---+---+--------+-+--+--+
Interleave S1 S2 = S3
But two do that, I probably need a kind of async select operator to select select values.
I think this would be like "select" in Go, where you can take the first available value from two channels.
TPL has a function called Task.WhenAny - I probably need something similar here.

I don't think the operator is available in the F# library. To combine this from existing operations, you could use Async.StartAsTask and then use the existing Task.WhenAny operator. However, I'm not exactly sure how that would behave with respect to cancellation.
The other option is to use the Async.Choose operator implemented on F# Snippets web site. This is not particularly elegant, but it should do the trick! To make the answer stand-alone, the code is attached below.
/// Creates an asynchronous workflow that non-deterministically returns the
/// result of one of the two specified workflows (the one that completes
/// first). This is similar to Task.WaitAny.
static member Choose(a, b) : Async<'T> =
Async.FromContinuations(fun (cont, econt, ccont) ->
// Results from the two
let result1 = ref (Choice1Of3())
let result2 = ref (Choice1Of3())
let handled = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
// Called when one of the workflows completes
let complete () =
let op =
synchronized (fun () ->
// If we already handled result (and called continuation)
// then ignore. Otherwise, if the computation succeeds, then
// run the continuation and mark state as handled.
// Only throw if both workflows failed.
match !handled, !result1, !result2 with
| true, _, _ -> ignore
| false, (Choice2Of3 value), _
| false, _, (Choice2Of3 value) ->
handled := true
(fun () -> cont value)
| false, Choice3Of3 e1, Choice3Of3 e2 ->
handled := true;
(fun () ->
econt (new AggregateException
("Both clauses of a choice failed.", [| e1; e2 |])))
| false, Choice1Of3 _, Choice3Of3 _
| false, Choice3Of3 _, Choice1Of3 _
| false, Choice1Of3 _, Choice1Of3 _ -> ignore )
op()
// Run a workflow and write result (or exception to a ref cell
let run resCell workflow = async {
try
let! res = workflow
synchronized (fun () -> resCell := Choice2Of3 res)
with e ->
synchronized (fun () -> resCell := Choice3Of3 e)
complete() }
// Start both work items in thread pool
Async.Start(run result1 a)
Async.Start(run result2 b) )

Tomas already answered the precise question. However, you might be interested to know that my Hopac library for F# directly supports Concurrent ML -style first-class, higher-order, selective events, called alternatives, which directly provide a choose -combinator and provide a more expressive concurrency abstraction mechanism than Go's select statement.
Regarding your more specific problem of interleaving two asynchronous sequences, I recently started experimenting with ideas on how Rx-style programming could be done with Hopac. One potential approach I came up with is to define a kind of ephemeral event streams. You can find the experimental code here:
Alts.fsi
Alts.fs
As you can see, one of the operations defined for event streams is merge. What you are looking for may be slightly different semantically, but would likely be straightforward to implement using Hopac -style alternatives (or Concurrent ML -style events).

Related

Hiding a System.Random instance by returning a function

The following code comes from Stylish F# 6: Crafting Elegant Functional Code for .NET 6 listing 9-13:
let randomByte =
let r = System.Random()
fun () ->
r.Next(0, 255) |> byte
// E.g. A3-52-31-D2-90-E6-6F-45-1C-3F-F2-9B-7F-58-34-44-
for _ in 0..15 do
printf "%X-" (randomByte())
printfn ""
The author states, "Although we call randomByte() multiple times, only one System.Random() instance is created."
I understand randomByte returns a function that does not create a System.Random() instance, but it seems to me multiple System.Random() instances would be created each time through the for-do-loop anyway.
I would appreciate an explanation of how multiple instances of System.Random() are not created in this case.
The key point is that randomByte is not a function. It's a value with some complex initialization logic. Like, for example, I could write:
let x = 5
Or I could write:
let x =
let fourtyTwo = 42
let thirtySeven = 37
fourtyTwo - thirtySeven
And these would be equivalent. Both declare a value named x and equal to 5. I hope you can see how the expression fourtyTwo - thirtySeven is evaluated only once, not every time somebody gets the value of x.
And so it works with randomByte too: it's a value with non-trivial initialization logic. During that value's initialization, first it creates an instance of System.Random, and then it creates an anonymous function that closes over that instance, and this anonymous function becomes the value of randomByte.

Railway oriented programming with Async operations

Previously asked similar question but somehow I'm not finding my way out, attempting again with another example.
The code as a starting point (a bit trimmed) is available at https://ideone.com/zkQcIU.
(it has some issue recognizing Microsoft.FSharp.Core.Result type, not sure why)
Essentially all operations have to be pipelined with the previous function feeding the result to the next one. The operations have to be async and they should return error to the caller in case an exception occurred.
The requirement is to give the caller either result or fault. All functions return a Tuple populated with either Success type Article or Failure with type Error object having descriptive code and message returned from the server.
Will appreciate a working example around my code both for the callee and the caller in an answer.
Callee Code
type Article = {
name: string
}
type Error = {
code: string
message: string
}
let create (article: Article) : Result<Article, Error> =
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Ok ((new DataContractJsonSerializer(typeof<Article>)).ReadObject(memoryStream) :?> Article)
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Error ((new DataContractJsonSerializer(typeof<Error>)).ReadObject(memoryStream) :?> Error)
Rest of the chained methods - Same signature and similar bodies. You can actually reuse the body of create for update, upload, and publish to be able to test and compile code.
let update (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let upload (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let publish (article: Article) : Result<Article, Error>
// body (same as create, method < POST)
Caller Code
let chain = create >> Result.bind update >> Result.bind upload >> Result.bind publish
match chain(schemaObject) with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Edit
Based on the answer and matching it with Scott's implementation (https://i.stack.imgur.com/bIxpD.png), to help in comparison and in better understanding.
let bind2 (switchFunction : 'a -> Async<Result<'b, 'c>>) =
fun (asyncTwoTrackInput : Async<Result<'a, 'c>>) -> async {
let! twoTrackInput = asyncTwoTrackInput
match twoTrackInput with
| Ok s -> return! switchFunction s
| Error err -> return Error err
}
Edit 2 Based on F# implementation of bind
let bind3 (binder : 'a -> Async<Result<'b, 'c>>) (asyncResult : Async<Result<'a, 'c>>) = async {
let! result = asyncResult
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
Take a look at the Suave source code, and specifically the WebPart.bind function. In Suave, a WebPart is a function that takes a context (a "context" is the current request and the response so far) and returns a result of type Async<context option>. The semantics of chaining these together are that if the async returns None, the next step is skipped; if it returns Some value, the next step is called with value as the input. This is pretty much the same semantics as the Result type, so you could almost copy the Suave code and adjust it for Result instead of Option. E.g., something like this:
module AsyncResult
let bind (f : 'a -> Async<Result<'b, 'c>>) (a : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> = async {
let! r = a
match r with
| Ok value ->
let next : Async<Result<'b, 'c>> = f value
return! next
| Error err -> return (Error err)
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) : 'a -> Async<Result<'c, 'e>> =
fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
Now you can write your chain as follows:
let chain = create >=> update >=> upload >=> publish
let result = chain(schemaObject) |> Async.RunSynchronously
match result with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Caution: I haven't been able to verify this code by running it in F# Interactive, since I don't have any examples of your create/update/etc. functions. It should work, in principle — the types all fit together like Lego building blocks, which is how you can tell that F# code is probably correct — but if I've made a typo that the compiler would have caught, I don't yet know about it. Let me know if that works for you.
Update: In a comment, you asked whether you need to have both the >>= and >=> operators defined, and mentioned that you didn't see them used in the chain code. I defined both because they serve different purposes, just like the |> and >> operators serve different purposes. >>= is like |>: it passes a value into a function. While >=> is like >>: it takes two functions and combines them. If you would write the following in a non-AsyncResult context:
let chain = step1 >> step2 >> step3
Then that translates to:
let asyncResultChain = step1AR >=> step2AR >=> step3AR
Where I'm using the "AR" suffix to indicate versions of those functions that return an Async<Result<whatever>> type. On the other hand, if you had written that in a pass-the-data-through-the-pipeline style:
let result = input |> step1 |> step2 |> step3
Then that would translate to:
let asyncResult = input >>= step1AR >>= step2AR >>= step3AR
So that's why you need both the bind and compose functions, and the operators that correspond to them: so that you can have the equivalent of either the |> or the >> operators for your AsyncResult values.
BTW, the operator "names" that I picked (>>= and >=>), I did not pick randomly. These are the standard operators that are used all over the place for the "bind" and "compose" operations on values like Async, or Result, or AsyncResult. So if you're defining your own, stick with the "standard" operator names and other people reading your code won't be confused.
Update 2: Here's how to read those type signatures:
'a -> Async<Result<'b, 'c>>
This is a function that takes type A, and returns an Async wrapped around a Result. The Result has type B as its success case, and type C as its failure case.
Async<Result<'a, 'c>>
This is a value, not a function. It's an Async wrapped around a Result where type A is the success case, and type C is the failure case.
So the bind function takes two parameters:
a function from A to an async of (either B or C)).
a value that's an async of (either A or C)).
And it returns:
a value that's an async of (either B or C).
Looking at those type signatures, you can already start to get an idea of what the bind function will do. It will take that value that's either A or C, and "unwrap" it. If it's C, it will produce an "either B or C" value that's C (and the function won't need to be called). If it's A, then in order to convert it to an "either B or C" value, it will call the f function (which takes an A).
All this happens within an async context, which adds an extra layer of complexity to the types. It might be easier to grasp all this if you look at the basic version of Result.bind, with no async involved:
let bind (f : 'a -> Result<'b, 'c>) (a : Result<'a, 'c>) =
match a with
| Ok val -> f val
| Error err -> Error err
In this snippet, the type of val is 'a, and the type of err is 'c.
Final update: There was one comment from the chat session that I thought was worth preserving in the answer (since people almost never follow chat links). Developer11 asked,
... if I were to ask you what Result.bind in my example code maps to your approach, can we rewrite it as create >> AsyncResult.bind update? It worked though. Just wondering i liked the short form and as you said they have a standard meaning? (in haskell community?)
My reply was:
Yes. If the >=> operator is properly written, then f >=> g will always be equivalent to f >> bind g. In fact, that's precisely the definition of the compose function, though that might not be immediately obvious to you because compose is written as fun x -> bind g (f x) rather than as f >> bind g. But those two ways of writing the compose function would be exactly equivalent. It would probably be very instructive for you to sit down with a piece of paper and draw out the function "shapes" (inputs & outputs) of both ways of writing compose.
Why do you want to use Railway Oriented Programming here? If you just want to run a sequence of operations and return information about the first exception that occurs, then F# already provides a language support for this using exceptions. You do not need Railway Oriented Programming for this. Just define your Error as an exception:
exception Error of code:string * message:string
Modify the code to throw the exception (also note that your create function takes article but does not use it, so I deleted that):
let create () = async {
let ds = new DataContractJsonSerializer(typeof<Error>)
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return ds.ReadObject(memoryStream) :?> Article
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return raise (Error (ds.ReadObject(memoryStream) :?> Error)) }
And then you can compose functions just by sequencing them in async block using let! and add exception handling:
let main () = async {
try
let! created = create ()
let! updated = update created
let! uploaded = upload updated
Debug.WriteLine(uploaded.name)
with Error(code, message) ->
Debug.WriteLine(code + ":" + message) }
If you wanted more sophisticated exception handling, then Railway Oriented Programming might be useful and there is certainly a way of integrating it with async, but if you just want to do what you described in your question, then you can do that much more easily with just standard F#.

get any non-error element from a list of deferred in OCaml/Async

Suppose I have a function such as:
query_server : Server.t -> string Or_error.t Deferred.t
Then I produce a list of deferred queries:
let queries : string Or_error.t Deferred.t list = List.map servers ~f:query_server
How to get the result of the first query that doesn't fail (or some error otherwise). Basically, I'd like a function such as:
any_non_error : 'a Or_error.t Deferred.t list -> 'a Or_error.t
Also, I'm not sure how to somehow aggregate the errors. Maybe my function needs an extra parameter such as Error.t -> Error.t -> Error.t or is there a standard way to combine errors?
A simple approach would be to use Deferred.List that contains list operations lifted into the Async monad, basically a container interface in the Kleisli category. We will try each server in order until the first one is ready, e.g.,
let first_non_error =
Deferred.List.find ~f:(fun s -> query_server s >>| Result.is_ok)
Of course, it is not any_non_error, as the processing is sequential. Also, we are losing the error information (though the latter is very easy to fix).
So to make it parallel, we will employ the following strategy. We will have two deferred computations, the first will run all queries in parallel and wait until all are ready, the second will become determined as soon as an Ok result is received. If the first one happens before the last one, then it means that all servers failed. So let's try:
let query_servers servers =
let a_success,got_success = Pipe.create () in
let all_errors = Deferred.List.map ~how:`Parallel servers ~f:(fun s ->
query_server s >>| function
| Error err as e -> e
| Ok x as ok -> Pipe.write_without_pushback x; ok) in
Deferred.any [
Deferred.any all_errors;
Pipe.read a_success >>= function
| `Ok x -> Ok x
| `Eof -> assert false
]

How can I throttle a large array of async workflows passed to Async.Parallel

I have an array holding a large number of small async database queries; for example:
// I actually have a more complex function that
// accepts name/value pairs for query parameters.
let runSql connString sql = async {
use connection = new SqlConnection(connString)
use command = new SqlCommand(sql, connection)
do! connection.OpenAsync() |> Async.AwaitIAsyncResult |> Async.Ignore
return! command.ExecuteScalarAsync() |> Async.AwaitTask
}
let getName (id:Guid) = async {
// I actually use a parameterized query
let querySql = "SELECT Name FROM Entities WHERE ID = '" + id.ToString() + "'"
return! runSql connectionString querySql
}
let ids : Guid array = getSixtyThousandIds()
let asyncWorkflows = ids |> Array.map getName
//...
Now, the problem: The next expression runs all 60K workflows at once, flooding the server. This leads to many of the SqlCommands timing out; it also typically causes out of memory exceptions in the client (which is F# interactive) for reasons I do not understand and (not needing to understand them) have not investigated:
//...
let names =
asyncWorkflows
|> Async.Parallel
|> Async.RunSynchronously
I've written a rough-and-ready function to batch the requests:
let batch batchSize asyncs = async {
let batches = asyncs
|> Seq.mapi (fun i a -> i, a)
|> Seq.groupBy (fst >> fun n -> n / batchSize)
|> Seq.map (snd >> Seq.map snd)
|> Seq.map Async.Parallel
let results = ref []
for batch in batches do
let! result = batch
results := (result :: !results)
return (!results |> List.rev |> Seq.collect id |> Array.ofSeq)
}
To use this function, I replace Async.Parallel with batch 20 (or another integer value):
let names =
asyncWorkflows
|> batch 20
|> Async.RunSynchronously
This works reasonably well, but I would prefer to have a system that starts each new async as soon as one completes, so rather than successive batches of size N starting after each previous batch of size N has finished, I am always awaiting N active SqlCommands (until I get to the end, of course).
Questions:
Am I reinventing the wheel? In other words, are there library functions that do this already? (Would it be profitable to look into exploiting ParallelEnumerable.WithDegreeOfParallelism somehow?)
If not, how should I implement a continuous queue instead of a series of discrete batches?
I am not primarily seeking suggestions to improve the existing code, but such suggestions will nonetheless be received with interest and gratitude.
FSharpx.Control offers an Async.ParallelWithThrottle function. I'm not sure if it is the best implementation as it uses SemaphoreSlim. But the ease of use is great and since my application doesn't need top performance it works well enough for me. Although since it is a library if someone knows how to make it better it is always a nice thing to make libraries top performers out of the box so the rest of us can just use the code that works and just get our work done!
Async.Parallel had support for throttling added in FSharp v 4.7. You do:
let! results = Async.Parallel(workflows, maxDegreeOfParallelism = dop)
if doing more than 1200 workflows concurrently in FSharp.Core versions <= 6.0.5, see this resolved issue
Proposal for a more explicit API

Why would disposal of resources be delayed when using the "use" binding within an async computation expression?

I've got an agent which I set up to do some database work in the background. The implementation looks something like this:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
use conn = new System.Data.SqlClient.SqlConnection("...")
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
With this I discovered that if this was called several times in some amount of time I would start getting SqlConnection objects piling up and not being disposed, and eventually I would run out of connections in the connection pool (I don't have exact metrics on how many "several" is, but running an integration test suite twice in a row could always cause the connection pool to run dry).
If I change the use to a using then things are disposed properly and I don't have a problem:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
using (new System.Data.SqlClient.SqlConnection("...")) <| fun conn ->
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
It seems that the Using method of the AsyncBuilder is not properly calling its finally function for some reason, but it's not clear why. Does this have something to do with how I've written my recursive async expression, or is this some obscure bug? And does this suggest that utilizing use within other computation expressions could produce the same sort of behavior?
This is actually the expected behavior - although not entirely obvious!
The use construct disposes of the resource when the execution of the asynchronous workflow leaves the current scope. This is the same as the behavior of use outside of asynchronous workflows. The problem is that recursive call (outside of async) or recursive call using return! (inside async) does not mean that you are leaving the scope. So in this case, the resource is disposed of only after the recursive call returns.
To test this, I'll use a helper that prints when disposed:
let tester () =
{ new System.IDisposable with
member x.Dispose() = printfn "bye" }
The following function terminates the recursion after 10 iterations. This means that it keeps allocating the resources and disposes of all of them only after the entire workflow completes:
let rec loop(n) = async {
if n < 10 then
use t = tester()
do! Async.Sleep(1000)
return! loop(n+1) }
If you run this, it will run for 10 seconds and then print 10 times "bye" - this is because the allocated resources are still in scope during the recursive calls.
In your sample, the using function delimits the scope more explicitly. However, you can do the same using nested asynchronous workflow. The following only has the resource in scope when calling the Sleep method and so it disposes of it before the recursive call:
let rec loop(n) = async {
if n < 10 then
do! async {
use t = tester()
do! Async.Sleep(1000) }
return! loop(n+1) }
Similarly, when you use for loop or other constructs that restrict the scope, the resource is disposed immediately:
let rec loop(n) = async {
for i in 0 .. 10 do
use t = tester()
do! Async.Sleep(1000) }

Resources