get any non-error element from a list of deferred in OCaml/Async - asynchronous

Suppose I have a function such as:
query_server : Server.t -> string Or_error.t Deferred.t
Then I produce a list of deferred queries:
let queries : string Or_error.t Deferred.t list = List.map servers ~f:query_server
How to get the result of the first query that doesn't fail (or some error otherwise). Basically, I'd like a function such as:
any_non_error : 'a Or_error.t Deferred.t list -> 'a Or_error.t
Also, I'm not sure how to somehow aggregate the errors. Maybe my function needs an extra parameter such as Error.t -> Error.t -> Error.t or is there a standard way to combine errors?

A simple approach would be to use Deferred.List that contains list operations lifted into the Async monad, basically a container interface in the Kleisli category. We will try each server in order until the first one is ready, e.g.,
let first_non_error =
Deferred.List.find ~f:(fun s -> query_server s >>| Result.is_ok)
Of course, it is not any_non_error, as the processing is sequential. Also, we are losing the error information (though the latter is very easy to fix).
So to make it parallel, we will employ the following strategy. We will have two deferred computations, the first will run all queries in parallel and wait until all are ready, the second will become determined as soon as an Ok result is received. If the first one happens before the last one, then it means that all servers failed. So let's try:
let query_servers servers =
let a_success,got_success = Pipe.create () in
let all_errors = Deferred.List.map ~how:`Parallel servers ~f:(fun s ->
query_server s >>| function
| Error err as e -> e
| Ok x as ok -> Pipe.write_without_pushback x; ok) in
Deferred.any [
Deferred.any all_errors;
Pipe.read a_success >>= function
| `Ok x -> Ok x
| `Eof -> assert false
]

Related

Understanding side effects with monadic traversal

I am trying to properly understand how side effects work when traversing a list in F# using monadic style, following Scott's guide here
I have an AsyncSeq of items, and a side-effecting function that can return a Result<'a,'b> (it is saving the items to disk).
I get the general idea - split the head and tail, apply the func to the head. If it returns Ok then recurse through the tail, doing the same thing. If an Error is returned at any point then short circuit and return it.
I also get why Scott's ultimate solution uses foldBack rather than fold - it keeps the output list in the same order as the input as each processed item is prepended to the previous.
I can also follow the logic:
The result from the list's last item (processed first as we are using foldback) will be passed as the accumulator to the next item.
If it is an Error and the next item is Ok, the next item is discarded.
If the next item is an Error, it replaces any previous results and becomes the accumulator.
That means by the time you have recursed over the entire list from right to left and ended up at the start, you either have an Ok of all of the results in the correct order or the most recent Error (which would have been the first to occur if we had gone left to right).
The thing that confuses me is that surely, since we are starting at the end of the list, all side effects of processing every item will take place, even if we only get back the last Error that was created?
This seems to be confirmed here as the print output starts with [5], then [4,5], then [3,4,5] etc.
The thing that confuses me is that this isn't what I see happening when I use AsyncSeq.traverseChoiceAsync from the FSharpx lib (which I wrapped to process Result instead of Choice). I see side effects happening from left to right, stopping on the first error, which is what I want to happen.
It also looks like Scott's non-tail recursive version (which doesn't use foldBack and just recurses over the list) goes from left to right? The same goes for the AsyncSeq version. That would explain why I see it short circuit on the first error but surely if it completes Ok then the output items would be reversed, which is why we normally use foldback?
I feel I am misunderstanding or misreading something obvious! Could someone please explain it to me? :)
Edit:
rmunn has given a really great comprehensive explanation of the AsyncSeq traversal below. The TLDR was that
Scott's initial implementation and the AsyncSeq traverse both do go from left to right as I thought and so only process until they hit an error
they keep their contents in order by prepending the head to the processed tail rather than prepending each processed result to the previous (which is what the built in F# fold does).
foldback would keep things in order but would indeed execute every case (which could take forever with an async seq)
It's pretty simple: traverseChoiceAsync isn't using foldBack. Yes, with foldBack the last item would be processed first, so that by the time you get to the first item and discover that its result is Error you'd have triggered the side effects of every item. Which is, I think, precisely why whoever wrote traverseChoiceAsync in FSharpx chose not to use foldBack, because they wanted to ensure that side effects would be triggered in order, and stop at the first Error (or, in the case of the Choice version of the function, the first Choice2Of2 — but I'll pretend from this point on that that function was written to use the Result type.)
Let's look at the traverseChoieAsync function in the code you linked to, and read through it step-by-step. I'll also rewrite it to use Result instead of Choice, because the two types are basically identical in function but with different names in the DU, and it'll be a little easier to tell what's going on if the DU cases are called Ok and Error instead of Choice1Of2 and Choice2Of2. Here's the original code:
let rec traverseChoiceAsync (f:'a -> Async<Choice<'b, 'e>>) (s:AsyncSeq<'a>) : Async<Choice<AsyncSeq<'b>, 'e>> = async {
let! s = s
match s with
| Nil -> return Choice1Of2 (Nil |> async.Return)
| Cons(a,tl) ->
let! b = f a
match b with
| Choice1Of2 b ->
return! traverseChoiceAsync f tl |> Async.map (Choice.mapl (fun tl -> Cons(b, tl) |> async.Return))
| Choice2Of2 e ->
return Choice2Of2 e }
And here's the original code rewritten to use Result. Note that it's a simple rename, and none of the logic needs to be changed:
let rec traverseResultAsync (f:'a -> Async<Result<'b, 'e>>) (s:AsyncSeq<'a>) : Async<Result<AsyncSeq<'b>, 'e>> = async {
let! s = s
match s with
| Nil -> return Ok (Nil |> async.Return)
| Cons(a,tl) ->
let! b = f a
match b with
| Ok b ->
return! traverseChoiceAsync f tl |> Async.map (Result.map (fun tl -> Cons(b, tl) |> async.Return))
| Error e ->
return Error e }
Now let's step through it. The whole function is wrapped inside an async { } block, so let! inside this function means "unwrap" in an async context (essentially, "await").
let! s = s
This takes the s parameter (of type AsyncSeq<'a>) and unwraps it, binding the result to a local name s that henceforth will shadow the original parameter. When you await the result of an AsyncSeq, what you get is the first element only, while the rest is still wrapped in an async that needs to be further awaited. You can see this by looking at the result of the match expression, or by looking at the definition of the AsyncSeq type:
type AsyncSeq<'T> = Async<AsyncSeqInner<'T>>
and AsyncSeqInner<'T> =
| Nil
| Cons of 'T * AsyncSeq<'T>
So when you do let! x = s when s is of type AsyncSeq<'T>, the value of x will either be Nil (when the sequence has run to its end) or it will be Cons(head, tail) where head is of type 'T and tail is of type AsyncSeq<'T>.
So after this let! s = s line, our local name s now refers to an AsyncSeqInner type, which contains the head item of the sequence (or Nil if the sequence was empty), and the rest of the sequence is still wrapped in an AsyncSeq so it has yet to be evaluated (and, crucially, its side effects have not yet happened).
match s with
| Nil -> return Ok (Nil |> async.Return)
There's a lot happening in this line, so it'll take a bit of unpacking, but the gist is that if the input sequence s had Nil as its head, i.e. had reached its end, then that's not an error, and we return an empty sequence.
Now to unpack. The outer return is in an async keyword, so it takes the Result (whose value is Ok something) and turns it into an Async<Result<something>>. Remembering that the return type of the function is declared as Async<Result<AsyncSeq>>, the inner something is clearly an AsyncSeq type. So what's going on with that Nil |> async.Return? Well, async isn't an F# keyword, it's the name of an instance of AsyncBuilder. Inside a computation expression foo { ... }, return x is translated into foo.Return(x). So calling async.Return x is just the same as writing async { return x }, except that it avoids nesting a computation expression inside another computation expression, which would be a little nasty to try and parse mentally (and I'm not 100% sure the F# compiler allows it syntactically). So Nil |> async.Return is async.Return Nil which means it produces a value of Async<x> where x is the type of the value Nil. And as we just saw, this Nil is a value of type AsyncSeqInner, so Nil |> async.Return produces an Async<AsyncSeqInner>. And another name for Async<AsyncSeqInner> is AsyncSeq. So this whole expression produces an Async<Result<AsyncSeq>> that has the meaning of "We're done here, there are no more items in the sequence, and there was no error".
Phew. Now for the next line:
| Cons(a,tl) ->
Simple: if the next item in the AsyncSeq named s was a Cons, we deconstruct it so that the actual item is now called a, and the tail (another AsyncSeq) is called tl.
let! b = f a
This calls f on the value we just got out of s, and then unwraps the Async part of f's return value, so that b is now a Result<'b, 'e>.
match b with
| Ok b ->
More shadowed names. Inside this branch of the match, b now names a value of type 'b rather than a Result<'b, 'e>.
return! traverseResultAsync f tl |> Async.map (Result.map (fun tl -> Cons(b, tl) |> async.Return))
Hoo boy. That's too much to tackle at once. Let's write this as if the |> operators were lined up on separate lines, and then we'll go through each step one at a time. (Note that I've wrapped an extra pair of parentheses around this, just to clarify that it's the final result of this whole expression that will be passed to the return! keyword).
return! (
traverseResultAsync f tl
|> Async.map (
Result.map (
fun tl -> Cons(b, tl) |> async.Return)))
I'm going to tackle this expression from the inside out. The inner line is:
fun tl -> Cons(b, tl) |> async.Return
The async.Return thing we've already seen. This is a function that takes a tail (we don't currently know, or care, what's inside that tail, except that by the necessity of the type signature of Cons it must be an AsyncSeq) and turns it into an AsyncSeq that is b followed by the tail. I.e., this is like b :: tl in a list: it sticks b onto the front of the AsyncSeq.
One step out from that innermost expression is:
Result.map
Remember that the function map can be thought of in two ways: one is "take a function and run it against whatever is "inside" this wrapper". The other is "take a function that operates on 'T and make it into a function that operates on Wrapper<'T>". (If you don't have both of those clear in your mind yet, https://sidburn.github.io/blog/2016/03/27/understanding-map is a pretty good article to help grok that concept). So what this is doing is taking a function of type AsyncSeq -> AsyncSeq and turning it into a function of type Result<AsyncSeq> -> Result<AsyncSeq>. Alternately, you could think of it as taking a Result<tail> and calling fun tail -> ... against that tail result, then re-wrapping the result of that function in a new Result. Important: Because this is using Result.map (Choice.mapl in the original) we know that if tail is an Error value (or if the Choice was a Choice2Of2 in the original), the function will not be called. So if traverseResultAsync produces a result that starts with an Error value, it's going to produce an <Async<Result<foo>>> where the value of Result<foo> is an Error, and so the value of the tail will be discarded. Keep that in mind for later.
Okay, next step out.
Async.map
Here, we have a Result<AsyncSeq> -> Result<AsyncSeq> function produced by the inner expression, and this converts it to an Async<Result<AsyncSeq>> -> Async<Result<AsyncSeq>> function. We've just talked about this, so we don't need to go over how map works again. Just remember that the effect of this Async<Result<AsyncSeq>> -> Async<Result<AsyncSeq>> function that we've built up will be the following:
Await the outer async.
If the result is Error, return that Error.
If the result is Ok tail, produce an Ok (Cons (b, tail)).
Next line:
traverseResultAsync f tl
I probably should have started with this, because this will actually run first, and then its value will be passed into the Async<Result<AsyncSeq>> -> Async<Result<AsyncSeq>> function that we've just analysed.
So what this whole thing will do is to say "Okay, we took the first part of the AsyncSeq we were handed, and passed it to f, and f produced an Ok result with a value we're calling b. So now we need to process the rest of the sequence similarly, and then, if the rest of the sequence produces an Ok result, we'll stick b on the front of it and return an Ok sequence with contents b :: tail. BUT if the rest of the sequence produces an Error, we'll throw away the value of b and just return that Error unchanged."
return!
This just takes the result we just got (either an Error or an Ok (b :: tail), already wrapped in an Async) and returns it unchanged. But note that the call to traverseResultAsync is NOT tail-recursive, because its value had to be passed into the Async.map (...) expression first.
And now we still have one more bit of traverseResultAsync to look at. Remember when I said "Keep that in mind for later"? Well, that time has arrived.
| Error e ->
return Error e }
Here we're back in the match b with expression. If b was an Error result, then no further recursive calls are made, and the whole traverseResultAsync returns an Async<Result> where the Result value is Error. And if we were currently nested deep inside a recursion (i.e., we're in the return! traverseResultAsync ... expression), then our return value will be Error, which means the result of the "outer" call, as we've kept in mind, will also be Error, discarding any other Ok results that might have happened "before".
Conclusion
And so the effect of all of that is:
Step through the AsyncSeq, calling f on each item in turn.
The first time f returns Error, stop stepping through, throw away any previous Ok results, and return that Error as the result of the whole thing.
If f never returns Error and instead returns Ok b every time, return an Ok result that contains an AsyncSeq of all those b values, in their original order.
Why are they in their original order? Because the logic in the Ok case is:
If sequence was empty, return an empty sequence.
Split into head and tail.
Get value b from f head.
Process the tail.
Stick value b in front of the result of processing the tail.
So if we started with (conceptually) [a1; a2; a3], which actually looks like Cons (a1, Cons (a2, Cons (a3, Nil))) we'll end up with Cons (b1, Cons (b2, Cons (b3, Nil))) which translates to the conceptual sequence [b1; b2; b3].
See #rmunn's great answer above for the explanation. I just wanted to post a little helper for anyone that reads this in the future, it allows you to use the AsyncSeq traverse with Results instead of the old Choice type it was written with:
let traverseResultAsyncM (mapping : 'a -> Async<Result<'b,'c>>) source =
let mapping' =
mapping
>> Async.map (function
| Ok x -> Choice1Of2 x
| Error e -> Choice2Of2 e)
AsyncSeq.traverseChoiceAsync mapping' source
|> Async.map (function
| Choice1Of2 x -> Ok x
| Choice2Of2 e -> Error e)
Also here is a version for non-async mappings:
let traverseResultM (mapping : 'a -> Result<'b,'c>) source =
let mapping' x = async {
return
mapping x
|> function
| Ok x -> Choice1Of2 x
| Error e -> Choice2Of2 e
}
AsyncSeq.traverseChoiceAsync mapping' source
|> Async.map (function
| Choice1Of2 x -> Ok x
| Choice2Of2 e -> Error e)

How to use memoize over sequence

let memoize (sequence: seq<'a>) =
let cache = Dictionary()
seq {for i in sequence ->
match cache.TryGetValue i with
| true, v -> printf "cached"
| false,_ -> cache.Add(i ,i)
}
I will call my memoize function inside this function :
let isCached (input:seq<'a>) : seq<'a> = memoize input
If the given sequence item is cached it should print cached otherwise it will continue to add sequence value to cache.
Right now I have problems with types.
When I try to call my function like this :
let seq1 = seq { 1 .. 10 }
isCached seq1
It throws an error
"The type int does not match the type unit"
I want my function to work generic even though I return printfn. Is it possible to achieve that? And while adding value to the cache is it appropriate to give the same value to tuple?
eg:
| false,_ -> cache.Add(i ,i)
I think the problem is that your memoize function does not actually return the item from the source sequence as a next element of the returned sequence. Your version only adds items to the cache, but then it returns unit. You can fix that by writing:
let memoize (sequence: seq<'a>) =
let cache = Dictionary()
seq {for i in sequence do
match cache.TryGetValue i with
| true, v -> printf "cached"
| false,_ -> cache.Add(i ,i)
yield i }
I used explicit yield rather than -> because I think that makes the code more readable. With this change, the code runs as expected for me.
Tomas P beat me to the punch, but I'll post this up anyway just in case it helps.
I'm not too sure what you are trying to achieve here, but I'll say a few things that I think might help.
Firstly, the type error. Your isCached function is defined as taking a seq of type 'a, and returning a seq of type 'a. As written in your question, right now it takes a seq of type 'a, and returns a sequence of type unit. If you try modifying the output specification to seq<'b> (or actually just omitting it altogether and letting type inference do it), you should overcome the type error. This probably still won't do what you want, since you aren't actually returning the cache from that function (you can just add cache as the final line to return it). Thus, try something like:
let memoize (sequence: seq<'a>) =
let cache = Dictionary()
for i in sequence do
match cache.TryGetValue i with
| true, v -> printf "cached"
| false,_ -> cache.Add(i ,i)
cache
let isCached (input:seq<'a>) : seq<'b> = memoize input
All this being said, if you are expecting to iterate over the same sequence a lot, it might be best just to use the library function Seq.cache.
Finally, with regards to using the value as the key in the dictionary... There's nothing stopping you from doing that, but it's really fairly pointless. If you already have a value, then you shouldn't need to look it up in the dictionary. If you are just trying to memoize the sequence, then use the index of the given element as the key. Or use the specific input as the key and the output from that input as the value.

Railway oriented programming with Async operations

Previously asked similar question but somehow I'm not finding my way out, attempting again with another example.
The code as a starting point (a bit trimmed) is available at https://ideone.com/zkQcIU.
(it has some issue recognizing Microsoft.FSharp.Core.Result type, not sure why)
Essentially all operations have to be pipelined with the previous function feeding the result to the next one. The operations have to be async and they should return error to the caller in case an exception occurred.
The requirement is to give the caller either result or fault. All functions return a Tuple populated with either Success type Article or Failure with type Error object having descriptive code and message returned from the server.
Will appreciate a working example around my code both for the callee and the caller in an answer.
Callee Code
type Article = {
name: string
}
type Error = {
code: string
message: string
}
let create (article: Article) : Result<Article, Error> =
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Ok ((new DataContractJsonSerializer(typeof<Article>)).ReadObject(memoryStream) :?> Article)
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
Error ((new DataContractJsonSerializer(typeof<Error>)).ReadObject(memoryStream) :?> Error)
Rest of the chained methods - Same signature and similar bodies. You can actually reuse the body of create for update, upload, and publish to be able to test and compile code.
let update (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let upload (article: Article) : Result<Article, Error>
// body (same as create, method <- PUT)
let publish (article: Article) : Result<Article, Error>
// body (same as create, method < POST)
Caller Code
let chain = create >> Result.bind update >> Result.bind upload >> Result.bind publish
match chain(schemaObject) with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Edit
Based on the answer and matching it with Scott's implementation (https://i.stack.imgur.com/bIxpD.png), to help in comparison and in better understanding.
let bind2 (switchFunction : 'a -> Async<Result<'b, 'c>>) =
fun (asyncTwoTrackInput : Async<Result<'a, 'c>>) -> async {
let! twoTrackInput = asyncTwoTrackInput
match twoTrackInput with
| Ok s -> return! switchFunction s
| Error err -> return Error err
}
Edit 2 Based on F# implementation of bind
let bind3 (binder : 'a -> Async<Result<'b, 'c>>) (asyncResult : Async<Result<'a, 'c>>) = async {
let! result = asyncResult
match result with
| Error e -> return Error e
| Ok x -> return! binder x
}
Take a look at the Suave source code, and specifically the WebPart.bind function. In Suave, a WebPart is a function that takes a context (a "context" is the current request and the response so far) and returns a result of type Async<context option>. The semantics of chaining these together are that if the async returns None, the next step is skipped; if it returns Some value, the next step is called with value as the input. This is pretty much the same semantics as the Result type, so you could almost copy the Suave code and adjust it for Result instead of Option. E.g., something like this:
module AsyncResult
let bind (f : 'a -> Async<Result<'b, 'c>>) (a : Async<Result<'a, 'c>>) : Async<Result<'b, 'c>> = async {
let! r = a
match r with
| Ok value ->
let next : Async<Result<'b, 'c>> = f value
return! next
| Error err -> return (Error err)
}
let compose (f : 'a -> Async<Result<'b, 'e>>) (g : 'b -> Async<Result<'c, 'e>>) : 'a -> Async<Result<'c, 'e>> =
fun x -> bind g (f x)
let (>>=) a f = bind f a
let (>=>) f g = compose f g
Now you can write your chain as follows:
let chain = create >=> update >=> upload >=> publish
let result = chain(schemaObject) |> Async.RunSynchronously
match result with
| Ok article -> Debug.WriteLine(article.name)
| Error error -> Debug.WriteLine(error.code + ":" + error.message)
Caution: I haven't been able to verify this code by running it in F# Interactive, since I don't have any examples of your create/update/etc. functions. It should work, in principle — the types all fit together like Lego building blocks, which is how you can tell that F# code is probably correct — but if I've made a typo that the compiler would have caught, I don't yet know about it. Let me know if that works for you.
Update: In a comment, you asked whether you need to have both the >>= and >=> operators defined, and mentioned that you didn't see them used in the chain code. I defined both because they serve different purposes, just like the |> and >> operators serve different purposes. >>= is like |>: it passes a value into a function. While >=> is like >>: it takes two functions and combines them. If you would write the following in a non-AsyncResult context:
let chain = step1 >> step2 >> step3
Then that translates to:
let asyncResultChain = step1AR >=> step2AR >=> step3AR
Where I'm using the "AR" suffix to indicate versions of those functions that return an Async<Result<whatever>> type. On the other hand, if you had written that in a pass-the-data-through-the-pipeline style:
let result = input |> step1 |> step2 |> step3
Then that would translate to:
let asyncResult = input >>= step1AR >>= step2AR >>= step3AR
So that's why you need both the bind and compose functions, and the operators that correspond to them: so that you can have the equivalent of either the |> or the >> operators for your AsyncResult values.
BTW, the operator "names" that I picked (>>= and >=>), I did not pick randomly. These are the standard operators that are used all over the place for the "bind" and "compose" operations on values like Async, or Result, or AsyncResult. So if you're defining your own, stick with the "standard" operator names and other people reading your code won't be confused.
Update 2: Here's how to read those type signatures:
'a -> Async<Result<'b, 'c>>
This is a function that takes type A, and returns an Async wrapped around a Result. The Result has type B as its success case, and type C as its failure case.
Async<Result<'a, 'c>>
This is a value, not a function. It's an Async wrapped around a Result where type A is the success case, and type C is the failure case.
So the bind function takes two parameters:
a function from A to an async of (either B or C)).
a value that's an async of (either A or C)).
And it returns:
a value that's an async of (either B or C).
Looking at those type signatures, you can already start to get an idea of what the bind function will do. It will take that value that's either A or C, and "unwrap" it. If it's C, it will produce an "either B or C" value that's C (and the function won't need to be called). If it's A, then in order to convert it to an "either B or C" value, it will call the f function (which takes an A).
All this happens within an async context, which adds an extra layer of complexity to the types. It might be easier to grasp all this if you look at the basic version of Result.bind, with no async involved:
let bind (f : 'a -> Result<'b, 'c>) (a : Result<'a, 'c>) =
match a with
| Ok val -> f val
| Error err -> Error err
In this snippet, the type of val is 'a, and the type of err is 'c.
Final update: There was one comment from the chat session that I thought was worth preserving in the answer (since people almost never follow chat links). Developer11 asked,
... if I were to ask you what Result.bind in my example code maps to your approach, can we rewrite it as create >> AsyncResult.bind update? It worked though. Just wondering i liked the short form and as you said they have a standard meaning? (in haskell community?)
My reply was:
Yes. If the >=> operator is properly written, then f >=> g will always be equivalent to f >> bind g. In fact, that's precisely the definition of the compose function, though that might not be immediately obvious to you because compose is written as fun x -> bind g (f x) rather than as f >> bind g. But those two ways of writing the compose function would be exactly equivalent. It would probably be very instructive for you to sit down with a piece of paper and draw out the function "shapes" (inputs & outputs) of both ways of writing compose.
Why do you want to use Railway Oriented Programming here? If you just want to run a sequence of operations and return information about the first exception that occurs, then F# already provides a language support for this using exceptions. You do not need Railway Oriented Programming for this. Just define your Error as an exception:
exception Error of code:string * message:string
Modify the code to throw the exception (also note that your create function takes article but does not use it, so I deleted that):
let create () = async {
let ds = new DataContractJsonSerializer(typeof<Error>)
let request = WebRequest.Create("http://example.com") :?> HttpWebRequest
request.Method <- "GET"
try
use response = request.GetResponse() :?> HttpWebResponse
use reader = new StreamReader(response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return ds.ReadObject(memoryStream) :?> Article
with
| :? WebException as e ->
use reader = new StreamReader(e.Response.GetResponseStream())
use memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(reader.ReadToEnd()))
return raise (Error (ds.ReadObject(memoryStream) :?> Error)) }
And then you can compose functions just by sequencing them in async block using let! and add exception handling:
let main () = async {
try
let! created = create ()
let! updated = update created
let! uploaded = upload updated
Debug.WriteLine(uploaded.name)
with Error(code, message) ->
Debug.WriteLine(code + ":" + message) }
If you wanted more sophisticated exception handling, then Railway Oriented Programming might be useful and there is certainly a way of integrating it with async, but if you just want to do what you described in your question, then you can do that much more easily with just standard F#.

Concatenate a vector of vectors of strings

I'm trying to write a function that receives a vector of vectors of strings and returns all vectors concatenated together, i.e. it returns a vector of strings.
The best I could do so far has been the following:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let vals : Vec<&String> = vecs.iter().flat_map(|x| x.into_iter()).collect();
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
However, I'm not happy with this result, because it seems I should be able to get Vec<String> from the first collect call, but somehow I am not able to figure out how to do it.
I am even more interested to figure out why exactly the return type of collect is Vec<&String>. I tried to deduce this from the API documentation and the source code, but despite my best efforts, I couldn't even understand the signatures of functions.
So let me try and trace the types of each expression:
- vecs.iter(): Iter<T=Vec<String>, Item=Vec<String>>
- vecs.iter().flat_map(): FlatMap<I=Iter<Vec<String>>, U=???, F=FnMut(Vec<String>) -> U, Item=U>
- vecs.iter().flat_map().collect(): (B=??? : FromIterator<U>)
- vals was declared as Vec<&String>, therefore
vals == vecs.iter().flat_map().collect(): (B=Vec<&String> : FromIterator<U>). Therefore U=&String.
I'm assuming above that the type inferencer is able to figure out that U=&String based on the type of vals. But if I give the expression the explicit types in the code, this compiles without error:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let a: Iter<Vec<String>> = vecs.iter();
let b: FlatMap<Iter<Vec<String>>, Iter<String>, _> = a.flat_map(|x| x.into_iter());
let c = b.collect();
print_type_of(&c);
let vals : Vec<&String> = c;
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
Clearly, U=Iter<String>... Please help me clear up this mess.
EDIT: thanks to bluss' hint, I was able to achieve one collect as follows:
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
vecs.into_iter().flat_map(|x| x.into_iter()).collect()
}
My understanding is that by using into_iter I transfer ownership of vecs to IntoIter and further down the call chain, which allows me to avoid copying the data inside the lambda call and therefore - magically - the type system gives me Vec<String> where it used to always give me Vec<&String> before. While it is certainly very cool to see how the high-level concept is reflected in the workings of the library, I wish I had any idea how this is achieved.
EDIT 2: After a laborious process of guesswork, looking at API docs and using this method to decipher the types, I got them fully annotated (disregarding the lifetimes):
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let a: Iter<Vec<String>> = vecs.iter();
let f : &Fn(&Vec<String>) -> Iter<String> = &|x: &Vec<String>| x.into_iter();
let b: FlatMap<Iter<Vec<String>>, Iter<String>, &Fn(&Vec<String>) -> Iter<String>> = a.flat_map(f);
let vals : Vec<&String> = b.collect();
vals.into_iter().map(|v: &String| v.to_owned()).collect()
}
I'd think about: why do you use iter() on the outer vec but into_iter() on the inner vecs? Using into_iter() is actually crucial, so that we don't have to copy first the inner vectors, then the strings inside, we just receive ownership of them.
We can actually write this just like a summation: concatenate the vectors two by two. Since we always reuse the allocation & contents of the same accumulation vector, this operation is linear time.
To minimize time spent growing and reallocating the vector, calculate the space needed up front.
fn concat_vecs(vecs: Vec<Vec<String>>) -> Vec<String> {
let size = vecs.iter().fold(0, |a, b| a + b.len());
vecs.into_iter().fold(Vec::with_capacity(size), |mut acc, v| {
acc.extend(v); acc
})
}
If you do want to clone all the contents, there's already a method for that, and you'd just use vecs.concat() /* -> Vec<String> */
The approach with .flat_map is fine, but if you don't want to clone the strings again you have to use .into_iter() on all levels: (x is Vec<String>).
vecs.into_iter().flat_map(|x| x.into_iter()).collect()
If instead you want to clone each string you can use this: (Changed .into_iter() to .iter() since x here is a &Vec<String> and both methods actually result in the same thing!)
vecs.iter().flat_map(|x| x.iter().map(Clone::clone)).collect()

Select First Async Result

Is there an async operator to get the value first returned by two asynchronous values (Async<_>)?
For example, given two Async<_> values where one A1 returns after 1 second and A2 returns after 2 seconds, then I want the result from A1.
The reason is I want to implement an interleave function for asynchronous sequences, so that if there are two asynchronous sequences "defined" like this (with space indicating time as with marble diagrams):
S1 = -+-----+------------+----+
S2 = ---+-------+----------+-----+
Then I want to generate a new asynchronous sequence that acts like this:
S3 = -+-+---+---+--------+-+--+--+
Interleave S1 S2 = S3
But two do that, I probably need a kind of async select operator to select select values.
I think this would be like "select" in Go, where you can take the first available value from two channels.
TPL has a function called Task.WhenAny - I probably need something similar here.
I don't think the operator is available in the F# library. To combine this from existing operations, you could use Async.StartAsTask and then use the existing Task.WhenAny operator. However, I'm not exactly sure how that would behave with respect to cancellation.
The other option is to use the Async.Choose operator implemented on F# Snippets web site. This is not particularly elegant, but it should do the trick! To make the answer stand-alone, the code is attached below.
/// Creates an asynchronous workflow that non-deterministically returns the
/// result of one of the two specified workflows (the one that completes
/// first). This is similar to Task.WaitAny.
static member Choose(a, b) : Async<'T> =
Async.FromContinuations(fun (cont, econt, ccont) ->
// Results from the two
let result1 = ref (Choice1Of3())
let result2 = ref (Choice1Of3())
let handled = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
// Called when one of the workflows completes
let complete () =
let op =
synchronized (fun () ->
// If we already handled result (and called continuation)
// then ignore. Otherwise, if the computation succeeds, then
// run the continuation and mark state as handled.
// Only throw if both workflows failed.
match !handled, !result1, !result2 with
| true, _, _ -> ignore
| false, (Choice2Of3 value), _
| false, _, (Choice2Of3 value) ->
handled := true
(fun () -> cont value)
| false, Choice3Of3 e1, Choice3Of3 e2 ->
handled := true;
(fun () ->
econt (new AggregateException
("Both clauses of a choice failed.", [| e1; e2 |])))
| false, Choice1Of3 _, Choice3Of3 _
| false, Choice3Of3 _, Choice1Of3 _
| false, Choice1Of3 _, Choice1Of3 _ -> ignore )
op()
// Run a workflow and write result (or exception to a ref cell
let run resCell workflow = async {
try
let! res = workflow
synchronized (fun () -> resCell := Choice2Of3 res)
with e ->
synchronized (fun () -> resCell := Choice3Of3 e)
complete() }
// Start both work items in thread pool
Async.Start(run result1 a)
Async.Start(run result2 b) )
Tomas already answered the precise question. However, you might be interested to know that my Hopac library for F# directly supports Concurrent ML -style first-class, higher-order, selective events, called alternatives, which directly provide a choose -combinator and provide a more expressive concurrency abstraction mechanism than Go's select statement.
Regarding your more specific problem of interleaving two asynchronous sequences, I recently started experimenting with ideas on how Rx-style programming could be done with Hopac. One potential approach I came up with is to define a kind of ephemeral event streams. You can find the experimental code here:
Alts.fsi
Alts.fs
As you can see, one of the operations defined for event streams is merge. What you are looking for may be slightly different semantically, but would likely be straightforward to implement using Hopac -style alternatives (or Concurrent ML -style events).

Resources