Why would disposal of resources be delayed when using the "use" binding within an async computation expression? - asynchronous

I've got an agent which I set up to do some database work in the background. The implementation looks something like this:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
use conn = new System.Data.SqlClient.SqlConnection("...")
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
With this I discovered that if this was called several times in some amount of time I would start getting SqlConnection objects piling up and not being disposed, and eventually I would run out of connections in the connection pool (I don't have exact metrics on how many "several" is, but running an integration test suite twice in a row could always cause the connection pool to run dry).
If I change the use to a using then things are disposed properly and I don't have a problem:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
using (new System.Data.SqlClient.SqlConnection("...")) <| fun conn ->
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
It seems that the Using method of the AsyncBuilder is not properly calling its finally function for some reason, but it's not clear why. Does this have something to do with how I've written my recursive async expression, or is this some obscure bug? And does this suggest that utilizing use within other computation expressions could produce the same sort of behavior?

This is actually the expected behavior - although not entirely obvious!
The use construct disposes of the resource when the execution of the asynchronous workflow leaves the current scope. This is the same as the behavior of use outside of asynchronous workflows. The problem is that recursive call (outside of async) or recursive call using return! (inside async) does not mean that you are leaving the scope. So in this case, the resource is disposed of only after the recursive call returns.
To test this, I'll use a helper that prints when disposed:
let tester () =
{ new System.IDisposable with
member x.Dispose() = printfn "bye" }
The following function terminates the recursion after 10 iterations. This means that it keeps allocating the resources and disposes of all of them only after the entire workflow completes:
let rec loop(n) = async {
if n < 10 then
use t = tester()
do! Async.Sleep(1000)
return! loop(n+1) }
If you run this, it will run for 10 seconds and then print 10 times "bye" - this is because the allocated resources are still in scope during the recursive calls.
In your sample, the using function delimits the scope more explicitly. However, you can do the same using nested asynchronous workflow. The following only has the resource in scope when calling the Sleep method and so it disposes of it before the recursive call:
let rec loop(n) = async {
if n < 10 then
do! async {
use t = tester()
do! Async.Sleep(1000) }
return! loop(n+1) }
Similarly, when you use for loop or other constructs that restrict the scope, the resource is disposed immediately:
let rec loop(n) = async {
for i in 0 .. 10 do
use t = tester()
do! Async.Sleep(1000) }

Related

How can I create an F# async from a C# method with a callback?

Suppose I have some C# code that takes a callback:
void DoSomething(Action<string> callback);
Now, I want to use this in F#, but wrap it in an async. How would I go about this?
// Not real code
let doSomething = async {
let mutable result = null
new Action(fun x -> result <- x) |> Tasks.DoSomething
// Wait for result to be assigned
return result
}
For example, suppose DoSomething looks like this:
module Tasks
let DoSomething callback =
callback "Hello"
()
Then the output of the following should be "Hello":
let wrappedDoSomething = async {
// Call DoSomething somehow
}
[<EntryPoint>]
let main argv =
async {
let! resultOfDoSomething = wrappedDoSomething
Console.WriteLine resultOfDoSomething
return ()
} |> Async.RunSynchronously
0
The function Async.FromContinuations is, so to say, the "lowest level" of Async. All other async combinators can be expressed in terms of it.
It is the lowest level in the sense that it directly encodes the very nature of async computations - the knowledge of what to do in the three possible cases: (1) a successful completion of the previous computation step, (2) a crash of the previous computation step, and (3) cancellation from outside. These possible cases are expressed as the three function-typed arguments of the function that you pass to Async.FromContinuations. For example:
let returnFive =
Async.FromContinuations( fun (succ, err, cancl) ->
succ 5
)
async {
let! res = returnFive
printfn "%A" res // Prints "5"
}
|> Async.RunSynchronously
Here, my function fun (succ, err, cancl) -> succ 5 has decided that it has completed successfully, and calls the succ continuation to pass its computation result to the next step.
In your case, the function DoSomething expresses only one of the three cases - i.e. "what to do on successful completion". Once you're inside the callback, it means that whatever DoSomething was doing, has completed successfully. That's when you need to call the succ continuation:
let doSometingAsync =
Async.FromContinuations( fun (succ, err, cancl) ->
Tasks.DoSomething( fun res -> succ res )
)
Of course, you can avoid a nested lambda-expression fun res -> succ res by passing succ directly into DoSomething as callback. Unfortunately, you'll have to explicitly specify which type of Action to use for wrapping it, which negates the advantage:
let doSometingAsync =
Async.FromContinuations( fun (succ, err, cancl) ->
Tasks.DoSomething( System.Action<string> succ )
)
As an aside, note that this immediately uncovered a hole in the DoSomething's API: it ignores the error case. What happens if DoSomething fails to do whatever it was meant to do? There is no way you'd know about it, and the whole async workflow will just hang. Or, even worse: the process will exit immediately (depending on how the crash happens).
If you have any control over DoSomething, I suggest you address this issue.
You can try something like:
let doSomething callback = async {
Tasks.DoSomething(callback)
}
If your goal is to define the callback in the method you could do something like:
let doSomething () = async {
let callback = new Action<string>(fun result -> printfn "%A" result )
Tasks.DoSomething(callback)
}
If your goal is to have the result of the async method be used in the DoSomething callback you could do something like:
let doSomething =
Async.StartWithContinuations(
async {
return result
},
(fun result -> Tasks.DoSomething(result)),
(fun _ -> printfn "Deal with exception."),
(fun _ -> printfn "Deal with cancellation."))

F# Make Async<Async<MyTpe>[]> to Async<MyType>[]

I get some list of data from a HTTP call. I then know what values to get for another HTTP call. I would like to have everything be asynchronous. But I need to use this data with Expecto's testCaseAsync : string -> Async<unit> -> Test. So, my goal is to get a signature like so Async<Item>[]
So, I would like to get a list of testCaseAsync.
So, I basically have something like this:
// Async<Async<Item>[]>
let getAsyncCalls =
async {
let! table = API.getTable ()
// Async<Item>[]
let items =
table.root
|> Array.map (fun x -> API.getItem x.id)
return item
}
If I run them in parallel I get:
// Async<Item[]>
let getAsyncCalls =
async {
let! table = API.getTable ()
// Item[]
let! items =
table.root
|> Array.map (fun x -> API.getItem x.id)
return item
}
So, that doesn't get me to Async<Item>[]. I'm not sure if this is possible. I would like to avoid Async.RunSynchronously for the API.getTable call since that can lead to deadlocks, right? It will most likely be called from a cached value (memoized) so I'm not sure that will make a difference.
I guess I'll keep working on it unless someone else is more clever than me :-) Thanks in advance!
In general, you cannot turn Async<Async<T>[]> into Async<T>[]. The problem is that to even get the length of the array, you need to perform some operation asynchronously, so there is no way to "lift" the array outside of the async. If you knew the length of the array in advance, then you can make this work.
The following function turns Async<'T[]> into Async<'T>[] provided that you give it the length of the array. As you figured out, the returned asyncs need to somehow share access to the one top-level async. The easiest way of doing this I can think of is to use a task. Adapting that for your use case should be easy:
let unwrapAsyncArray (asyncs:Async<'T[]>) len =
let task = asyncs |> Async.StartAsTask
Array.init len (fun i -> async {
let! res = Async.AwaitTask task
if res.Length <> len then failwith "Wrong length!"
return res.[i] }
)

How can I throttle a large array of async workflows passed to Async.Parallel

I have an array holding a large number of small async database queries; for example:
// I actually have a more complex function that
// accepts name/value pairs for query parameters.
let runSql connString sql = async {
use connection = new SqlConnection(connString)
use command = new SqlCommand(sql, connection)
do! connection.OpenAsync() |> Async.AwaitIAsyncResult |> Async.Ignore
return! command.ExecuteScalarAsync() |> Async.AwaitTask
}
let getName (id:Guid) = async {
// I actually use a parameterized query
let querySql = "SELECT Name FROM Entities WHERE ID = '" + id.ToString() + "'"
return! runSql connectionString querySql
}
let ids : Guid array = getSixtyThousandIds()
let asyncWorkflows = ids |> Array.map getName
//...
Now, the problem: The next expression runs all 60K workflows at once, flooding the server. This leads to many of the SqlCommands timing out; it also typically causes out of memory exceptions in the client (which is F# interactive) for reasons I do not understand and (not needing to understand them) have not investigated:
//...
let names =
asyncWorkflows
|> Async.Parallel
|> Async.RunSynchronously
I've written a rough-and-ready function to batch the requests:
let batch batchSize asyncs = async {
let batches = asyncs
|> Seq.mapi (fun i a -> i, a)
|> Seq.groupBy (fst >> fun n -> n / batchSize)
|> Seq.map (snd >> Seq.map snd)
|> Seq.map Async.Parallel
let results = ref []
for batch in batches do
let! result = batch
results := (result :: !results)
return (!results |> List.rev |> Seq.collect id |> Array.ofSeq)
}
To use this function, I replace Async.Parallel with batch 20 (or another integer value):
let names =
asyncWorkflows
|> batch 20
|> Async.RunSynchronously
This works reasonably well, but I would prefer to have a system that starts each new async as soon as one completes, so rather than successive batches of size N starting after each previous batch of size N has finished, I am always awaiting N active SqlCommands (until I get to the end, of course).
Questions:
Am I reinventing the wheel? In other words, are there library functions that do this already? (Would it be profitable to look into exploiting ParallelEnumerable.WithDegreeOfParallelism somehow?)
If not, how should I implement a continuous queue instead of a series of discrete batches?
I am not primarily seeking suggestions to improve the existing code, but such suggestions will nonetheless be received with interest and gratitude.
FSharpx.Control offers an Async.ParallelWithThrottle function. I'm not sure if it is the best implementation as it uses SemaphoreSlim. But the ease of use is great and since my application doesn't need top performance it works well enough for me. Although since it is a library if someone knows how to make it better it is always a nice thing to make libraries top performers out of the box so the rest of us can just use the code that works and just get our work done!
Async.Parallel had support for throttling added in FSharp v 4.7. You do:
let! results = Async.Parallel(workflows, maxDegreeOfParallelism = dop)
if doing more than 1200 workflows concurrently in FSharp.Core versions <= 6.0.5, see this resolved issue
Proposal for a more explicit API

sequential execution chaining of async operations in F#

Is there any primitive in the langage to compose async1 then async2, akin to what parallel does for parallel execution planning ?
to further clarify, I have 2 async computations
let toto1 = Async.Sleep(1000)
let toto2 = Async.Sleep(1000)
I would like to create a new async computation made of the sequential composition of toto1 and toto2
let toto = Async.Sequential [|toto1; toto2|]
upon start, toto would run toto1 then toto2, and would end after 2000 time units
The async.Bind operation is the basic primitive that asynchronous workflows provide for sequential composition - in the async block syntax, that corresponds to let!. You can use that to express sequential composition of two computations (as demonstrated by Daniel).
However, if you have an operation <|> that Daniel defined, than that's not expressive enough to implement async.Bind, because when you compose things sequentially using async.Bind, the second computation may depend on the result of the first one. The <e2> may use v1:
async.Bind(<e1>, fun v1 -> <e2>)
If you were writing <e1> <|> <e2> then the two operations have to be independent. This is a reason why the libraries are based on Bind - because it is more expressive form of sequential composition than the one you would get if you were following the structure of Async.Parallel.
If you want something that behaves like Async.Parallel and takes an array, then the easiest option is to implement that imperatively using let! in a loop (but you could use recursion and lists too):
let Sequential (ops:Async<'T>[]) = async {
let res = Array.zeroCreate ops.Length
for i in 0 .. ops.Length - 1 do
let! value = ops.[i]
res.[i] <- value
return res }
I'm not sure what you mean by "primitive." Async.Parallel is a function. Here are a few ways to run two asyncs:
In parallel:
Async.Parallel([|async1; async2|])
or
async {
let! child = Async.StartChild async2
let! result1 = child
let! result2 = async1
return [|result1; result2|]
}
Sequentially:
async {
let! result1 = async1
let! result2 = async2
return [|result1; result2|]
}
You could return tuples in the last two. I kept the return types the same as the first one.
I would say let! and do! in an async { } block is as close as you'll get to using a primitive for this.
EDIT
If all this nasty syntax is getting to you, you could define a combinator:
let (<|>) async1 async2 =
async {
let! r1 = async1
let! r2 = async2
return r1, r2
}
and then do:
async1 <|> async2 |> Async.RunSynchronously

Monadic Retry logic w/ F# and async?

I've found this snippet:
http://fssnip.net/8o
But I'm working not only with retriable functions, but also with asynchronous such, and I was wondering how I make this type properly. I have a tiny piece of retryAsync monad that I'd like to use as a replacement for async computations, but that contains retry logic, and I'm wondering how I combine them?
type AsyncRetryBuilder(retries) =
member x.Return a = a // Enable 'return'
member x.ReturnFrom a = x.Run a
member x.Delay f = f // Gets wrapped body and returns it (as it is)
// so that the body is passed to 'Run'
member x.Bind expr f = async {
let! tmp = expr
return tmp
}
member x.Zero = failwith "Zero"
member x.Run (f : unit -> Async<_>) : _ =
let rec loop = function
| 0, Some(ex) -> raise ex
| n, _ ->
try
async { let! v = f()
return v }
with ex -> loop (n-1, Some(ex))
loop(retries, None)
let asyncRetry = AsyncRetryBuilder(4)
Consuming code is like this:
module Queue =
let desc (nm : NamespaceManager) name = asyncRetry {
let! exists = Async.FromBeginEnd(name, nm.BeginQueueExists, nm.EndQueueExists)
let beginCreate = nm.BeginCreateQueue : string * AsyncCallback * obj -> IAsyncResult
return! if exists then Async.FromBeginEnd(name, nm.BeginGetQueue, nm.EndGetQueue)
else Async.FromBeginEnd(name, beginCreate, nm.EndCreateQueue)
}
let recv (client : MessageReceiver) timeout =
let bRecv = client.BeginReceive : TimeSpan * AsyncCallback * obj -> IAsyncResult
asyncRetry {
let! res = Async.FromBeginEnd(timeout, bRecv, client.EndReceive)
return res }
Error is:
This expression was expected to have type Async<'a> but here has type 'b -> Async<'c>
Your Bind operation behaves like a normal Bind operation of async, so your code is mostly a re-implementation (or wrapper) over async. However, your Return does not have the right type (it should be 'T -> Async<'T>) and your Delay is also different than normal Delay of async. In general, you should start with Bind and Return - using Run is a bit tricky, because Run is used to wrap the entire foo { .. } block and so it does not give you the usual nice composability.
The F# specification and a free chapter 12 from Real-World Functional Programming both show the usual types that you should follow when implementing these operations, so I won't repeat that here.
The main issue with your approach is that you're trying to retry the computation only in Run, but the retry builder that you're referring to attempts to retry each individual operation called using let!. Your approach may be sufficient, but if that's the case, you can just implement a function that tries to run normal Async<'T> and retries:
let RetryRun count (work:Async<'T>) = async {
try
// Try to run the work
return! work
with e ->
// Retry if the count is larger than 0, otherwise fail
if count > 0 then return! RetryRun (count - 1) work
else return raise e }
If you actually want to implement a computation builder that will implicitly try to retry every single asynchronous operation, then you can write something like the following (it is just a sketch, but it should point you in the right direction):
// We're working with normal Async<'T> and
// attempt to retry it until it succeeds, so
// the computation has type Async<'T>
type RetryAsyncBuilder() =
member x.ReturnFrom(comp) = comp // Just return the computation
member x.Return(v) = async { return v } // Return value inside async
member x.Delay(f) = async { return! f() } // Wrap function inside async
member x.Bind(work, f) =
async {
try
// Try to call the input workflow
let! v = work
// If it succeeds, try to do the rest of the work
return! f v
with e ->
// In case of exception, call Bind to try again
return! x.Bind(work, f) }

Resources