Terminate an Lwt thread from another thread - tcp

Here I have a function which accept a TCP connection and run two Lwt threads handle_connection and send_message. Each time that a connection is terminated I am notified in the handle_connection thread, so I can terminate it loops, but then I want to terminate the whole join <&> thread to go further with another connection in the next serv recursive call.
let create_server sock =
let rec serve () =
Lwt_unix.accept sock
>>= (fun (fd, _) ->
connection := true;
let ic = Lwt_io.of_fd ~mode:Lwt_io.Input fd in
let oc = Lwt_io.of_fd ~mode:Lwt_io.Output fd in
handle_connection ic oc <&> send_message oc)
>>= serve
in
serve ()
The question is, how can I force the send_message thread to terminate each time that handle_connection terminates ?
let handle_connection ic oc =
Lwt.on_failure (handle_message ic oc "client") (fun e ->
Logs.err (fun m -> m "%s" (Printexc.to_string e)));
Logs_lwt.info (fun m -> m "New connection")
let rec send_message oc =
let* s = read_console () in
Lwt_io.write_line oc s >>= fun _ -> send_message oc)
I have already tried to use Lwt.choose instead of Lwt.join, it passes to the next connection when client is disconnected but the send_message thread is still running on the terminated connection.

I hesitate to comment on this as I suspect that you already know this, but as a matter of general principle the most fundamental way in which you conditionally wait on a promise in Lwt is to construct the promise using Lwt.wait, and then bind on the promise using operator let* or operator >>= until the promise's resolver either fulfils the promise by means of Lwt.wakeup_later or (in your special case) rejects it using Lwt.wakeup_later_exn. Alternatively in the latter case you could construct the promise using Lwt.task and reject the promise directly by cancellation with Lwt.cancel, but I believe cancellation is now deprecated or at least discouraged.
There is a Lwt.pick function which, on a promise being fulfilled, will cancel any others which are bound by the pick, but that is the inverse of what you want. What this means is that I think you are going to have to restructure your code to expose the conditional promise.

Related

How to tell if an async computation is sent to the thread pool?

I was recently informed that in
async {
return! async { return "hi" } }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (async { return 1 }) would not be sent to the thread pool for evaluation, whereas in
async {
use ms = new MemoryStream [| 0x68uy; 0x69uy |]
use sr = new StreamReader (ms)
return! sr.ReadToEndAsync () |> Async.AwaitTask }
|> Async.RunSynchronously
|> printfn "%s"
the nested Async<'T> (sr.ReadToEndAsync () |> Async.AwaitTask) would be. What is it about an Async<'T> that decides whether it's sent to the thread pool when it's executed in an asynchronous operation like let! or return!? In particular, how would you define one which is sent to the thread pool? What code do you have to include in the async block, or in the lambda passed into Async.FromContinuations?
TL;DR: It's not quite like that. The async itself doesn't "send" anything to the thread pool. All it does is just run continuations until they stop. And if one of those continuations decides to continue on a new thread - well, that's when thread switching happens.
Let's set up a small example to illustrate what happens:
let log str = printfn $"{str}: thread = {Thread.CurrentThread.ManagedThreadId}"
let f = async {
log "1"
let! x = async { log "2"; return 42 }
log "3"
do! Async.Sleep(TimeSpan.FromSeconds(3.0))
log "4"
}
log "starting"
f |> Async.StartImmediate
log "started"
Console.ReadLine()
If you run this script, it will print, starting, then 1, 2, 3, then started, then wait 3 seconds, and then print 4, and all of them except 4 will have the same thread ID. You can see that everything until Async.Sleep is executed synchronously on the same thread, but after that async execution stops and the main program execution continues, printing started and then blocking on ReadLine. By the time Async.Sleep wakes up and wants to continue execution, the original thread is already blocked on ReadLine, so the async computation gets to continue running on a new one.
What's going on here? How does this function?
First, the way the async computation is structured is in "continuation-passing style". It's a technique where every function doesn't return its result to the caller, but calls another function instead, passing the result as its parameter.
Let me illustrate with an example:
// "Normal" style:
let f x = x + 5
let g x = x * 2
printfn "%d" (f (g 3)) // prints 11
// Continuation-passing style:
let f x next = next (x + 5)
let g x next = next (x * 2)
g 3 (fun res1 -> f res1 (fun res2 -> printfn "%d" res2))
This is called "continuation-passing" because the next parameters are called "continuations" - i.e. they're functions that express how the program continues after calling f or g. And yes, this is exactly what Async.FromContinuations means.
Seeming very silly and roundabout on the surface, what this allows us to do is for each function to decide when, how, or even if its continuation happens. For example, our f function from above could be doing something asynchronous instead of just plain returning the result:
let f x next = httpPost "http://calculator.com/add5" x next
Coding it in continuation-passing style would allow such function to not block the current thread while the request to calculator.com is in flight. What's wrong with blocking the thread, you ask? I'll refer you to the original answer that prompted your question in the first place.
Second, when you write those async { ... } blocks, the compiler gives you a little help. It takes what looks like a step-by-step imperative program and "unrolls" it into a series of continuation-passing calls. The "breaking" points for this unfolding are all the constructs that end with a bang - let!, do!, return!.
The above async block, for example, would look somethiing like this (F#-ish pseudocode):
let return42 onDone =
log "2"
onDone 42
let f onDone =
log "1"
return42 (fun x ->
log "3"
Async.Sleep (3 seconds) (fun () ->
log "4"
onDone ()
)
)
Here, you can plainly see that the return42 function simply calls its continuation right away, thus making the whole thing from log "1" to log "3" completely synchronous, whereas the Async.Sleep function doesn't call its continuation right away, instead scheduling it to be run later (in 3 seconds) on the thread pool. That's where the thread switching happens.
And here, finally, lies the answer to your question: in order to have the async computation jump threads, your callback passed to Async.FromContinuations should do anything but call the success continuation immediately.
A few notes for further investigation
The onDone technique in the above example is technically called "monadic bind", and indeed in real F# programs it's represented by the async.Bind method. This answer might also be of help understanding the concept.
The above is a bit of an oversimplification. In reality the async execution is a bit more complicated than that. Internally it uses a technique called "trampoline", which in plain terms is just a loop that runs a single thunk on every turn, but crucially, the running thunk can also "ask" it to run another thunk, and if it does, the loop will do so, and so on, forever, until the next thunk doesn't ask to run another thunk, and then the whole thing finally stops.
I specifically used Async.StartImmediate to start the computation in my example, because Async.StartImmediate will do just what it says on the tin: it will start running the computation immediately, right there. That's why everything ran on the same thread as the main program. There are many alternative starting functions in the Async module. For example, Async.Start will start the computation on the thread pool. The lines from log "1" to log "3" will still all happen synchronously, without thread switching between them, but it will happen on a different thread from log "start" and log "starting". In this case thread switching will happen before the async computation even starts, so it doesn't count.

How to interrupt await'ing in a loop

Consider this function receiving datagrams from UDP socket:
async fn recv_multiple(socket: &async_std::net::UdpSocket) -> Vec<String> {
let mut buf = [0; 1024];
let mut v = vec![];
while let Ok((amt, _src)) = socket.recv_from(&mut buf).await {
v.push(String::from_utf8_lossy(&buf[..amt]).to_string());
}
v
}
And then it is executed for example like this:
let fut = recv_multiple(&socket);
async_std::task::block_on(fut);
How to add timeout functionality so that after exactly 1 minute a Vec<String> (empty or not) is returned from recv_multiple? I need to collect incoming datagrams for a minute and then return what was captured during that time. Please note that I don't need to timeout a single recv_from operation because datagrams can appear very often and the function will never timeout.
I've found a couple of partial solutions but they don't fit well:
async_std::future::timeout times out a Future but partially filled Vec is discarded and only Err is returned
async_std::future::Future::timeout same story, partial output is discarded
As I see it I should pass some kind of a "timeout" Future inside this function and in while let I have to await on both socket and "timeout" Future. Then if socket fires we proceed but if "timeout" fires then we return from the function. But I'm not sure how to accomplish this.
Instead of timeouting on recv_multiple you can timeout on socket.recv_from inside recv_multiple. It'll require to recalculate remaining time after each iteration. And when remaining time is 0 then you can return your Vec<_>.

F# events not working inside Async workflow

I want to do a Post-Fire-Reply to an agent. Basically the agent triggers an event then replies to the caller. However I either keep getting a timeout error or the events do not fire correctly. I tried doing Post-Fire, that stopped the timeout errors but the events do not fire.
let evt = new Event<int>()
let stream = evt.Publish
type Agent<'T> = MailboxProcessor<'T>
type Fire = Fire of int
let agent = Agent.Start(fun inbox ->
let rec loop() = async {
let! msg = inbox.Receive()
let (Fire i) = msg
evt.Trigger i }
loop())
let on i fn =
stream
|> Observable.filter (fun x -> x = i)
|> Observable.filter (fun x -> x <> 1)
|> Observable.subscribe (fun x -> fn x)
let rec collatz n =
printfn "%d" n
on n (fun i ->
if (i % 2 = 0) then collatz (i/2)
else collatz (3*n + 1)) |> ignore
agent.Post (Fire n) // this does not work
// evt.Trigger n // this does works
collatz 13
This is a simple experiment that repeatedly creates a function to find the next number in the Collatz series and then calls itself to return the value until it reaches 1.
What seems to happen is that the trigger only fires once. I tried experimenting with every combination of Async.RunSynchronously / Async.Start / StartChild / SynchronizationContext that I could think of but no progress. I found a blog similar to what I am doing but that didn't help me neither
EDIT
Thank you Fyodor Soikin for pointing out my oversight. The original problem still remains in that I wish to both fire events and reply with a result, but get a timeout.
let evt = new Event<int>()
let stream = evt.Publish
type Agent<'T> = MailboxProcessor<'T>
type Command =
| Fire of int
| Get of int * AsyncReplyChannel<int>
let agent = Agent.Start(fun inbox ->
let rec loop() = async {
let! msg = inbox.Receive()
match msg with
| Fire i -> evt.Trigger i
| Get (i,ch) ->
evt.Trigger i
ch.Reply(i)
return! loop() }
loop())
let on i fn =
stream
|> Observable.filter (fun x -> x = i)
|> Observable.filter (fun x -> x <> 1)
|> Observable.subscribe (fun x -> fn x)
let rec collatz n =
printfn "%d" n
on n (fun i ->
if (i % 2 = 0) then collatz (i/2)
else collatz (3*n + 1)) |> ignore
agent.PostAndReply (fun ch -> (Get (n, ch))) |> ignore // timeout
agent.PostAndAsyncReply (fun ch -> (Get (n, ch))) |> Async.Ignore |> Async.Start // works but I need the result
agent.PostAndAsyncReply (fun ch -> (Get (n, ch))) |> Async.RunSynchronously |> ignore // timeout
collatz 13
Your loop function doesn't loop. It receives the first message, triggers the event, and then just... exits. Never attempts to receive a second message.
You need to make that function work continuously: process the first message, then go right back to receive the next one, then go receive the next one, and so on. Like this:
let agent = Agent.Start(fun inbox ->
let rec loop() = async {
let! msg = inbox.Receive()
let (Fire i) = msg
evt.Trigger i
return! loop() }
loop())
Edit
Since you've reached your limit on questions, I will answer your edit here.
The reason you're getting timeouts in your second snippet is that you have a deadlock in your code. Let's trace the execution to see that.
THREAD 1: The agent is started.
THREAD 2: The first collatz call.
THREAD 2: The first collatz call posts a message to the agent.
THREAD 1: The agent receives the message.
THREAD 1: The agent triggers the event.
THREAD 1: As a result of the event, the second collatz call happens.
THREAD 1: The second collatz call posts a message to the agent.
THREAD 1: The second collatz call starts waiting for the agent to respond.
And this is where the execution ends. The agent cannot respond at this point (in fact, it cannot even receive the next message!), because its instruction pointer is still inside evt.Trigger. The evt.Trigger call hasn't yet returned, so the loop function hasn't yet recursed, so the inbox.Receive function hasn't yet been called, so the second message is still waiting in the agent's queue.
So you get yourself a classic deadlock: collatz is waiting for the agent to receive its message, but the agent is waiting for collatz to finish handling the event.
The simplest, dumbest solution to this would be to just trigger the event asynchronously:
async { evt.Trigger i } |> Async.Start
This will make sure that the event handler is executed not "right there", but asynchronously, possibly on a different thread. This will in turn allow the agent not to wait for the event to be processed before it can continue its own execution loop.
In general though, when dealing with multithreading and asynchrony, one should never call unknown code directly. The agent should never directly call evt.Trigger, or anything else that it doesn't control, because that code might be waiting on the agent itself (which is what happened in your case), thus introducing the deadlock.

Select First Async Result

Is there an async operator to get the value first returned by two asynchronous values (Async<_>)?
For example, given two Async<_> values where one A1 returns after 1 second and A2 returns after 2 seconds, then I want the result from A1.
The reason is I want to implement an interleave function for asynchronous sequences, so that if there are two asynchronous sequences "defined" like this (with space indicating time as with marble diagrams):
S1 = -+-----+------------+----+
S2 = ---+-------+----------+-----+
Then I want to generate a new asynchronous sequence that acts like this:
S3 = -+-+---+---+--------+-+--+--+
Interleave S1 S2 = S3
But two do that, I probably need a kind of async select operator to select select values.
I think this would be like "select" in Go, where you can take the first available value from two channels.
TPL has a function called Task.WhenAny - I probably need something similar here.
I don't think the operator is available in the F# library. To combine this from existing operations, you could use Async.StartAsTask and then use the existing Task.WhenAny operator. However, I'm not exactly sure how that would behave with respect to cancellation.
The other option is to use the Async.Choose operator implemented on F# Snippets web site. This is not particularly elegant, but it should do the trick! To make the answer stand-alone, the code is attached below.
/// Creates an asynchronous workflow that non-deterministically returns the
/// result of one of the two specified workflows (the one that completes
/// first). This is similar to Task.WaitAny.
static member Choose(a, b) : Async<'T> =
Async.FromContinuations(fun (cont, econt, ccont) ->
// Results from the two
let result1 = ref (Choice1Of3())
let result2 = ref (Choice1Of3())
let handled = ref false
let lockObj = new obj()
let synchronized f = lock lockObj f
// Called when one of the workflows completes
let complete () =
let op =
synchronized (fun () ->
// If we already handled result (and called continuation)
// then ignore. Otherwise, if the computation succeeds, then
// run the continuation and mark state as handled.
// Only throw if both workflows failed.
match !handled, !result1, !result2 with
| true, _, _ -> ignore
| false, (Choice2Of3 value), _
| false, _, (Choice2Of3 value) ->
handled := true
(fun () -> cont value)
| false, Choice3Of3 e1, Choice3Of3 e2 ->
handled := true;
(fun () ->
econt (new AggregateException
("Both clauses of a choice failed.", [| e1; e2 |])))
| false, Choice1Of3 _, Choice3Of3 _
| false, Choice3Of3 _, Choice1Of3 _
| false, Choice1Of3 _, Choice1Of3 _ -> ignore )
op()
// Run a workflow and write result (or exception to a ref cell
let run resCell workflow = async {
try
let! res = workflow
synchronized (fun () -> resCell := Choice2Of3 res)
with e ->
synchronized (fun () -> resCell := Choice3Of3 e)
complete() }
// Start both work items in thread pool
Async.Start(run result1 a)
Async.Start(run result2 b) )
Tomas already answered the precise question. However, you might be interested to know that my Hopac library for F# directly supports Concurrent ML -style first-class, higher-order, selective events, called alternatives, which directly provide a choose -combinator and provide a more expressive concurrency abstraction mechanism than Go's select statement.
Regarding your more specific problem of interleaving two asynchronous sequences, I recently started experimenting with ideas on how Rx-style programming could be done with Hopac. One potential approach I came up with is to define a kind of ephemeral event streams. You can find the experimental code here:
Alts.fsi
Alts.fs
As you can see, one of the operations defined for event streams is merge. What you are looking for may be slightly different semantically, but would likely be straightforward to implement using Hopac -style alternatives (or Concurrent ML -style events).

Why would disposal of resources be delayed when using the "use" binding within an async computation expression?

I've got an agent which I set up to do some database work in the background. The implementation looks something like this:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
use conn = new System.Data.SqlClient.SqlConnection("...")
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
With this I discovered that if this was called several times in some amount of time I would start getting SqlConnection objects piling up and not being disposed, and eventually I would run out of connections in the connection pool (I don't have exact metrics on how many "several" is, but running an integration test suite twice in a row could always cause the connection pool to run dry).
If I change the use to a using then things are disposed properly and I don't have a problem:
let myAgent = MailboxProcessor<AgentData>.Start(fun inbox ->
let rec loop =
async {
let! data = inbox.Receive()
using (new System.Data.SqlClient.SqlConnection("...")) <| fun conn ->
data |> List.map (fun e -> // Some transforms)
|> List.sortBy (fun (_,_,t,_,_) -> t)
|> List.iter (fun (a,b,c,d,e) ->
try
... // Do the database work
with e -> Log.error "Yikes")
return! loop
}
loop)
It seems that the Using method of the AsyncBuilder is not properly calling its finally function for some reason, but it's not clear why. Does this have something to do with how I've written my recursive async expression, or is this some obscure bug? And does this suggest that utilizing use within other computation expressions could produce the same sort of behavior?
This is actually the expected behavior - although not entirely obvious!
The use construct disposes of the resource when the execution of the asynchronous workflow leaves the current scope. This is the same as the behavior of use outside of asynchronous workflows. The problem is that recursive call (outside of async) or recursive call using return! (inside async) does not mean that you are leaving the scope. So in this case, the resource is disposed of only after the recursive call returns.
To test this, I'll use a helper that prints when disposed:
let tester () =
{ new System.IDisposable with
member x.Dispose() = printfn "bye" }
The following function terminates the recursion after 10 iterations. This means that it keeps allocating the resources and disposes of all of them only after the entire workflow completes:
let rec loop(n) = async {
if n < 10 then
use t = tester()
do! Async.Sleep(1000)
return! loop(n+1) }
If you run this, it will run for 10 seconds and then print 10 times "bye" - this is because the allocated resources are still in scope during the recursive calls.
In your sample, the using function delimits the scope more explicitly. However, you can do the same using nested asynchronous workflow. The following only has the resource in scope when calling the Sleep method and so it disposes of it before the recursive call:
let rec loop(n) = async {
if n < 10 then
do! async {
use t = tester()
do! Async.Sleep(1000) }
return! loop(n+1) }
Similarly, when you use for loop or other constructs that restrict the scope, the resource is disposed immediately:
let rec loop(n) = async {
for i in 0 .. 10 do
use t = tester()
do! Async.Sleep(1000) }

Resources