Given an asynchronous IMFSourceReader connected to a synchronous only IMFTransform.
Then for the IMFSourceReaderCallback::OnReadSample() callback is it a good idea not to call IMFTransform::ProcessInput directly within OnReadSample, but instead push the produced sample onto another queue for another thread to call the transforms ProcessInput on?
Or would I just be replicating identical work source readers typically do internally? Or put another way does work within OnReadSample run the risk of blocking any further decoding work within the source reader that could have otherwise happened more asynchronously?
So I am suggesting something like:
WorkQueue transformInputs;
...
// Called back async
HRESULT OnReadSampleCallback(... IMFSample* sample)
{
// Push sample and return immediately
Push(transformInputs, sample);
}
// Different worker thread awoken for transformInputs queue samples
void OnTransformInputWork()
{
// Transform object is not async capable
transform->TransformInput(0, Pop(transformInputs), 0);
...
}
This is touched on, but not elaborated on here 'Implementing the Callback Interface':
https://learn.microsoft.com/en-us/windows/win32/medfound/using-the-source-reader-in-asynchronous-mode
Or is it completely dependent on whatever the source reader sets up internally and not easily determined?
It is not a good idea to perform a long blocking operation in IMFSourceReaderCallback::OnReadSample. Nothing is going to be fatal or serious but this is not the intended usage.
Taking into consideration your previous question about audio format conversion though, audio sample data conversion is fast enough to happen on such callback.
Also, it is not clear or documented (depends on actual implementation), ProcessInput is often instant and only references input data. ProcessOutput would be computationally expensive in this case. If you don't do ProcessOutput right there in the same callback you might run into situation where MFT is no longer accepting input, and so you'd have to implement a queue anyway.
With all this in mind you would just do the processing in the callback neglecting performance impact assuming your processing is not too heavy, or otherwise you would just start doing the queue otherwise.
I have a function that looks like this:
fun <R> map(block: (T) -> R): Result<R> { ... }
and I'd like to make a suspending version:
suspend fun <R> mapAsync(block: suspend (T) -> R): Result<R> { ... }
The logic in both bodies are identical, but one suspends and one doesn't.
I don't want to have this duplicated logic. The only way I found for this to work is to have the map function call to the mapAsync function and then wrap the result in runBlocking:
fun <R> map(block: (T) -> R): Result<R> =
runBlocking { mapAsync { block(it) } }
So I have two questions:
Is there any performance considerations in taking a "normal" function, passing it as a suspend parameter, then block until the result is done?
Based on what I've read, it sounds like the initial thread keeps "doing the work" inside the suspend block until it hits the first suspend point. Then, the continuation is put into the wait queue and the initial thread is free to perform other work.
However, in this case, there isn't any "real" suspend point because the actual function is just (T) -> R, though I don't know if the compiler can tell that.
I'm worried that this setup is actually utilizing another thread from the pool that is just notifying my first thread to wake up...
Is there a better way to have a suspend and non-suspend set of functions utilize the same code?
You have encountered the infamous "colored function" problem. The two worlds are indeed separate and, while you can add a superficial layer that unifies them, you can't get it at zero performance cost. This is so fundamental that, even assuming that your suspend block never actually suspends, and the wrapping layer leverages that assumption and doesn't even use runBlocking on it, you will still pay the price of "being ready to suspend". The price isn't huge, though: it means creating a small object per each suspend fun call that holds the data that would normally reside on the thread's native call stack. In your case only the outer block is suspendable, so that's just one such object.
runBlocking runs the coroutine on the thread where you called it and it will finish synchronously on the same thread unless it suspends itself. Therefore your case where you'd have some synchronous code in a suspend block wouldn't suffer an additional performance hit from thread coordination.
If the coroutine does suspend itself, then there will have to be some external worker thread which will react to the event that allows the coroutine to resume, and there will have to be some coordination between that thread and your original runBlocking thread. This is a fundamental mechanism that's there with or without coroutines.
Your approach is correct, runBlocking was specifically designed to serve as a connection between blocking and non-blocking operations. From the documentation:
Runs new coroutine and blocks current thread interruptibly until its
completion. This function should not be used from coroutine. It is
designed to bridge regular blocking code to libraries that are written
in suspending style, to be used in main functions and in tests.
https://kotlin.github.io/kotlinx.coroutines/kotlinx-coroutines-core/kotlinx.coroutines/run-blocking.html
Also further read:
https://github.com/Kotlin/kotlinx.coroutines/blob/master/docs/basics.md#bridging-blocking-and-non-blocking-worlds
And some interesting videos by Roman Elizarov:
https://youtu.be/_hfBv0a09Jc
https://youtu.be/a3agLJQ6vt8
Polly has several retry functionalities like for example WaitAndRetryForever. I looked in the documentation but couldn't find what is used exactly for making the thread wait until the next retry. I guess Polly uses System.Timers for this or is it something completely different? Thanks for any collaboration.
Asynchronous executions (fooAsyncPolicy.ExecuteAsync(...)) wait with Task.Delay(...), freeing the thread the caller was using while the delay occurs.
Synchronous executions (fooSyncPolicy.Execute(...)) wait between retries in a cancellable thread-blocking manner. This means that, for the synchronous (a):
action();
compared to the synchronous (b):
policy.Execute(action);
the following three things all hold:
both (a) and (b) block progress from continuing (subsequent code does not run) until the statement has completed;
(b) executes action on the same thread that (a) originally would have;
(b) expresses exceptions (if Policy operation does not intervene) in the same/similar-as-possible way that (a) originally would have.
These semantics (1) (2) (3) are intentional, to keep synchronously executing code with Polly as similar in semantics/behaviour (surrounding code needs little adjustment) as executing code without Polly.
Anticipating a follow-up question: Wouldn't it be possible to write the synchronous Polly: Policy.Handle<T>().WaitAndRetry(...).Execute(action) so that it didn't block a thread while waiting before retrying?: Yes, but no solution has been found that is preferable to letting the caller control transitions to TPL Tasks or async/await and then using Polly's ExecuteAsync(...).
I am using the Async Http Client library (with Netty) to make asynchronous Http Get requests to a RESTful API. Since I want to preserve the non-blocking behavior I am returning instances of CompletableFuture<T> as the result of the Http Get requests. So, where a RESTful API endpoints returns a Json array, I am returning a CompletableFuture<T[]>.
Yet, and according to the classification made by Erik Meijer about the The Four Essential Effects In Programming, I consider that Stream<T> is better suited to the result of a Java method that makes an asynchronous Http Get request and returns a Json array. In this case we can see the Stream<T> as the Observable<T> equivalent, which is the result of an asynchronous computation that returns many values.
So, considering that resp holds the response, then I can get a CompletableFuture<Stream<T>> as following:
CompletableFuture<T[]> resp = …
return resp.thenApply(Arrays::stream);
However, I was wondering how can I convert the CompletableFuture<Stream<T>> resp to a Stream<T>, without waiting for the computation to complete (i.e I do NOT want to block on get() invocation)?
I would like to have the same result as the following expression, but WITHOUT blocking on get():
return resp.thenApply(Arrays::stream).get();
You can build a Stream<T> that will defer the call to the Future<T> get() method, just like this:
CompletableFuture<T[]> resp = ...
return Stream
.of(resp) // Stream<CompletableFuture<T[]>>
.flatMap(f -> Arrays.stream(f.join())); // Stream<T>
To simplify usage, instead of get() I am using join() to avoid checked exceptions.
As long as the result of the asynchronous computation is passed as an array, you can’t benefit from the Stream API here, as the Stream operation can’t start processing elements until the array has been handed off, which implies the entire completion of the asynchronous job.
Unless you rewrite your asynchronous job to publish individual elements of the arrays, e.g. via queue, you can only defer the synchronization up to the point when the terminal operation of the stream has commenced. In other words, you can chain the intermediate operations to the Stream before you have to wait for the completion of the asynchronous job. Since chaining is not an expensive operation, the gain will be very small.
If you still want to do it, Miguel Gamboa’s solution, Stream.of(resp).flatMap(f -> Arrays.stream(f.join())) will do and it’s concise. Unfortunately, it may have performance drawbacks which outweigh any benefit of deferring the join operation. Streaming over arrays works smoothly as arrays have a predictable length and support for balanced splitting, whereas the nested stream not only lacks these features, the current implementation even lacks short-circuiting processing.
So instead of utilizing flatMap for deferring the Stream creation, it’s more recommended to go one level deeper, where deferred Stream creation is directly supported:
static <T> Stream<T> getStream(CompletableFuture<T[]> resp) {
return StreamSupport.stream(() -> Arrays.spliterator(resp.join()),
Spliterator.ORDERED|Spliterator.SIZED|Spliterator.SUBSIZED|Spliterator.IMMUTABLE,
false);
}
This creates a Stream which defers the join operation until the terminal operation starts, but still has the performance characteristics of an array-based Stream. But the code obviously is more complicated and the benefit still is, as said, only the possibility to chain intermediate operations while the array providing asynchronous operation is still running.
Just to get it straight in my head. Consider this example bit of Erlang code:
test() ->
receive
{From, whatever} ->
%% do something
test();
{From, somethingelse} ->
%% do something else
test();
end.
Isn't the test() call, just a goto?
I ask this because in C we learned, if you do a function call, the return location is always put on the stack. I can't imagine this must be the case in Erlang here since this would result in a stackoverflow.
We had 2 different ways of calling functions:
goto and gosub.
goto just steered the program flow somewhere else, and gosub remembered where you came from so you could return.
Given this way of thinking, I can look at Erlang's recursion easier, since if I just read: test() as a goto, then there is no problem at all.
hence my question: isn't :Erlang just using a goto instead of remembering the return address on a stack?
EDIT:
Just to clarify my point:
I know goto's can be used in some languages to jump all over the place. But just supose instead of doing someFunction() you can also do: goto someFunction()
in the first example the flow returns, in the second example the flow just continues in someFunction and never returns.
So we limit the normal GOTO behaviour by just being able to jump to method starting points.
If you see it like this, than the Erlang recursive function call looks like a goto.
(a goto in my opinion is a function call without the ability to return where you came from). Which is exactly what is happening in the Erlang example.
A tail recursive call is more of a "return and immediately call this other function" than a goto because of the housekeeping that's performed.
Addressing your newest points: recording the return point is just one bit of housekeeping that's performed when a function is called. The return point is stored in the stack frame, the rest of which must be allocated and initialized (in a normal call), including the local variables and parameters. With tail recursion, a new frame doesn't need to be allocated and the return point doesn't need to be stored (the previous value works fine), but the rest of the housekeeping needs to be performed.
There's also housekeeping that needs to be performed when a function returns, which includes discarding locals and parameters (as part of the stack frame) and returning to the call point. During tail recursive call, the locals for the current function are discarded before invoking the new function, but no return happens.
Rather like how threads allow for lighter-weight context switching than processes, tail calls allow for lighter-weight function invocation, since some of the housekeeping can be skipped.
The "goto &NAME" statement in Perl is closer to what you're thinking of, but not quite, as it discards locals. Parameters are kept around for the newly invoked function.
One more, simple difference: a tail call can only jump to a function entry point, while a goto can jump most anywhere (some languages restrict the target of a goto, such as C, where goto can't jump outside a function).
You are correct, the Erlang compiler will detect that it is a tail recursive call, and instead of moving on on the stack, it reuses the current function's stack space.
Furthermore it is also able to detect circular tail-recursion, e.g.
test() -> ..., test2().
test2() -> ..., test3().
test3() -> ..., test().
will also be optimized.
The "unfortunate" side-effect of this is that when you are tracing function calls, you will not be able to see each invocation of a tail recursive function, but the entry and exit point.
You've got two questions here.
First, no, you're not in any danger of overrunning the stack in this case because these calls to test() are both tail-recursive.
Second, no, function calls are not goto, they're function calls. :) The thing that makes goto problematic is that it bypasses any structure in your code. You can jump out of statements, jump into statements, bypass assignments...all kinds of screwiness. Function calls don't have this problem because they have an obvious flow of control.
I think the difference here is between a "real" goto and what can in some cases seem like a goto. In some special cases the compiler can detect that it is free to cleanup the stack of the current function before calling another function. This is when the call is the last call in a function. The difference is, of course, that as in any other call you can pass arguments to the new function.
As others have pointed out this optimisation is not restricted to recursive calls but to all last calls. This is used in the "classic" way of programming FSMs.
It's a goto in the same why that if is goto and while is goto. It is implemented using (the moral equivalent of) goto, but it does not expose the full shoot-self-in-foot potential of goto directly to the programmer.
In fact, these recursive functions are the ultimate GOTO according to Guy Steele.
Here's a more general answer, which supercedes my earlier answer based on call-stacks. Since the earlier answer has been accepted, I won't replace the text.
Prologue
Some architectures don't have things they call "functions" that are "called", but do have something analogous (messaging may call them "methods" or "message handlers"; event based architectures have "event handlers" or simply "handlers"). I'll be using the terms "code block" and "invocation" for the general case, though (strictly speaking) "code block" can include things that aren't quite functions. You can substitute the appropriately inflected form of "call" for "invocation" or "invoke", as I might in a few places. The features of an architecture that describe invocation are sometimes called "styles", as in "continuation passing style" (CPS), though this isn't previously an official term. To keep things from being too abstract, we'll examine call stack, continuation passing, messaging (à la OOP) and event handling invocation styles. I should specify the models I'm using for these styles, but I'm leaving them out in the interest of space.
Invocation Features
or, C Is For Continuation, Coordination and Context, That's Good Enough For Me
Hohpe identifies three nicely alliterative invocation features of the call-stack style: Continuation, Coordination, Context (all capitalized to distinguish them from other uses of the words).
Continuation decides where execution will continue when a code block finishes. The "Continuation" feature is related to "first-class continuations" (often simply called "continuations", including by me), in that continuations make the Continuation feature visible and manipulable at a programmatic level.
Coordination means code doesn't execute until the data it needs is ready. Within a single call stack, you get Coordination for free because the program counter won't return to a function until a called function finishes. Coordination becomes an issue in (e.g.) concurrent and event-driven programming, the former because a data producer may fall behind a data consumer and the latter because when a handler fires an event, the handler continues immediately without waiting for a response.
Context refers to the environment that is used to resolve names in a code block. It includes allocation and initialization of the local variables, parameters and return value(s). Parameter passing is also covered by the calling convention (keeping up the alliteration); for the general case, you could split Context into a feature that covers locals, one that covers parameters and another for return values. For CPS, return values are covered by parameter passing.
The three features aren't necessarily independent; invocation style determines their interrelationships. For instance, Coordination is tied to Continuation under the call-stack style. Continuation and Context are connected in general, since return values are involved in Continuation.
Hohpe's list isn't necessarily exhaustive, but it will suffice to distinguish tail-calls from gotos. Warning: I might go off on tangents, such as exploring invocation space based on Hohpe's features, but I'll try to contain myself.
Invocation Feature Tasks
Each invocation feature involves tasks to be completed when invoking a code block. For Continuation, invoked code blocks are naturally related by a chain of invoking code. When a code block is invoked, the current invocation chain (or "call chain") is extended by placing a reference (an "invocation reference") to the invoking code at the end of the chain (this process is described more concretely below). Taking into account invocation also involves binding names to code blocks and parameters, we see even non-bondage-and-discipline languages can have the same fun.
Tail Calls
or, The Answer
or, The Rest Is Basically Unnecessary
Tail calling is all about optimizing Continuation, and it's a matter of recognizing when the main Continuation task (recording an invocation reference) can be skipped. The other feature tasks stand on their own. A "goto" represents optimizing away tasks for Continuation and Context. That's pretty much why a tail call isn't a simple "goto". What follows will flesh out what tail calls look like in various invocation styles.
Tail Calls In Specific Invocation Styles
Different styles arrange invocation chains in different structures, which I'll call a "tangle", for lack of a better word. Isn't it nice that we've gotten away from spaghetti code?
With a call-stack, there's only one invocation chain in the tangle; extending the chain means pushing the program counter. A tail call means no program counter push.
Under CPS, the tangle consists of the extant continuations, which form a reverse arborescence (a directed tree where every edge points towards a central node), where each path back to the center is a invocation chain (note: if the program entry point is passed a "null" continuation, the tangle can be a whole forest of reverse arborescences). One particular chain is the default, which is where an invocation reference is added during invocation. Tail calls won't add an invocation reference to the default invocation chain. Note that "invocation chain" here is basically synonymous with "continuation", in the sense of "first class continuation".
Under message passing, the invocation chain is a chain of blocked methods, each waiting for a response from the method before it in the chain. A method that invokes another is a "client"; the invoked method is a "supplier" (I'm purposefully not using "service", though "supplier" isn't much better). A messaging tangle is a set of unconnected invocation chains. This tangle structure is rather like having multiple thread or process stacks. When the method merely echos another method's response as its own, the method can have its client wait on its supplier rather than itself. Note that this gives a slightly more general optimization, one that involves optimizing Coordination as well as Continuation. If the final portion of a method doesn't depend on a response (and the response doesn't depend on the data processed in the final portion), the method can continue once it's passed on its client's wait dependency to its supplier. This is analogous to launching a new thread, where the final portion of the method becomes the thread's main function, followed by a call-stack style tail call.
What About Event Handling Style?
With event handling, invocations don't have responses and handlers don't wait, so "invocation chains" (as used above) isn't a useful concept. Instead of a tangle, you have priority queues of events, which are owned by channels, and subscriptions, which are lists of listener-handler pairs. In some event driven architectures, channels are properties of listeners; every listener owns exactly one channel, so channels become synonymous with listeners. Invoking means firing an event on a channel, which invokes all subscribed listener-handlers; parameters are passed as properties of the event. Code that would depend on a response in another style becomes a separate handler under event handling, with an associated event. A tail call would be a handler that fires the event on another channel and does nothing else afterwards. Tail call optimization would involve re-subscribing listeners for the event from the second channel to the first, or possibly having the handler that fired the event on the first channel instead fire on the second channel (an optimization made by the programmer, not the compiler/interpreter). Here's what the former optimization looks like, starting with the un-optimized version.
Listener Alice subscribes to event "inauguration" on BBC News, using handler "party"
Alice fires event "election" on channel BBC News
Bob is listening for "election" on BBC News, so Bob's "openPolls" handler is invoked
Bob subscribes to event "inauguration" on channel CNN.
Bob fires event "voting" on channel CNN
Other events are fired & handled. Eventually, one of them ("win", for example) fires event "inauguration" on CNN.
Bob's barred handler fires "inauguration" on BBC News
Alice's inauguration handler is invoked.
And the optimized version:
Listener Alice subscribes to event "inauguration" on BBC News
Alice fires event "election" on channel BBC News
Bob is listening for "election" on BBC News, so Bob's "openPolls" handler is invoked
Bob subscribes anyone listening for "inauguration" on BBC News to the inauguration event on CNN*.
Bob fires event "voting" on channel CNN
Other events are fired & handled. Eventually, one of them fires event "inauguration" on CNN.
Alice's inauguration handler is invoked for the inauguration event on CNN.
Note tail calls are trickier (untenable?) under event handling because they have to take into account subscriptions. If Alice were later to unsubscribe from "inauguration" on BBC News, the subscription to inauguration on CNN would also need to be canceled. Additionally, the system must ensure it doesn't inappropriately invoke a handler multiple times for a listener. In the above optimized example, what if there's another handler for "inauguration" on CNN that fires "inauguration" on BBC News? Alice's "party" event will be fired twice, which may get her in trouble at work. One solution is to have *Bob unsubscribe all listeners from "inauguration" on BBC News in step 4, but then you introduce another bug wherein Alice will miss inauguration events that don't come via CNN. Maybe she wants to celebrate both the U.S. and British inaugurations. These problems arise because there are distinctions I'm not making in the model, possibly based on types of subscriptions. For instance, maybe there's a special kind of one-shot subscription (like System-V signal handlers) or some handlers unsubscribe themselves, and tail call optimization is only applied in these cases.
What's next?
You could go on to more fully specify invocation feature tasks. From there, you could figure out what optimizations are possible, and when they can be used. Perhaps other invocation features could be identified. You could also think of more examples of invocation styles. You could also explore the dependencies between invocation features. For instance, synchronous and asynchronous invocation involve explicitly coupling or uncoupling Continuation and Coordination. It never ends.
Get all that? I'm still trying to digest it myself.
References:
Hohpe, Gregor; "Event-Driven Architecture"
Sugalski, Dan; "CPS and tail calls--two great tastes that taste great together"
In this case it is possible to do tail-call optimization, since we don't need to do more work or make use of local variables. So the compiler will convert this into a loop.
(a goto in my opinion is a function call without the ability to return where you came from). Which is exactly what is happening in the erlang example.
That is not what's happening in Erlang, you CAN return to where you came from.
The calls are tail-recursive, which means that it is "sort of" a goto. Make sure you understand what tail-recursion is before you attempt to understand or write any code. Reading Joe Armstrong's book probably isn't a bad idea if you are new to Erlang.
Conceptually, in the case where you call yourself using test() then a call is made to the start of the function using whatever parameters you pass (none in this example) but nothing more is added to the stack. So all your variables are thrown away and the function starts fresh, but you didn't push a new return pointer onto the stack. So it's like a hybrid between a goto and a traditional imperative language style function call like you'd have in C or Java. But there IS still one entry on the stack from the very first call from the calling function. So when you eventually exit by returning a value rather the doing another test() then that return location is popped from the stack and execution resumes in your calling function.