vertx verticle instance confusing - asynchronous

I was confused by vertx instance. The first time seeing the docs, i think the instance means the the number of event-looping threads.
As i dig into the source code(vertx 2.1.2), i found the verticle instance means a task in the event-loop thread group. The event-loop thread always waits on the selector and run tasks.
Then the first Question comes:
Is it necessary to have verticle instances in the vertx? Since the vertcle run only once by one event loop. To be more precise, the event-loop thread run the Verticle start method and throw it away, it works like an entry and that is all.
My second Question is:
How to collect the results of multiple events?
Scenario
send multiple queries on the event bus with the same handler instance
the handler waits for every callback and modify flags
if the flags reach the threadhold, do some jobs
problem
when multiple events callback, it has a chance that multiple event-loop threads will execute the handler, thus there is a race condition that the jobs will be run multiple times. How can i avoid it?
Any solutions will be apperciated.

Is it necessary to have verticle instances in the vertx?
No. This would not be required. You do not have to create any instances of the Verticle class.
How to collect the results of multiple events?
problem
when multiple events callback, it has a chance that multiple event-loop threads will execute the handler, thus there is a race condition that the jobs will be run multiple times. How can i avoid it?
Each query which is sent over the event bus will have a corresponding Handler object. They will not share the same Handler instance. For every response to a query, it's corresponding Handler object's handle() method is called. Hence, there would be no place for a race condition over a specific Handler object.

Related

Is signal R async in its networking (not in its calls)

We are using signal r to fetch data for 20 tiles.
We call FetchData() on all 20 tiles at the same time, it then fires of a message on signal r to request that data. (each tile has subscribed to get the answers)
We find that each tile will populate its data one at a time, as if the signal r only fetches the next tiles response after the first tile has been completed?
I know this is super high level, but in my mind, it worked like an AJAX request. Where if I fired off 20 requests in a row, they would all randomly return out of order?
By default, SignalR only allows a single execution from a client at a time. So if you fire of 20 concurrent calls to invoke a hub method, it'll run those in sequence (there are exceptions to this rule but that's the idea). This article talks about some of the configuration https://learn.microsoft.com/en-us/aspnet/core/signalr/configuration?view=aspnetcore-6.0&tabs=dotnet#configure-server-options. You'd want to increase MaximumParallelInvocationsPerClient so something other than 1.

How to control the number of threads when executing an Asynchronous Activity in WF 4

I am creating a workflow in WF 4, where I have a ParallelForeach activity that iterates over a collection of items. For each item in the collection, I execute a custom Asynchronous activity to processing multiple items in parallel.
The above solution works for me, but I am concerned about the number of threads used since each Asynchronous activity instance is executed on its own thread. Is there a way to configure/control the number of threads that get launched when executing the parallelForeach activity in the above described mechanism?
since each Asynchronous activity instance is getting executed on its own thread. Who says? Certainly not the docs.
ParallelForEach enumerates its values and schedules the Body for every value it enumerates on. It only schedules the Body. How the body executes depends on whether the Body goes idle.
If the Body does not go idle, it executes in a reverse order because the scheduled activities are handled as a stack, the last scheduled activity executes first.
For example, if you have a collection of {1,2,3,4}in ParallelForEach and use a WriteLine as the body to write the value out. You have 4, 3, 2, 1 printed out in the console. This is because WriteLine does not go idle so after 4 WriteLine activities got scheduled, they executed using a stack behavior (first in last out).
The Parallelism of execution occurs only when an Activity creates a bookmark and goes idle. Even then, two activities aren't actually executing at the same time--one or more have just stopped executing, allowing others to run in order. Understandably confusing, given the name, but that's it.
In any event, when you're relying on the framework to parallelize for you, don't worry about how many threads they're using. They probably have everything under control. Until you know they don't.
Will is correct, ParallelForEach does not require a new thread for each branch. If you are doing blocking I/O in code that should occur in an AsyncCodeActivity so that you aren't unecessarily blocking. If you want CPU-bound work to run in parallel to other activities you will either need to wrap it in an AsyncCodeActivity or use InvokeMethod { RunAsynchronously = true} in which case the framework will take care of running the work on a background thread.
The SynchronizationContext extensibility point is intended for cases where you have a particular existing threading model that you need WF to integrate with. Prime examples of this include ASP.NET's threading environment, and Windows Presentation Foundation/WinForms (e.g. if you wanted a activity to work correctly).

How do Goliath or EventMachine switch context?

Assume I have an I/O-bounded operations. I do have a callbacks (or em-synchrony)
How does EM switch to proccess next request keeping previous one waiting for callback?
How does it keep Thread.current variables isolated?
How can I emulate long running jobs?
1. How does EM switch to proccess next request keeping previous one waiting for callback?
In any reactor pattern there is a single thread of the execution. Meaning, only one thing can execute at a time. If the reactor is processing a request on that main thread, then no other request can intervene (cooperative scheduling). Now, the request can "relinquish" control either voluntarily (in EM, we have EM.next_tick { # block }), or by scheduling an operation in the future: a timer (EM.add_timer { #block }), or by making another IO operation!
For example, if you're using EM-Synchrony, then when you make an HTTP request (via em-http), then when the request is dispatched, the fiber is paused, and a callback is created for you under the hood.. When the request is finished, the callback is invoked by EventMachine, via an internal callback.. and control returns back to your request. For a more in-depth look:
http://www.igvita.com/2010/03/22/untangling-evented-code-with-ruby-fibers/
2. How does it keep Thread.current variables isolated?
No magic. In Ruby, we have thread local variables: Thread.current[:foo] = bar. Similarly, Fiber's have the same semantics, although what sometimes catches people off-guard is that the same mechanism is used.. aka Thread.current[:foo] = bar.
See here: http://devblog.avdi.org/2012/02/02/ruby-thread-locals-are-also-fiber-local/
3. How can I emulate long running jobs?
Best approach: move them outside of the reactor. This is true of any reactor system.
a) Create a job queue and push it to another process
b) EM does provide EM.defer, which spawns another thread.
Choose (a) whenever possible over (b).. you'll thank yourself later.

Out of order race condition

When a workflow has a receive activity that occurs after another receive activity and the second receive activity is called first the workflow holds the caller by blocking for 1 minute before timing out.
I want the workflow to return immediately when there are no matching workflow instances.
I do not want to change the timeout on the client as some calls may take a while.
This is a known issue in WF4, at least I am not aware of it being fixed yet.

Qt/C++, QObject::connect() effect on currently executed function?

I always use QObject::connect() in all my applications but it is not clear to me its effect when my program is currently inside a function. Suppose I have the following code:
void main() {
//other stuffs here
QObject::connect(xxx,SIGNAL(yyy()),this,SLOT(zzz());
}
void aFunction()
{
//a bunch of codes here
//i am here when suddenly signal is emitted from QObject::connect();
//another bunch of codes here
}
I assume that when the signal is emitted, QObject::connect leaves the function "aFunction()" to execute "zzz()". What will happen to the remaining codes in the "aFunction()"
Thanks.
I can understand the confusion, coming from procedural to event based programming gives me same experience like you do now.
Short answer:
in non multi threaded environment, slot zzz() will be executed after aFunction() finishes. In fact, the signal probably gets emitted after aFunction finishes.
in multi threaded env., same thing but it is "after some time", not after.
Key to understanding this is Event Loop. QApplication::exec() runs a forever loop polling for event. New event is then handled, signals get emitted and depending on the fifth argument of QObject::connect which is a Qt::ConnectionType, ultimately runs the connected slot. You can read QObject documentation for more detail..
So your aFunction probably gets called after some signal, so after it is finished, it's back to event loop again, then your 'suddenly emitted' signal actually gets emitted and zzz is executed.
Even in multi threading environment, inter thread signals and slots work with Qt::QueuedConnection which basically just posts the emitted signal to corresponding thread so that when that thread's event loop come around to process it - it will be executed sequentially as well.
Ultimately what you will have to remember is that this Turing Machine called computers is executing sequences of codes, whether it's semi paralel (e.g time sharing, pipelining) or truly paralel (e.g multi cores, multi cpu) the part where codes get sent (or distributed? or fetched?) to its execution will always run in sequences or in one or more ways have to be simulated to be sequential so that no code is executed twice in multiple execution node.
There is no "suddenly"

Resources