I'm using an Azure environment and developing in .NET
I am running a web app (ClientApp) that takes client data to perform a series of calculations. The calculations are performance intensive, so they are running on a separate web app (CalcApp).
Currently, the ClientApp sends the calculation request to the CalcApp. The requests from every client are put into a common queue and run one at a time, FIFO. My goal is to create separate queues for each client and run several calculations concurrently.
I am thinking of using the Azure Service Bus queues to accomplish this. On the ClientApp, the service bus would check for an existing queue for that client and create one if needed. On the CalcApp, the app would periodically check for existing queues. If it finds a new queue, then it would create a new QueueClient that uses OnMessageAsync() and RunCalculationsAsync() as the callback function.
Is this feasible or even a good idea?
I would consider using multiple consumers instead, perhaps with a topic denoting the "client" if you need to differentiate the type of processing based on which client originated it. Each client can add an entry into the queue, and the consumers "fight" over the messages. There is no chance of the same message being processed twice if you follow this approach.
I'm not sure having multiple queues is necessary.
Here is more information on the Competing Consumers pattern.
https://msdn.microsoft.com/en-us/library/dn568101.aspx
You could also build one consumer and spawn multiple threads. In this model, you would have one queue and one consumer, but still have the ability to calculate more than one at a time. Ultimately, though, competing consumers is far more scalable, using a combination of both strategies.
Related
PROBLEM
Our PROCESSING SERVICE is serving UI, API, and internal clients and listening for commands from Kafka.
Few API clients might create a lot of generation tasks (one task is N messages) in a short time. With Kafka, we can't control commands distribution, because each command comes to the partition which is consumed by one processing instance (aka worker). Thus, UI requests could be waiting too long while API requests are processing.
In an ideal implementation, we should handle all tasks evenly, regardless of its size. The capacity of the processing service is distributed among all active tasks. And even if the cluster is heavily loaded, we always understand that the new task that has arrived will be able to start processing almost immediately, at least before the processing of all other tasks ends.
SOLUTION
Instead, we want an architecture that looks more like the following diagram, where we have separate queues per combination of customer and endpoint. This architecture gives us much better isolation, as well as the ability to dynamically adjust throughput on a per-customer basis.
On the side of the producer
the task comes from the client
immediately create a queue for this task
send all messages to this queue
On the side of the consumer
in one process, you constantly update the list of queues
in other processes, you follow this list and consume for example 1 message from each queue
scale consumers
QUESTION
Is there any common solution to such a problem? Using RabbitMQ or any other tooling. Š¯istorically, we use Kafka on the project, so if there is any approach using - it is amazing, but we can use any technology for the solution.
Why not use spark to execute the messages within the task? What I'm thinking is that each worker creates a spark context that then parallelizes the messages. The function that is mapped can be based on which kafka topic the user is consuming. I suspect however your queues might have tasks that contained a mixture of messages, UI, API calls, etc. This will result in a more complex mapping function. If you're not using a standalone cluster and are using YARN or something similar you can change the queueing method that the spark master is using.
As I understood the problem, you want to create request isolation from the customer using dynamically allocated queues which will allow each customer tasks to be executed independently. The problem looks like similar to Head of line blocking issue in networking
The dynamically allocating queues is difficult. This can also lead to explosion of number of queues that can be a burden to the infrastructure. Also, some queues could be empty or very less load. RabbitMQ won't help here, it is a queue with different protocol than kafka.
One alternative is to use custom partitioner in kafka that can look at the partition load and based on that load balance the tasks. This works if the tasks are independent in nature and there is no state store maintains in the worker.
The other alternative would be to load balance at the customer level. In this case you select a dedicated set of predefined queues for a set of customers. Customers with certain Ids will be getting served by a set of queues. The downside of this is some queues can have less load than others. This solution is similar to Virtual Output Queuing in networking,
My understanding is that the partitioning of the messages it's not ensuring a evenly load-balance. I think that you should avoid create overengineering and so some custom stuff that will come on top of the Kafka partitioner and instead think at a good partitioning key that will allows you to use Kafka in an efficiently manner.
I have a situation where I host a high RPS highly available service that receives requests aka commands. These commands have to be sent to N downstream clients, who actually execute them. Each downstream client is separate microsevice and has different constraints like mode (sync,async), execution cadence etc.
Should a slow downstream client build the logic to receive all requests and execute them in batches as they want ? Or my service should build logic to talk to slow and fast clients by maintaining state for commands across downstream clients. Share your opinions
Not enough info to give any prescriptive advice, but I'd start with dividing the tasks into async and sync first. Those are 2 completely different workloads that, most likely, would require different implementation stacks. I'll give you an idea of what you can start with in the world of AWS...
Not knowing what you mean by async, I'd default to a message-bus setup. In that case you can use something like Amazon Kinesis or Kafka for ingestion purposes, and kicking off Lambda or EC2 instance. If the clients need to be notified of a finished job they can either long-poll an SQS queue, subscribe to an SNS topic, or use MQTT with websockets for a long-running connection.
The sync tasks are easier, since it's all about processing power. Just make sure you have your EC2 instances in an auto-scaling group behind an ALB or API Gateway to scale out, and in, appropriately.
This is a very simple answer since I don't have any details needed to be more precise, but this should give you an idea of where to get started.
With rebus, I'd like to be able to publish from one application, and subscribe from multiple applications, where each message only gets processed by one subscriber in a round robin, as described here:
https://www.rabbitmq.com/tutorials/tutorial-two-dotnet.html
Is this possible with rebus?
Yes, but the words "publish" and "subscribe" are confusing here, because it's not "Publish/Subscribe" as described in the literature, because pub/sub pretty much means that you do not care how many listeners there are.
What you want, is to send a message, and then you want the sent messages distributed among multiple consumers.
With Rebus, that is definitely possibly. However, the way you do it may depend slightly on which transport, you are using(*)
Most transports have pretty good support for the competing consumers pattern, where you simply start multiple processes, probably running on multiple machines, that consume messages from the same queue.
This way, each message gets processed exactly once, and each consumer will receive messages at a rate that suits that particular consumer.
(*) MSMQ is not good at distributing load when multiple processes are taking messages off of the same queue, especially not when the processes are running on other machines than where the queue is.
Is there a standard name for this kind of design. and any existing frameworks in .net to make use of.
Multiple Process.exe are running in the server. There will be a ProcessPoolManager which is responsible for spawning these exes on a need basis. client(s) will send tasks to a queue. PoolManager reads the tasks queue and have them processed in the invoked process.exe and put response back into the queue. and client will get async response from queue when there is a response available.
Do you know what kind of design is this. and how to achieve it with any existing frameworks.
Suppose we were to implement a network application, such as a chat with a central server and several clients: we assume that all communication must go through the central server, then it should pick up messages from some clients and forward them to target clients, and so on.
Regardless of the technology used (sockets, web services, etc..), it is possible to think that there are some producer threads (that generate messages) and some consumer threads (that read messages).
For example, you could use a single queue for incoming and outgoing messages, but using a single queue, you couldn't receive and send messages simultaneously, because only one thread at a time can access the queue.
Perhaps it would be more appropriate to use two queues: for example, this article explains a way in which you can manage a double queue so that producers and consumers can work almost simultaneously. This scenario may be fine if there are only a producer and a consumer, but if there are many clients:
How to make so that the central server can receive data simultaneously from multiple input streams?
How to make so that the central server can send data simultaneously to multiple output streams?
To resolve this problem, my idea is to use a double queue for each client: on the central server, each client connection may be associated with two queues, one for incoming messages from that client and one for outgoing messages addressed to that client. In this way the central server may send and receive data simultaneously on almost all the connections with the clients...
There are probably other ways to manage the queues ... What are the parameters to determine how many queues are needed and how to organize them? There are cases that do not need any queue?
To me, this idea of using a queue per client or multiple queues per client seems to miss the point. First of all, it is absolutely possible to build a queue which can be accessed simultaneously by 2 threads (one can be enqueueing an item while a different one is dequeueing another item). If you want to know how, post a specific question about that.
Second, even if we assume that only 1 thread at a time can access a single queue, and even if we assume that the server will be receiving or sending data to/from all the clients simultaneously, it still doesn't follow that you need a different queue for each client. To avoid limiting system performance, you just need to allow enough concurrency to utilize all the server's CPUs. Even with a single, system-wide queue, if dequeueing/enqueueing messages is fast enough compared to the other work the server is doing, it might not be a bottleneck. (And with an efficient implementation, simply inserting an item or removing an item from a queue should be very fast. It's a very simple operation.) For that message queue to become the bottleneck limiting performance, either you would need a LOT of CPUs, or everything else the server was doing would have to be very fast. In that case, you could work out some scheme with 2 or 4 system-wide queues, to allow 2x or 4x more concurrency.
The whole idea of using work queues in a multi-threaded system is that they 1) allow multiple consumers to all grab work from a single location, so producers can "dump" whatever work they need done at that single location without worrying about which consumer will do it, and 2) function as a load-balancing mechanism for the consumers. (Additionally, a work queue can act as a "buffer" if producers temporarily generate work too fast for the consumers.) If you have a dedicated pair of producer-consumer threads for each client, it calls into question why you need to use queues at all. Why not just do a synchronous "pass off" from dedicated producer to corresponding dedicated consumer? Or, why not use a single thread per client which acts as both producer and consumer? Using queues in the way which you are proposing doesn't seem to really gain anything.