I am working on a SignalR application and I intend to make it scalable using Azure Message Bus and Azure autoscale. However, based on my expected user base, I anticipate that 90% of the time, my application will only have one instance running.
I would like to only have the backplane active if there are more than one instances, since the backplane architecture increases the travel time of a message and the message bus will cost me money. I definitely recognize that the travel time and costs are very small, but there's no reason to have them, if there's no reason to have them.
Q: Is it possible to make the service bus backplane for SignalR dynamic so that it can be enabled and disabled based on need?
Possible? Probably, but this is uncharted territory and there's no telling what would break if you actually implemented a bus that dynamically scaled out on demand. Sounds like a cool experiment though...
Related
We have a bus reservation system running in GKE in which we are handling the creation of such reservations with different threads. Due to that, CRUD java methods can sometimes run simultaneously referring to the same bus, resulting in the save in our DB of the LAST simultaneous update only (so the other simultaneous updates are lost).
Even if the probabilities are low (the simultaneous updates need to be really close, 1-2 seconds), we need to avoid this. My question is about how to address the solution:
Lock the bus object and return error to the other simultaneous requests
In-memory map or Redis caché to track the bus requests
Use GCP Pub/Sub, Kafka or RabbitMQ as a queue system.
Try to focus the efforts on reducing the simultaneous time window (reduce from 1-2 seconds up to milliseconds)
Others?
Also, we are worried if in the future the GKE requests handling scalability may be an issue. If we manage a relatively higher number of buses, should we need to implement a queue system between the client and the server? Or GKE load balancer & ambassador will already manages it for us? In case we need a queue system in the future, could it be used also for the collision problem we are facing now?
Last, the reservation requests from the client often takes a while. Therefore, we are changing the requests to be handled asynchronously with a long polling approach from the client to know the task status. Could we link this solution to the current problem? For example, using the Redis caché or the queue system to know the task status? Or should we try to keep the requests synchronous and focus on reducing the processing time (it may be quite difficult).
I am developing an ASP.NET application that will be hosted as an Azure web app. Part of the app will continuously record multiple web-based cameras by retrieving a snapshot every N seconds. I would like to design the app so that the processes that record the cameras can be run on multiple instances. I would like it to load balance between all instances, but not duplicate effort for any one camera.
For example, if I have 100 cameras, and am running on 2 instances, I want each instance to get 50 cameras to process. If I have 5 instances, each instance should get 20 cameras to process. As I add cameras or scale instances up/down I would like for the system to load balance the work evenly.
If it's feasible, I would rather not spin up dedicated VMs just for processing cameras, due to increased cost.
I'm somewhat familiar with Akka.NET, Hangfire, and WebJobs, but am unclear if these will help in this scenario. I have used Hangfire and WebJobs to do background processing, but not with this sort of load-balancing requirement. Will these or some other framework or tool help me load balance these background tasks evenly across Azure Web App Instances? How should I go about setting up these or another framework to do this?
I honestly don't think you want to try to "balance" the servers. I think you just want to make sure the work is well distributed. If I were you, I would use a queue system like SQS to queue up all of the cameras that need a snapshot and let each instance worker dequeue one at a time and process it.
A good approach could be to have a master server responsible for queueing up the snapshots, and then have all of your workers servers simply work out of this shared queue. Even if one server happens to process more than the others, that is fine since the others were working out of the same queue. It just means that this server was able to process its jobs more quickly than the others.
To be honest, there are a lot of ways to approach this. You could do something as simple as just having a shared list of your cameras, with a timestamp for the last snapshot, and use this to work off of. Each server would request a camera, they would look at the list and find one that was stale, and then update the timestamp and perform the snapshot for the camera. The downside to something like this is you are going to struggle with non-atomic operations and the possibility of multiple workers making the request at the same time and both working on the same server. These are the type of things that a queue system will help you with, because as soon as one of those queue items are in flight, they will no longer be available. And also, because each server is responsible for invalidating their items once they are finished, if a server were to crash mid-snapshot, this work would simple go back into the queue.
No matter which solution you choose, it is going to boil down to having a central system/list for serving up stale cameras.
The Azure WebJob SDK uses the Storage Account you set up to balance the work between the various instances that are running your Jobs. You can gain finer control by using a Queue to divide up the work that needs doing and then scale your App Service Plan based on the Queue length.
Here's a rough picture of that architecture:
I wanted to check the viability of the design approach to use Message Oriented middle-ware (MOM) technology like JMS or ActiveMQ or RabbitMQ for handling asynchronous processing within single web application i.e. the publisher and the subscriber to the MOM server will be contained in the same web application.
The rationale behind this design is to offload some of the heavy duty processing functionality as a background asynchronous operation. The publisher in this case is the server side real-time web service method which need to respond back instantaneously (< than 1 sec) to the calling web service client and the publisher emits the message on MOM Topic. The subscriber is contained in the same web application as the publisher and the subscriber uses the message to asynchronously processes the complex slightly more time consuming (5-7 seconds) functionality.
With this design we can avoid having to spawn new threads within the application server container for handling the heavy duty complex processing functionality.
Does using MOM server in this case an overkill where the message publisher and message subscriber are contained in the same web server address space? From what I have read MOM tech is used mainly for inter-application communication and wanted to check if it is fine to use MOM for intra-application communication.
Let know your thoughts.
Thanks,
Perhaps you will not think it is a good example but in the JEE world using JMS for intra-application communication is quite common. Spawning new threads is considered a bad practive and message-driven beans make consuming messages relatively easy and you get transaction support. A compliant application server like GlassFish has JMS on board so production and consumption of messages does not involve socket communication as will be the case with a standalone ActiveMQ. But there might be reasons to have a standalone JMS, e.g. if there is a cluster of consumers and you want the active instances to take over work from the failed ones... but then the standalone JMS server becomes the single point of failure and now you want a cluster of them and so on.
One significant feature of JMS is (optional) message persistence. You may be concerned that the long-running task fails for some reason and the client's request will be lost. But persistent messages are much more expensive as they cause disk IO.
From what you've described I can tell that of the usual features of MOM (asynchronous processing, guaranteed delivery, order of messages) you only need asynchronous processing. So if guarantees are not important I would use some kind of a thread pool.
I have an existing application (WPF) that monitors OPC Servers and alarms. There is a requirement for this to be accessible via a browser so that users can view the status of alarms etc remotely. I'm feeling out of my depth (I'm not a Web developer) and I just need some advice on the best technology to accomplish this.
I've written several WCF Services, but all these have done is, via a function call, crunch some data sending back a result.
This 'service' will have to be persistent and able to be interrogated by x number of clients. For example, a client will need to be able to connect, stay connected and be informed of events as an when they happen. This has been a major problem in the past when I've developed WCF services (channel faults etc) and I've learnt to only keep a connection open for as long as it's needed. Is a WCF Service the best option in this case (as opposed to a normal Window's Service)
I need to be able to 'push' information from the service to clients. So, someone navigates to a webpage, the page shows in realtime, what is happening in the service. Do I need to use timers since this could be big problem if session state cannot be maintained.
I've read about Observer Design Pattern, but can this be implemented in ASP.net and how would ASP connect (and remain connected) to a remote windows service? Again, do I have to resort to timers?
I apologise it this appears vague, but the situation boils down to the following:
A process that's continually running (somewhere), receiving connections from remote clients (desktop/web), and then keeping the clients informed as events (alarms going off etc) occur.
We have a core windows service hosting around 9 WCF service and acting as a client to another 3 WCF services. We have a front-end website that communicates with this windows service through WCF.
At somepoint, the windows services is executing some heavy operations which results in 100% CPU utilization, usually split 60-40 between the windows service and SQL server.
This is where the WCF connection/requests between the website times out, and this results in a very non responsive UI.
I am looking for a way to make sure any UI-related WCF calls gets executed anyway and takes the highest priority.
Our main problem is that we need to stick with this deployment scenario, where the windows service, the website and SQL server are all running on one machine. We are required to maintain a responsive UI even with a 100% CPU utilization. I am not sure where to start looking for a fix for that ...
It sounds like you should split your service endpoint onto two separate hosts, one for high volume, or process-intensive operations and one for low latency operations. The high-volume endpoint would process from a queue offline, and the low-latency endpoint would handle requests synchronously from the UI.
The kind of problems you are having are typical of when you try to balance the conflicting resource needs of high volume and low latency together in the same process.
If you cannot scale out in this way then I can't really suggest much you can do about it and must apologize for not answering your question directly.
Another thing you could look at is moving everything asynchronous and using a pattern such as CQRS to provide separation between your read and write requirements.