Azure Redis Cache automatically clearing PubSub channels - asp.net

I have four different Redis Cache subscriptions set up in Azure. I also have four App Services that each use one of those Redis Cache subscriptions. The four App Services/Redis Cache subscriptions are for the same code base, but different environments. I use a test, staging, live-east coast, and live-west coast environment.
The code running in each app service is exactly the same.
I have an ASP.NET Core Web API project that uses StackExchange.Redis. In my Web API project, I connect to the Redis subscription set up in Azure that corresponds to the environment for the App Service. As part of the startup process for my Web API project, I open up four PubSub channels.
For the test, staging, and live-west coast environments, the four PubSub channels get created and work just fine. I can connect to the Redis console through Azure and run the PUBSUB CHANNELS command and see the four channels I create through code.
For some reason, on the live-east coast Redis subscription, only one of the PubSub channels shows up. I can also verify that only one channel is actually open. My front-end that calls the Web API has logic that publishes messages to the Redis PubSub. These do not work on the live-east coast App Service. If I restart the App Service or reboot Redis, then I can sometimes get all four PubSub channels to show up and to work properly. Anytime I deploy new code to my live-east coast App Service, after the service boots back up, only one of the channels get created.
For some reason Redis is closing three of my PubSub channels. Again, this only happens in one of my four Redis subscriptions/App Services. The other three work flawlessly.
I've made sure that all the settings for my four Redis subscriptions and four App Services are identical. I've tried rebooting and redeploying several times and I just can't get that live-east coast Redis subscription to keep all four PubSub channels open.
Has anyone experienced anything like this? Has anyone seen Azure Redis Cache randomly closing their PubSub channels?

It is possible that the clients subscribed to that channel have either died or never successfully connected. Once the subscriber count reaches zero, PUBSUB channels won't show that channel anymore. Try running PUBSUB numsub <channel_name> to verify that there are subscribers. Also run CLIENT list to see how many clients have subscriptions (would be something like sub=1).

Related

load balancing/routing an application made with socketio and flask

I'm a bit of noob when it comes to deploying web apps and wanted to make sure a little app I'm building will work with the tech I'm trying to use.
I have some experience with flask, but have only ever used the test server. My understanding is that with nginx or apache, if I write a flask app, each user who visits my website could get a different instance of the flask app, exactly how that will work is a little confusing to me.
The app I want to make is similar to chatrooms/a game like "among us". When a user comes to the website, they join a big "lobby" and can either join a "room" that already exists, or launch a new room and generate a code/ID that they can pass to their friends so that their friends can join the same session (I think a socketio "room" can be used for this).
However, if each client is connected to their own flask instance, will every server instance be able to see the "rooms" on the other instances? Suppose my app becomes really popular and I want to scale the lobby across multiple machines/AWS instances in the future, is there anything I can do now to ensure this works? Or is scaling across multiple machines equivalent to scaling across instances on a single machine as far as the flask-socketio/nginx stack is concerned.
Basically, how do I ensure that the lobby part of the code is scalable. Is there anything I need to do to ensure every user has the ability to connect to rooms with other users even if they get a different instance of the flask app?
I will answer this question specifically with regards to the Socket.IO service. Other features of your application or third-party services that you use may need their own support for horizontal scaling.
With Flask-SocketIO scaling from one to two or more instances requires an additional piece, a message queue, which typically is either Redis or RabbitMQ, although there are a few more options.
As you clearly stated in your question, when the whole server is in a single instance, data such as which room(s) each connected client is in are readily available in the memory of the single process hosting the application.
When you scale to two or more instances, your clients are going to be partitioned and randomly assigned to one of your servers. So you will likely end up having the participants that are in a room also spread across multiple servers.
To make things work, the server instances all connect to the message queue and use messages to coordinate complex actions such as broadcasts to a room.
So in short, to scale from one to more instances, all you need to do is deploy a message queue, and change the Flask-SocketIO server to indicate the location of the queue. For example, here is the single instance server instantiation:
from flask_socketio import SocketIO
socketio = SocketIO(app)
And here is the initialization with a Redis message queue running on localhost's default 6379 port:
from flask_socketio import SocketIO
socketio = SocketIO(app, message_queue='redis://')
The application code does not need to be changed, Flask-SocketIO does all the coordination between instances for you by posting message on the queue.
Note that it does not really matter if the instances are hosted in the same server or in different ones. All that matters is that they connect to the same message queue so that they can communicate.

Google Cloud Pub/Sub Publisher Lifecycle

I'm running a Cloud Pub/Sub PublisherClient instance as a Singleton in an ASP.NET web application (.NET Standard 2). Does this retain a persistent HTTPS connection to the specified Cloud Pub/Sub Topic and should I call the ShutdownAsync method explicitly, or just let the connection be severed when the app pool recycles?
Running this in conjunction with Quartz.NET, publishing messages to Pub/Sub in relatively small batches, every 30 seconds. This seems to introduce server affinity in a 3-node Azure Load Balancer cluster, where the majority of traffic is routed to any given node after running for 1+ hours. Not 100% sure about best practices here.
Using Pub/Sub C# NuGet package V1 1.0 and Quartz NuGet 3.0.7
I assume you’re using this PublisherClient. Per the sample documentation, the PublisherClient instance should be shut down after use. This ensures that locally queued messages get sent. See also the ShutdownAsync documentation.

How does Azure Signal R handle Application Server scaling?

We have an existing web service that we are modifying such that when certain events happen within the service then they can be published to users that are interested. We are using Azure Signal R Service as our mechanism for relaying messages from our service to interested users. Currently, our architecture looks like this:
Our Signal R application server has only one hub and we are currently running three instances of the application server. I have labeled these Hub Instance 01, Hub Instance 02, and Hub Instance 03 in the diagram above.
Each instance of our existing web service opens one connection to the Azure Signal R service. After reading the Azure SignalR Service internals docs I have come to understand that each client connection to the Azure Signal R service goes through a one-time mapping to an application server (or Hub Instance in this case). In the diagram I have modeled that by showing a colored link coming from either the existing web service instance or a user and another link of the same color and style coming out of the Azure Signal R service and into a single Hub Instance.
Our primary concern is that the connection from the existing web service instance into the Azure Signal R service (the solid green and solid blue links in the diagram) could become saturated if we're trying to send too many events. Our plan to mitigate that concern was to open multiple connections from each web service instance to the Azure Signal R service. Then, within our code we would simply round-robin through each of the connections as we send messages.
Our concern with that approach is that we don't know how those connections to the Azure Signal R service are going to be mapped to Hub Instances. We could end up in a situation like the one below, where one or two Hub Instances end up taking the brunt of our traffic.
In this diagram we can see:
Each instance of the existing web service has opened multiple connections to the Azure Signal R service. Unfortunately, Hub Instance 01 and Hub Instance 03 have been assigned the majority of those connections. That means that they'll be taking the brunt of our traffic and will eventually start to run hot.
This leads me to the following questions:
Is there anything we can do in our existing web service to make sure that the connections we establish to the Azure Signal R service are evenly spread out across the Hub Instances?
What can we do if one of our Hub Instances starts running hot? It seems like just adding another Hub Instance isn't going to be helpful because only new clients will be assigned to that instance. Is there a way to have the Azure Signal R service re-balance connections when a new Hub Instance comes online?
How are client connections affected if an application server instance goes down (i.e. for deploying updates)? Are the client connections terminated and then the client is supposed to reconnect?
Within the Azure Signal R service, how are connections balanced if the Signal R Service cluster itself needs to scale up or down?
We're facing a similiar issue, and from what I've read in the Microsoft docs they suggest incorporating a backplane using Redis or Service Bus into the architecture to manage the connections.

NServiceBus messaging across private networks

I was assigned with the re-architecture of a legacy (medical) product which is controlling several external devices. In the current architecture, we have several such stations in each customer's network, where each station is processing its own data, and they all share some of that data via a central server (that talks to the DB and BLOB storage).
I'm planning the new architecture such that it will allow more scenarios, such as monitoring the stations through a web interface, and allowing data processing to be scalable by adding additional servers.
This led me to choose NServicebus as the messaging and communication infrastructure. And I pretty much have a clear view of the new architecture.
However, another factor was recently added to the equation by my manager. He requires that the machine that communicates with the devices (hardware), will not be under the IT policies of the customer. The reason behind this, as I understand, is that we don't want the customer's IT to control OS updates, security, permissions and other settings, because we want full control over that machine in order to work properly with our hardware.
My manager thus added a requirement that this machine will be disconnected from the customer's LAN.
If I still want to deploy NServiceBus on that separated machine (because I want to pub/sub async messages to other machines - some are on the customer's LAN and some aren't), Will it require some special deployment? Will it require an NServiceBus gateway?
EDIT: I removed the other (1st) question, as it wasn't relevant to the scope of StackOverflow.
Regarding question 2, yes it would require the use of a "Gateway", however the current NServiceBus Gateway implementation does not support pub/sub so you would have to look at alternatives.

How to use ServiceStack Redis in a web application to take advantage of pub / sub paradigm

I am interested in the Pub/Sub paradigm in order to provide a notifications system (ie : like Facebook), especially in a web application which has publishers (in several web applications on the same web server IIS) and one or more subscribers, in charge to display on the web the notifications for the front user.
I found out Redis, it seems to be a great server which provides interesting features : Caching (like Memcached) , Pub/Sub, queue.
Unfortunately, I didn't find any examples in a web context (ASP.NET, with Ajax/jQuery), except WebSockets and NodeJS but I don't want to use those ones (too early). I guess I need a process (subscriber) which receives messages from the publishers but I don't see how to do that in a web application (pub/sub works fine with unit tests).
EDIT : we currently use .NET (ASP.NET Forms) and try out ServiceStack.Redis library (http://www.servicestack.net/)
Actually Redis Pub/Sub handles this scenario quite well, as Redis is an async non-blocking server it can hold many connections cheaply and it scales well.
Salvatore (aka Mr Redis :) describes the O(1) time complexity of Publish and Subscribe operations:
You can consider the work of
subscribing/unsubscribing as a
constant time operation, O(1) for both
subscribing and unsubscribing
(actually PSUBSCRIBE does more work
than this if you are subscribed
already to many patterns with the
same client).
...
About memory, it is similar or smaller
than the one used by a key, so you
should not have problems to subscribe
to millions of channels even in a
small server.
So Redis is more than capable and designed for this scenario, but the problem as Tom pointed out in order to maintain a persistent connection users will need long-running connections (aka http-push / long-poll) and each active user will take its own thread. Holding a thread isn't great for scalability and technologically you would be better off using a non-blocking http server like Manos de Mono or node.js which are both async and non-blocking and can handle this scenario. Note: WebSockets is more efficient for real-time notifications over HTTP, so ideally you would use that if the users browser supports it and fallback to regular HTTP if they don't (or fallback to use Flash for WebSockets on the client).
So it's not the Redis or its Pub/Sub that doesn't scale here, it's the number of concurrent connections that a threaded HTTP server like IIS or Apache that is the limit, with that said you can still support a fair amount of concurrent users with IIS (this post suggests 3000) and since IIS is the bottleneck and not Redis you can easily just add an extra IIS server into the mix and distribute the load.
For this application, I would strongly suggest using SignalR, which is a .Net framework that enables real-time push to connected clients.
Redis publish/subscribe is not designed for this scenario - it requires a persistent connection to redis, which you have if you are writing a worker process but not when you are working with stateless web requests.
A publish/subscribe system that works for end users over http takes a little more work, but not too much - the simplest approach is to use a sorted set for each channel and record the time a user last got notifications. You could also do it with a list recording subscribers for each channel and write to the inbox list of each of those users whenever a notification is added.
With either of those methods a user can retrieve their new notifications very quickly. It will be a form of polling rather than true push notifications, but you aren't really going to get away from that due to the nature of http.
Technically you could use redis pub/sub with long-running http connections, but if every user needs their own thread with active redis and http connections, scalability won't be very good.

Resources