load balancing/routing an application made with socketio and flask - nginx

I'm a bit of noob when it comes to deploying web apps and wanted to make sure a little app I'm building will work with the tech I'm trying to use.
I have some experience with flask, but have only ever used the test server. My understanding is that with nginx or apache, if I write a flask app, each user who visits my website could get a different instance of the flask app, exactly how that will work is a little confusing to me.
The app I want to make is similar to chatrooms/a game like "among us". When a user comes to the website, they join a big "lobby" and can either join a "room" that already exists, or launch a new room and generate a code/ID that they can pass to their friends so that their friends can join the same session (I think a socketio "room" can be used for this).
However, if each client is connected to their own flask instance, will every server instance be able to see the "rooms" on the other instances? Suppose my app becomes really popular and I want to scale the lobby across multiple machines/AWS instances in the future, is there anything I can do now to ensure this works? Or is scaling across multiple machines equivalent to scaling across instances on a single machine as far as the flask-socketio/nginx stack is concerned.
Basically, how do I ensure that the lobby part of the code is scalable. Is there anything I need to do to ensure every user has the ability to connect to rooms with other users even if they get a different instance of the flask app?

I will answer this question specifically with regards to the Socket.IO service. Other features of your application or third-party services that you use may need their own support for horizontal scaling.
With Flask-SocketIO scaling from one to two or more instances requires an additional piece, a message queue, which typically is either Redis or RabbitMQ, although there are a few more options.
As you clearly stated in your question, when the whole server is in a single instance, data such as which room(s) each connected client is in are readily available in the memory of the single process hosting the application.
When you scale to two or more instances, your clients are going to be partitioned and randomly assigned to one of your servers. So you will likely end up having the participants that are in a room also spread across multiple servers.
To make things work, the server instances all connect to the message queue and use messages to coordinate complex actions such as broadcasts to a room.
So in short, to scale from one to more instances, all you need to do is deploy a message queue, and change the Flask-SocketIO server to indicate the location of the queue. For example, here is the single instance server instantiation:
from flask_socketio import SocketIO
socketio = SocketIO(app)
And here is the initialization with a Redis message queue running on localhost's default 6379 port:
from flask_socketio import SocketIO
socketio = SocketIO(app, message_queue='redis://')
The application code does not need to be changed, Flask-SocketIO does all the coordination between instances for you by posting message on the queue.
Note that it does not really matter if the instances are hosted in the same server or in different ones. All that matters is that they connect to the same message queue so that they can communicate.

Related

uWSGI and Flask: keep objects in memory between requests

My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.

Will Docker suffice for a Shiny app with ~100 connections or do I need Shiny Proxy?

I'm looking for a free and open source option for serving out a Shiny appl to ~100 of my students simultaneously. I tried to do this with Shiny Server Open and it throttled. Users got a message like
Too Many Users
Sorry, but this application has exceeded its quota of concurrent users. Please try again later.
After searching on that error message I now know that I can increase the number of concurrent connections, but I'm afraid of bottlenecks due to R's single threaded-ness. I'm aware of Shiny Proxy and I've been experimenting with this, but it seems like it may contain an extra layer of complexity that I don't need.
I've served Shiny apps before with Docker (but not to this large of an audience), so I'm wondering if it will be sufficient.
My question is this: if I don't need authentication (user logins), will Docker suffice for a single page application for ~100 simultaneous connections? Or do I really need Shiny Proxy?
Corollary: how can I test this and ensure that it will work (outside of getting in front of a 100 student class and testing on the fly)?
Do you care if they all share the same underlying R process?
The open-source version of shiny-server allows you to serve apps, but they all share a single R process. So if your app has a long-running simulation, while one user runs that, it would tie-up your R thread and block the other users until it finishes running.
I don't know that there is a limit on the concurrent connections, if you don't mind them sharing the R process as described above. You can try increasing the simple_scheduler setting, see Section 3.1.2 Simple Scheduler, in the documentation for shiny-server.conf (typically at /etc/shiny-server/).
If you don't care about them all being at the same URL, you could just use multiple instances of the open-source shiny-server, for instance in docker containers hosted on your machine at different ports.
If you want to do something like load-balance between instances of your application (horizontally scaling behind a single URL), you'll need either shiny-server pro, ShinyProxy, or to use a load balancer with sticky-sessions. This is because shiny apps handle state in-memory in a R session, so if you try to send your students to a URL and that URL is backed by n instances of your app, but there is no stickiness guarantee, than an individual student's action will not necessarily be on the same instance each time, and the apps won't work as you expect.
Shiny-Server pro and ShinyProxy handle this stickiness for you with cookies and headers. Depending on your cloud-services provider, they probably support a browser cookie which would work as long as you don't need your students to be able to open multiple tabs of your app with different instances.

easy server and client communication

I want to create a program for my desktop and an app for my android. Both of them will do the same, just on those different devices. They will be something like personal assistants, so I want to put a lot of data into them ( for example contacts, notes and a huge lot of other stuff). All of this data should be saved on a server (at least for the beginning I will use my own Ubuntu server at home).
For the android app I will obviously use java and the database on the server will be a MySQL database, because that's the database I have used for everything. The Windows program will most likely be written in of these languages: Java, C#c C++, as these are the languages I am able to use quite well.
Now to the problem/question: The server should have a good backend which will be communicating with the apps/programs and read/write data in the database, manage the users and all that stuff. But I am not sure how I should approach programming the backend and the "network communication" itself. I would really like to have some relatively easy way to send secured messages between server and clients, but I have no experience in that matter. I do have programming experience in general, but not with backend and network programming.
side notes:
I would like to "scale big". At first this system will only be used by me, but it may be opened to more people or even sold.
Also I would really like to a (partly) self programmed backend on the server, because I could very well use this for a lot of other stuff, like some automation features in my house, which will be implemented.
EDIT: I would like to be able to scale big. I don't need support for hundreds of people at the beginning ;)
You need to research Socket programming. They provide relatively easy, secured network communication. Essentially, you will create some sort of connection or socket listener on your server. The clients will create Sockets, initialize them to connect to a certain IP address and port number, and then connect. Once the server receives these connections, the server creates a Socket for that specific connection, and the two sockets can communicate back and forth.
If you want your server to be able to handle multiple clients, I suggest creating a new Thread every time the server receives a connection, and that Thread will be dedicated to that specific client connection. Having a multi-threaded server where each client has its own dedicated Thread is a good starting point for an efficient server.
Here are some good C# examples of Socket clients and servers: https://msdn.microsoft.com/en-us/library/w89fhyex(v=vs.110).aspx
As a side note, you can also write Android apps in C# with Xamarin. If you did your desktop program and Android app both in C#, you'd be able to write most of the code once and share it between the two apps easily.
I suggest you start learning socket programming by creating very simple client and server applications in order to grasp how they will be communicating in your larger project. Once you can grasp the communication procedures well enough, start designing your larger project.
But I am not sure how I should approach programming the backend and
the "network communication" itself.
Traditionally, a server for your case would be a web server exposing REST API (JSON). All clients need to do http requests and render/parse JSON. REST API is mapped to database calls and exposes some data model. If it was in Java, it would be Jetty web server, Jackson Json parser.
I would really like to have some relatively easy way to send secured
messages between server and clients,
Sending HTTP requests probably the easiest way to communicate with a service. Having it secured is a matter of enabling HTTPS on the server side and implementing some user access authentication and action authorization. Enabling HTTPS with Jetty for Java will require few lines of code. Authentication is usually done via OAuth2 technique, and authorization could be based on ACL. You may go beyond of this and enable encryption of data at rest and employ other practices.
I would like to "scale big". At first this system will only be used by
me, but it may be opened to more people or even sold.
I would like to be able to scale big. I don't need support for
hundreds of people at the beginning
I anticipate scalability can become the main challenge. Depending on how far you want to scale, you may need to go to distributed (Big Data) databases and distributed serving and messaging layers.
Also I would really like to a (partly) self programmed backend on the
server, because I could very well use this for a lot of other stuff,
like some automation features in my house, which will be implemented.
I am not sure what you mean self-programmed. Usually a backend encapsulates some application specific business logic.
It could be a piece of logic between your database and http transport layer.
In more complicated scenario your logic can be put into asynchronous service behind the backend, so the service can do it's job without blocking clients' requests.
And in the most (probably) complicated scenario your backend may do machine learning (for example, if you would like you software stack to learn your home-being habits and automate house accordingly to your expectations without actually coding this automation)
but I have no experience in that matter. I do have programming
experience in general, but not with backend and network programming.
If you can code, writing a backend is not very hard problem. There are a lot of resources. However, you would need time (or money) to learn and to do it, what may distract you from the development of your applications or you may enjoy it.
The alternative to in-house developed of a backend could be a Backend-as-a-Service (BaaS) in cloud or on premises. There are number of product in this market. BaaS will allow you to eliminate the development of the backend entirely (or close to this). At minimum it should do:
REST API to data storage with configurable data model,
security,
scalability,
custom business-logic
Disclaimer: I am a member of webintrinsics.io team, which is a Backend-as-a-Service. Check our website and contact if you need to, we will be able to work with you and help you either with BaaS or with guiding you towards some useful resources.
Good luck with your work!

Server to Server Communication

I'd like to know if there's a way to communicate directly between two (or more) flask-socketio servers. I want to pass information between servers, and have clients connect a single web socket server, which would have all the combined logic and data from the other servers.
I found this example in JS Socket IO Server to Server where the solution was to use a socket.io-client to connect to another server.
I've looked through the Flask-SocketIO documentation, as well as other resources, however it doesn't appear that Flask-SocketIO has a client component to it.
Any suggestions or ideas?
Flask-SocketIO 2.0 can (maybe) do what you want. This is explained in the Using Multiple Workers section of the documentation.
Basically, the servers are configured to connect to a shared message queue service (redis, for example), and then a load balancer in front of them assigns clients to any of the servers in the pool using sticky sessions. Broadcasting operations are coordinated automatically among the servers by passing messages on the queue.
As an additional feature, if you use this set up, you can have any process connect to the message queue to post messages for clients, so for example, you can emit events to clients from a worker or other auxiliary process that is not a SocketIO server.
From your question it is unclear if you were looking to implement something like this, or if you wanted to have the servers communicate for a different reason. Sending of custom messages on the queue is currently not supported, but your question gave me the idea, this might be useful for some scenarios.
As far as using a SocketIO client as in the question you referenced, that shouud also work. You can use this Python package: https://pypi.python.org/pypi/socketIO-client. If you go this route, you can have a server be a client and receive events or join rooms.

Load balance background tasks in Azure Web App

I am developing an ASP.NET application that will be hosted as an Azure web app. Part of the app will continuously record multiple web-based cameras by retrieving a snapshot every N seconds. I would like to design the app so that the processes that record the cameras can be run on multiple instances. I would like it to load balance between all instances, but not duplicate effort for any one camera.
For example, if I have 100 cameras, and am running on 2 instances, I want each instance to get 50 cameras to process. If I have 5 instances, each instance should get 20 cameras to process. As I add cameras or scale instances up/down I would like for the system to load balance the work evenly.
If it's feasible, I would rather not spin up dedicated VMs just for processing cameras, due to increased cost.
I'm somewhat familiar with Akka.NET, Hangfire, and WebJobs, but am unclear if these will help in this scenario. I have used Hangfire and WebJobs to do background processing, but not with this sort of load-balancing requirement. Will these or some other framework or tool help me load balance these background tasks evenly across Azure Web App Instances? How should I go about setting up these or another framework to do this?
I honestly don't think you want to try to "balance" the servers. I think you just want to make sure the work is well distributed. If I were you, I would use a queue system like SQS to queue up all of the cameras that need a snapshot and let each instance worker dequeue one at a time and process it.
A good approach could be to have a master server responsible for queueing up the snapshots, and then have all of your workers servers simply work out of this shared queue. Even if one server happens to process more than the others, that is fine since the others were working out of the same queue. It just means that this server was able to process its jobs more quickly than the others.
To be honest, there are a lot of ways to approach this. You could do something as simple as just having a shared list of your cameras, with a timestamp for the last snapshot, and use this to work off of. Each server would request a camera, they would look at the list and find one that was stale, and then update the timestamp and perform the snapshot for the camera. The downside to something like this is you are going to struggle with non-atomic operations and the possibility of multiple workers making the request at the same time and both working on the same server. These are the type of things that a queue system will help you with, because as soon as one of those queue items are in flight, they will no longer be available. And also, because each server is responsible for invalidating their items once they are finished, if a server were to crash mid-snapshot, this work would simple go back into the queue.
No matter which solution you choose, it is going to boil down to having a central system/list for serving up stale cameras.
The Azure WebJob SDK uses the Storage Account you set up to balance the work between the various instances that are running your Jobs. You can gain finer control by using a Queue to divide up the work that needs doing and then scale your App Service Plan based on the Queue length.
Here's a rough picture of that architecture:

Resources