Openshift autoscalling offline application - scaling

I'd like to use OpenShift to host an application that will consume some data from queue and put it to database. So it won't receive http requests. Is there a way to automatically scale it up? (To minimize the time data spends in queue).

Unless you are going to use a paid plan, this application would get idled after 48 hours. You can use the rhc command to scaled your application up and down manually

Related

How to send 50.000 HTTP requests in a few seconds?

I want to create a load test for a feature of my app. It’s using a Google App Engine and a VM. The user sends HTTP requests to the App Engine. It’s realistic that this Engine gets thousands of requests in a few seconds. So I want to create a load test, where I send 20.000 - 50.000 in a timeframe of 1-10 seconds.
How would you solve this problem?
I started to try using Google Cloud Task, because it seems perfect for this. You schedule HTTP requests for a specific timepoint. The docs say that there is a limit of 500 tasks per second per queue. If you need more tasks per second, you can split this tasks into multiple queues. I did this, but Google Cloud Tasks does not execute all the scheduled task at the given timepoint. One queue needs 2-5 minutes to execute 500 requests, which are all scheduled for the same second :thinking_face:
I also tried a TypeScript script running asynchronous node-fetch requests, but I need for 5.000 requests 77 seconds on my macbook.
I don't think you can get 50.000 HTTP requests "in a few seconds" from "your macbook", it's better to consider going for a special load testing tool (which can be deployed onto GCP virtual machine in order to minimize network latency and traffic costs)
The tool choice is up to you, either you need to have powerful enough machine type so it would be able to conduct 50k requests "in a few seconds" from a single virtual machine or the tool needs to have the feature of running in clustered mode so you could kick off several machines and they would send the requests together at the same moment of time.
Given you mention TypeScript you might want to try out k6 tool (it doesn't scale though) or check out Open Source Load Testing Tools: Which One Should You Use? to see what are other options, none of them provides JavaScript API however several don't require programming languages knowledge at all
A tool you could consider using is siege.
This is Linux based and to prevent any additional cost by testing from an outside system out of GCP.
You could deploy siege on a relatively large machine or a few machines inside GCP.
It is fairly simple to set up, but since you mention that you need 20-50k in a span of a few seconds, siege by default only allows 255 requests per second. You can make this larger, though, so it can fit your needs.
You would need to play around on how many connections a machine can establish, since each machine will have a certain limit based on CPU, Memory and number of network sockets. You could just increase the -c number, until the machine gives an "Error: system resources exhausted" error or something similar. Experiment with what your virtual machine on GCP can handle.

uWSGI and Flask: keep objects in memory between requests

My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.

How to change idle timeout asp mvc app on my shared hosting provider?

I have a site on a shared hosting provider. My site timeouts when on idle and can take up to 40 seconds to start up again, I want to increase the idle timeout. Under manage - Dedicated IIS application Pool, the idle timeout is set to 5 minutes I want to increase it, I called my provider and they said I am unable to change the settings with a shared hosting account. I was thinking if there was another way Like the web.config folder to increase the timeout time?
The app pool is likely being recycled. There's nothing you can do about this on a shared hosting service. What you can do is send pings to the web server every N minutes. If GoDaddy recycles the app pool every 5 minutes, then send a ping to your website every 4 minutes. Doing this should extend the timeout an additional 4 minutes. If you always do this, it should not recycle unless explicitly called (or unless the host has some other recycle in place).
Optionally, you can use a monitoring service that pings and reports on your server. Here are two that may be of use to you: https://uptimerobot.com/ https://www.pingdom.com/
Uptime services that I've tried didn't actually send HTTP requests to the web app so it didn't help in keeping it alive. However, I found a nice feature in Application Insights called Availability which lets you create recurring tests that actually send GET requests to your website, thus preventing it from recycling.
I explain it more in my blog post here.
Open task scheduler on your computer
Create a scheduled task to run every 4 minutes
Have it run a powershell command invoke-webreqest to your site.
It'll have the effect of someone visiting your site every 4 minutes.

Load balance background tasks in Azure Web App

I am developing an ASP.NET application that will be hosted as an Azure web app. Part of the app will continuously record multiple web-based cameras by retrieving a snapshot every N seconds. I would like to design the app so that the processes that record the cameras can be run on multiple instances. I would like it to load balance between all instances, but not duplicate effort for any one camera.
For example, if I have 100 cameras, and am running on 2 instances, I want each instance to get 50 cameras to process. If I have 5 instances, each instance should get 20 cameras to process. As I add cameras or scale instances up/down I would like for the system to load balance the work evenly.
If it's feasible, I would rather not spin up dedicated VMs just for processing cameras, due to increased cost.
I'm somewhat familiar with Akka.NET, Hangfire, and WebJobs, but am unclear if these will help in this scenario. I have used Hangfire and WebJobs to do background processing, but not with this sort of load-balancing requirement. Will these or some other framework or tool help me load balance these background tasks evenly across Azure Web App Instances? How should I go about setting up these or another framework to do this?
I honestly don't think you want to try to "balance" the servers. I think you just want to make sure the work is well distributed. If I were you, I would use a queue system like SQS to queue up all of the cameras that need a snapshot and let each instance worker dequeue one at a time and process it.
A good approach could be to have a master server responsible for queueing up the snapshots, and then have all of your workers servers simply work out of this shared queue. Even if one server happens to process more than the others, that is fine since the others were working out of the same queue. It just means that this server was able to process its jobs more quickly than the others.
To be honest, there are a lot of ways to approach this. You could do something as simple as just having a shared list of your cameras, with a timestamp for the last snapshot, and use this to work off of. Each server would request a camera, they would look at the list and find one that was stale, and then update the timestamp and perform the snapshot for the camera. The downside to something like this is you are going to struggle with non-atomic operations and the possibility of multiple workers making the request at the same time and both working on the same server. These are the type of things that a queue system will help you with, because as soon as one of those queue items are in flight, they will no longer be available. And also, because each server is responsible for invalidating their items once they are finished, if a server were to crash mid-snapshot, this work would simple go back into the queue.
No matter which solution you choose, it is going to boil down to having a central system/list for serving up stale cameras.
The Azure WebJob SDK uses the Storage Account you set up to balance the work between the various instances that are running your Jobs. You can gain finer control by using a Queue to divide up the work that needs doing and then scale your App Service Plan based on the Queue length.
Here's a rough picture of that architecture:

launching adhoc docker instances: Is it recommended to launch a docker instance per request?

Is it recommended to launch a docker instance per request?
I have either lighttpd or Nginx running on my web server as a reverse proxy. I support a number of subdomains with very low usage. When a request for the subdomain arrives I want to start the docker instance. Preferable I'd like to launch them dynamically so that if more than one user arrives that I would launch one per user... and/or a shared instance (determined by configuration)
Originally I said this should work well for low traffic sites, but upon further thought, no, this is a bad idea.
Each time you launch a Docker container, it adds a read-write layer to the image. Even if there is very little data written, the layer exists, and each request will generate one. When a single user visits a website, rendering the page will generate 10's to 1000's of requests, for CSS, for javascript, for each image, for fonts, for AJAX, and each of these would create those read-write layers.
Right now there is no automatic cleanup of the read-write layers -- they persist even after the Docker container has exited. By default, nothing is lost.
So, even for a single low traffic site, you would find your disk use growing steadily over time. You could add your own automated cleanup.
Then there is the second problem: anything uploaded to the website would not be available to any other requests unless it was written to some out-of-container shared storage. That's pretty easy to do with S3 or a separate and persistent database service, but it does start showing the weakness in the "one new Docker container per request" approach. If you're going to have some persistent services, why not make the Docker containers more persistent and run them longer?

Resources