trade off between workers and connections - r

I'm subscribing shinyapps.io with basic plan. So I have 3 instances, up to 3 workers per instance and up to 50 connections per workers.
I'm wondering what's the difference between
2 workers with 5 connections each.
1 worker with 10 connection.

This is from this help page on tuning Shiny apps:
When should you worry about tuning your applications? You should
consider tuning your applications if:
Your application has several requests that are slow and you have enough concurrent usage that people’s expectations for responsiveness
aren’t being met. For example, If your response time for some key
calculations takes one second and you would like to make sure that the
average response time for your application is less than two seconds,
you will not want more than two concurrent requests per worker.
Possible Diagnosis: The application performance might be due to R’s single threaded nature. Spreading the load across additional
workers should alleviate the issue.
Remedy: Consider lowering the maximum number of connections per worker, and possibly increasing the maximum number of workers.
Also consider adding additional Application Instances and aggressively
scaling them by tweaking the Instance Load Factor to a lower
percentage.
The answer to your question is probably dependent on your apps. If you have a relatively simple app with fast calculations and relatively few concurrent users, you probably won't notice a difference between your two scenarios. However, if you have complex apps as described in the help page, you might notice that having more workers (i.e., more individual threads sending requests to the R server) will improve user experience.
In my experience, where I tend to have complex apps with but with few (<10) concurrent users, I haven't noticed a difference from the limited tuning I've done.

Related

R Shiny / shinyapps.io - Worker and Instance settings to maximize performance

Extra information relevant to this: https://shiny.rstudio.com/articles/scaling-and-tuning.html
I am trying to determine the best Worker and Instance settings for my Shiny App to make the user experience as smooth as possible. For the most part, traffic on the app is minimal, but there will be some times where traffic will be abnormally high such as when it is being presented to a large audience (maybe 100+ users?).
First, based on personal experience as well as this SO question., I am setting the "Max Connections" (the number of concurrent connections allowed per R worker process) to 1. This will avoid some strange 'interactions' between connections that share the same R worker.
I have a Professional subscription to shinyapps.io, which means each app has a maximum of 10 instances. In the settings for the app, I have the ability to adjust several values to determine when new instances are launched, when new workers are added or shared, etc.
Consider two scenarios:
I set it up so there is as many instances as possible, and additional workers fill in as they appear. If I have a max of 10 workers per instance, the first 10 connections would each start their own instance, and the 11th connection would end up joining one of those instances with a new worker.
I set it up to have as few instances as possible, adding workers until a new instance is needed. If max of 10 workers per instance, the first connection starts the instance, the next 9 start workers within that instance, and the 11th will start a new instance and worker.
What are the pros and cons to using either of these methods?
Does one increase performance? Are they the same?
Does having more workers on an instance slow computational speeds?
Thanks!
I'm wondering the same thing, but set "Max connection" to 1 seems not efficient to me, as they can be up to 50 isn't it

Jelastic - vertical scaling

I have an application on tomcat (no horizontal scaling), I send 20 requests. Each request executing complex calculations for 6-7 minutes.
I run the same on a standalone server with many cores (50+) and there each request is executed in a thread on the separate logical processor - execution time is 3-4 minutes.
On the Jelastic platform, it scales up to 40-42 cloudlets (even if max is 92). Is there a way "influence" scaling, to tell Jelastic that it should use more cloudlets. (I suppose there is less than 20 processors on 42 cloudlets and time is shared between threads).
Also if I send twice more (40 requests) it uses around 65 cloudlets (max, in that case, is set to 160).
Could I find anywhere what are rules how Jelastic decide how to scale and what to do to use more cloudlets?
The overall compute power of your Jelastic node is determined by the cloudlet scaling limit. If you have a compute intensive workload, you should set this value as high as possible, keeping in mind your budget constraints. (you can also configure load alerts to notify you about persistent excessive usage)
The cloudlet consumption (in terms of billing) is calculated as the average CPU usage per hour, so if you have a spike of a few minutes - as in your example - you will want to allow it to use as much CPU as it can.
The usage that you see displayed is a function of how your workload is spreading across cores, with potentially artificial limitations on the actual compute power of each core due to the cloudlet limits that you've set. To get the best performance, you have to allow a very high cloudlet limit, because this will give you full access to each core instead of an (effectively) limited clock speed per core.
What you set as reserved cloudlets doesn't matter (it's purely a billing construct: no technical implications), and no cloudlets are added or removed from your server - again, this is purely a billing construct, not a technical one. This means there is no way to instruct Jelastic to use more cloudlets: that's happening purely due to how your code is executed (most likely the way it's spread across cores due to the overall cloudlet limits you've set vs. the compute power of the underlying hardware).
There is also a possibility that your performance will be different across different Jelastic providers depending on their hardware choices. For example some will use a high number of cores with lower clock speed, others may prioritise clock speed over the number of cores. Your application performance in that case will depend on how effectively it can be parallelised across cores (multi-threading); if your most intensive chunk of work doesn't parallelise, it will work best with a higher clock speed. Given that you get increased cloudlet consumption with more requests, it sounds like this is at least partially responsible in your case.
You might also consider that if you have better performance from dedicated hardware, you can get Jelastic as a private or hybrid cloud option. They may also offer different hardware options in different regions to cater for different types of workloads. Talk to your Jelastic provider and see what they can offer you.

How to handle scaling when request per minute go from 500 to 5000 instanly

I have an application that spikes from 500 rpm to 5000 and stays there for 20-30min. I know that's not a ton of requests but its the magnitude of the jump that is killing me. AWS-EC2 takes 5 min to scale up so that's not helpful when things move so fast. Maybe multiple DB's that handle different pieces of the application.
How would you go about analyzing this and thinking about infrastructure if you will always go from 500 to 5000RPM or higher in one minute?
This is the graph from my AWS logs:
If you can predict that demand will increase at some point you can automate provisioning of new instances. If you can't determine this then you need to do proper capacity planning. For instance, how many servers/containers do you need running to sustain the load with an acceptable user experience? This will be key to determine.
You also should look at implement asynchronous messaging patterns that offload the spike although this may come with some performance degradation.
One additional consideration would be moving to a serverless architecture like AWS Lambda. This likely wouldn't fully solve the problem but would provide you more ability to quickly provision on demand infrastructure.

asp.net mvc 2 and .net 4 parallelism

How can MVC website benefit from the new parallelism features in .net 4? don't websites support parallelism by default since multiple users access them at the same time? Can someone clarify this?
Executing tasks in parallel is especially useful for long running tasks. What constitutes a long running task might differ, but it should be longer than the overhead of spinning up and synchronizing the threads.
So, there is no particular benefit for MVC, but there is a general benefit for each request which require more things to be run in parallel.
There's an article from 2007 in MSDN Magazine which outlines some performance aspects of the parallel library.
Example 1: a user hits a page which displays two different graphs. Each graph is calculated from a dataset. Executing the calculations in parallel will benefit the overall time to render the page. (Executing parallel individual tasks)
Example 2: You need to execute some function on a list of data, And use Parallel.For to enumerate over the data and execute some code on it in parallel.
You should analyze your application and figure out which parts can be run in parallel, and then test with the new language features if it helps your application or not.
Parallelism is by computer processor. Parallelism doesn't mean multiple users access the application. Parallelism is that for one request, the app can split tasks across multiple processors to do the work. So multiple users accessing the app is a factor in how the server dishes out the work, but not the core idea of parallelism.
Every request is handled in its own thread, so you get some degree of parallelism by default with ASP.Net.
When considering implementing any kind of parallelism it is important to test your code with and without parallelism. You have to make sure that the overhead of wrapping, distributing and executing each task is not greater than running the tasks in serial. It often is for most of your day-to-day loops. Parallelism is best used in computationally intensive tasks.

design considerations for a WCF service to be accessed 500k times/day

I've been tasked with creating a WCF service that will query a db and return a collection of composite types. Not a complex task in itself, but the service is going to be accessed by several web sites which in total average maybe 500,000 views a day.
Are there any special considerations I need to take into account when designing this?
Thanks!
No special problems for the development side.
Well designed WCF services can serve 1000's of requests per second. Here's a benchmark for WCF showing 22,000 requests per second, using a blade system with 4x HP ProLiant BL460c Blades, each with a single, quad-core Xeon E5450 cpu. I haven't looked at the complexity or size of the messages being sent, but it sure seems that on a mainstream server from HP, you're going to be able to get 1000 messages per second or more. And with good design, scale-out will just work. At that peak rate, 500k per day is not particularly stressful for the commnunications layer built on WCF.
At the message volume you are working with, you do have to consider operational aspects.
Logging
Most system ops people who oversee WCF systems (and other .NET systems) that I have spoken use an approach where, in the morning, they want to look at basic vital signs of the system:
moving averages of request volume: 1min, 1hr, 1day.
comparison of those quantities with historical averages
error/exception rate: 1min, 1hr, 1day
comparison of those quantities
If your exceptions are low enough in volume (in most cases they should be), you may wish to log every one of them into a special application event log, or some other audit log. This requires some thought - planning for storage of the audits and so on. The reason it's tricky is that in some cases, highly exceptional conditions can lead to very high volume logging, which exacerbates the exceptional conditions - a snowball effect. Definitely want some throttling on the exception logging to avoid this. a "pop off valve" if you know what I mean.
Data store
And of course you need to insure that the data source, whatever it is, can support the volume of queries you are throwing at it. Just as a matter of good citizenship - you may want to implement caching on the service to relieve load from the data store.
Network
With the benchmark I cited, the network was a pretty wide open gigabit ethernet. In your environment, the network may be shared, and you'll have to check that the additional load is reasonable.

Resources