Why is writing from R to Azure SQL Server is very slow - r

I am trying to build out some data repositories in an Azure Managed Instance/Sql Server DB. I have been shocked by how slow the write process is, from R/RStudio. As an example, it took 65 minutes to write a table to Azure and less than one minute to write it to my local machine Sql Server.
It appears to write about 20 rows per second, regardless of the number of columns (if I refresh a query w/in SSMS, It's adding about 20 rows each time I run).
I have read in other threads that this could be due to the performance tier. I saw things about the B and A and P tiers. The only information I see on tiers in our account is "General Purpose" and "Business Critical". We have General Purpose (Gen5) with 8 cores and 512 GB of storage, of which we are utilizing less than 10%. While performing one of these write operations from R the overall CPU Utilization is less than 1%.
I am able to read tables from Azure back to R/RStudio quickly. Only writing is significantly hampered.
All of this makes it feel like it's going much slower than it should, as if there was a throttling effect or something. It is so slow that I cannot effectively get historical data there-- I allowed several things to run last night and they all timed out.

Related

ASP.Net Core Performance on Azure vs Local

I have a website with a function that generates a report in Excel that is pretty much just a datadump, approx 16000 rows, using EPPlus. This report keeps timing out for the users on Azure. The timeout (524) is a cloudflare limit if the request takes longer than 100 seconds.
I have optimised the hell out of it using Hashsets and Dictionaries, and it now runs in under 2 seconds on my laptop in Debug. I've also tried publishing with the target runtime as win-x64, in case it's a memory allocation issue.
Initially I thought the bottleneck would be memory. After setting up Application Insights, I can see that the CPU is at 100% while the memory is fairly low, about 300MB. I've bumped the Service Plan up to the P3V2 (14GB RAM & 840 ACU) to test if it's just a resource allocation issue. Even at that level, it takes about 50-60 seconds to produce. I can't run the app at that level, so I need to get it down much lower.
I'm not sure how else to optimise this, or identify the bottleneck. Any ideas?

trade off between workers and connections

I'm subscribing shinyapps.io with basic plan. So I have 3 instances, up to 3 workers per instance and up to 50 connections per workers.
I'm wondering what's the difference between
2 workers with 5 connections each.
1 worker with 10 connection.
This is from this help page on tuning Shiny apps:
When should you worry about tuning your applications? You should
consider tuning your applications if:
Your application has several requests that are slow and you have enough concurrent usage that people’s expectations for responsiveness
aren’t being met. For example, If your response time for some key
calculations takes one second and you would like to make sure that the
average response time for your application is less than two seconds,
you will not want more than two concurrent requests per worker.
Possible Diagnosis: The application performance might be due to R’s single threaded nature. Spreading the load across additional
workers should alleviate the issue.
Remedy: Consider lowering the maximum number of connections per worker, and possibly increasing the maximum number of workers.
Also consider adding additional Application Instances and aggressively
scaling them by tweaking the Instance Load Factor to a lower
percentage.
The answer to your question is probably dependent on your apps. If you have a relatively simple app with fast calculations and relatively few concurrent users, you probably won't notice a difference between your two scenarios. However, if you have complex apps as described in the help page, you might notice that having more workers (i.e., more individual threads sending requests to the R server) will improve user experience.
In my experience, where I tend to have complex apps with but with few (<10) concurrent users, I haven't noticed a difference from the limited tuning I've done.

Query execution taking time in Presto with pinot connector

We are using Apache pinot as source system. We have loaded 10GB TPCH data into pinot. We are using Presto as query execution engine, using pinot connector.
We are trying with simple configuration. Presto installed on CentOS machine with 8CPUs and 64GB RAM. Only one instance of worker running with embedded coordinator. Pinot is installed on CentOS machine with 4 CPUs and 64 GB RAM. One Controller,one broker,one server and one zookeeper are running.
Running a query on Lineitem table involving group by roll-up, is taking 23 seconds. Around 20 seconds is spent in transferring 2.3GB data from pinot to presto.
In another query, involving join between Lineitem,Nation,Partsupply,Region with group by cube is taking around 2 minutes. Data transfer is taking around 25 seconds in this. Most of the remaiy time is spent in join and aggregation computation.
Is this normal performance with presto-pinot?
If not,what am I missing?
Do I, need to increase hardware? Increase number of presto/pinot processes?
Any specific presto properties I should consider modifying?
Thanks for your help in advance
Please list the queries so that we can provide a better answer. At a high level, Presto Pinot connector tries to pushdown most of the computation (filter, aggregation, group by) to Pinot and minimize the amount of data needed to pull from Pinot.
There are always queries that require a full table scan and computation cannot be pushed to pinot. Query latency can be higher in such cases. Pinot recently added a streaming api that can improve the latency further.

The Case of the Missing '14 second SQLite database' performance

I have developed a program which uses SQLite 3.7 ... database, in it there is a rather extensive write/read module that imports , checks and updates data. This process takes 14 seconds on my PC and Im pleased as punch with the performance.
I use transactions for everything with paratetrs my PC is a Intel i7 with 18gig of ram. I have not set anything in the database. I used SQLite Expert to create the database and create the data structures including table and columns and checked that all indexes are created. In other words its all OK.
I have since deployed the program/database to 2 other machines. That 14 second process takes over 5 minutes on the other machines. Same program, identical data, identical database. The machines are upto date, one is a 3rd gen Intel i7 bought last week, the other is quite fast as well so hardware should not be an issue.
Im just not understanding what the problem could be? Is it the database itself ? I have not set anything other then encription on it. Remembering that I run the same and it takes the 14 seconds. Could it be that the database is 'optimised' to my PC ? so when I give it to others its not optimised?
I know I could turn off jurnaling to get better performance, but that would only speed up the process and still would leave the problem.
Any ideas would be welcome.
EDIT:
I have tested the program on my 7yo Dual Athelon with 3gig of ram running XP on HDD, and the procedure took 35 seconds. Well in tolerable limits considering. I just dont get what could be making 2 modern machines take 5 min ?
I have an idea that its a write issue, as using a reader they are slower but quite ecceptable.
SQLite speed is affected most by how well the disk does random reads and writes; any SSD is much more better at this than any rotating disk.
Whenever changes overflow the internal cache, they must be written to disk. You should use PRAGMA cache_size to increase the cache to more than the default 2 MB.
Changed data must be written to disk at the end of every transaction. Make sure that there are as many changes as possible in one transaction.
If much of your processing involves temporary tables or indexes, the speed is affected by the speed of the main disk. If your machines have enough RAM, you can force temporary data to RAM with PRAGMA temp_store.
You should enable Write-Ahead Logging.
Note: the default SQLite distribution does not have encryption.

Best way to determine the number of servers needed

How much traffic can one web server handle? What's the best way to see if we're beyond that?
I have an ASP.Net application that has a couple hundred users. Aspects of it are fairly processor intensive, but thus far we have done fine with only one server to run both SqlServer and the site. It's running Windows Server 2003, 3.4 GHz with 3.5 GB of RAM.
But lately I've started to notice slows at various times, and I was wondering what's the best way to determine if the server is overloaded by the usage of the application or if I need to do something to fix the application (I don't really want to spend a lot of time hunting down little optimizations if I'm just expecting too much from the box).
What you need is some info on Capacity Planning..
Capacity planning is the process of planning for growth and forecasting peak usage periods in order to meet system and application capacity requirements. It involves extensive performance testing to establish the application's resource utilization and transaction throughput under load. First, you measure the number of visitors the site currently receives and how much demand each user places on the server, and then you calculate the computing resources (CPU, RAM, disk space, and network bandwidth) that are necessary to support current and future usage levels.
If you have access to some profiling tools (such as those in the Team Suite edition of Visual Studio) you can try to set up a testing server and running some synthetic requests against it and see if there's any specific part of the code taking unreasonably long to run.
You should probably check some graphs of CPU and memory usage over time before doing this, to see if it can even be that. (A number alike to the UNIX "load average" could be a useful metric, I don't know if Windows has anything like it. Basically the average number of threads that want CPU time for every time-slice.)
Also check the obvious, that you aren't running out of bandwidth.
Measure, measure, measure. Rico Mariani always says this, and he's right.
Measure req/sec, RAM, CPU, Sessions, etc.
You may come up with a caching strategy (Output caching, data caching, caching dependencies, and so on.)
See also how your SQL Server is doing... indexes are a good place to start but not the only thing to look at..
On that hardware, a .NET application should be able to serve about 200-400 requests per second. If you have only a few hundred users, I doubt you are seeing even 2 requests per second, so I think you have a lot of capacity on that box, even with SQL server running.
Without know all of the details, I would say no, you will not see any performance improvement by adding servers.
By the way, if you're not using the Output Cache, I would start there.

Resources