Why would bulk Inserts cause an ASP.net application to become Unresponsive? - asp.net

Setup: ASP.net 3.5, Linq-to-Sql. Separate Web and DB servers (each 8-core, 8GB RAM). 4 databases. I am running an insert operation with a few million records into DB4 (using Linq-to-Sql for now, though I might switch to SqlBulkCopy). Logging shows that records are being put in consistently at a rate of 600-700 per second (I am running DataContext.SubmitChanges() every 1000 records to keep the transaction size down). The insert is run during one Http Request (timeout is set pretty high).
The problem is that while this insert operation is running, the web application becomes completely unresponsive (both within different browser windows on my machine, and on other browsers in remote locations).
This insert operation is touching one table in DB4. Most pages will only touch DB1 (so I don't think that it is a locking issue - I also checked in through Management Studio, and no objects are being locked unnecessarily). I have checked out performance stats on both the Web and DB servers, and while they may spike from time to time, throughout the inserts they stay well within the "green".
Any idea about what can be causing the app to become unresponsive or suggestions about things that I should do in order to narrow down the issue?
Responses to suggestions:
Suggestion that inserts are using all DB connections: the inserts are being done off of a different connection string (and DB) than what other pages in the app use. Also, I checked in SSMS, and there is just one connection open for DB4, and one open for DB1 (so it doesn't look like it is running out of connections).
Suggestion that inserts are maxing out CPU on web server: this is the only application on the server (and less than 5 users at any one time). Performance monitor shows CPU staying in between 12%-20%. Memory is hardly being touched.

My first guess would be that you are using up available connections to the database with the insert operations that you are doing and the web applications are waiting to get a connection to the database.
You have a few options.
Look in SSMS and see what you have for open and active connections under regular load and when doing the inserts see if that is a problem.
Use a profiling tool such as ANTS profiler to see what is going on with the web application at the time of the slow down, it might help pinpoint the issue.
You could also try manually executing the queries that the web application is using, on the SQL Server and see if you notice a similar behavior.
The other option, a bit less likely, but it could be that the web application doing the bulk insert, it taking all of the CPU time from the other web applications on the server, preventing use. If you haven't done so already, split out the application to its own pool so you can monitor its load.

I don't know about Linq-to-Sql, but NHibernate specifically states that using it for bulk inserts is a bad idea. I have found Array Binding in ADO.NET to be very fast, Here is an article explaining how to do it with Oracle, but it should work with other providers too.

Seems like it is bad idea to do long operations in web app (for example, your IIS server can restart your application for next to no reason). Split your long application into Web App and Service App. Do long operations in Service App. Communicate between them via WCF & pipes.

Eventual Solution: I changed the data insertions from LinqToSql to use SqlBulkCopy via DataTable. The first time I did this, I got an OutOfMemory exception when trying to build a DataTable with 2 million rows in memory. So I am adding 50,000 rows at a time, and loading them into the DB with SqlBulkCopy (Batch Rate: 10,000) and then clearing the DataTable Rows collection. I am now getting in 2.1 million rows in 108 seconds (About 20,000 per second; Rate rate last night was average of 200 per second with L2S). With the increased data insertion performance, the app-wide unresponsiveness has gone away.

It possible what you have a lock statement some where in you web application what blocking some important resurse during the whole time you loading you data into DB.

Related

100% CPU Usage in ASP.Net

After deploying a new version of a hybrid asp.net web application, Framework 4.5.1, IIS 7.5, we immediately noticed that CPU usage was spiking to 100%.
I followed CPU spike debugging using DebugDiag as described in this article: http://www.iis.net/learn/troubleshoot/performance-issues/troubleshooting-high-cpu-in-an-iis-7x-application-pool
I now have my report, and every one of the threads identified as High CPU usage problems look like this, with varying thread numbers:
Thread 1576 - .SNIReadSyncOverAsync(SNI_ConnWrapper*, SNI_Packet**, Int32)
I'm guessing this means the culprit is a LINQ to SQL call. The application uses a lot of LINQ to SQL. Unfortunately the DebugDiag report gives no clue as to which LINQ to SQL call is causing the difficulty.
Is there any way to use the information in the DebugDiag report to identify the SQL Server calls that causes the High CPU usage?
We never did find an answer to the question. I was hoping for an answer that would tell us what we could add to the performance monitor data collection to see the actual SQL that was being passed by the threads that were spiking CPU.
Instead we ran SQL Server performance monitor, duly filtered to cover only traffic from the web application, for about a minute. We dumped all the data collected into a table, then examined statement start and end times to identify statements that were taking an inordinate amount of time. From this collection of sluggish statements we identified the SQL call that was spiking CPU.
Oddly enough, the SQL call (selecting the results of an Inline Table-Valued Function) takes 2-3 seconds to complete, but most of that time is taken with sql server breaking the connection (sp_reset_connection). The call itself returns in less than a millisecond, and when we execute the same function in SSMS using identical parameters the call executes in less than a millisecond. However, this will be the topic of a separate question.

Where do I cache ASP.NET data to avoid Application state being reset?

I have an application that performs complex queries against what amounts to data organized in a "star schema". The gold-owner keeps adding new "axes" to perform searches on, with the result that performance becomes worse over time. Currently, the execution of a search operation, using a stored procedure to do all the work on the SQL server, takes about 2 seconds, which doesn't fit the gold-owner's desire to have the code be interactive (<0.1 sec response time). Looking at the SQL Server query analyzer, the search is IO-bound on 9 table scans of 100,000 records, and then doing brutal joins. Due to the nature of the queries I need to perform and the limitations of SQL, this cannot be improved.
In desperation, I've rewritten the query processor so that it sucks in the 100,000 records into a cache at application start, then perform the complex queries against the cached memory. Loading all the records from the database takes about 12 seconds. This expensive initial load is mitigated by my rewritten query processor. It now only needs to do a single scan through the records, and gives a response time of 0.02 seconds.
This good news is tainted by the gold-owner's discovery that the 12-second hit for populating the cache is being experienced every hour or so. I'm currently storing the data in the ASP.NET application state, as Application["FactTable"]. It seems the application state is being reset after the ASP.NET application is idle for longer than a dozen minutes or so.
If I move the 100,000 records into the ASP.NET application cache, will I be experiencing these evictions just as often, or can I rely on the data remaining in memory for the fast retrievals for longer periods of time? If the ASP.NET cache is also victim to application resets, what other mechanism should I use? A separate app domain hosting an instance of my database cache comes to mind, but I don't want to go down that route unless my other options are closed off.
I realise that you have a lot of data and processing and you must have tried a few things to speed this scenario up, but using Application State which is managed by IIS will be volatile...
Have you thought of running the your calculations etc in another process, ie, create a windows service that periodically runs the queries to organise your data and save that "flat" data to a database cache. When the user requests the data, they will just get the last DB cached results... and then further speed this up by holding those results in the Application state which can just refresh itself if that gets destroyed?

ASP Website runs slow when number of users Increases

I need some information from you.I have used session.TimeOut=540 in application.Is that effects on my Application performance after some time.When number of users increases its getting very slow. response time nearly more that 2 minutes for a button click also.This is hosted in server in Application pool .I don't know about Application pool much.If Session Timeout is the problem i will remove it.Please suggest me the way to for more users.
Job Numbers,CustomerID,Tasks will come from one database.when the user click start Button then the data saved in another Database.I need this need to be faster for more Users
I think that you have some page(s) that make some work that takes time, or for some reason or a bug is keep open for more time than the usual.
This page is keep lock the session and hold the rest page from response because the session holds all the pages.
Now, together with the increase of the timeout this page is lock everything and here is you response time near to 2 minutes.
The solution is to locate the page that have the long running problem and fix it or make it faster by optimize the process, or if this page must keep the long time running, then disable the session for that one.
relative:
Web app blocked while processing another web app on sharing same session
What perfmon counters are useful for identifying ASP.NET bottlenecks?
Replacing ASP.Net's session entirely
Trying to make Web Method Asynchronous
Does ASP.NET Web Forms prevent a double click submission?
About server
Now from the other hand, if your server suffer from hardware, or bad setup then here is one other answer with points that you need to check to make it faster.
Find out where the time is spent
add the StopWatch in the method which you said "more that 2 minutes for a button click". you can find which statment spent the most time.
If it is a query on DB that cost time. Check your sql statement.
are you using "SELECT Count(*)" instead of "SELECT Count(Id)"? the * is always slower. also, don't try "SELECT * FROM...."
Use cache.
there are many ways to do cache. both in ASPX pages and your biz layer.
the OutputCache is the most easy way.
and also, cache the page (for example a blog post) on the first time when a user visit it.
Did you use memory paging?
be careful when doing paging on gridview or other list. If you just call DataSource=xxx and DataBind(), even with PagedDataSource, this is likely a memory paging. It cost a lot of performance. Please use stored procedures to do paging.
Check your server environment
where did you deploy the website? many ISP will limit brandwide and IIS connection count and also CPU time to your account.
if you have RD access to your server. you can watch CPU and memory usage to see if they are high when many user comes to your site. If the site is slow and neither CPU nor memory useage is high, it may be a network brandwide problem.
Here are some simple steps to narrow down the issue -
1) Get HTTPWatch (theres a free Basic version) available and check whats really taking time from an end user perspective. Look at number of requests, number of resources downloaded, and the payload. If there is nothing to worry move on to next
2) If its not client, then its usually the processing time on the server. Jump on to DB first - since this is quite easier to eliminate quickly. Look at how many DB calls are made (run profiler in staging or dev) and see if there are any long running queries, missing indexes or statistics, and note the IO. If all is well, move on
3) Check your app code. You could get on with VS.NET in build profiler or professional tools such as Ants. If code is fine then its your network or external calls that you make, check your network bandwidth. If you still cannot narrow down, check your environment/hardware
The best way to get to it is to apply load - You could use simple tools such as ab.exe (that comes as part of Apache Web server) to have concurrent hits on your server and run the App, DB profilers in the background to get to the issue.
Hope this helps!

Live Data Web Application Design

I'm about to begin designing the architecture of a personal project that has the following characteristics:
Essentially a "game" containing several concurrent users based on a sport.
Matches in this sport are simulated on a regular basis and their results stored in a database.
Users can view the details of a simulated match "live" when it is occurring as well as see results after they have occurred.
I developed a similar web application with a much smaller scope as the previous iteration of this project. In that case, however, I chose to go with SQLite as my DB provider since I also had a redistributable desktop application that could be used to manually simulate matches (and in fact that ran as a standalone simulator outside of the web application). My constraints have now shifted to be only a web application, so I don't have to worry about this additional level of complexity.
My main problem with my previous implementation was handling concurrent requests. I made the mistake of using one database (which was represented by a single file on disk) to power both the simulation aspect (which ran in a separate process on the server) and the web application. Hence, when users were accessing the website concurrently with a live simulation happening, there were all sorts of database access issues since it was getting locked by one process. I fixed this by implementing a cross-process mutex on database operations but this drastically slowed down the performance of the website.
The tools I will be using are:
ASP.NET for the web application.
SQL Server 2008 R2 for the database... probably with an NHibernate layer for object relational mapping.
My question is, how do I design this so I will achieve optimal efficiency as well as concurrent access? Obviously shifting to an actual DB server from a file will have it's positives, but do I need to have two redundant servers--one for the simulation process and one for the web server process?
Any suggestions would be appreciated!
Thanks.
You should be fine doing both on the same database. Concurrent access is what modern database engines are designed for. Concurrent reads are usually no problem at all; concurrent writes lock the minimum possible amount of data (a table, or even just a number of rows), not the entire database.
A few things you should keep in mind though:
Use transactions wisely. On the one hand, a transaction is an important tool in making sure your database is always consistent - in short, a transaction either happens completely, or not at all. On the other hand, two concurrent transactions can cause deadlocks, and those buggers can be extremely hard to debug.
Normalize, and use constraints to protect your data integrity. Enforcing foreign keys can save the day, even though it often leads to more cumbersome administration.
Minimize the amount of time spent on data access: don't keep connections around when you don't need them, make absolutely sure you're not leaking any connections, don't fetch data you know don't need, do as much data-related processing (especially things that can be solved using joins, subqueries, groupings, views, etc.) in SQL instead of in code

How many is too many databases on SQL Server?

I am working with an application where we store our client data in separate SQL databases for each client. So far this has worked great, there was even a case where some bad code selected the wrong customer ids from the database and since the only data in the database belonged to that client, the damage was not as bad as it could have been. My concerns are about the number of databases you realistically have on an SQL Server.
Is there any additional overhead for each new database you create? We we eventually hit a wall where we have just to many databases on one server? The SQL Server specs say you can have something like 32000 databases but is that possible, does anyone have a large number of database on one server and what are the problems you encounter.
Thanks,
Frank
The upper limits are
disk space
memory
maintenance
Examples:
Rebuilding indexes for 32k databases? When?
If 10% of 32k databases each has a active set of 100MB data in memory at one time, you're already at 320GB target server memory
knowing what DB you're connected too
...
The effective limit depends on load, usage, database size etc.
Edit: And bandwidth as Wyatt Barnett mentioned.. I forgot about network, the bottleneck everyone forgets about...
The biggest problem with all the multiple databases is keeping them all in synch as you make schema changes. As far as realistic number of databases you can have and have the system work well, as usual it depends. It depends on how powerful the server is and how large the databases are. Likely you would want to have multiple servers at some point not just because it will be faster for your clients but because it will put fewer clients at risk at one time if something happens to the server. At what point that is, only your company can decide. Certainly if you start getting a lot of time-outs another server might be indicated (or fixing your poor queries might also do it). Big clients will often pay a premium to be on a separate server, so consider that in your pricing. We had one client so paranoid about their data we had to have a separate server that was not even co-located with the other servers. They paid big bucks for that as we had to rent extra space.
ISPs routinely have one database server that is shared by hundreds or thousands of databases.
Architecturally, this is the right call in general. You've seen the first huge advantage--oftentimes, damage can be limited to a single client and you have near zero risk of a client getting into another client's data. But you are missing the other big advantage--you don't have to keep all the clients on the same database server. When you do get big enough that your server is suffering, you can offload clients onto another box entirely with minimal effort.
I'd also bet you'll run out of bandwidth to manage the databases before your server runs out of steam to handle more databases . . .
What you are really asking about is Scalability; Though, ideally setting up 32,000 Databases on one Server is probably not advantageous it is possible (though, not recommended).
Read - http://www.sql-server-performance.com/articles/clustering/massive_scalability_p1.aspx
I know this is an old thread but it's the same structure we've had in place for the past 2 years and current run 1768 databases over 3 servers.
We have the following setup (not included mirrors and so on):
2 web farm servers and 4 content servers
SQL instance just for a master database of customers, which is queried when they access their webpage by the ID to get the server/instance and database name which their data resides on. This is then stored in the authentication ticket.
3 SQL servers to host customer databases on with load spread on creation based on current total number of learners that exist within all databases on each server (quickly calculated by license number field in master database).
On each SQL Server there is a smaller master database setup which contains shared static data that is used by all clients, therefore allowing smaller client databases and quicker updating of the content.
The biggest thing as mentioned above is keeping the database structures synchronises! For this I ended up programming a small .NET windows form that looks up all customers in the master database and you paste code in to execute and it'll loop through getting the database location and executing the SQL you past.
Creating new customers also caused some issues for us, so I ended up programming a management system for our sale people and it create a new database based on a backup of a inactive "blank" database, therefore we have the latest DB without need to re-script the entire database creation script. It then inserts the customer details inside the master database with location of where the database was created and migrates any old data from an old version of our software. All this is done on a separate instance before moving, therefore reducing any SQL locks.
We are now moving to a single database for our next version of the software as database redundancy is near impossible with so many databases! This is a huge thing to consider as SQL creates a couple of waiting tasks which mirror your data per database, once you start multiplying the databases it gets out of hand and the system almost solely is tasked with synchronising and can lock up due to the shear number of threads. See page 30 of Microsoft document below:
SQLCAT's Guide to High Availability Disaster Recovery.pdf
I do however have doubts about moving to a single database, due to some concerns as mentioned above, such as constantly checking in every single procedure that the current customer has access to only their data and also things along the lines of one little issue will now affect every single database, such as table indexing and so on. Also at the minute our customer are spaced over 3 servers, but the single database will mean yes we have redundancy, but if the error was within the database rather than server going down, then that's every single customer down, not just 1 customer database.
All in all, it depends what you're doing and if you are wanting the redundancy; for me, the redundancy is now key and everything else in a perfect world shouldn't happen (such as error which causes errors within the database for everyone). We only started off expecting a hundred or so to move to the system from the old self hosted software and that quickly turned into 200,500,1000,1500... We now have over 750,000 users use our system each year and in August/September we have over 15,000 concurrent users online (expecting to hit 20,000 this year).
Hope that this is of help to someone along the line :-)
Regards
Liam

Resources