Lock convoys in ASP.NET website - asp.net

TL;DR:
Made refactoring for performance, website got slower. Ran the Concurrency Visualizer, the graph looks like the lock convoys as described on MSDN.
Context
I’m helping with refactoring an ASP.NET website to switch user controls from performing business logic on datasets to perform presentation logic on business objects and also reduce database calls made from the user controls.
The issue
We have noticed a significant performance drop (hangs/blockings) after introducing changes involving what we thought would be performance improvements in multiple areas.
We’re using Lean Sentry to monitor our websites’ performance. According to the hang diagnostics, the thread pool was running out of threads and (according to the descriptions on the diagnostics page) when GC runs, it stops more threads from being created. The GC Heap and Gen 0 were consuming a lot of memory (~ 9 GB), according to the memory diagnostics.
What I did so far?
I used memory profiler in Visual Studio and identified issues with our excessive DataAdapter and DataTable usage. Memory consumption dropped to 3 GB but that only helped with GC blocking. It is still slower than it had been before we introduced the changes and I still see blocking on high load caused by functions like CompilationLock.GetLock() and BuildManager.GetBuildResultFromCacheInternal(). Googling them didn’t return anything useful.
This is a website that uses JIT compilation. I had assumed that the issue with CompilationLock might be because of JIT compiling and wanted to run the website precompiled, but one of our global Utilities classes caused ambiguity with some other Utilities class/namespace that I don’t know of. I’ve found out that there is a Microsoft.Build.Utilities namespace, but it’s not referenced in our website and I can’t reproduce the ambiguity in my own environment when I reference Microsoft.Build myself, so I couldn’t get the website running on precompiled mode on the staging server to test this theory.
I made additional changes on memory allocation and the amount of database calls, using Visual Studio’s memory allocation and instrumentation profilers as a measure, but I didn’t notice any progress on performance.
I used a concurrency profiler to gather more information on thread utilization. I haven’t used this tool before, so I’m not sure about my interpretations here. There are multiple threads in each handle and in one handle I’m seeing 42% contention. I see that the DataAdapter.Fill and SqlHelper.ExecuteReader methods show up most when it’s set to “Show Just My Code” and WaitForSingleObjectExImplementation shows up most when it’s set to “Show All Code”.
I encountered a SO question about ASP.NET websites’ performance issues and set EnableSessionState="ReadOnly" for each page, but I didn’t notice difference with this change, either.
Concurrency Visualizer and Common Patterns for Poorly-Behaved Multithreaded Applications helped me identify the issue. My graph doesn’t seem like serial execution, but I see 80–90% synchronization as shown in Lock Convoys graph. I checked out a SO question on lock convoys debugging, too.
Testing approach
I’m using Screaming Frog to crawl the website in order to reproduce the issues and taking numbers of requests per second and response times in both Screaming Frog and Lean Sentry as a performance measure. It might not be the best way but the difference is noticeable, reproducible and it’s pretty much all I have at this point.
Architecture of the website
The website was originally coded in VB.NET for .NET Framework 1.0 about 10 years ago, and upgraded to .NET Framework 4.6.1 by fixing some compatibility issues. There haven’t been any architectural changes so far. There is a shared SqlHelper class, which is a collection of shared data access functions like ExecuteDataset or ExecuteDatareader, that return either a DataSet, DataReader or String value. These functions read the connection string information from the web.config file and create a new SqlConnection, SqlDataAdapter, SqlDataReader and SqlCommand object to perform the database operations. The Data Access Layer that consumes this shared class consists of classes for each module like shopping cart, category, product, etc. to be instantiated in each user control and they consist of functions that represent stored procedures in the database.
The refactoring
We have introduced some new objects to be instantiated either inside page load of the related user control, or inside OnItemDataBound event of repeaters and attached to its child user controls’ public properties, which are refactored to use the object. However, there are still other child user controls that need multiple data tables, so we decided to store one of the data tables in one of the objects and pass it to related user controls by assigning it to their public properties.
I guess that we hurt performance by introducing these objects. Even though database calls and memory consumption seem to be reduced, I’m wondering if the objects are causing threads to be synced all the time.
The graph before any refactoring happened:
The graph after all the refactoring I mentioned applied:
Will you help me identify the issue?

Your problem is rather complex. I think that you have two basic options to resolve your refactoring performance issues:
Revert changes to the code to a point where all or much of the refactoring hadn’t yet been done and when you had better performance than what you are currently experiencing. Then, proceed gradually with the addition of new classes for performance improvements. If a change does not improve performance, then undo it and try something else.
Replace some / much of the newly added classes with versions that support the interfaces but lack the performance overhead. Do this selectively to isolate where the performance issues exist. Perhaps, the website has tapped into an unknown performance bug that was not triggered by prior implementations of the added classes.
I would favor option 1, though it may seem counterproductive. It is a bit like punting in U.S. football. Sure, it is nice to just drive down the field. But sometimes the dominant strategy is to punt, get the ball back and try to score on another drive.

Related

Steps for profiling performance of a website running on azure

I am quite new to running websites in general. I am familiar with statistical profilers for desktop applications, but unsure how to even begin profiling a website as there are a lot of additional potential bottlenecks and I'm not sure what profilers are available for websites.
I have looked around and seen useful suggestions in other questions, but I am not sure they are a very complete solution. The main suggestions are azure performance counters and suggestions from this answer.
Summarizing they are:
Use firebug to determine rendering time and loading time seperately so one can tell whether one has a rendering issue or a server issue.
If server side:
Test a small static page like a page with a single gif. If that is slow one has a CPU issue. Otherwise one is probably IO bound or has problems with database performance.
One can use performance counters to check server aspects such as:
memory
garbage collection
tcp/ip issues
bytes sent / recieved
requests requested, queued, rejected
request wait time, processing time
From my naive standpoint some things that seem to be missing from this list are the sort of profiling one has for a traditional desktop application, i.e. what the stack looked like what percentage of the time (i.e. what functions were we spending time in, and in what context). Another missing item is profiling the database performance, which seems like it may be different on azure than in a local environment especially if one starts dealing with scaling. Another is time spent on requests to third party services, though maybe that can be done with azure performance counters(?).
I apologize for the naive nature of this question. What tools and aspects am I missing here to profile an azure MVC asp.net website and what changes would you make to the above list?
There's a lot of aspects to profiling a site, in terms of database calls, business logic, rendering a view, and even client side performance (any jQuery that might run, for example).
StackOverflow's MiniProfiler is one of the easiest things to get going, just install a NuGet package, add some Javascript includes, and wrap whatever you want to test inside a using() block, and you'll see execution times (including LINQ-to-SQL and EF). You can even create steps if you want finer grained timings of individual calls.
The nice thing about MiniProfiler is you can enable/disable based on the environment, which makes it suitable for running inside Azure (as opposed to say, the Visual Studio Profiler).
You can also look at Azure Performance Counters, which will give you an idea of system resources, but isn't profiling in the sense that MiniProfiler is. It will however give you an idea of network latency and CPU and memory utilization.
Once you're satisfied there, you can use Chrome's Developer Tools to profile your application on the client side. It'll give you an idea of how well your Javascript is doing, including CSS selectors and rendering.
Also worth noting, Visual Studio has a really good Profiler in some higher editions that can give you deep insights into your code. Time spent in methods, call counts, etc.
Between these four methods, you should be able to find most bottlenecks, especially for a first pass.

refactoring some legacy code - potential memory leaks

Due to some memory leaks, some memory is not released except when doing IISreset.
I found some code where I have a class of properties and methods, where only 10% of the methods are specific for the class, 90% can be moved to another class.
how bad is that? is this is affecting my memory, because I am initiating this class with every user using the application?
It is a ASP.NET Application on IIS6.
I suppose if the methods require no instance state but require an instance to call (i.e. they are instance methods that never reference 'this'), they could be made into static methods and save you one object allocation. However, if you have to allocate the object anyway (as it sounds like you do, for the other "10% of functionality" you mention) that doesn't sound like it will fix the issue.
You should probably do an analysis of memory usage in your application using the debugger. Rico Mariani has a pretty good blog post about tracking down managed memory leaks here: http://blogs.msdn.com/b/ricom/archive/2004/12/10/279612.aspx . It is old but still relevant. (Note: If you happen to be using .NET 4.0 you'll need to do ".loadby sos clr" instad of ".loadby sos mscorwks" to load the SOS debugger extension in WinDBG.)

EF 4.1 N Tier ASP .Net confusion

I hope you can help me out with any advice, it'd be greatly appreciated.
I've been using EF 4.0 for a while now using the following object context management technique http://dotnetslackers.com/articles/ado_net/Managing-Entity-Framework-ObjectContext-lifespan-and-scope-in-n-layered-ASP-NET-applications.aspx . I have a fairly simple setup with a web project connecting to a BLL connecting to a DAL. The web and BLL reference the DAL Entity objects. It's been functioning fine but it seems very slow. It's an ASP .NET Webforms application that uses an existing database model (i.,e not code first) and points to a SQL Server 2005 DB.
Anyway, I'm now revisiting the architecture as we're getting complaints over screen to screen performance. I've done most of the UI enhancements possible but I think it's just the Save, Redirect and Load using EF that's the sluggish point now.
The site is a series of quote pages for car insurance. I'm now hoping to do create the relevant objects in session, i.e, page 1 create quote object, populate fields, page 2, add X additional drivers, page 3 add claims /convictions and then save the objects to the database. The users will be able to save and exit at any point in the quote process so the save won't always be at the end. We also need to be able to load page 1's info and update it in memory when they click Next and finally update the DB when they're ready to finish the quote.
At the moment, we're doing the retrieve and saves on the same page. How would you advise we move to the storing in session/final commit?
I've trawled through various msdn pages and I'm having trouble putting it all together into the latest 'best practice' for 4.1 for this fairly simple application. I've looked at Julie Lerman's videos but her n'tier only did a simple retrieve and 'Add', suspiciously leaving out the Update section as I suspect it is not straightforward. Do you think I should use Self Tracking Entities (I've read that people are having multiple issues with this but maybe their architecture is more complex?) or some other way of storing the EF objects in session and making the changes
Any help/ideas greatly appreciated.
Storing data in Session can be an effective method to minimise IO, but it's not a simple decision. You shouldn't put large amounts of data in Session State and we can't tell from your description what kind of volume of data you're talking about, or the number of concurrent active sessions.
There's also the question of your Session State provider. If you're using a single server, you'll probably use in-process Session State, which is quick, but if you have lots of users and lots of data, you can soon run into memory pressure issues.
If you're using a web farm, you'll have to use shared Session State, possibly using the SQL Server Session State provider, so you'll end up reading from and writing to a database on every interaction, which could be worse.
However, step 1 is to make sure you understand the problem. Don't make assumptions about where your performance problems are and try to redesign those. Use profiling or instrumentation techniques to identify the real bottlenecks and concentrate your efforts on those.
You might be surprised as to where your problems lie. It may well simply be a database optimisation issue.

How do I track down sporadic ASP.NET performance problems in a production environment?

I've had sporadic performance problems with my website for awhile now. 90% of the time the site is very fast. But occasionally it is just really, really slow. I mean like 5-10 seconds load time kind of slow. I thought I had narrowed it down to the server I was on so I migrated everything to a new dedicated server from a completely different web hosting company. But the problems continue.
I guess what I'm looking for is a good tool that'll help me track down the problem, because it's clearly not the hardware. I'd like to be able to log certain events in my ASP.NET code and have that same logger also track server performance/resources at the time. If I can then look back at the logs then I can see what exactly my website was doing at the time of extreme slowness.
Is there a .NET logging system that'll allow me to make calls into it with code while simultaneously tracking performance? What would you recommend?
Every intermittent performance problem I ever had turn out to be caused by something in the database.
You need to check out my blog post Unexplained-SQL-Server-Timeouts-and-Intermittent-Blocking. No, it's not caused by a heavy INSERT or UPDATE process like you would expect.
I would run a database trace for 1/2 a day. Yes, the trace has to be done on production because the problem doesn't usually happen in a low use environment.
Your trace log rows will have a "Duration" column showing how long an event took. You are looking at the long running ones, and the ones before them that might be holding up the long running ones. Once you find the pattern you need to figure out how things are working.
IIS 7.0 has built-in ETW tracing capability. ETW is the fastest and least overhead logging. It is built into Kernel. With respect to IIS it can log every call. The best part of ETW you can include everything in the system and get a holistic picture of the application and the sever. For example you can include , registry, file system, context switching and get call-stacks along with duration.
Here is the basic overview of ETW and specific to IIS and I also have few posts on ETW
I would start by monitoring ASP.NET related performance counters. You could even add your own counters to your application, if you wanted. Also, look to the number of w3wp.exe processes running at the time of the slow down vs normal. Look at their memory usage. Sounds to me like a memory leak that eventually results in a termination of the worker process, which of course fixes the problem, temporarily.
You don't provide specifics of what your application is doing in terms of the resources (database, networking, files) that it is using. In addition to the steps from the other posters, I would take a look at anything that is happening at "out-of-process" such as:
Databases connections
Files opened
Network shares accessed
...basically anything that is not happening in the ASP.NET process.
I would start off with the following list of items:
Turn on ASP.Net Health Monitoring to start getting some metrics & numbers.
Check the memory utilization on the server. Does re-cycling the IIS periodically remove this issue (memory leak??).
ELMAH is a good tool to start looking at the exceptions. Also, go though the logs your application might be generating.
Then, I would look for anti-virus software running at a particular time or some long running processes which might be slowing down the machine etc., a database backup schedule...
HTH.
Of course ultimately I just want to solve the intermittent slowness issues (and I'm not yet sure if I have). But in my initial question I was asking for a rather specific logger.
I never did find an answer for that so I wrote my own stopwatch threshold logging. It's not quite as detailed as my initial idea but it has the benefit of being very easy to apply globally to a web application.
From my experience performance related issues are almost always IO related and is rarely the CPU.
In order to get a gauge on where things are at without writing instrumentation code or installing software is to use Performance Monitor in Windows to see where the time is being spent.
Another quick way to get a sense of where problems might be is to run a small load test locally on your machine while a code profiler (like the one built into VS) is attached to the process to tell you where all the time is going. I usually find a few "quick wins" with that approach.

Concurrency ASP.NET best-practices worst-practices

In which cases to you need to watch out for Concurrency problems (and use lock for instance) in ASP.NET?
Are there 'best practices' around on this topic
Documentation?
Examples?
'worst practices...' or things you've seen that can cause a disaster...?
I'm curious about for instance singletons (even though they are considered bad practice - don't start a discussion on this), static functions (do you need to watch out here?), ...?
Since ASP.NET is a web framework and is mainly stateless there are very few concurrency concerns that need to be addressed.
The only thing that I have ever had to deal with is managing application cache but this is easily done with a cache-management type that wraps the .NET caching mechanisms.
One huge problem that caused us a lot of grief was using Modules vs. Classes in our main Web Service. This was before we really knew what we were doing and has since been fixed.
The big problem with using modules is that by default any module level variables are visible to every instance of the ASP worker process. We pass in multiple datasets and manipulate them then return them to the client. Because we were using modules the variables holding these datasets were getting corrupted by multiple calls occuring at one time.
This was not caught in testing and was difficult to reproduce until we figured out how to properly load test our web services. It took something like 10-20 requests per second before we could reproduce it accurately.
In the end, we just changed all the modules to classes, and then used those classes instead of calls to the modules, this cleared up this concurrency issue as each instantiated class had its own copy of the dataset in memory.

Resources