Last night I did a load test on a site. I found that one of my shared caches is a bottleneck. I'm using a ReaderWriterLockSlim to control the updates of the data. Unfortunately at one point there are ~200 requests trying to update the data at approximately the same time. This also coincided with CPU usage spikes.
The data being updated is in the ASP.NET Cache. What I'd like to do is if the CPU usage is around 75%, I'd like to just skip the cache and hit the database on another machine.
My problem is that I don't know how expensive it is to create a new performance counter to check the cpu usage. Also, if I would probably like the average cpu usage over the last 2 or 3 seconds. However, I can't sit there and calculate the cpu time as that would take longer than it's taking to update the cache currently.
Is there an easy way to get the average CPU usage? Are there any drawbacks to this?
I'm also considering totaling the wait count for the lock and then at a certain threshold switch over to the database. The concern I had with this approach would be that changing hardware might allow more locks with less of a strain on the system. And also finding the right balance for the threshold would be cumbersome and it doesn't take into account any other load on the machine. But it's a simple approach, and simple is 99% of the time better.
This article from Microsoft covers Tuning .Net Application Performance and highlights which counters to collect and compare to determine CPU and I/O bound applications.
You sound like you want to monitor this during execution and bypass your cache when things get intensive. Would this not just move the intensive processing from the cache calls to your database calls? Surely you have the cache to avoid expensive DB calls.
Are you trying to repopulate an invalidated cache? What is the affect of serving stale data from the cache? You could just lock on the re-populating function and serve stale data to other requests until the process completes.
Based on the above article, we collect the following counter objects during our tests and that gives us all the necessary counters to determine the bottlenecks.
.NET CLR Exceptions
.NET CLR Memory
ASP.NET Applications
ASP.NET
Memory
Paging File
Processor
Thread
The sections in the article for CLR Tuning and ASP.NET Tuning highlight the bottlenecks that can occur and suggest configuration changes to improve performance. We certainly made changes to the thread pool settings to get better performance.
Changing and Retrieving Performance Counter Values might help with accessing the existing Processor counter via code but this isn't something I've tried personally.
Related
We have a problem on our website, seemingly at random (every day or so, up to once every 7-10 days) the website will become unresponsive.
We have two web servers on Azure, and we use Redis.
I've managed to run DotNetMemory and caught it when it crashes, and what I observe is under Event handlers leak two items seem to increase in count into the thousands before the website stops working. Those two items are CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy. Once the site crashes, we get lots of Redis exceptions that it can't connect to the Redis server. According to Azure Portal, our Redis server load never goes above 10% in peak times and we're following all best practises.
I've spent a long time going through our website ensuring that there are no obvious memory leaks, and have patched a few cases that went under the radar. Anecdotally, these seem to of improved the website stability a little. Things we've checked:
All iDisposable objects are now wrapped in using blocks (we did this strictly before but we did find a few not disposed properly)
Event handlers are unsubscribed - there are very few in our code base
We use WebUserControls pretty heavily. Each one had the current master page passed in as a parameter. We've removed the dependency on this as we thought it could cause GC to not collect the page perhaps
Our latest issue is that when the web server runs fine, but then we run DotNetMemory and attach it to the w3wp.exe process it causes the CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy event leaks to increase rapidly until the site crashes! So the crash is reproducible just by running DotNetMemory. Here is a screenshot of what we saw:
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base, and our "solution" is to have the app pools recycle every several hours to be on the safe side.
We've even tried upgrading Redis to the Premium tier, and even upgraded all drives on the webservers to SSDs to see if it helps things which it doesn't appear to.
Can anyone shed any light on what might be causing these issues?
All iDisposable objects are now wrapped in using blocks (we did this
strictly before but we did find a few not disposed properly)
We can't say a lot about crash without any information about it, but I have some speculations about it.
I see 10 000 (!) not disposed objects handled by finalization queue. Let start with them, find all of them and add Dispose call in your app.
Also I would recommend to check how many system handles utilized by your application. There is an OS limit on number of handles and if they are exceeded no more file handles, network sockets, etc can be created. I recommend it especially since the number of not disposed objects.
Also if you have a timeout on accessing Redis get performance profiler and look why so. I recommend to get JetBrains dotTrace and use TIMELINE mode to get a profile of your app, it will show thread sleeping, threads contention and many many more information what will help you to find a problem root. You can use command line tool to obtain profile data, in order not to install GUI application on the server side.
it causes the CaliEventHandlerDelegateProxy and
ArglessEventHandlerProxy event leaks to increase rapidly
dotMemory doesn't change your application code and doesn't allocate any managed objects in profiled process. Microsoft Profiling API injects a dll (written in c++) into the profiling process, it's a part of dotMemory, named Profilng Core, playing the role of the "server" (where standalone dotMemory written in C# is a client). Profiling Core doing some work with gathered data before sending it to the client side, it requires some memory, which allocated, of course, in the address space of the profiling process but it doesn't affect managed memory.
Memory profiling may affect performance of your application. For example, profiling API disables concurrent GC when application is under profiling or memory allocation data collecting can significantly slow down your application.
Why do you thing that CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy are allocated only under dotMemory profiling? Could you please describe how you explored this?
Event handlers are unsubscribed - there are very few in our code base
dotMemory reports an event handler as a leak means there is only one reference to it - from event source at there is no possibility to unsubscribe from this event. Check all these leaks, find yours at look at the code how it is happened. Anyway, there are only 110.3 KB retained by these objects, why do you decide your site crashed because of them?
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base
Take several snapshots in a period of time when memory consumption is growing, open full comparison of some of these snapshots and look at all survived objects which should not survive and find why they survived. This is the only way to prove that your app doesn't have memory leak, looking the code doesn't prove it, sorry.
Hope if you perform all the activities I recommend you to do (performance profiling, full snapshots and snapshots comparison investigation, not only inspections view, checking why there are huge amount of not disposed objects) you will find and fix the root problem.
I am working on ASP.NET project and yesterday I saw a piece of code that uses System.Threading.Thread to offload some tasks to a new thread. The thread runs a few SQL statements and logs the result.
Isn't it better to use another approach? For example to have a Windows Service that performs the SQL batch. Then the web page will just enqueue the batch (via WCF).
In general, what are the best practices for multithreading in ASP.NET? Are there justified usages of threads/TPL tasks/etc. in a web page?
My thought when using multi-threading in ASP.NET:
ASP.NET recycles AppDomain for some reasons like you change web.config or in the period of time to avoid memory leak. The thing is you don't know which exact time of recycle. Long running thread is not suitable because when ASP.NET recycles it will take your thread down accordingly. The right approach of this case is long running task should be running on background process via Queue, like you mention.
For short running and fire and forget task, TPL or async/await are the most appropriate because it does not block thread in thread pool to utilize for HTTP requests.
In my opinion this should be solved by raising some kind of flag in the database and a Windows service that periodically checks the flag and starts the job. If the job is too frequent a dedicated queue solution should be used (MSMQ, RabbitMQ, etc.) to avoid overloading the database or the table growing too fast. I don't think communicating directly with the Windows service via WCF or anything else is a good idea because this may result in dropped messages.
That being said sometimes a project needs to run in a shared hosting and cannot setup a dedicated Windows service. In this case a thread is acceptable as a work around that should be removed as soon as the project grows enough to have its own server.
I believe all other threading in ASP.NET is a sign of a problem except for using Tasks to represent async operations or in the extremely rare case when you want to perform a computation in parallel in a web project but your project has very few concurrent users (less concurrent users than the number of cores)
Why Tasks are useful in ASP.NET?
First reason to use Tasks for async operations is that as of .NET 4.5 async APIs return Tasks :)
Async operations (not to be confused with parallel computations) may be web service calls, database calls, etc. They may be useful for two things:
Fire several of them at once and your job will take a time equal to the longest operation. If you fire them in sequential (non-async) fashion they will take time equal to the sum of the times of each operation which is obviously more.
They can improve scalability by releasing the thread executing the page - Node.js style. ASP.NET supports this since forever but in version 4.5 it is really easy to use. I'll go as far as claiming that it is easier than Node.js because of async/await. Releasing the thread is important because you may deplete your threads in the pool by having them wait. The result is that your website becomes slow when there are a certain number of users despite the fact that the CPU usage is like 30% simply because new requests are waiting in queue. If you increase the number of threads in the thread pool you pay the price of constant context switching than by the OS. At certain point you will get 100% CPU usage but 40% of it will be spent context switching. You will increase the throughput but with diminishing returns. A lot of threads also increase the memory footprint.
To start off - I have 2 separate websites and a database (IIS 7.5, ASP.NET and SQL Server 2008, using Linq-To-SQL for database access).
I have a separate administrative website that sometimes, during usage needs to trigger long running operations (more than 10 seconds) on database. The problem is that those operations cause sqlserver process to hit 100% CPU and then other, main customer website, can't access database promptly - there are some delays in accessing database.
I am OK with those administrative operations lasting 2x or 4x or nx times longer since they are lower priority.
I've tried using CPU Limit setting on AppPool in IIS, but that doesn't help, as w3wp.exe process never uses much of CPU... rather it's sqlservr.exe. Thanks in advance for your suggestions!
If your admin queries are consuming all the CPU on the box, there is almost certainly some tuning opportunities there - likely some indexing optimizations.
In lieu of the time to invest in those, and until you get your Resource Governor configuration settled, you can simply reduce their impact to a single CPU, which may provide short-term symptom relief, by adding the MAXDOP hint to your admin query:
OPTION (MAXDOP 1);
Yes, it might make you feel a little dirty, but the rest of your CPUs will be freed up to work on your more important queries.
The real answer is to tune your admin queries. Just because it's ok that they run long does not mean it's good for your server or the experience of your users. You'll never be able to completely isolate them from the effects of other queries going on on the box, especially if you are experiencing high CPU that is compensating for slow I/O. I/O does not have any knobs in resource governor - you can only control CPU and memory, and not even 100%.
Sounds like you want to look into Resource Governor which is built into SQL server as of SQL 2008. BOL link should get you started.
http://msdn.microsoft.com/en-us/library/bb933866(v=SQL.100).aspx
Essentially, you can throttle CPU and memory usage for the resource pools and workloads you define. This throttling will only kick in when the server is under load. Be aware that you cannot control disk IO utilization. If the process in your admin database is IO bound and your other DBs share drives you will inevitably still see performance issues and moving databases to separate spindles or query tuning will be necessary.
Example of the classifier function that will ensure the user you define is throttled by the desired resource pool based on workload group:
/* Classifier function */
CREATE FUNCTION dbo.rgov_classifier_db ()
RETURNS sysname
WITH SCHEMABINDING
AS
BEGIN
DECLARE #rgWorkloadGrp sysname
IF SUSER_SNAME() = 'adminWebsiteDB'
SET #rgWorkloadGrp = 'workloadGroupName'
ELSE
SET #rgWorkloadGrp = 'defaultWorkloadGroupName'
RETURN #rgWorkloadGrp
END;
GO
/* Register the function with Resource Governor and then start Resource Governor. */
ALTER RESOURCE GOVERNOR
WITH (CLASSIFIER_FUNCTION = dbo.rgov_classifier_db);
GO
ALTER RESOURCE GOVERNOR RECONFIGURE;
GO
I am developing large data collecting ASP.Net/Windows service application-pair that uses Microsoft SQL Server 2005 through LINQ2Sql.
Performance is always the issue.
Currently the application is divided into multiple larger processing parts, each logging the duration of their work. This is not detailed and does not help us with anything. It would be nice to have some database tables that contain statistics that the application itself collected from its own behavior.
What logging tips and data structures do you recommend to spot the parts that cause performance problems?
Edit:
Mostly I am looking for parts of the application that can cripple the whole system when excessively used. There are peaks during the day when some parts of the application are under heavy load. Some advanced logging would help me isolate the parts that need more attention and optimizing.
Don't use logging for this, use Performance Counters instead. The runtime impact of performance counters is minor and you can simple have them always on. To collect and monitor the performance, you can rely on the existing performance counters infrastructure (perfmon.exe, logman.exe, relog.exe etc).
I personally use XML and XSLT to generate the counters. I can then decorate all my code with performance counters tracking functions being run, average call duration time, number of executions, rate of executions per second and so on and so forth. Good choice of counters will give an immediate, accurate, performance picture much faster than logging can. While logging can give more insight on certain event paths (ie. order of events that lead to certain state), logging can seldom be 'always on' as the impact on performance is significant, not only on performance but most importantly on concurrency as most existing logging infrastructures add contention.
This is not a job for logging. It's a job for a profiler.
Try one of these:
JetBrains' dotTrace - http://www.jetbrains.com/profiler/index.html
Red-Gate ANTS - http://www.red-gate.com/products/ants_profiler/index.htm
Automated QA's AQTime - http://www.automatedqa.com/products/aqtime/index.asp
While I haven't (yet) tried it for myself, it may be worth looking at Gibraltar which can be used with PostSharp to put declarative performance logging into your code.
When dealing with problems like this I try and not add any extra headache by manually adding logging / tracing & timing into the application itself. If all you want is to tune the application then I suggest getting a profiler which will show you what areas of code are an issue. I recommend Red-Gate's Ant's Profiler.
Now if you want to collect statistics for monitoring or trending purposes then a profiler is not the right tool. I have had success using PerformanceCounters which let's many third party tools pull the performance information out of the application.
So what are you trying to do solve performance problems or monitor to ensure you catch a performance problem before it becomes severe?
EDIT
Based on your comment, I would look at using performance monitors around critical sections of code, timing how long it took to complete an operation. Then you can use the built in performance monitoring tools, or any number of third party tools to monitor and trend the stats.
SQL Server keeps track of some things for you, so try running some of these queries on your system:
Wayback Machine: Uncover Hidden Data to Optimize Application Performance
here is an example from the link:
--Identifying Most Costly Queries by I/O
SELECT TOP 10
[Average IO] = (total_logical_reads + total_logical_writes) / qs.execution_count
,[Total IO] = (total_logical_reads + total_logical_writes)
,[Execution count] = qs.execution_count
,[Individual Query] = SUBSTRING (qt.text,qs.statement_start_offset/2,
(CASE WHEN qs.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)
,[Parent Query] = qt.text
,DatabaseName = DB_NAME(qt.dbid)
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt
ORDER BY [Average IO] DESC;
the link contains many queries including ones for: Costly Missing Indexes, Logically Fragmented Indexes, Identifying Queries that Execute Most Often , etc...
I would start would diagnosing what is the real cause for the perf issue? Is it CPU, Memory, Disk or IO. This can be identified by few perfmon counters.
For example Linq2SQL uses Sync I/O which could be a big bottleneck for scalability. Because it uses Sync I/O windows threads get blocked and requests would end up waiting. This is usual suspect and might not be true. Here is an MSDN article how sync I/O could affect scalability.
If CPU is an issue then the next question is application CPU bound? Then you could use one of the profilers like mentioned above. Also look for time spent on GC perfmon counter that is another usual suspect?
I'm seeing consistently high CPU usage for my ASP.NET web application (on the live production box only, naturally....!) and I'm trying to narrow down the cause - it's basically maxing out a quad core Xeon box and there's no way it should be able to do that!
The CPU usage of the web process is generally higher than that of the DB process - which rings alarm bells to me on its own (?).
However, using the standard profiling tools (dotTrace, Red Gate etc) only show you the time spent in individual methods (rather than actual CPU usage) - and ultimately still highlight methods that are DB-bound. While this might indicate opportunities for caching or better indexes, I don't see how that in itself would result in high CPU usage of the web application process?
Any suggestions or tips as to how I can narrow this down?
Thanks!
Some suggestions to try at the first place.
1.Deploy with Release Build
Check whether the deployed product is in release mode. By running in debug mode, lot of time is wasted loading the pdbs along with the assemblies.
2.Disable ViewState
Disable viewstate if its not required. ViewState is nothing but data stored in hidden fields to be persisted between requests. it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
3.Disable Session State:
If you are not going to use it disable Session State. By default it’s on. You can actually turn this off for specific pages or for the whole application.
There are some basic ASP.NET application performance monitoring, check these two MSDN articles
"Monitoring ASP.NET Application Performance" and Performance Counters for ASP.NET
Can you set up some unit tests to call various methods and see what their impact is on processor usage? Visual Studio has some testing tools built in if you're using Team System, but even if you're not, you could write a multithreaded tester to call particular functions hundreds of times.
If you'd like some pointers on how to do this, I can help you build some basic unit testing.
are you recording/reporting unhandled exception? If not do so and check if any of them correspond with your high CPD spikes you may have a stack overflow causing the spikes.
http://msdn.microsoft.com/en-us/library/ms998306.aspx
You could also look into recoding the time of each request by using a HttpModule and checking which requests are taking up the most time which may indicate the pages that are causing the issue.
As Pradeepno notes, the place to start with is really performance counters--they can give you a very good idea of what is consuming what part of the CPU.
The web app usage being higher than DB usage isn't entirely suprising. If you have decent db design, most web apps are barely going to cause a decently powered DB server to break a sweat.