I am developing large data collecting ASP.Net/Windows service application-pair that uses Microsoft SQL Server 2005 through LINQ2Sql.
Performance is always the issue.
Currently the application is divided into multiple larger processing parts, each logging the duration of their work. This is not detailed and does not help us with anything. It would be nice to have some database tables that contain statistics that the application itself collected from its own behavior.
What logging tips and data structures do you recommend to spot the parts that cause performance problems?
Edit:
Mostly I am looking for parts of the application that can cripple the whole system when excessively used. There are peaks during the day when some parts of the application are under heavy load. Some advanced logging would help me isolate the parts that need more attention and optimizing.
Don't use logging for this, use Performance Counters instead. The runtime impact of performance counters is minor and you can simple have them always on. To collect and monitor the performance, you can rely on the existing performance counters infrastructure (perfmon.exe, logman.exe, relog.exe etc).
I personally use XML and XSLT to generate the counters. I can then decorate all my code with performance counters tracking functions being run, average call duration time, number of executions, rate of executions per second and so on and so forth. Good choice of counters will give an immediate, accurate, performance picture much faster than logging can. While logging can give more insight on certain event paths (ie. order of events that lead to certain state), logging can seldom be 'always on' as the impact on performance is significant, not only on performance but most importantly on concurrency as most existing logging infrastructures add contention.
This is not a job for logging. It's a job for a profiler.
Try one of these:
JetBrains' dotTrace - http://www.jetbrains.com/profiler/index.html
Red-Gate ANTS - http://www.red-gate.com/products/ants_profiler/index.htm
Automated QA's AQTime - http://www.automatedqa.com/products/aqtime/index.asp
While I haven't (yet) tried it for myself, it may be worth looking at Gibraltar which can be used with PostSharp to put declarative performance logging into your code.
When dealing with problems like this I try and not add any extra headache by manually adding logging / tracing & timing into the application itself. If all you want is to tune the application then I suggest getting a profiler which will show you what areas of code are an issue. I recommend Red-Gate's Ant's Profiler.
Now if you want to collect statistics for monitoring or trending purposes then a profiler is not the right tool. I have had success using PerformanceCounters which let's many third party tools pull the performance information out of the application.
So what are you trying to do solve performance problems or monitor to ensure you catch a performance problem before it becomes severe?
EDIT
Based on your comment, I would look at using performance monitors around critical sections of code, timing how long it took to complete an operation. Then you can use the built in performance monitoring tools, or any number of third party tools to monitor and trend the stats.
SQL Server keeps track of some things for you, so try running some of these queries on your system:
Wayback Machine: Uncover Hidden Data to Optimize Application Performance
here is an example from the link:
--Identifying Most Costly Queries by I/O
SELECT TOP 10
[Average IO] = (total_logical_reads + total_logical_writes) / qs.execution_count
,[Total IO] = (total_logical_reads + total_logical_writes)
,[Execution count] = qs.execution_count
,[Individual Query] = SUBSTRING (qt.text,qs.statement_start_offset/2,
(CASE WHEN qs.statement_end_offset = -1
THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2
ELSE qs.statement_end_offset END - qs.statement_start_offset)/2)
,[Parent Query] = qt.text
,DatabaseName = DB_NAME(qt.dbid)
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as qt
ORDER BY [Average IO] DESC;
the link contains many queries including ones for: Costly Missing Indexes, Logically Fragmented Indexes, Identifying Queries that Execute Most Often , etc...
I would start would diagnosing what is the real cause for the perf issue? Is it CPU, Memory, Disk or IO. This can be identified by few perfmon counters.
For example Linq2SQL uses Sync I/O which could be a big bottleneck for scalability. Because it uses Sync I/O windows threads get blocked and requests would end up waiting. This is usual suspect and might not be true. Here is an MSDN article how sync I/O could affect scalability.
If CPU is an issue then the next question is application CPU bound? Then you could use one of the profilers like mentioned above. Also look for time spent on GC perfmon counter that is another usual suspect?
Related
We have a very large three-tier .NET 4.0 web application (ASP.NET, WCF, SQL Server 2008 R2) with some performance issues I'm trying to diagnose. The ASP.NET tier uses the SQL session state provider, and there is reasonably extensive use of session state.
<sessionState mode="SQLServer" sqlConnectionString="..." cookieless="false" timeout="20"/>
I'm trying to figure out how much time it takes to load and save the session to the database. Note that this is different than using SQL profiler or other database means to measure time on that tier; I want to measure total time taken from the ASP.NET tier. I.e. that would include session object serialization/deserialization, wire time, and SQL save/load time.
I am pretty experienced developer (since the early 80's), and I have Googled the web for hours now -- there are tons of articles contrasting session state approaches and general performance concepts etc, but I can find nothing on how to actually measure this total time for the SQL session state provider. I initially thought of using an HttpModule to hook events and use a stopwatch, but there doesn't appear to be a clean path to this. (I would settle for an approach that's reasonably accurate to few percent...)
Any ideas/suggestions are much appreciated, thanks.
Could you not just alter your session timeout to seconds, and then to a lower and lower amount, until it does not load and save the session?
The managed stack can be profiled in the VS profiler. See:
Find Application Bottlenecks with Visual Studio Profiler
How to: Profile a Web Site or Web Application Using the Performance Wizard
Analyzing Application Performance by Using Profiling Tools
For low level stuff use Xperf, see Two Minute Drill: Introduction to XPerf
For the SQL Server side there are many methodologies for performance troubleshooting, like Waits and Queues. For specific statement duration/IO cost you can use profiler trace, see Scenarios for Using SQL Server Profiler.
Now for a very specific and invasive way in your case you can use Performance Counters. Not the stock kind, but create your own. See Using XSLT to generate Performance Counters for an ease way to geenrate the plumbing needed to augment your app with its own perf counters, then hook up your code with performance tracing. Use the standard perofrmance counters tool chain (logman.exe, perfmon.exe) to capture and analyze.
The easiest way to do this is to profile the app and look at where the time is spent deep in the bowels of ASP.NET. You will be able to quickly spot session-related activity by class and method name.
Perfview is fine for this. It has a sampling profiler which means that it will give you accurate, unskewed profiling information. But it can't tell you how many seconds were spent in what method. All it tells you are percentages.
I'd spin up a load test against an empty page and let perfview run for 30s.
dottrace is a simple profiler that also shows you the amount of CPU time spent per call, I believe.
The first thing that comes to mind is to create a page that doesn't really do anything other than perhaps load/store the session state variables. From there you can set up a coded UI test that accesses the page a number of times and builds a statistical estimate for the variability and distribution involved in load times for the page.
WCAT (Web Capacity Analysis Tool) could also be used to see how performance changes under load...
I got a dedicated server running both IIS 7.5 and SQL Server 2010. Server CPU load is often near 100%. The SQL server does not take too much but the w3wp process is taking a significant amount of CPU (often 70+%).
I'd like to find out, what is causing this pressure:
* Too many requests of static files (a CDN could be added)
* Too many ajax requests (I am thinking about comet/web sockets anyways)
* Single asp.net pages consuming too much processing power (should be easy to optimize)
Where would you start looking to find out where to start optimizing?
The easiest possible way is to profile the app in production. Not sure if that is possible in your case. Some options:
look into the logs and look at the duration of the requests. Long requests are likely to put load on the system
Remote debug w3wp with Visual Studio and pause the debugger 10 times to see where it stops most. That is the hot spot
Use XPerf or PerfView to capture (managed) stacks. This has almost no impact on production performance
A good starting point would be to fire up the development tools (F12 in IE / Chrome) and look at the timings under the network tab. That will show you a waterfall-style diagram for how the page has loaded and should help you identify any particularly slow-loading static files which might be sensibly moved off to a cdn, any unnecessary requests being made, how much time is being spent getting the actual page itself, etc.
After that, profile the application with a performance profiler. A good profiler like ANTS Performance Profiler will let you look at things like execution time / hit counts for different methods, as well as what database queries are being run and how long they’re taking. A new version of ANTS (currently in EAP) will also group that activity by http request so you can see if specific pages need optimisation or are being hit too many times.
You'd also do well to check that caching is working as you intend it so that users aren’t unnecessarily re-requesting pages.
There's also a nice article on ASP.NET performance which you might want to read at http://aspalliance.com/1533_ASPNET_Performance_Tips.7.
Disclaimer: I work for Red Gate which makes ANTS.
I found an easy way to see what's going on on the server.
Nevertheless, the professional way is probably to go and use a profiling tool.
What did I do?
In IIS Console you can get a list of all current worker threads and if you choose one you can see what this thread is working on. So I was able to see that the thread was handling 100 requests in parallel, 70 of those were tracing back to the same ajax call.
The immediate solution was to reduce the frequency of that call (from every 10 to every 30 seconds). The next step will be to further optimize the call on the server side since I do have other ajax calls with the same frequency (every 10 seconds) which nearly never showed up in the active requests list since they were so fast.
Probably the easiest way to figure it out would be to install New Relic on the server. The trial lasts 30 days I think so it should give you enough time to get to the bottom of this. It'll show you long-running SQL queries, .NET methods, as well as just about everything else you can think of. It makes it very easy to identify bottlenecks.
By the way, I suggested New Relic because it sounds like your problem is in a production environment. New Relic isn't an incredibly detailed profiler. It gathers enough information to be helpful, but not so much as to slow down the server. That makes it well suited to this purpose.
If, however, you could reproduce the problem in a development environment you might try something like the free Eqatec profiler.
Background:
Enterprise application - very will written for its time in 2004.
Stack:
.NET, Heavy use of Remoting, ASMX style web services, SQL Server
Problem:
The application allows user to go through various wizards for lack of a better term, all of their actions are stored in what we call "wiz state", which is essentially XML that is persisted to a SQL server database very frequently because we allow users to pause/resume their application. Often in these wizards, the XML that comprises the wizard state grows very large, I'm talking 5-8 MB of data, and we noticed that when we had a sudden influx of simultaneous users, we started receiving occasional timeouts against the database, because a lot of what the wizard state is comprised of, is keeping track of collections of "things". Sometimes these custom collections grow very large.
Question:
We were in a meeting today and we're expecting a flurry of activity in October that will test the system like never before, and possibly result in huge wizard states that go back and forth from the web server to the database. The crux of the situation is that there is only one database and one web server.
For arguments sake, because of the complexity of the application, lets say adding any kind of clustering/mirroring to increase database throughput is out of the question. I spoke up in the meeting and said the quickest way to address this in the shortest time period would be to add more servers to the front end web application so the load could be distributed amongst web servers. The development lead said I was completely wrong and it would have no effect because we only have one database, so adding more web power would do nothing. He is having one of the other developers reduce the xml bloat that we persist frequently to the database. Probably in the long run, reducing the size of the xml that we pass back and forth is the right idea, but will adding additional web servers truly have no effect, I just think in terms of simultaneous users, it should help.
Any responses thoughts are appreciated, proof that more web servers would help would be pure win.
Thanks.
EDIT: We use binary serialization to store the XML in the database in an image field.
I haven't heard anything about locating the "bottlenecks". Isn't that the first thing to do? Here's the method I use.
Otherwise you're just investing in guesses. That won't work.
I've been in meetings like that, where everybody gets excited throwing ideas around, and "management" wants to make "decisions", but it's the blind leading the blind. Knuckle down and find out what's going on. You can't do that in meetings.
Some time ago I looked at a performance problem with some similarity to yours. The biggest "bottleneck" was in writing and parsing XML, with attendant memory allocation, setup, and destruction. Then there were others as well. You might find the same thing, or something different.
P.S. I keep quoting "bottleneck" because all the performance problems I've found have been nothing at all like the necks of bottles. Rather they are like way over-bushy call trees that need radical pruning, such as making and reading mountains of XML for no good reason.
If the rate at which the data is written by SQL is the bottleneck, feeding data to SQL more quickly should have no effect.
I am not sure exactly what the data structure is, but perhaps compressing the XML data on the web server(s) before writing may have a positive effect.
If the bottleneck is the database, then more web services will not help you a lot.
The problem may be that the problem is not only the size of the data, but the number of concurrent request to the same table. The number of writes will be the big problem. If your XML write is in a transaction with other queries you may try to break out the XML write from that transaction to reduce locking time of the XML table.
As stated by vdeych you may try compression to reduce the data size. (That would increase the load on the web servers.)
You may also try caching the data. Only read from the SQL server if the data is not already in the cache. Make sure you don't update the SQL server if your data has not changed.
No one seems to have suggest this, what about replacing your XML serialization of your wizard with JsonSerialization.
Not only should this give you a minor boost in performance in the serialization itself since both the DataContractSerializer (faster) and Newtonsoft Json.NET (fastest) out perform the XML serializers in .NET. This should easily reduce the size of your object graph by upwards of 50% or more (depending on number of properties vs large strings in the XML).
This should dramatically lower the IO that is inflicted upon Sql server. This should also limit the amount of scope required to alter your application significantly (assuming it's well designed and works through common calls for serialization/deserialization).
If you choose to go this route also invest time comparing BSON vs JSON as I think it would be likely that the binary encoded one will offer even more space savings (and further IO reduction) due to the size of your object graphs.
I'm not a .NET expert but maybe using a binary serialization would increase throughput. Making sure that the XML isn't stored as text (fairly obvious but thought I'd mention it). Also relational databases are best for storing relational data, so perhaps substituting an ORM layer in place of the serialization (sounds feasible) could speed things up.
Mike is spot on, without understanding the resource constaint leading to the performance issues, no amount of discussion will resolve the problem. I'll add that socket timeouts that affect running statements are a symptom, and are never imposed by SQL Server, they're an artifact of your driver configuration or a firewall or similar device between app and db imposing them (unless you're talking about timeouts for new connections, then you have a host in serious distress under load).
Given your symptom is database timeouts, you need to start there. If they're indicative of long running statements that result in a socket timeout, use SQL Server profiler to capture the workload while simultaneously monitoring system resources. Given it's a mature application and the type of workload you mention, it's unlikely to be statement tuning related, it probably boils down to resource limitations CPU, memory or disk IO capacity
This Technet guide is a very good place to start:
http://technet.microsoft.com/en-us/library/cc966540.aspx
If it's resource contention, then it's a simple discussion about how the resource contention can be tuned, configured for or addressed by adding more of whatever is needed.
Edit: I should add that given a database performance issue, more applications servers is likely to worsen the problem as you increase the amount of concurrency, that might otherwise be kept in check by connection pool, request processing or other limits.
Last night I did a load test on a site. I found that one of my shared caches is a bottleneck. I'm using a ReaderWriterLockSlim to control the updates of the data. Unfortunately at one point there are ~200 requests trying to update the data at approximately the same time. This also coincided with CPU usage spikes.
The data being updated is in the ASP.NET Cache. What I'd like to do is if the CPU usage is around 75%, I'd like to just skip the cache and hit the database on another machine.
My problem is that I don't know how expensive it is to create a new performance counter to check the cpu usage. Also, if I would probably like the average cpu usage over the last 2 or 3 seconds. However, I can't sit there and calculate the cpu time as that would take longer than it's taking to update the cache currently.
Is there an easy way to get the average CPU usage? Are there any drawbacks to this?
I'm also considering totaling the wait count for the lock and then at a certain threshold switch over to the database. The concern I had with this approach would be that changing hardware might allow more locks with less of a strain on the system. And also finding the right balance for the threshold would be cumbersome and it doesn't take into account any other load on the machine. But it's a simple approach, and simple is 99% of the time better.
This article from Microsoft covers Tuning .Net Application Performance and highlights which counters to collect and compare to determine CPU and I/O bound applications.
You sound like you want to monitor this during execution and bypass your cache when things get intensive. Would this not just move the intensive processing from the cache calls to your database calls? Surely you have the cache to avoid expensive DB calls.
Are you trying to repopulate an invalidated cache? What is the affect of serving stale data from the cache? You could just lock on the re-populating function and serve stale data to other requests until the process completes.
Based on the above article, we collect the following counter objects during our tests and that gives us all the necessary counters to determine the bottlenecks.
.NET CLR Exceptions
.NET CLR Memory
ASP.NET Applications
ASP.NET
Memory
Paging File
Processor
Thread
The sections in the article for CLR Tuning and ASP.NET Tuning highlight the bottlenecks that can occur and suggest configuration changes to improve performance. We certainly made changes to the thread pool settings to get better performance.
Changing and Retrieving Performance Counter Values might help with accessing the existing Processor counter via code but this isn't something I've tried personally.
I'm seeing consistently high CPU usage for my ASP.NET web application (on the live production box only, naturally....!) and I'm trying to narrow down the cause - it's basically maxing out a quad core Xeon box and there's no way it should be able to do that!
The CPU usage of the web process is generally higher than that of the DB process - which rings alarm bells to me on its own (?).
However, using the standard profiling tools (dotTrace, Red Gate etc) only show you the time spent in individual methods (rather than actual CPU usage) - and ultimately still highlight methods that are DB-bound. While this might indicate opportunities for caching or better indexes, I don't see how that in itself would result in high CPU usage of the web application process?
Any suggestions or tips as to how I can narrow this down?
Thanks!
Some suggestions to try at the first place.
1.Deploy with Release Build
Check whether the deployed product is in release mode. By running in debug mode, lot of time is wasted loading the pdbs along with the assemblies.
2.Disable ViewState
Disable viewstate if its not required. ViewState is nothing but data stored in hidden fields to be persisted between requests. it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
3.Disable Session State:
If you are not going to use it disable Session State. By default it’s on. You can actually turn this off for specific pages or for the whole application.
There are some basic ASP.NET application performance monitoring, check these two MSDN articles
"Monitoring ASP.NET Application Performance" and Performance Counters for ASP.NET
Can you set up some unit tests to call various methods and see what their impact is on processor usage? Visual Studio has some testing tools built in if you're using Team System, but even if you're not, you could write a multithreaded tester to call particular functions hundreds of times.
If you'd like some pointers on how to do this, I can help you build some basic unit testing.
are you recording/reporting unhandled exception? If not do so and check if any of them correspond with your high CPD spikes you may have a stack overflow causing the spikes.
http://msdn.microsoft.com/en-us/library/ms998306.aspx
You could also look into recoding the time of each request by using a HttpModule and checking which requests are taking up the most time which may indicate the pages that are causing the issue.
As Pradeepno notes, the place to start with is really performance counters--they can give you a very good idea of what is consuming what part of the CPU.
The web app usage being higher than DB usage isn't entirely suprising. If you have decent db design, most web apps are barely going to cause a decently powered DB server to break a sweat.