How to profile an application only when it is stuttering? - qt

I have a QT application that I'm trying to increase the performance of. I've gotten it to the point where it runs at nearly 60 fps, but there are still a lot of dropped frames that noticeably decrease smoothness. In a standard call-graph-based profile of CPU usage, whatever is causing those stutters is lost in normal application activity. I want to view a profile of the application for only the times when it is dropping frames. What's the best way to do this?
One idea would be to add code to the application to detect when it is running smoothly, and tell perf to drop the last frame's activity, but I don't know how to detect that within QT nor how the application would communicate that to perf.

Related

CaliEventHandlerDelegateProxy leaking

We have a problem on our website, seemingly at random (every day or so, up to once every 7-10 days) the website will become unresponsive.
We have two web servers on Azure, and we use Redis.
I've managed to run DotNetMemory and caught it when it crashes, and what I observe is under Event handlers leak two items seem to increase in count into the thousands before the website stops working. Those two items are CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy. Once the site crashes, we get lots of Redis exceptions that it can't connect to the Redis server. According to Azure Portal, our Redis server load never goes above 10% in peak times and we're following all best practises.
I've spent a long time going through our website ensuring that there are no obvious memory leaks, and have patched a few cases that went under the radar. Anecdotally, these seem to of improved the website stability a little. Things we've checked:
All iDisposable objects are now wrapped in using blocks (we did this strictly before but we did find a few not disposed properly)
Event handlers are unsubscribed - there are very few in our code base
We use WebUserControls pretty heavily. Each one had the current master page passed in as a parameter. We've removed the dependency on this as we thought it could cause GC to not collect the page perhaps
Our latest issue is that when the web server runs fine, but then we run DotNetMemory and attach it to the w3wp.exe process it causes the CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy event leaks to increase rapidly until the site crashes! So the crash is reproducible just by running DotNetMemory. Here is a screenshot of what we saw:
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base, and our "solution" is to have the app pools recycle every several hours to be on the safe side.
We've even tried upgrading Redis to the Premium tier, and even upgraded all drives on the webservers to SSDs to see if it helps things which it doesn't appear to.
Can anyone shed any light on what might be causing these issues?
All iDisposable objects are now wrapped in using blocks (we did this
strictly before but we did find a few not disposed properly)
We can't say a lot about crash without any information about it, but I have some speculations about it.
I see 10 000 (!) not disposed objects handled by finalization queue. Let start with them, find all of them and add Dispose call in your app.
Also I would recommend to check how many system handles utilized by your application. There is an OS limit on number of handles and if they are exceeded no more file handles, network sockets, etc can be created. I recommend it especially since the number of not disposed objects.
Also if you have a timeout on accessing Redis get performance profiler and look why so. I recommend to get JetBrains dotTrace and use TIMELINE mode to get a profile of your app, it will show thread sleeping, threads contention and many many more information what will help you to find a problem root. You can use command line tool to obtain profile data, in order not to install GUI application on the server side.
it causes the CaliEventHandlerDelegateProxy and
ArglessEventHandlerProxy event leaks to increase rapidly
dotMemory doesn't change your application code and doesn't allocate any managed objects in profiled process. Microsoft Profiling API injects a dll (written in c++) into the profiling process, it's a part of dotMemory, named Profilng Core, playing the role of the "server" (where standalone dotMemory written in C# is a client). Profiling Core doing some work with gathered data before sending it to the client side, it requires some memory, which allocated, of course, in the address space of the profiling process but it doesn't affect managed memory.
Memory profiling may affect performance of your application. For example, profiling API disables concurrent GC when application is under profiling or memory allocation data collecting can significantly slow down your application.
Why do you thing that CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy are allocated only under dotMemory profiling? Could you please describe how you explored this?
Event handlers are unsubscribed - there are very few in our code base
dotMemory reports an event handler as a leak means there is only one reference to it - from event source at there is no possibility to unsubscribe from this event. Check all these leaks, find yours at look at the code how it is happened. Anyway, there are only 110.3 KB retained by these objects, why do you decide your site crashed because of them?
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base
Take several snapshots in a period of time when memory consumption is growing, open full comparison of some of these snapshots and look at all survived objects which should not survive and find why they survived. This is the only way to prove that your app doesn't have memory leak, looking the code doesn't prove it, sorry.
Hope if you perform all the activities I recommend you to do (performance profiling, full snapshots and snapshots comparison investigation, not only inspections view, checking why there are huge amount of not disposed objects) you will find and fix the root problem.

Determining what is putting pressure on IIS

I got a dedicated server running both IIS 7.5 and SQL Server 2010. Server CPU load is often near 100%. The SQL server does not take too much but the w3wp process is taking a significant amount of CPU (often 70+%).
I'd like to find out, what is causing this pressure:
* Too many requests of static files (a CDN could be added)
* Too many ajax requests (I am thinking about comet/web sockets anyways)
* Single asp.net pages consuming too much processing power (should be easy to optimize)
Where would you start looking to find out where to start optimizing?
The easiest possible way is to profile the app in production. Not sure if that is possible in your case. Some options:
look into the logs and look at the duration of the requests. Long requests are likely to put load on the system
Remote debug w3wp with Visual Studio and pause the debugger 10 times to see where it stops most. That is the hot spot
Use XPerf or PerfView to capture (managed) stacks. This has almost no impact on production performance
A good starting point would be to fire up the development tools (F12 in IE / Chrome) and look at the timings under the network tab. That will show you a waterfall-style diagram for how the page has loaded and should help you identify any particularly slow-loading static files which might be sensibly moved off to a cdn, any unnecessary requests being made, how much time is being spent getting the actual page itself, etc.
After that, profile the application with a performance profiler. A good profiler like ANTS Performance Profiler will let you look at things like execution time / hit counts for different methods, as well as what database queries are being run and how long they’re taking. A new version of ANTS (currently in EAP) will also group that activity by http request so you can see if specific pages need optimisation or are being hit too many times.
You'd also do well to check that caching is working as you intend it so that users aren’t unnecessarily re-requesting pages.
There's also a nice article on ASP.NET performance which you might want to read at http://aspalliance.com/1533_ASPNET_Performance_Tips.7.
Disclaimer: I work for Red Gate which makes ANTS.
I found an easy way to see what's going on on the server.
Nevertheless, the professional way is probably to go and use a profiling tool.
What did I do?
In IIS Console you can get a list of all current worker threads and if you choose one you can see what this thread is working on. So I was able to see that the thread was handling 100 requests in parallel, 70 of those were tracing back to the same ajax call.
The immediate solution was to reduce the frequency of that call (from every 10 to every 30 seconds). The next step will be to further optimize the call on the server side since I do have other ajax calls with the same frequency (every 10 seconds) which nearly never showed up in the active requests list since they were so fast.
Probably the easiest way to figure it out would be to install New Relic on the server. The trial lasts 30 days I think so it should give you enough time to get to the bottom of this. It'll show you long-running SQL queries, .NET methods, as well as just about everything else you can think of. It makes it very easy to identify bottlenecks.
By the way, I suggested New Relic because it sounds like your problem is in a production environment. New Relic isn't an incredibly detailed profiler. It gathers enough information to be helpful, but not so much as to slow down the server. That makes it well suited to this purpose.
If, however, you could reproduce the problem in a development environment you might try something like the free Eqatec profiler.

ASP.Net application getting slow over time

Our users of a legacy web application written in ASP.net 2.0 are experiencing slow down after using the application over some time (after 2 hours or so).
For example, clicking a button performing some calculation and updating an ajax updatepanel takes several seconds while it is almost instant when they just have logged in.
If they log off and log back on of the web application, it gets back to the normal speed again, every time they do. The slow down is for a particular user and not global. It can fast for an user and slow for the user next to him.
I've looked at the IIS server and there is plenty of free memory in it and the CPU is almost always below 10%.
There are only a dozen users using this web application at time so scalability shouldn't be an issue.
The only think I can think of being affected by log offs, is session state.
I've already looked ViewState too and it is "only" 18kb at the slowest which shouldn't be the bottleneck, especially since they are on the same gigabit LAN and not remote. It is not any smaller when the applications runs fast.
Looking at session state in the code, it isn't used much and no large objects are stored.
According to my tests, it didn't grow bigger than 10 items and 500 bytes after a few minutes.
Even if there was a leak, the server should be able to cope with it considering the amounts of memory it has (4GB but only using 300MB).
I've cleared temporary files and it didn't seem to make any significant difference.
Do you think these slowdowns could come from something else than Session State, keeping in mind logging off and back on gets the speed back to normal?
If not, is there any tool you would suggest to profile sessionstate or a setting in IIS I should try to change?
what does the browser's memory use look like on the client machines when they experience the slowdown? It could be memory leaks in your javascript if you are using lots of ajax stuff and the browser's memory use continuously climbs.
Have you checked your database? (I assume you're consuming some data with an ajax app)
If you don't close the connection in the Finally block, you can wind up with zombie database connections that slow the app down. I don't know if this is the root cause, but probably worth looking at.
If this is an AJAX heavy page & the user is staying on the same page (i.e. lots of postbacks to itself) then it's possible that it's just ViewState growing to large. and the slow down you're experiencing is due to PageSize.
I would try and recreate the experience locally & monitor the page size/traffic with something like Firebug/Fiddler. If it is a ViewState/page size issue you could look at moving ViewState to the server side
See these for more info on it.
http://professionalaspnet.com/archive/2006/12/09/Move-the-ViewState-to-Session-and-eliminate-page-bloat.aspx

Network delays & computer freezing

Is anyone aware of an application I might be able to use to record or data-log network delays? Or failing that, would it be feasible to write such a program?
I work for a big corporation who have recently deployed a remote file management platform which is causing severe productivity issues for staff in our branch. We work direct to a server, and every time a file is saved now, there is a significant delay (generally between 5-15 seconds, but sometimes timing out all together). Everything is just extremely unresponsive & slow, and it makes people avoid saving files so often, so quite frequently, when crashes occur, quite a bit of work is lost.
And these delays don't only occur on save operations. They also occur on navigating the network file structure. 2-3 seconds pause / outage between each folder-hop is incredibly frustrating, and adds up to a lot of time when you add it all up.
So when these delays occurs, it freezes out the rest of the system. Clicking anywhere on screen or on another application does nothing until the delay has passed its course.
What I am trying to do is to have some sort of data-logger running which records the cumulative duration of these outages. The idea is to use it for a bit, and then take the issue higher with evidence of the percentage of lost time due to this problem.
I suspect this percentage to be a surprising one to managers. They appear to be holding their heads in the sand and pretending like it only takes away a couple of minutes a day. Going by my rough estimates, we should be talking hours lost per day (per employee), not minutes. :/
Any advice would be greatly appreciated.
You could check if the process is responding using C#. Just modify the answer from the other question to check for the name of the application (or the process id, if possible with that C# API) and sum up the waiting times.
This might be a bit imprecise, because Windows will give a process a grace period until it declares it "not responding", but depending on the waiting times might be enough to make your point.
I'm not sure how your remote file management platform works. In the easiest scenario where you can access files directly on the platform you could just create a simple script that opens files, navigates the file system and lists containing dirs/files.
In case it's something more intricate, I would suggest to use a tool like wireshark to capture the network traffice, filter out the relavant packets and do some analysis on their delay. I'm not sure if this can be done directly in wireshark otherwise I'd suggest to export it as a spreadsheet csv and do you analysis on that.

Profiling CPU usage of ASP.NET web app - but not the database!

I'm seeing consistently high CPU usage for my ASP.NET web application (on the live production box only, naturally....!) and I'm trying to narrow down the cause - it's basically maxing out a quad core Xeon box and there's no way it should be able to do that!
The CPU usage of the web process is generally higher than that of the DB process - which rings alarm bells to me on its own (?).
However, using the standard profiling tools (dotTrace, Red Gate etc) only show you the time spent in individual methods (rather than actual CPU usage) - and ultimately still highlight methods that are DB-bound. While this might indicate opportunities for caching or better indexes, I don't see how that in itself would result in high CPU usage of the web application process?
Any suggestions or tips as to how I can narrow this down?
Thanks!
Some suggestions to try at the first place.
1.Deploy with Release Build
Check whether the deployed product is in release mode. By running in debug mode, lot of time is wasted loading the pdbs along with the assemblies.
2.Disable ViewState
Disable viewstate if its not required. ViewState is nothing but data stored in hidden fields to be persisted between requests. it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
3.Disable Session State:
If you are not going to use it disable Session State. By default it’s on. You can actually turn this off for specific pages or for the whole application.
There are some basic ASP.NET application performance monitoring, check these two MSDN articles
"Monitoring ASP.NET Application Performance" and Performance Counters for ASP.NET
Can you set up some unit tests to call various methods and see what their impact is on processor usage? Visual Studio has some testing tools built in if you're using Team System, but even if you're not, you could write a multithreaded tester to call particular functions hundreds of times.
If you'd like some pointers on how to do this, I can help you build some basic unit testing.
are you recording/reporting unhandled exception? If not do so and check if any of them correspond with your high CPD spikes you may have a stack overflow causing the spikes.
http://msdn.microsoft.com/en-us/library/ms998306.aspx
You could also look into recoding the time of each request by using a HttpModule and checking which requests are taking up the most time which may indicate the pages that are causing the issue.
As Pradeepno notes, the place to start with is really performance counters--they can give you a very good idea of what is consuming what part of the CPU.
The web app usage being higher than DB usage isn't entirely suprising. If you have decent db design, most web apps are barely going to cause a decently powered DB server to break a sweat.

Resources