I'm developing an app for a client and one of his devices (2nd gen iTouch on iOS4) is having issues starting the application. I've run a few allocation/leak tests and I concluded that there isn't anything wrong with my app's code. I noticed that there is an allocation spike at startup and I concluded that it's because of dyld which is dynamically linking the libraries on start up. How would I go about pre-binding the application in xcode4?
OS X forum seemed to be extremely non informative in that they assume you'd be able to find it. :/
Any help would be appreciated.
Thanks!
(I also wish I could make a new tag for "prebinding")
According to Apple, you shouldn't need to prebind your iOS applications. If you are getting big allocation spikes, I'm guessing it's due to your app's architecture rather than the underlying OS itself.
The memory allocated by dyld should pale in insignificance compared to even the most basic allocations made by the earliest stages of the runtime. The Objective-C runtime and other system frameworks/libraries allocate a bunch of internal structures that are required for things to work correctly.
For instance, a quick test of an app that does nothing in main but make one call to NSLog(#"FooBar"); and then sleep (i.e. never even spools up UIApplication) performed 373 allocations for a total of 52K living.
To take it a step further, if you actually start up UIKit, like this...
UIApplicationMain(argc, argv, nil, NSStringFromClass([MyAppDelegate class]));
... you'll see ~600K in ~7800 living allocations once the app reaches quiescent state. This is all unavoidable stuff. No amount of prebinding will save you this. I suggest not worrying about it.
If you're seeing orders of magnitude more memory being allocated then, as Nik Reiman said, it's your application. At the end of the day, the memory allocated by the dynamic linker is totally insignificant.
Related
We have a problem on our website, seemingly at random (every day or so, up to once every 7-10 days) the website will become unresponsive.
We have two web servers on Azure, and we use Redis.
I've managed to run DotNetMemory and caught it when it crashes, and what I observe is under Event handlers leak two items seem to increase in count into the thousands before the website stops working. Those two items are CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy. Once the site crashes, we get lots of Redis exceptions that it can't connect to the Redis server. According to Azure Portal, our Redis server load never goes above 10% in peak times and we're following all best practises.
I've spent a long time going through our website ensuring that there are no obvious memory leaks, and have patched a few cases that went under the radar. Anecdotally, these seem to of improved the website stability a little. Things we've checked:
All iDisposable objects are now wrapped in using blocks (we did this strictly before but we did find a few not disposed properly)
Event handlers are unsubscribed - there are very few in our code base
We use WebUserControls pretty heavily. Each one had the current master page passed in as a parameter. We've removed the dependency on this as we thought it could cause GC to not collect the page perhaps
Our latest issue is that when the web server runs fine, but then we run DotNetMemory and attach it to the w3wp.exe process it causes the CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy event leaks to increase rapidly until the site crashes! So the crash is reproducible just by running DotNetMemory. Here is a screenshot of what we saw:
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base, and our "solution" is to have the app pools recycle every several hours to be on the safe side.
We've even tried upgrading Redis to the Premium tier, and even upgraded all drives on the webservers to SSDs to see if it helps things which it doesn't appear to.
Can anyone shed any light on what might be causing these issues?
All iDisposable objects are now wrapped in using blocks (we did this
strictly before but we did find a few not disposed properly)
We can't say a lot about crash without any information about it, but I have some speculations about it.
I see 10 000 (!) not disposed objects handled by finalization queue. Let start with them, find all of them and add Dispose call in your app.
Also I would recommend to check how many system handles utilized by your application. There is an OS limit on number of handles and if they are exceeded no more file handles, network sockets, etc can be created. I recommend it especially since the number of not disposed objects.
Also if you have a timeout on accessing Redis get performance profiler and look why so. I recommend to get JetBrains dotTrace and use TIMELINE mode to get a profile of your app, it will show thread sleeping, threads contention and many many more information what will help you to find a problem root. You can use command line tool to obtain profile data, in order not to install GUI application on the server side.
it causes the CaliEventHandlerDelegateProxy and
ArglessEventHandlerProxy event leaks to increase rapidly
dotMemory doesn't change your application code and doesn't allocate any managed objects in profiled process. Microsoft Profiling API injects a dll (written in c++) into the profiling process, it's a part of dotMemory, named Profilng Core, playing the role of the "server" (where standalone dotMemory written in C# is a client). Profiling Core doing some work with gathered data before sending it to the client side, it requires some memory, which allocated, of course, in the address space of the profiling process but it doesn't affect managed memory.
Memory profiling may affect performance of your application. For example, profiling API disables concurrent GC when application is under profiling or memory allocation data collecting can significantly slow down your application.
Why do you thing that CaliEventHandlerDelegateProxy and ArglessEventHandlerProxy are allocated only under dotMemory profiling? Could you please describe how you explored this?
Event handlers are unsubscribed - there are very few in our code base
dotMemory reports an event handler as a leak means there is only one reference to it - from event source at there is no possibility to unsubscribe from this event. Check all these leaks, find yours at look at the code how it is happened. Anyway, there are only 110.3 KB retained by these objects, why do you decide your site crashed because of them?
I'm at a loss now, I believe I've exhausted all possibilities of memory leaks in our code base
Take several snapshots in a period of time when memory consumption is growing, open full comparison of some of these snapshots and look at all survived objects which should not survive and find why they survived. This is the only way to prove that your app doesn't have memory leak, looking the code doesn't prove it, sorry.
Hope if you perform all the activities I recommend you to do (performance profiling, full snapshots and snapshots comparison investigation, not only inspections view, checking why there are huge amount of not disposed objects) you will find and fix the root problem.
I will start with TL;DR version as this may be enough for some of you:
We are trying to investigate an issue that we see in diagnostic data of our C++ product.
The issue was pinpointed to be caused by timeout on sqlite3_open_v2 which supposedly takes over 60s to complete (we only give it 60s).
We tried multiple different configurations, but never were able to reproduce even 5s delay on this call.
So the question is if maybe there are some known scenarios in which sqlite3_open_v2 can take that long (on windows)?
Now to the details:
We are using version 3.10.2 of SQLite. We went through changelogs from this version till now and nothing we've found in the bugfixes section seems to suggest that there was some issue that was addressed in consecutive SQLite releases and may have caused our problem.
The issue we see affects around 0.1% unique user across all supported versions of windows (Win 7, Win 8, Win 10). There are no manual user complaints/reports about that - this can suggest that a problem happens in the context where something serious enough is happening with the user machine/system that he doesn't expect anything to work. So something that indicates system-wide failure is a valid possibility as long as it can possibly happen for 0.1% of random windows users.
There are no data indicating that the same issue ever occurred on Mac which is also supported platform with large enough sample of diagnostic data.
We are using Poco (https://github.com/pocoproject/poco, version: 1.7.2) as a tool for accessing our SQLite database, but we've analyzed the Poco code and it seems that failure on this code level can only (possibly) explain ~1% of all collected samples. This is how we've determined that the problem lies in sqlite3_open_v2 taking a long time.
This happens on both DELETE journal mode as well as on WAL.
It seems like after this problem happens the first time for a particular user each consecutive call to sqlite3_open_v2 takes that long until the user restarts whole application (possibly machine, no way to tell from our data).
We are using following flags setup for sqlite3_open_v2 (as in Poco):
sqlite3_open_v2(..., ..., SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE | SQLITE_OPEN_URI, NULL);
This usually doesn't happen on startup of the application so it's not likely to be caused by something happening while our application is not running. This includes power cuts offs causing data destruction (which tends to return SQLITE_CORRUPT anyway, as mentioned in https://www.sqlite.org/howtocorrupt.html).
We were never able to reproduce this issue locally even though we tried different things:
Multiple threads writing and reading from DB with synchronization required by a particular journaling system.
Keeping SQLite connection open for a long time and working on DB normally in a meanwhile.
Trying to hit HDD hard with other data (dumping /dev/rand (WSL) to multiple files from different processes while accessing DB normally).
Trying to force antivirus software to scan DB on every file access (tested with Avast with basically everything enabled including "scan on open" and "scan on write").
Breaking our internal synchronization required by particular journaling systems.
Calling WinAPI CreateFile with all possible combinations of file sharing options on DB file - this caused issues but sqlite3_open_v2 always returned fast - just with an error.
Calling WinAPI LockFile on random parts of DB file which is btw. nice way of reproducing SQLITE_IOERR, but no luck with reproducing the discussed issue.
Some additional attempts to actually stretch the Poco layer and double-check if our static analysis of codes is right.
We've tried to look for similar issues online but anything somewhat relevant we've found was here sqlite3-open-v2-performance-degrades-as-number-of-opens-increase. This doesn't seem to explain our case though, as the numbers of parallel connections are way beyond what we have as well as what would typical windows users have (unless there is some somewhat popular app exploiting SQLite which we don't know about).
It's very unlikely that this issue is caused by db being accessed through network share as we are putting the DB file inside %appdata% unless there is some pretty standard windows configuration which sets %appdata% to be a remote share.
Do you have any ideas what can cause that issue?
Maybe some hints on what else should we check or what additional diagnostic data that we can collect from users would be useful to pinpoint the real reason why that happens?
Thanks in advance
Is anyone aware of an application I might be able to use to record or data-log network delays? Or failing that, would it be feasible to write such a program?
I work for a big corporation who have recently deployed a remote file management platform which is causing severe productivity issues for staff in our branch. We work direct to a server, and every time a file is saved now, there is a significant delay (generally between 5-15 seconds, but sometimes timing out all together). Everything is just extremely unresponsive & slow, and it makes people avoid saving files so often, so quite frequently, when crashes occur, quite a bit of work is lost.
And these delays don't only occur on save operations. They also occur on navigating the network file structure. 2-3 seconds pause / outage between each folder-hop is incredibly frustrating, and adds up to a lot of time when you add it all up.
So when these delays occurs, it freezes out the rest of the system. Clicking anywhere on screen or on another application does nothing until the delay has passed its course.
What I am trying to do is to have some sort of data-logger running which records the cumulative duration of these outages. The idea is to use it for a bit, and then take the issue higher with evidence of the percentage of lost time due to this problem.
I suspect this percentage to be a surprising one to managers. They appear to be holding their heads in the sand and pretending like it only takes away a couple of minutes a day. Going by my rough estimates, we should be talking hours lost per day (per employee), not minutes. :/
Any advice would be greatly appreciated.
You could check if the process is responding using C#. Just modify the answer from the other question to check for the name of the application (or the process id, if possible with that C# API) and sum up the waiting times.
This might be a bit imprecise, because Windows will give a process a grace period until it declares it "not responding", but depending on the waiting times might be enough to make your point.
I'm not sure how your remote file management platform works. In the easiest scenario where you can access files directly on the platform you could just create a simple script that opens files, navigates the file system and lists containing dirs/files.
In case it's something more intricate, I would suggest to use a tool like wireshark to capture the network traffice, filter out the relavant packets and do some analysis on their delay. I'm not sure if this can be done directly in wireshark otherwise I'd suggest to export it as a spreadsheet csv and do you analysis on that.
As i was working on my program, i noticed that, upon profiling, instruments was nice enough to point me to a Zombie object when it saw one. Does the fact that this message does not show up indicative of the fact that app contains no zombie processes?
Is there a way i can confirm that app contains no references to Zombie processes?
In my question, i am explicitly mentioning Xcode4, as i have not seen automatic Zombie behavior in 3 and suspect it may be a new feature.
No zombie messages showing up is a good sign. It means you weren't accessing any freed objects while Instruments was tracing. There's no way for Instruments to confirm that your application never accesses a freed object. All Instruments can do is tell you when you access a freed object.
Regarding automatic zombie behavior, detecting zombies is not new behavior in Xcode 4. Instruments has a Zombies template in both Xcode 3.2 and 4 that detects zombies. You can also configure the Allocations instrument to detect zombies by clicking the Info button next to the instrument, which the zombie message is blocking in your screenshot.
I'm seeing consistently high CPU usage for my ASP.NET web application (on the live production box only, naturally....!) and I'm trying to narrow down the cause - it's basically maxing out a quad core Xeon box and there's no way it should be able to do that!
The CPU usage of the web process is generally higher than that of the DB process - which rings alarm bells to me on its own (?).
However, using the standard profiling tools (dotTrace, Red Gate etc) only show you the time spent in individual methods (rather than actual CPU usage) - and ultimately still highlight methods that are DB-bound. While this might indicate opportunities for caching or better indexes, I don't see how that in itself would result in high CPU usage of the web application process?
Any suggestions or tips as to how I can narrow this down?
Thanks!
Some suggestions to try at the first place.
1.Deploy with Release Build
Check whether the deployed product is in release mode. By running in debug mode, lot of time is wasted loading the pdbs along with the assemblies.
2.Disable ViewState
Disable viewstate if its not required. ViewState is nothing but data stored in hidden fields to be persisted between requests. it increases the total payload of the page both when served and when requested. There is also an additional overhead incurred when serializing or deserializing view state data that is posted back to the server. Lastly, view state increases the memory allocations on the server.
3.Disable Session State:
If you are not going to use it disable Session State. By default it’s on. You can actually turn this off for specific pages or for the whole application.
There are some basic ASP.NET application performance monitoring, check these two MSDN articles
"Monitoring ASP.NET Application Performance" and Performance Counters for ASP.NET
Can you set up some unit tests to call various methods and see what their impact is on processor usage? Visual Studio has some testing tools built in if you're using Team System, but even if you're not, you could write a multithreaded tester to call particular functions hundreds of times.
If you'd like some pointers on how to do this, I can help you build some basic unit testing.
are you recording/reporting unhandled exception? If not do so and check if any of them correspond with your high CPD spikes you may have a stack overflow causing the spikes.
http://msdn.microsoft.com/en-us/library/ms998306.aspx
You could also look into recoding the time of each request by using a HttpModule and checking which requests are taking up the most time which may indicate the pages that are causing the issue.
As Pradeepno notes, the place to start with is really performance counters--they can give you a very good idea of what is consuming what part of the CPU.
The web app usage being higher than DB usage isn't entirely suprising. If you have decent db design, most web apps are barely going to cause a decently powered DB server to break a sweat.