WinDBG - Analyse dump file on local PC - asp.net

I have created a memory dump of an ASP.NET process on a server using the following command: .dump /ma mydump.dmp. I am trying to identify a memory leak.
I want to look at the dump file in more detail on my local development PC. I read somewhere that it is advisable to debug on the same machine as you create the dump file. However, I have also read that some developers do analyse the dump file on their local development PC's. What is the best approach?
I notice that when I create a dump file using the command above the W3WP process memory increases by about 1.5 times. Why this this? I suppose this should be avoided on a live server.

Analyzing on the same machine can save you from SOS loading issues thereafter. Unless you are familiar with WinDbg and SOS, you will find it confusing and frustrating then.
If you have to use another machine for analysis, make sure you read carefully this blog post, http://blogs.msdn.com/b/dougste/archive/2009/02/18/failed-to-load-data-access-dll-0x80004005-or-what-is-mscordacwks-dll.aspx as it shows you how to copy the necessary files from the source machine (where the dump is captured) to the target machine (the one you launch WinDbg).
For your second question, as you use WinDbg to attach to the process directly, and use .dump command to capture the dump, the target process unfortunately is modified. Not easy to explain in a few words. The recommended way is to use ADPlus.exe or Debug Diag. Even procdump from SysInternals is better. Those tools are designed for dump capture and they have minimal impact on the target processes.
For memory leak from unmanaged libraries, you should use memory leak rule of Debug Diag. for managed memory leak, you can simply capture hang dumps when memory usage is high.

I am no expert on WinDBG but I once had to analyse a dump file on my ASP.NET site to find a StackOverflowException.
While I got a dump file of my live site (I had no choice since that was what was failing), originally I tried to analyse that dump file on my local dev PC but ran into problems when trying to load the CLR data from it. The reason being that the exact version of the .NET framework differed between my dev PC and the server - both were .NET 4 but I imagine my dev PC had some cumulative updates installed that the server did not. The SOS module simply refused to load because of this discrepancy. I actually wrote a blog post about my findings.
So to answer part of your question it may be that you have no choice but to run WinDBG from your server, at least you can be sure that the dump file will match your environment.

It is not necessary to debug on the actual machine unless the problem is difficult to manifest on your development machine.
So long as you have the pdbs with the private symbols then the symbols should be resolved and call stacks correctly displayed and the correct version of .NET installed.
In terms of looking at memory leaks you should enable Gflags user stack trace and take memory dumps at 2 intervals so you can compare the memory usage before and after the action that provokes the memory leak, remember to disable gflags afterwards!
You could also run DebugDiag on the server which has automated memory pressure analysis scripts that will work with .Net leaks.

Related

I'm getting OutOfMemoryExceptions, but my trace file is much smaller than my available memory

As the title says, I've got tons of free memory, but I keep getting OutOfMemoryExceptions when processing traces and calling properties on Data Sources. Why is this happening?
The ETL file format is designed to be very space-efficient, and also supports optional compression. Due to these factors taking the data from an .etl file and transforming it into our more useful structures can often require significantly more memory than the original size of the file. However there are two steps that can be taken to make OutOfMemoryExceptions less likely:
Don't use data sources you don't need. Even if none of the properties on a data source are called by your code, simply turning it on by calling its Use method will result in the data source processing events and preparing data for consumption.
Ensure your program is running as a 64-bit process. The default Visual Studio C# project settings are to compile your program targeting AnyCPU, but to prefer running it as a 32-bit process. Unchecking the "Prefer 32-bit" option in your project's Build properties or switching your project's build configuration to x64 will cause your program to run as a 64-bit process.

Difference between Jstack and gcore when generating Core Dumps?

We all know that Core Dumps are an essential diagnostic tools for analysing various processes in Unix . I know both jstack and gcore are both used for generating Javacore files or Core Dumps but I have a doubt that Gcore is mainly used for Processes and Jstack is used for threads .
As from an Operating System perspective Process and Threads though interrelated (Process comprises of Threads only) they are relatively different from each other w.r.t memory/speed/execution . So is that gcore will diagnose the process and jstack will analyse the threads in that process ???
GCore act at OS level and you got a dump of native code that is currently running. From a java point of view, it is not really understandable.
JStack get you the stack trace at VM level (the java stack) of all thread your application have. You can find from that what is the real java code executed at a point.
Clearly, GCore is almost never used (too low level, native code...). Only really strange issue with native library or stuff like that will perhaps need this kind of tool.
There is also jmap that can generate a hprof file which is the heap data from you VM. A tool like 'Memory Analyser Tool' can open the hprof, and you can drill down on what was going on (on memory side).
If your VM crash because of a OutOfMemory, you can also set parameter to get the hprof when the event occurs. It helps to understand why (too many users, A DB query that fetch too much data...)
Last thing is the fact that you can add a debug option when your start your VM, so that you can connect to it, and put debug on running process. It can help if you have some strange issue that you are not able to reproduce in your local environment.

OutofMemory error with ibm jdk 1.7

I am using IBM jdk 1.7(to support TLS cyphers) for an struts based application deployed with embedded tomcat.
We are running with memory leaks(OOM) that generated almost 30 gigs of dumps.This has become a rotine event.
We have tried increasing the heap mem by including
wrapper.java.additional.1="-XX:MaxPermSize=256m -Xss2048k" in the wrapper.conf.
But this didnt help much.
Try using Memory Analyzer, you can follow the instructions here to download and install it:
https://www.ibm.com/developerworks/java/jdk/tools/memoryanalyzer/
It should provide an overview of your heap usage.
I'd recommend starting with the dominator tree view to see which objects are responsible for keeping data alive on the heap. You can also run various reports which analyse the heap for you.
You should have core files (.dmp) and heap dumps (.phd), the core files will be large but may be faster to access and will also contain all the values for basic types in objects and strings. The phd files just contain object sizes and the connections between them. It may be easier to relate what you are seeing back to your code if you start with the core file.

windbg cant load sos clr

I'm not sure that windbg is the right tool, but that's what I'm trying now
my asp.net app seems to have a memory leak, it keeps on growing by about 3 MB almost every time a page loads (then it goes back down...)
I want to read the entire process memory and see exactly whats being stored that is unnecessary.
So I run windbg, attach to the webserver40.exe process
then I try
.loadby sos clr
and I get
The call to LoadLibrary(C:\Windows\Microsoft.NET\Framework\v4.0.30319\sos) failed, Win32 error 0n193
"%1 is not a valid Win32 application."
Please check your debugger configuration and/or network access.
It seems that I have this sos.dll in Framework AND Framework64
I tried both using
.load C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos
but nothing loads
I don't understand why its looking for a vaild 32bit app. im on a 64bit pc with 64bit windows.
How can I get this sos thing to load?
Also when I start I get this warning
WARNING: Process 7240 is not attached as a debuggee
The process can be examined but debug events will not be received
I also tried loadby sos mscorwks it didn't work, but I understand that was discontinued. I'm in asp.net 4
I also read somewhere that the code should be stopped in debug before loading sos, that just hangs VS 2010.
Thank you very much.
Again, if there's another tool that could better help me, I'm all ears :-)
WebDev.WebServer40.exe is a 32 bit executable. To debug that you need to use 32 bit WinDbg. Visual Studio, as well as Callipso server are still executing in 32 bit mode.
For your other question. Yes, WinDbg is a great tool to investigate memory leaks in managed code. This blog will get you started. However in your case I would not be so sure you have a memory leak.
You are saying that memory goes down eventually. This means it is not a memory leak, because a leaked memory never gets released.
Do not waste your time investigate memory problems in Callipso. There are a lot of differences between IIS and Callipso that would make your findings not applicable in production environment. Even if you find that Callipso is in fact leaking does not mean that IIS would be.

How do I debug a memory dump of a spiking ASP.NET process?

Sorry, I couldn't figure out a good way to phrase my real question.
I run a high-traffic ASP.NET site on a 64-bit machine. I have IIS running in 32-bit mode, however, due to some legacy components of the app. I am running this particular web app inside an application pool that has the web garden option on (running 6 processes inside an 8 core machine).
Once or twice a week one of the processes will skyrocket into 100% CPU utilization, causing a giant slowdown for the site, so my plan was to wait until that happens, memory dump the offending process, then poke around WinDbg to zero in on the thread that's spiking to see where the code is spinning its wheels.
I've debugged using WinDbg before to figure out what was causing a deadlock on the site, but that was several months ago and I can't remember how I got it working. (As a side note, this is a lesson to document everything you do.)
I'm running WinDbg on the Windows 2003 server that's running the site, so as to prevent any DLL version problems. Here have been my steps so far, please let me know where I'm going wrong to get the error message that I'm getting.
I first memory dump the spiking process using UserDump, with the following command, where 3389 is the ID of the process:
userdump -k 3389
I load the dump into the x86 edition of WinDbg.
Since I'm running 32-bit on a 64-bit machine, I first load the memory dump and then:
.load wow64exts
.effmach x86
I make sure that my symbol path includes the directory that contains my apps PDB files:
.sympath+ c:\inetpub\myapp\bin
Running just `.load SOS' fails with an error of "The system cannot find the file specified", so I go the fully qualified route of the following, which works:
.load c:\windows\microsoft.net\framework\v2.0.50727\sos
From here, I'm lost. I try any of the SOS commands, like !threads, only to get this error:
Failed to load data access DLL, 0x80004005
That error is also accompanied by the numbered list of items that I should be verifying.
I have verified that I am running the most current version of the debugger, mscordacwks.dll is in fact in the same directory as the mscorwks.dll file, and I'm debugging on the same architecture as the dump file.
I've also run the magical ".cordll -ve -u -l" command, but that doesn't solve anything. I'm always greeted with "CLR DLL status: No load attempts" when I execute that. Then I try ".reload", which yields a handful of warnings like "WARNING: wldap32 overlaps dnsapi". I wish it said something like "CLRDLL: Loaded DLL C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscordacwks.dll". But it doesn't.
Try executing !sw before running the sos commands. See this blog post.
(MSDN) How To: Use CLR Profiler
(MSDN Magazine) Investigating Memory Issues
In my experience, spiking app pool can be due to it being recycled. Have you tried IIS Crash / Hang agent and IIS Dump ?
http://www.microsoft.com/downloads/details.aspx?FamilyID=01c4f89d-cc68-42ba-98d2-0c580437efcf&DisplayLang=en
Also included with them is a dumpfile analyzer which will tell you about memory leaks and even suggest areas of your code that need fixing (complete with links to the applicable MSKB articles!)
Dude - not sure if this helps, but maybe try this.
Copy c:\windows\microsoft.net\framework\v2.0.50727\sos.dll to the same directory where windbg is installed to (eg. c:\program files\Debugging Tools for Windows\ ). Why? make it easy to load the sos file
run windbg
load the memory dump file. for me, i use ctrl-D or File -> open crash dump
.load sos <-- take note of the fullstop BEFORE the load command
.symfix c:\temp\debug_symbols
.reload
Ok.. take note of the commandline. this tells me the current THREAD that the dump was in. That might be useless for a HIGH CPU scenario .. because we could be in any thread.
so from here i look at the threads that were running and check out the busiest thread
8 !threadpool <-- this is so i can see the cpu utilization to check we are in a crap (busy) state... eg 100% cpu or what not.
9 !runaway <-- list the threads that have ben around the longest...
eg.
0:027 !runaway
User Mode Time
Thread Time
18:704 0 days 0:00:17.843 <-- Thread #18
19:9f4 0 days 0:00:13.328 <-- Thread #19
16:1948 0 days 0:00:10.718
26:a7c 0 days 0:00:01.375
24:114 0 days 0:00:01.093
27:d54 0 days 0:00:00.390
28:1b70 0 days 0:00:00.328
0:b7c 0 days 0:00:00.171
25:3f8 0 days 0:00:00.000
23:1968 0 days 0:00:00.000
thread 18 and 19 have been hanging around awhile.. hmm.... are they stuck in a loop?
~18s <-- goto thread 18.
!clrstack <-- clr call stack .. which is just like debugging in windows.
.. and from here u can dump objects and stuff by giving the address references and stuff.
check out !help to list some commands to try and use .. i think !help.sos also works?
HTH .. if u still get stuck, ask away at what worked and what didn't.
I just had to deal with a similar problem. In my case, it turned out that WinDbg wasn't able to find the correct version of mscorwks.dll. In addition to the Framework version, there is also a revision of the DLL which can be different between the same framework version.
In theory, the Microsoft symbol servers should be able to supply the necessary DLL, but it wasn't happening for me. To solve it, I used !sym noisy to get additional information on symbol loading. When I did !dumpstack, I got the error message:
SYMSRV: http://msdl.microsoft.com/download/symbols/mscorwks.dll/492B82C1590000/mscorwks.dll not found
To fix this, I created the appropriate folders in my local symbol cache, and copied mscorwks.dll from the machine the dump came from. After a .reload, WinDbg found the necessary DLL in the local symbol cache, and continued on happily.
Alternatively, you can find the exact version of mscorwks being used with lm v m mscorwks. You can then find the update that contains the version you need from this list. You will need to extract the necessary DLLs from the particular update to the right location.

Resources