In our ASP.NET web app we're experiencing a quite extensive memory leak which I am investigating right now. Using WinDbg I got down to the largest memory eaters in our app which are (ran !dumpheap -stat in the WinDbg console to get these):
MethodTable Addr Count Overall size Type
...
000007fee8306e10 212928 25551360 System.Web.UI.LiteralControl
000007feebf44748 705231 96776168 System.Object[]
000007fee838fd18 4394539 140625248 System.Web.Caching.CacheDependency+DepFileInfo
000007fee838e678 4394614 210941472 System.Web.FileMonitorTarget
000007feebf567b0 18259 267524784 System.Collections.Hashtable+bucket[]
00000000024897c0 1863 315249528 Free
000007feebf56cd0 14315 735545880 System.Byte[]
000007feebf4ec90 1293939 1532855608 System.String
For all I know a large number of String objects can be quite normal; still there's definitely room for improvement. But what really makes me itch is the count of System.Web.FileMonitorTarget objects: we have over 4 million instances on the heap (à 48 bytes)! Using two memory dumps and comparing them I've found out that these objects are not being cleaned up by the GC.
What I'm trying to find out is: where are these objects coming from? I've already tried ANTS Memory Profiler to get to the root of the evil but it leads nowhere near any of our own classes. I see the connection with System.Web.Caching.CacheDependency+DepFileInfo and thus the System.Web.Cache but we do not use file dependencies to invalidate our cache entries.
Also, there are 14315 instances of System.Byte[] making up for over 700 MB on the heap which stuns me - the only place where we use Byte[] is our image uploading component but we have only around 30 image uploads per day.
What might be the source of these Byte arrays and FileMonitorTarget objects? Any hints are very welcome!
Oliver
P.S. Someone asked pretty much the same question here but the only 'answer' there was very general.
There are a couple of things I would look into. You're right the strings are often used in great number. Still you have approx. 1.4 GB worth of strings on the heap. Does that sound right? If not I would look into that. If that is withing the expected range, just ignore it.
If you suspect FileMonitorTarget and/or Byte[] to be leaking, dump the instances using !dumpheap -mt XXX where XXX is the listed MethodTable for the types. You may want to use PSSCOR2 instead of SOS, as it makes this task a bit easier (the output from !dumpheap shows a delta column and you can limit the number of instances dumped).
The next thing to do is to start looking into what is keeping specific instances alive. The !gcroot command will tell you what roots a specific instance. Pick an instance at random and inspect the roots. If everything is as expected move on to the next. If you application is leaking instances of these types chances are that you will get an instance that should have been freed. Once you get the roots you need to figure out what part of the code is holding on to these. A common source is unsubscribed events, but there are other possible reasons why objects are kept alive.
Objects of type System.Web.Caching.CacheDependency+DepFileInfo are created automatically by ASP.NET to monitor file changes to your website. So even if you are not specifically using a FileDependency cache expiration, ASP.NET itself does.
If I run a dump field against some of these objects, I get a path to my controls/pages.
0:000> !df -field _filename 0d3f24ec
Name: System.String
MethodTable: 79330b24
EEClass: 790ed65c
Size: 180(0xb4) bytes
GC Generation: 2
(C:\WINDOWS\assembly\GAC_32\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: C:\inetpub\wwwroot\Website\Application\Base\UserControl\Messages.ascx
Fields:
MT Field Offset Type VT Attr Value Name
79332d70 4000096 4 System.Int32 1 instance 82 m_arrayLength
79332d70 4000097 8 System.Int32 1 instance 81 m_stringLength
79331804 4000098 c System.Char 1 instance 44 m_firstChar
79330b24 4000099 10 System.String 0 shared static Empty
>> Domain:Value 000e0ba0:02581198 00109f28:02581198 <<
79331754 400009a 14 System.Char[] 0 shared static WhitespaceChars
>> Domain:Value 000e0ba0:025816f0 00109f28:02586410 <<
You can see this link describing a bit more detail: Understanding ASP.NET Dynamic Compilation
However, your case might still be different. Try running !GCRoot [obj_addr] and see what is holding onto those objects. In my case it is entirely IIS /.NET related objects.
That said, I still had a problem where millions of these cache objects were created, and I have no idea why. :| (this is the first time it happened to me, but I don't think it appeared or will disappear magically...)
Related
Problem Background
From the past few months we found issue in my online Asp.net application. Application working fine but 1 or 2 times in a day it suddenly crash on different modules on live server, but their is no as such issue in code at all and find not such types of issues on local server.
After some research i found that process that run my application on IIS on live server its memory increase continuously and when it reaches at certain level its begin to crash.
Temporary solution:
when ever we found such issue we restart application on IIS.
which can end this process and start new process,then after that application start working.
In a day some time we need to restart our application 2 or 3 times some times more.
Problem that i find: memory leak.
Find some solution after some research:
create a dump file from my application process from task manager when application is crashes.
Tool use: Windbg
open in Windbg tool for analysis.
write command
.load by clr
dumpheap -stat
shows a tons of references of datatypes.
now i stuck at thses point.
i share with you in image section.
Question:
1. I am on the right direction in finding memory leaks issue?
2. if my path is right where whats my next step?
3. Windbg tool is good for finding such kind of issue?
dump file link for detail review, i take this dump file when server stop to response
create a dump file from my application process from task manager when application is crashes
That's not a good choice, because
you don't have much time to do so. You can only do that as long as the crash dialog is displayed. If you're too late, the application is gone.
in that state, you'll have difficulties debugging it. Instead of the original exception, it will show a breakpoint, which is used by the OS to show the dialog and collect diagnostic data
Use WER local dumps to automatically create a crash dump when it crashes instead of doing it manually. It's much more reliable and gives you the original exception. See How do I take a good crash dump for .NET
I am on the right direction in finding memory leaks issue?
Sorry, you're on the wrong track already.
Starting with !dumpheap -stat is not a good idea. Usually one would start at the lowest level, which is !address -summary. It will give you an indicator whether it's a managed memory leak or a native memory leak. If it's a managed leak, you could continue with !dumpheap -stat
if my path is right where whats my next step?
Even if it's not the right path, it's a good idea that you learn how figure out that you're on the wrong path. So, how do I know?
Looking at your output of !dumpheap -stat, you can see
[...]
111716 12391360 System.String.
This tells you there are 110.000 different strings, using 12 MB of memory. It also tells you that everything else takes less than 12 MB. Look at the other sizes and you'll find out that .NET is not the reason for your OutOfMemoryException. They use less than 50 MB.
If there were a managed leak, you would look for paths where objects are connected to, so that the garbage collector thinks it cannot be freed. The command is !gcroot.
Windbg tool is good for finding such kind of issue?
It is possible, but WinDbg is not the best tool. Use a memory profiler instead. That's a dedicated tool for memory leaks. Typically it has a much better usability. Unfortunately you'll need to decide whether you need a managed memory profiler, native memory profiler or both.
I once wrote how to use WinDbg to track down .NET OutOfMemoryException. You'll find a chart there which gives you ideas on how to proceed in different situations.
In your dump I see 2 TB of <unknown> memory, which could be .NET, but needn't be. Still, these 2 TB are likely the cause of the OOM, because the rest is less than 350 MB in size.
Since clr is in the list of loaded modules, we can check !dumpheap -stat as you did. But there are not many objects using memory.
!eeheap -gc shows that there are 8 heaps, corresponding to the 8 processors of your machine, for parallel garbage collection. The largest individual heap is 45 MB, the total 249 MB. This roughly matches the sum of !dumpheap. conclusion: .NET is not the culprit.
Let's check the special cases:
Presence of MSXML
Bitmaps
Calls to HeapAlloc() which are so large that they are directly forwarded to VirtualAlloc().
Direct calls to VirtualAlloc()
MSXML is not present: lm m msxml* does not produce output.
There are no Bitmaps: !dumpheap -stat -type Bitmap
Heap allocations larger than 512 kB: !heap -stat. Here's a truncated part of the output:
0:000> !heap -stat
_HEAP 0000018720bd0000
Segments 00000006
Reserved bytes 0000000001fca000
Committed bytes 0000000001bb3000
VirtAllocBlocks 00000002
VirtAlloc bytes 00000312cdc4b110
_HEAP 0000018bb0fe0000
Segments 00000005
Reserved bytes 0000000000f0b000
Committed bytes 0000000000999000
VirtAllocBlocks 00000001
VirtAlloc bytes 0000018bb0fe0110
As you can see, there are 3 blocks that went to VirtualAlloc. The size is somewhat unrealistic:
0:000> ? 00000312cdc4b110
Evaluate expression: 3379296514320 = 00000312`cdc4b110
0:000> ? 0000018bb0fe0110
Evaluate expression: 1699481518352 = 0000018b`b0fe0110
That would be a total of 3.3TB + 1.7TB = 6TB and not 2TB. Now, it may happen that this is a bug of !address, but 4TB is not a common overflow point.
With !heap -a 0000018720bd0000 you can see the 2 virtual allocs:
Virtual Alloc List: 18720bd0110
0000018bac70c000: 00960000 [commited 961000, unused 1000] - busy (b), tail fill
0000018bad07b000: 00960000 [commited 961000, unused 1000] - busy (b), tail fill
And with !heap -a 0000018bb0fe0000 you can see the third one:
Virtual Alloc List: 18bb0fe0110
0000018bb1043000: 00400000 [commited 401000, unused 1000] - busy (b), tail fill
These are all relatively small blocks of 4.1MB and 9.8 MB.
For the last part, direct calls to VirtualAlloc(), you need to get back to the level of !address. With !address -f:VAR -c:".echo %1 %3" you can see the address and size of all <unknown> regions. You'll find a lot of entries there, many of small sizes, some which could be the .NET heaps, a few 2GB ones and one really large allocation
The 2GB ones:
0x18722070000 0x2d11000
0x18724d81000 0x7d2ef000
0x187a2070000 0x2ff4000
0x187a5064000 0x7d00c000
0x18822070000 0x2dfe000
0x18824e6e000 0x7d202000
0x188a2070000 0x2c81000
0x188a4cf1000 0x7d37f000
0x18922070000 0x2d13000
0x18924d83000 0x7d2ed000
0x189a2070000 0x2f5a000
0x189a4fca000 0x7d0a6000
0x18a22070000 0x2c97000
0x18a24d07000 0x7d369000
0x18aa2070000 0x2d0c000
0x18aa4d7c000 0x7d2f4000
It is likely that these are the .NET heaps (committed part + reserved part).
The large one:
0x7df600f57000 0x1ffec56a000
The information about it:
0:000> !address 0x7df600f57000
Usage: <unknown>
Base Address: 00007df6`00f57000
End Address: 00007ff5`ed4c1000
Region Size: 000001ff`ec56a000 ( 2.000 TB)
State: 00002000 MEM_RESERVE
Protect: <info not present at the target>
Type: 00040000 MEM_MAPPED
Allocation Base: 00007df5`ff340000
Allocation Protect: 00000001 PAGE_NOACCESS
It looks like a 2TB memory mapped file which is unused (and therefore reserved).
I don't know what your application is doing. This is really where I need to stop the analysis. I hope this was helpful and you can draw your conclusions and fix the issue.
I am having some kind of strange behavior using .NET MemoryCache in an ASP.NET application. The problem is, that objects will be evicted after a view minutes and there seems to be no reason for that. The memory limits are set in the web.config:
<system.runtime.caching>
<memoryCache>
<namedCaches>
<add name="Default"
cacheMemoryLimitMegabytes="1500"
physicalMemoryLimitPercentage="18"
pollingInterval="00:02:00" />
</namedCaches>
</memoryCache>
</system.runtime.caching>
My development machine has 8 GB of ram and the w3wp.exe process is using about 0,5 GB. 2 GB are still available on the machine when the application is running (beside visual studio, webbrowsers and so on)
A RemovedCallBack method has been added to every entry to generate log entries for every removal and expecially for evictions:
private static void CachedItemRemovedCallback(CacheEntryRemovedArguments arguments)
{
LogCurrentCacheDelta(arguments.CacheItem, true);
if (arguments.RemovedReason == CacheEntryRemovedReason.Evicted)
{
Sitecore.Diagnostics.Log.Warn(
string.Format(
"Cache Item Evicted (cacheMemoryLimitMegabytes: {0}) - Key: {1}, Value: {2}",
FlightServiceCache.CacheMemoryLimit,
arguments.CacheItem.Key,
arguments.CacheItem.Value),
FlightServiceCache);
}
}
A counter for calculating the size currently used has also been implemented. I am using a binary serialization to estimate the size of the objects in memory. At the moment, the first eviction occured, about 120 objects were in the cache and the memory used was about 6 magabytes. For my understanding, this is in no way a reason for evicting entries from cache. But it happens again and again and after to days of investigation, I am still not sure why this happens.
I also took a look at the internal implementation of the trim() function in the .NET framework source code used when objects are beeing evicted. The calculation made therefore is not easy to understand, maybe someone knows how it works and can point this out for me.
It would be great if anyone could shade some light on that.
Thank you very much in advance and sorry for the really long post ;)
(btw. this is my first post so any suggestions about how to improve my questions are highly appreciated)
Had the same exact problem. Even if I set the CacheMemoryLimit to big enough value (let's say 1GB) and the PhysicalMemoryLimit to 10% (which in my case with 32GB of physical installed memory comes to 3.2 GB), still many of my cache entries would be evicted to free cache memory. Note that I was caching 1MB items and 10 of those, so altogether 10 MB, whereas, I supposed to have at least the minimum of the two limitations mentioned above, which is 1GB.
Yes, #VMAtm, was correct in his comment above that one should use bigger %, and I tested with 10% it evicted, 50% it didn't, and with the divide and conquer method I proved that with my setup then at around 45% it no longer evicts. But note, that depending on the overall installed memory size the behaviour might be different for the % values I used for testing.
So for me the point was not to trust the PhysicalMemoryLimit % and not to set it, rather, use the CacheMemoryLimit config property only. And if you still need to prove an option with the PhysicalMemoryLimit % then instead of using the system.runtime.caching configuration settings, rather, introduce your own settings, read them, get the actual physical installed memory size, use your percentage setting and then calculate the minimum of the two: physical memory limit (now in bytes) and cache memory limit (in bytes from your own setting). Having that you can then create MemoryCache and pass only the cacheMemoryLimitMegabytes config through NameValueCollection to its constructor, and the value for the cacheMemoryLimitMegabytes config property will be that minimum of two calculated above.
BTW to get the total physical installed memory size one can use:
[DllImport("kernel32.dll")]
[return: MarshalAs(UnmanagedType.Bool)]
private static extern bool GetPhysicallyInstalledSystemMemory(out long totalMemoryInKilobytes);
Sometimes, on various Unix architectures, recompiling a program while it is running causes the program to crash with a "Bus error". Can anyone explain under which conditions this happens? First, how can updating the binary on disk do anything to the code in memory? The only thing I can imagine is that some systems mmap the code into memory and when the compiler rewrites the disk image, this causes the mmap to become invalid. What would the advantages be of such a method? It seems very suboptimal to be able to crash running codes by changing the executable.
On local filesystems, all major Unix-like systems support solving this problem by removing the file. The old vnode stays open and, even after the directory entry is gone and then reused for the new image, the old file is still there, unchanged, and now unnamed, until the last reference to it (in this case the kernel) goes away.
But if you just start rewriting it, then yes, it is mmap(3)'ed. When the block is rewritten one of two things can happen depending on which mmap(3) options the dynamic linker uses:
the kernel will invalidate the corresponding page, or
the disk image will change but existing memory pages will not
Either way, the running program is possibly in trouble. In the first case, it is essentially guaranteed to blow up, and in the second case it will get clobbered unless all of the pages have been referenced, paged in, and are never dropped.
There were two mmap flags intended to fix this. One was MAP_DENYWRITE (prevent writes) and the other was MAP_COPY, which kept a pure version of the original and prevented writers from changing the mapped image.
But DENYWRITE has been disabled for security reasons, and COPY is not implemented in any major Unix-like system.
Well this is a bit complex scenario that might be happening in your case. The reason of this error is normally the Memory Alignment issue. The Bus Error is more common to FreeBSD based system. Consider a scenario that you have a structure something like,
struct MyStruct {
char ch[29]; // 29 bytes
int32 i; // 4 bytes
}
So the total size of this structure would be 33 bytes. Now consider a system where you have 32 byte cache lines. This structure cannot be loaded in a single cache line. Now consider following statements
Struct MyStruct abc;
char *cptr = &abc; // char point points at start of structure
int32 *iptr = (cptr + 1) // iptr points at 2nd byte of structure.
Now total structure size is 33 bytes your int pointer points at 2nd byte so you can 32 byte read data from int pointer (because total size of allocated memory is 33 bytes). But when you try to read it and if the structure is allocated at the border of a cache line then it is not possible for OS to read 32 bytes in a single call. Because current cache line only contains 31 bytes data and remaining 1 bytes is on next cache line. This will result into an invalid address and will give "Buss Error". Most operating systems handle this scenario by generating two memory read calls internally but some Unix systems don't handle this scenario. To avoid this, it is recommended take care of Memory Alignment. Mostly this scenario happen when you try to type cast a structure into another datatype and try reading the memory of that structure.
The scenario is bit complex, so I am not sure if I can explain it in simpler way. I hope you understand the scenario.
We have a large asp.net application that is leaking memory. Perfmon shows that this leak is in managed memory as W3WP private bytes grows at the same rate as bytes in all heaps. I can also see that Gen 2 garbage collections are running but the Gen 2 heap size continues to grow.
I took a memory dump and analysed in WinDbg and can see a very large number of objects of lots of types. Strings are the biggest type and 20% of the size of the strings is made from 51 objects.
Dumping these large strings shows outputted html either from controls or entire pages. Running !gcroot on these shows the root objects being of type System.Text.RegularExpressions.Regex or System.Web.RegularExpressions.GTRegex.
Any ideas of what could be happening or how I can investigate further?
Thanks, Simon
How about using a memory profiler such as dotTrace Memory or ANTZ Memory Profiler? Both products are available as a time-limited trial version.
That strings are the most common type on your heap is not strange at all. If you for example have 10 HashSet's containing 1000 strings each, the dump will show that you have 10 HashSet's on your heap, but 100 000 strings. Many objects contain one or several strings. Thus, the number of strings shown in the dump is the sum of all strings from all objects on the heap, which tend to be a lot.
However, if you have alot of System.Text.RegularExpressions.Regex on your heap, that can very well be the root to your memory problemts. Regex in .NET tend to take a lot of resources. Hence, my advice is that you go through your code and try to find any excessive use of regex. Also, make sure that any references to Regex objects are being taken care of, that is to say, that the references to the Regex objects are not kept alive. That way, the Garbage Collector can make sure the Regex objects are deallocated properly.
Good luck!
In theory it should be quite dificult to cause a memory leak in asp.net without using unmanaged resources. If everything is single threaded then all references to managed resources should be free to be garbage collected when the page life cycle is complete. Are you firing off worker threads to do anything and are these threads continuing to live beyond the life of the page? Or do you have any long running processes exposed as web methods that can be fired off by asynchounsly and are just taking a long time to run and being called repeatedly until the memory is full?
Running into a prickly problem with our web app here. (Asp.net 2.0 Win server 2008)
Our memory usage for the website, grows and grows even though I would expect it to remain at a fairly static level. (We have a small amount of data that gets stored in state).
Wanting to find out what the problem is, I've run a System.GC.Collect(); a few times, taken a memory dump and then loaded this memory dump into WinDbg.
When I do a DumpHeap -Stat I get an inordinately large number on particular type hanging around in memory.
0000064280580b40 713471 79908752 PaymentOption
so, doing a DumpHeap -MT for this type, I get a stack of object references. Picking a random number of these, I do a !gcroot and the command comes back reporting that no references are held to it.
To me, this is exactly when the GC should collect these items, but for some reason they have been left outstanding.
Can anybody offer an explanation as to what might be happening?
You could try using sosex.dll in Windbg, which is an extension written to help with .NET debugging. There is a command named !refs which is similar to !gcroot, in that it will show you all the objects referencing an object, plus it will show all the objects that it too is referencing.
In the example on the author's website, !refs is used against an object and the output looks like this:
0:000> !refs 0000000080000db8
Objects referenced by 0000000080000db8 (System.Threading.Mutex):
0000000080000ef0 32 Microsoft.Win32.SafeHandles.SafeWaitHandle
Objects referencing 0000000080000db8 (System.Threading.Mutex):
0000000080000e08 72 System.Threading.Mutex+<>c__DisplayClass3
0000000080000e50 64 System.Runtime.CompilerServices.RuntimeHelpers+CleanupCode
Few things:
GC.Collect won't help you do any debugging. The garbage collector is already being called: if any objects were available for collection it would have happened already.
Idle memory on a server is wasted memory. Are you sure memory is being 'leaked', or is it just that the framework is deciding it can keep more things in memroy or keep more memory around for faster access? In this case I suspect you are leaking memory, but it's something to double check for.
It sounds like something you don't expect is keeping a reference to PaymentOption objects. Perhaps a static collection somewhere? Or separate thread?
Does PaymentObject implement a finalizer by any chance? Does it call a STA COM object?
I'd be curious to see the output of !finalizequeue to see if the count of objects that are showing up on the heap are roughly the amount of any that might waiting to be finalized. Output should probably look something like this:
generation 0 has 57 finalizable objects (0409b5cc->0409b6b0)
generation 1 has 55 finalizable objects (0409b4f0->0409b5cc)
generation 2 has 0 finalizable objects (0409b4f0->0409b4f0)
Ready for finalization 0 objects (0409b6b0->0409b6b0)
If the number of Ready for finalization objects continues to grow, and your certain garbage collections are occuring (confirm via perfmon counters), then it might be a blocked finalizer thread. You might need to take several snapshots over the lifetime of the process (before a recycle) to confirm. I usually rely on the magic number of three, as long as the site is under some sort of load.
A bug in a finalizer can block the finalizer thread and prevent the objects from ever being collected.
If the PaymentOption object calls a legacy STA COM object, then this article ASP.NET Hang and OutOfMemory exceptions caused by STA components might point in the right direction.
Not without more info on your application. But we ran into some nasty memory problems a long time ago. Do you use ASP.NET caching? As Raymond Chen likes to say, "poor caching strategy is indisitinguishable from a memory leak."
Check out another tool - CLRProfiler.exe - it will help you traverse object reference trees to see where your objects are rooted. This is also good: link text
You've heard this before - if you have to GC.Collect, something is wrong.
Is the PaymentOption object created in an asynchronous process, by any chance? I remember something about, if you don't call EndInvoke, you can get problems like this.
I've been investigating the same issue myself and was asking why objects that had no references were not being collected.
Objects larger than 85,000 bytes are stored on the Large Object Heap, from which memory is freed up less frequently.
http://msdn.microsoft.com/en-us/magazine/cc534993.aspx
A single PaymentOption may not be that big, but are they contained within collections, or are they based on something like a DataSet? You should pick on few instances of the PaymentOption / collection / DataSet and then use the sos !objsize command to see big they are.
Unfortunately this doesn't really answer the question. I like to think I can trust the .net framework to take care of releasing unused memory whenever it needs to. However I see a lot of memory being used by the worker process running the app I am looking at, even when memory looks quite tight on the server.
FYI, SOS in .NET 4 supports a few new commands that might be of assistance, namely !gcwhere (locate the generation of an objection; sosex's gcgen) and !findroots (does what it says on the tin; sosex's !refs)
Both are documented on the SOS documentation and mentioned on Tess Ferrandez's blog.