Thursday, 3 June 2010

Memory leak, Dispose() and Using statement (2)

Step 1: Diagnose your code. Find the evil piece.
Probably you know some tools which helps to measure performance and memory usage. Two types of tools can help you in this step. Memory measurement tool and memory explorer tool.
WMI (Windows Management Instruction) does a good job measuring private working set for a defined process or processes. It keeps tracking the memory usage with a configured time interval and saving to spreadsheet or other format files. There is also a user interface so you can see some graph of memory usage.
There are also several memory explorer tools in the market. The one I am using is ANTS memory profiler. It gives a good hint which objects are now inside the memory (heap), and also objects numbers, values and how far they are from the GC root. Even it can compare the memory snapshots at two different time. From my experience sometimes it helps, sometimes not. Memory usage is a complex issue especially when you have a complex system. You got tons of things in the memory. Usually unused objects stay in the memory for a while should be ok as long as they could be collected in the next garbage collection. But the thing is that GC , not you, decides when and how to collect the garbage. So it is really hard to tell if these unused objects in the memory are new or survive in the last garbage collection. You have to run the tool for a longer time to compare the several snapshots.

Go back to the memory leak problem I fixed. Some context. The project size is big and I am not familiar with every module. And no way to be familiar with every module of this big project.
I am a bit lucky this time because I paid attention to the memory usage of the system for a long time. Sometime I left the system running over night and there was no obvious memory leaking. So think about the developing environment and production environment and get some idea.
  • Database size. I believe production environment has a much bigger database size.
  • LDAP. Production environment uses LDAP for authentication. I do not.
  • I did not do a real long test.
It did not take me very long time to narrow down to one suspicious module. I spent almost one day on some suspicious UI things because ANTS memory profiler told me that there are a lot of objects staying in the memory relevant to UI refreshing. But after some painful code checking and more tests I found out that number of that part of objects went down after it reached a limit. And it does it consistently. So forgot about it and moved on. I also did spend some time to increase my database size and did more tests. It turned out database size has nothing to do with memory leaking at this time.

I focused on several public methods after verifying that this module is the evil one with several tests. It became easy now. Tested one method each time after mocking up the other suspicious methods. I were almost there after several tests. I can tell which lines are leaking memory. But I can not tell if they are all of the evil pieces.

To be honest ANTS memory profiler does not help a lot in this step and drove me to the wrong direction sometimes.


No comments:

Post a Comment