Debugging Memory Leaks

First of all, we need unit tests to verify different classes and functions (esp. extension functions) don't have memory leaks by running under valgrind like this: GLIBCXX_FORCE_NEW=1 \ valgrind --suppressions=../bin/valgrind.suppression --tool=memcheck \ --leak-check=full --num-callers=30 --max-stackframe=3000000 \ test/test TestExtFoo::test_ext_bar When it comes to server running, it becomes impossible to run valgrind or heap profiler that slows down request handling very much. Here's the procedure to run built-in memory leak detection against a live server: 1. Turn on heap profiler Build the server (both HPHP and www) with modification of rules.mk: DEBUG=1 #GOOGLE_CPU_PROFILER = 1 GOOGLE_HEAP_PROFILER = 1 This turns off CPU profiler and turns on heap profiler that gives us malloc() hooks for our own sampling based leak detection. We also need to turn on DEBUG to generate readable stacktraces. 2. Turn off mt_allocator Run server with GLIBCXX_FORCE_NEW=1. This environment variable turns off STL's mt_allocator, which doesn't call free() when some STL objects are destructed. 3. Initialize long-living objects Let the server run for a few minutes, until APC is mostly updated. Otherwise, APC objects may be reported as leaked items. 4. Turn on leak detection Hit the server to turn on leak detection: GET http://[server]:8088/leak-on?sampling=500 The higher the sampling rate, the least impact leak detection has on server running, but it will take longer to collect leaked items. 500 is a good rate in our debugging process. 5. Report leaks Wait for minutes long, or even hours long, depending on how rare the leak happens. Then hit the server to turn off leak detection and to report leaks: GET http://[server]:8088/leak-off > leak_report 6. Examine output The output should have all leaked items. Sometimes some stacks are not fully translated, and a manual translation needs to be done like this: ./www --mode translate