How To Track Down Circular References
Circular references (typically two structures, each containing a pointer to the other) are by far the most frequent cause of a
MemoryLeak in large Perl applications. For traditional use of TWiki as
CGI program without persistent interpreter (like
ModPerl or
PersistentPerl), these are irrelevant, since at the end of every request the interpreter terminates, releasing
all its data, including those occupied by circular references.
So if you are just using TWiki with traditional
CGI, you don't need to read on.
With a persistent interpreter, however, things are different. One perl process responds to many HTTP requests. One must take great care that the programs for every single request are able to release
all their data. If they don't, then the processes will claim more and more RAM, and eventually hit some limit given by the operating system and fail, often with symptoms which are difficult to interpret.
How Do I Detect Whether I Have A Memory Leak?
It isn't too bad if you occasionally have a look how the processes of your web server are doing. In Unix/Linux, there's the
top command, in Windows you use the task manager.
If the processes keep accumulating more and more RAM, then you are likely to have a memory leak. In my rather small installation (based on
TWikiVMDebianStable, with mod_perl and a couple of plugins added) the processes are starting with about 25MB, and do not exceed 40MB. In case of a leak they can easily exceed 100MB. Almost like Java
I have used a pretty simple program, similar to what mod_perl does, to check whether there could be a memory leak. Place the following in your bin directory as view_loop and start from the command line with ./view_loop 50 >/dev/null (change the number of loops if you want, it defaults to 100 views). Watch with top.
#!/usr/bin/perl -w
my $view = '';
my $n_loops = $ARGV[0] || 100;
my $i_loops = 0;
{
open VIEW, './view' or die "Couldn't open view script: '$!'";
local $/ = undef;
$view = <VIEW>;
close VIEW;
}
eval "sub handler { $view }";
die "Couldn't compile handler: '$@'" if $@;
while ($i_loops++ < $n_loops) {
handler();
}
How Do I Find The Cause?
Our Helpers
Given the complex data structures in TWiki, finding the structures which actually point to each other at the end of the program can be a nightmare, especially if you don't know TWiki's code by heart. But fortunately there's help available with Perl's
warn and
DESTROY functions, and class inheritance.
The
warn function simply writes its argument to STDERR. Its existence is justified by the fact that it automatically adds the source line where it has been called (which isn't extremely helpful), and that it tells us
whether Perl is in the phase of "global destruction".
If a
DESTROY function (documented in the section
"Destructors" in the perlobj manpage
is defined in a package, then it is called whenever a perl object is destroyed. Fortunately references in TWiki
are all stored in objects.
So by combining both, printing a
warn in a
DESTROY subroutine, we can detect whether an object is destroyed during global destruction. There are several possibilities for this:
- We have a lexically object which is defined at the top level (i.e. outside any
sub) in the main program.
- We have a global object (e.g. one defined with
use vars).
- We have a circular reference which prevents freeing a lexical object when it goes out of scope.
TWiki's sources (in
DakarRelease) are structured in a way that the main programs (those in the
bin directory) are pretty simple and don't store any objects, and global variables are very seldom
objects. So if an object is destroyed during global destruction, then probably
we have a circular reference.
Instrumenting The Code
So how do we get a
DESTROY routine in every package of TWiki? This is where Perl's class inheritance comes in handy: All Perl classes inherit from a class called
UNIVERSAL. So all we need to do is to write a
DESTROY function in the
UNIVERSAL namespace. Note that we don't have to create a file
UNIVERSAL.pm for that. Personally, I prefer to simply add a couple of lines to the main program in question. For example, I've created a file called
view_circ in my
bin directory which looks almost identical to the
view program:
#!/usr/bin/perl -wT
#
# TWiki Enterprise Collaboration Platform, http://TWiki.org/
#
# Copyright (C) 1999-2006 Peter Thoeny, peter@thoeny.org
# and TWiki Contributors.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version. For
# more details read LICENSE in the root of this distribution.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
#
# As per the GPL, removal of this notice is prohibited.
BEGIN {
# Set default current working directory (needed for mod_perl)
if( $ENV{"SCRIPT_FILENAME"} && $ENV{"SCRIPT_FILENAME"} =~ /^(.+)\/[^\/]+$/ ) {
chdir $1;
}
# Set library paths in @INC, at compile time
unshift @INC, '.';
require 'setlib.cfg';
}
use TWiki::UI::View;
TWiki::UI::run( \&TWiki::UI::View::view );
package UNIVERSAL;
sub DESTROY {
my $self = shift;
warn "Destroying object $self";
}
Note the final five lines? We define a
DESTROY method in class
UNIVERSAL. And we use
warn to print the actual class of the object. Note that you must
not end the string of the warning in a newline
"\n" since this would tell
warn to suppress the information about global destruction!
Harvesting
Now just change to the
bin directory and run the modified program like that (we're discarding the
HTML output of the page since it isn't interesting for our purposes):
./view_circ >/dev/null 2>/tmp/tales_of_destruction
Open
/tmp/tales_of_destruction with your favourite editor. It might be several hundreds of lines, but not unreasonably big.
Look for the string
during global destruction.
There you are - almost. Every line telling you something like
Destroying object CGI::Cookie=HASH(0x105ff1f8) above the first line featuring
during global destruction is just fine because it shows that an object has been destroyed when the variable holding it got out of scope.
But not everything inflicted with
during global destruction is a memory leak.
Let's try to simply enumerate the
known harmless objects which are destroyed during global destruction:
- A
TWiki::Sandbox object is held by the variable $sharedSandbox in TWiki.pm, and as the name indicates, is intended to stay shared.
- A
TWiki::If object is held by the variable $ifFactory in TWiki.pm, initialized only once, and re-used thereafter. No problem.
- If you have a plugin using LDAP installed: The CPAN module
Net::LDAP creates a couple of Convert::ASN1::parser objects which (hopefully) are harmless.
Well, that should be it (additions are welcome). If you have more, then you might have a circular reference.
Finding The Culprits
The problem with our harvest above is that it tells us the
class of the object, but not where it has been defined, and what the references are which point to each other.
To that end, I usually use
Data::Dumper (a module bundled with the Perl core) in its easiest way. Say, you have a message like
Destroying object TWiki=HASH(0x00affe00) during global destruction. Tough luck. But at least we know where to capture it.
All normal TWiki
CGI programs, as their last action, run through the subroutine
TWiki::finish. So what I usually do is add a couple of lines into that routine:
sub finish {
my $this = shift;
$this->{client}->finish();
+
+ use Data::Dumper;
+ $Data::Dumper::Indent = 1;
+ warn "prepared to finish";
+ warn Dumper($this);
}
=pod
Then I run the harvesting program again and inspect the output.
Data::Dumper, when used in this way, is starting his output with a line like
$VAR1 =
...and all you need to do is to look through the structure where you find
$VAR1 as a value on the right hand side of the equation.
Breaking The Circle
Say that your dump contains the following:
$VAR1 = bless( {
'plugins' => bless( {
'session' => $VAR1
}, 'TWiki::Plugins' ),
You've detected a circular reference between the
TWiki and the
TWiki::Plugins object!
All you need to do, preferably in
TWiki::finish again, is break that circle. Note that there are two possibilities: Remove the "upward" pointer from the plugins object to the TWiki object, or remove the reference from the TWiki object to the plugins object.
Since practically all substructures of the TWiki object contain the upward pointer, it is easier to kill the references
from the TWiki object with one single statement at the very end of the routine:
%$this = ();
Rerun your harvester, and
Voila! the line saying
Destroying object TWiki=HASH(0x00affe00) should no longer show the part
during global destruction.
--
Contributors: HaraldJoerg
Discussion

Note that at the time of this writing, there is an open bugs item
Bugs:Item2158
reporting circular references in TWiki. So if you start running this recipe right now, be aware that you might be duplicating work which has already been done. Please check the status of
Bugs:Item2158
now.
--
HaraldJoerg - 26 Apr 2006
An excellent read, thanks Harald! Clear and succinct. it occurs to me that we can improve the unit tests to detect leakage, using your methodology. That should help to nail down leaky classes.
--
CrawfordCurrie - 27 Apr 2006
I'll try to write an automated version which runs
view_circ and checks the result against a list of "known harmless" persistent data. This would be helpful because people can easily run the test against arbitrary topics, which is needed if one wants to test plugin data.
--
HaraldJoerg - 28 Apr 2006