Refactoring Proposal: Don't use require for mandatory Perl modules
Motivation
For TWiki 4.2, a lot of
use statements have been replaced by
require, mainly motivated by performance reasons. This change,
however, makes code inspection, debugging and benchmarking more
difficult, and even fails to meet its performance goal in persistent
environments.
Description
All modules which are needed anyway for a particular TWiki request should be compiled with
use and not with
require.
--
HaraldJoerg - 16 Dec 2007
Impact and Available Solutions
Documentation
For TWiki 4.2, many occurrences of
use Module; have been replaced by
require Module; throughout the code. This has been motivated by
performance, but it has some drawbacks - even in the area of
performance.
The benefits are:
- According to CC's measurements,
require is faster than use. This can be rationalized by the fact that require does not even attempt to load any symbols into the caller's namespace.
The drawbacks are:
- In a persistent environment,
require is actually slower than use. And again, there's a simple rationale: use is done at compile time, i.e. once for a persistent process, and does not leave any traces in the opcode for every request. On the other hand, require is a runtime operation, so even if require finds out that the module in question has already been compiled, it has to do so over and over again, once for each invocation of the script.
- Module dependencies are no longer visible by looking at the module's head area. The
require statements are spread throughout the files.
- Performance measurements are skewed if
require takes place in different routines, because the time needed for require is added to the runtime of routines where it occurs.
- Debugging is more difficult because it is no longer possible to set breakpoints immediately after starting the program. Yes, there is
b postpone, but this doesn't offer any checking of possible typos, whereas a plain b will complain if the breakpoint symbol is not defined.
The remaining reasonable use cases of
require are:
- If the module is only required if some conditions are met (see the practice for
require locale in TWiki code)
- If the compilation is wrapped in a string eval, as an additional hint to the reader that this is a runtime compilation
Examples
For the performance aspect in persistent environments, compare the
runtime of the following two snippets:
#!perl -wT
use strict;
for my $i (1..1000000) {
use Benchmark;
}
#!perl -wT
use strict;
for my $i (1..1000000) {
require Benchmark;
}
On my machine (3GHz Pentium 4)
use takes about 0.12 seconds, and
require about 0.32 seconds.
For the debugger annoyance, start
perl -dT with either of the
scripts and then try any of the following commands:
-
b Benchmark::new
-
f Benchmark
Implementation
Just use
use, as in previous versions of TWiki, unless there is a
real chance that, in "plain
CGI" environments, a module might escape
compilation altogether by being
required only in certain paths.
Discussion
I am tempted to agree to your findings. However, you are measuring 1000000 loops and still only get a difference of 200 ms? That's not significant. Are there any better benchmarks to base your proposal on (legacy hardware

)?
--
MichaelDaum - 17 Dec 2007
Not significant indeed. You can easily improve the benchmark by using ten modules instead of one. This will slow down the hash lookup in
%INC for each iteration of
require, but not for
use. TWiki has more than ten modules.
But anyway: The replacement of
use by
require has been introduced as a performance
improvement, which it obviously fails to be with persistent interpreters. By using
use we could make the code more readable, easier to debug and profile,
and faster by whatever tiny amount with mod_perl, at the expense of some (I don't recall the actual figures and can't find the Codev topic) small slowdown for plain
CGI.
--
HaraldJoerg - 18 Dec 2007
I do agree with your conclusions as I use persistent interpreters all the time and don't want to see them slower as they must.
CDot has made his own statistics on the base of which he reworked the code from
use to
require. Could you please compare?
--
MichaelDaum - 18 Dec 2007
If you review the code, you may see that I adopted the following strategy:
- Where a module is always required, then
require it at the top level of the package. I understand that a top-level require behaves the same as a use, and is evaluated at compile time (leaves no opcodes). (Later: see below)
- Where a module is often, but not always, required, make a judgement (just that, no benchmarks) as to whether it should be
used or required.
- Where a module is obviously conditional, then embed a
require as early as possible in the module code.
I'm sure I didn't execute on this strategy perfectly, and there are cases where I have used an embedded
require where a top level
require would be more appropriate.
As far as I know, the only argument for using a
use rather than a
require is where the import of symbols is essential. There are very few modules like this - the only example I am aware of is
Storable.
--
CrawfordCurrie - 19 Dec 2007
Later: I just did this:
package Module1;
require Benchmark;
#use Benchmark;
sub wibble {
return shift() + 1;
}
1;
and a caller
Module2.pm
package Module2;
require Module1;
#use Module1;
my $j = 0;
for my $i (1..1000000) {
$j = Module1::wibble($j);
}
then ran it using
time perl -I . Module2.pm
Within the limits of measurement available to me, there is no performance difference in this example whether you use
use or
require at the top level (Module1 exports no symbols). Moving the require in Module2 into the inner loop obviously affects performance, but it's a case-by-case tradeoff whether the cost of the require check is higher or lower than the cost of unconditional compilation. For obvious reasons you should always avoid embedding
require statements in tight, CPU intensive loops.
The bottom line is that there is no single "best way" that applies to every situation. Because of the cost of importing name tables, I can be fairly certain that using
use rather than
require in the same place in code is almost always a bad idea, however.
I really don't think this is worth worrying about too much. The performance gains to be found here are orders of magnitude less than the performance gains from algorithmic improvements (such as template precompilation).
--
CrawfordCurrie - 19 Dec 2007
Agreed: when
require is at the top level, then there's no measurable
performance effect. But on the top level, there's no conditional
compilation either.
I admit that I haven't reviewed every occurrence of
require in the
code, but I started investigating with
TWiki::new and found four
unconditional
require statements within that routine. I've found
four
require TWiki::Attrs in
TWiki.pm as well, and though all can
of them can be considered sort of conditional, the occurrence in
sub _expandTagOnTopicRendering is telling a lot: this routine is called
373 times for a simple view of TWiki.WebHome in
SVN.
You write:_Because of the cost of importing name tables, I can be fairly certain that using use rather than require in the same place in code is almost always a bad idea, however._ Importing can be suppressed by explicitly specifying an empty list of symbols to be imported:
use Module ();
use Module (); is the exact equivalent of
BEGIN { require Module }
Writing
use in most of these cases would bring TWiki closer towards
what the rest of the Perl world is using. That alone should, in my opinion,
make it worth considering.
--
HaraldJoerg - 19 Dec 2007
Sure, I would support and encourage any refactoring that makes code easier to read, as long as there is no (or only small) performance penalty. The
require calls in
TWiki::new are there because there is an ordering dependency in the
BEGIN process that I never fully worked out (you have to make sure
@INC is fully set up before using or requiring certain modules). I just never moved some requires that are known to be called once-only during the
new process - it just didn't seem worth worrying about. But if
use is easier for most people to read - go for it!
--
CrawfordCurrie - 21 Dec 2007
I guess we'll know more soon - I just wrote a
Blog post
on the observation (using the DTrace probes I'm adding to perl) that there are definate benifits to using the
my $var = shift method of getting function parameters - and When I work out where to put the probes for use and require...
--
SvenDowideit - 29 Dec 2007