Bug: regex "once" option fails for mod_perl
A lot of the regexs in TWiki use the "o" optimization flag to
restrict regex compiles to one time only for speed purposes.
This does not work well with mod_perl (&apache) because the patterns
are reused beyond their original intended scope.
We noticed this problem because of registration failures:
sub htpasswdExistUser
{
my ( $user ) = @_;
if( ! $user ) {
return "";
}
my $text = &TWiki::Store::readFile( $TWiki::htpasswdFilename );
if( $text =~ /$user\:/go ) {
return "1";
}
return "";
}
# it should be:
if( $text =~ /^${user}:/gm ) {
(minor regex bug in addition to once flag issue)
Take a look at the "mod_perl traps" page for more info.
I'm sure there are other instances where the regexs shouldn't be reused- and there are many cases where it doesn't make any sense
for the once flag to be specified (like a subroutine that's only
called once during a program invocation or where the regex will
resolve to a string compare, etc.).
My workaround solution was to just strip the once flag from all
the regexs I could find. It appears to have fixed the registration
issues and did not noticeably impact the execution speed.
Test case
Environment
| TWiki version: |
20011201 |
| TWiki plugins: |
none |
| Server OS: |
solaris 2.7 |
| Web server: |
apache 1.3.12 |
| Perl version: |
5.61 |
| Client OS: |
varied |
| Web Browser: |
varied |
--
EdwardJajko - 29 Apr 2002
Additional Info
(apologies for the delay)
===installpassword====
sub htpasswdExistUser
if( $text =~ /$user\:/go ) {
====passwd====
sub htpasswdExistUser
if( $text =~ /$user\:/go ) {
===register===
Side note: sub htpasswdAddUser is out of sync with other versions
-bad naming in installpasswd & passwd
===Attach.pm===
sub extractArgsForFile
if ( $theText =~ /%FILEATTACHMENT{[\s]*("$theFile" [^}]*)}%/o ) {
(this sub apparently not used??)
I looked at most of the regexs in the TWiki code base; there
are other once-regex occurances, but they will only mess up when substational system reconfiguration (changing of central Twiki configuration pm file) and other actions not normally occuring during production use.
Thanks!
--
EdwardJajko - 6 May 2002
Follow up
Thanks for spotting this. Generally, the use of 'o' on regexes is reasonable even with
ModPerl - it is just an issue where the left-hand side of a substitute (or a simple pattern match) includes a variable, as here.
If you can point out any other changes where you found a variable within the pattern, I'll make the fixes in
TWikiAlphaRelease. The other uses of 'o' are important for performance, e.g. when looping over a number of lines, and don't cause problems with
ModPerl.
Perhaps the reason this wasn't found until now is that the fairly heavy
ModPerl usage at
DrKW doesn't require
register (they use client side certificates and LDAP, I think). Other uses of TWiki on
ModPerl should be OK... But I'll let
JohnTalintyre comment on this
--
RichardDonkin - 29 Apr 2002
There has certainly been much work to remove uses of the o flag where there is variable in the pattern. Richar is right that we don't use
register at
DrKW. Users are auto registered based on their LDAP identity (from certificate or username/password) when they first do an edit. It's a shame there isn't a perl flag to warn on use of this sort of dangerous optimisation.
--
JohnTalintyre - 30 Apr 2002
My understanding of the o flag is that if there is NOT a variable in the pattern, then the o flag makes no difference at all. Perl caches the result anyway. So while it's not necessary, removing the o flag from all regexes with no variable in the regex expression will have zero impact on performance; adding it to all regexes with no variable would also have zero performance impact. It only matters when there's a variable in the pattern, and it sounds like it needs to go away in that case, so you might as well just get rid of it everywhere.
--
WesleySheldahl - 29 Aug 2002
Good point - TWiki code has 'o' flags everywhere, when in reality it only matters with variables in the regex. I checked the docs and your interpretation does seem to be right
--
RichardDonkin - 04 Sep 2002
Fix record
This is now fixed in
TWikiAlphaRelease (see
CVS:bin/register
), and tested under
CGI Perl (no
ModPerl locally) - it does seem that the '^' is needed to handle users such as
BobSmith and
JimBobSmith, where searching for former could match latter. Hence the 'm' flag is needed.
Please post any other places where you noticed a variable within a pattern, and I'll fix those as well.
--
RichardDonkin - 30 Apr 2002
I'm happy to help, but a little hammered with higher priority work right now. I will check through the code specifically for additional regex
"once" issues when I have a chance and append them here or in a new report if this bug is closed.
--
EdwardJajko - 30 Apr 2002
You can just re-open this bug, changing the classification to
BugReport again, since any other occurrences will be similar to fix.
--
RichardDonkin - 01 May 2002
Category:
TWikiPatches