Tags:
create new tag
view all tags

Bug: regex "once" option fails for mod_perl

A lot of the regexs in TWiki use the "o" optimization flag to restrict regex compiles to one time only for speed purposes.

This does not work well with mod_perl (&apache) because the patterns are reused beyond their original intended scope.

We noticed this problem because of registration failures:

sub htpasswdExistUser 
 { 
     my ( $user ) = @_; 

     if( ! $user ) { 
         return ""; 
     } 

     my $text = &TWiki::Store::readFile( $TWiki::htpasswdFilename ); 
     if( $text =~ /$user\:/go ) { 
         return "1"; 
     } 
     return ""; 
 } 

# it should be: 
 if( $text =~ /^${user}:/gm ) {

(minor regex bug in addition to once flag issue)

Take a look at the "mod_perl traps" page for more info.

I'm sure there are other instances where the regexs shouldn't be reused- and there are many cases where it doesn't make any sense for the once flag to be specified (like a subroutine that's only called once during a program invocation or where the regex will resolve to a string compare, etc.).

My workaround solution was to just strip the once flag from all the regexs I could find. It appears to have fixed the registration issues and did not noticeably impact the execution speed.

Test case

Environment

TWiki version: 20011201
TWiki plugins: none
Server OS: solaris 2.7
Web server: apache 1.3.12
Perl version: 5.61
Client OS: varied
Web Browser: varied

-- EdwardJajko - 29 Apr 2002

Additional Info
(apologies for the delay)

===installpassword====
sub htpasswdExistUser
    if( $text =~ /$user\:/go ) {


====passwd====
sub htpasswdExistUser
    if( $text =~ /$user\:/go ) {


===register===
Side note: sub htpasswdAddUser is out of sync with other versions
-bad naming in installpasswd & passwd


===Attach.pm===
sub extractArgsForFile
   if ( $theText =~ /%FILEATTACHMENT{[\s]*("$theFile" [^}]*)}%/o ) {
(this sub apparently not used??)

I looked at most of the regexs in the TWiki code base; there are other once-regex occurances, but they will only mess up when substational system reconfiguration (changing of central Twiki configuration pm file) and other actions not normally occuring during production use.

Thanks!

-- EdwardJajko - 6 May 2002

Follow up

Thanks for spotting this. Generally, the use of 'o' on regexes is reasonable even with ModPerl - it is just an issue where the left-hand side of a substitute (or a simple pattern match) includes a variable, as here.

If you can point out any other changes where you found a variable within the pattern, I'll make the fixes in TWikiAlphaRelease. The other uses of 'o' are important for performance, e.g. when looping over a number of lines, and don't cause problems with ModPerl.

Perhaps the reason this wasn't found until now is that the fairly heavy ModPerl usage at DrKW doesn't require register (they use client side certificates and LDAP, I think). Other uses of TWiki on ModPerl should be OK... But I'll let JohnTalintyre comment on this smile

-- RichardDonkin - 29 Apr 2002

There has certainly been much work to remove uses of the o flag where there is variable in the pattern. Richar is right that we don't use register at DrKW. Users are auto registered based on their LDAP identity (from certificate or username/password) when they first do an edit. It's a shame there isn't a perl flag to warn on use of this sort of dangerous optimisation.

-- JohnTalintyre - 30 Apr 2002

My understanding of the o flag is that if there is NOT a variable in the pattern, then the o flag makes no difference at all. Perl caches the result anyway. So while it's not necessary, removing the o flag from all regexes with no variable in the regex expression will have zero impact on performance; adding it to all regexes with no variable would also have zero performance impact. It only matters when there's a variable in the pattern, and it sounds like it needs to go away in that case, so you might as well just get rid of it everywhere.

-- WesleySheldahl - 29 Aug 2002

Good point - TWiki code has 'o' flags everywhere, when in reality it only matters with variables in the regex. I checked the docs and your interpretation does seem to be right smile

-- RichardDonkin - 04 Sep 2002

Fix record

This is now fixed in TWikiAlphaRelease (see CVS:bin/register), and tested under CGI Perl (no ModPerl locally) - it does seem that the '^' is needed to handle users such as BobSmith and JimBobSmith, where searching for former could match latter. Hence the 'm' flag is needed.

Please post any other places where you noticed a variable within a pattern, and I'll fix those as well.

-- RichardDonkin - 30 Apr 2002

I'm happy to help, but a little hammered with higher priority work right now. I will check through the code specifically for additional regex "once" issues when I have a chance and append them here or in a new report if this bug is closed.

-- EdwardJajko - 30 Apr 2002

You can just re-open this bug, changing the classification to BugReport again, since any other occurrences will be similar to fix.

-- RichardDonkin - 01 May 2002


Category: TWikiPatches

Edit | Attach | Watch | Print version | History: r10 < r9 < r8 < r7 < r6 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r10 - 2002-10-17 - RichardDonkin
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.