Tags:
create new tag
, view all tags

Feature Proposal: Finer Granularity Of Enabling Registration Based On Pattern Matching Email

Motivation

There are good people and bad people on the Internet. Current systems for online cooperation aimed at corporate users often fall down because of underlying assumptions made about registration. Corporate internet based systems are typically read-only with a few users publishing. Corporate Intranet based systems are typically open to all users of the company and are protected behind a firewall. My own personal experience has involved working on large projects where there may be 10 or so companies required to cooperate on certain items, but who compete on others. An example is a multi-sourcing contract, where competing suppliers need access to customer information, but no-one has the place to host such a site.

So the key features of this case for using a TWiki are

  1. you need to be able to host the site on the Internet where it is accessible to all users.
  2. there are large numbers of interested parties, possibly a few hundred.
    • some of us deal with thousands, quite happily
  3. you do not know exactly who these people are in advance, so you cannot add them manually,
  4. but you do know who they work for and hence their registration email..
  5. the project is pretty valuable but the community small.
  6. You do not want to spend a lot of time fighting spammers.

So more granularity is required for who can register, rather than just the current "on" or "off" behaviour.

Description and Documentation

Increase the granularity of registration based on pattern matching the validated email address from the registration request against a static pattern. The pattern match directive can be stored within the TWiki as a 'set' command in the registration topic.

Examples

Enter patterns on the Registration page

  • # Set ALLOWREGISTRATIONFROMEMAIL = @*.com, @*.edu, @*.org
  • # Set DENYREGISTRATIONFROMEMAIL = @nastypeople.com

Impact

WhatDoesItAffect: API, Install, UI, Usability

Implementation

add a method to the register.pm library to search for pattern preferences, check text against pattern match, and return 0,1 call the method from register.cgi and throw an oops if registration is not allowed from this email address. Do NOT call from bulk register, as we presume the admin knows what he is doing by allowing these guys in.

The diffs below are a fairly complete and tested code sketch to demo the concept: it can be updated over time I am not sure what other files would need to be altered to match this behaviour e.g. examples and docs

I would be happy to take this on, but I am not a TWiki developer, so my style may not suit you.


/var/www/twiki/lib/TWiki/UI$ diff Register.pm.dist Register.pm
16a17
> #           (c) 2007 Ray Hunter globis.net
1134a1136,1145
>     # validate email address is allowed to register
>     # against registration preferences in TWikiRegistration
>     if ( _validateEmailPattern($session,$data->{Email}) != 1) {
>         throw TWiki::OopsException( 'attention',
>             web => $data->{webName},
>             topic => $session->{topicName},
>             def => 'registration_forbidden',
>             params => [ $data->{Email} ] );
>     }
>
1319a1331,1479
>
> #### new code
>
> sub MONITOR { 0 }
>
> sub _validateEmailPattern {
>
>   # my ($session,$email) = @_;
>   my $session = shift;
>   my $email = shift;
>
>   # make sure we only get the preferences used in a registration topic
>   # blessed by admin if set in the SiteLocal configuration.
>   # otherwise current topic is used.
>   # Avoids people taking copies of the registration form and editing patterns
>   # $TWiki::cfg{Register}{TWikiRegistrationWeb} = 'TWiki';
>   my $web =  $TWiki::cfg{Register}{TWikiRegistrationWeb};
>
>   # $TWiki::cfg{Register}{TWikiRegistrationTopic} = 'TWikiRegistration';
>   my $topic = $TWiki::cfg{Register}{TWikiRegistrationTopic} ;
>
>   return _checkEmailAgainstList($session,$email,'REGISTRATIONFROMEMAIL',$topic,$web,1);
> }
>
> sub _checkEmailAgainstList {
>   my $session = shift; # session object
>   my $email = shift; # email to match (e.g. a validated email)
>   my $pn = shift; # optional preference name containing list of allowed and denied domains
>   my $topic = shift; # optional topic holding the preference if not this topic
>   my $web= shift; # optional web holding the preference if not this web
>
>   my $noMatchValue=shift; # optional value to return on no explicit rules match.
>                           # suggest undef=unknown (default), 0=deny or 1=allow
>
>   # set sensible defaults
>   $pn = "REGISTRATIONFROMEMAIL" unless $pn =~ /\S$/;
>   $topic = $session->{topicName} unless $topic =~ /\S$/;
>   $web = $session->{webName} unless $web =~ /\S$/;
>
>   my $apn = "ALLOW".$pn; # allow preference name
>   my $dpn = "DENY".$pn; # deny preference name
>
>   # Extract Email list Preferences from the TWiki
>   my $prefs = $session->{prefs};
>
>   my $allowList = $prefs->getTopicPreferencesValue( $apn, $web, $topic );
>   my $denyList = $prefs->getTopicPreferencesValue( $dpn, $web, $topic );
>
>   # convert @company.net, @*.org to (@company\.net$)|(@.*\.org$)
>   my $drp = _convertDomainSuffixListToPattern($session,$denyList);
>   my $arp = _convertDomainSuffixListToPattern($session,$allowList);
>
>   # change to lower case but preseve any undef values
>   $drp = lc $drp if defined($drp);
>   $arp = lc $arp if defined($arp);
>   $email = lc $email if defined($email);
>
>   print STDERR "_checkEmailAgainstList " if MONITOR;
>   print STDERR "email $email web $web topic $topic\n" if MONITOR;
>   print STDERR "dpn $dpn apn $apn\n" if MONITOR;
>   print STDERR "denylist :$denyList: allowlist :$allowList:\n" if MONITOR;
>
>   return _checkTextAgainstPattern($session,$email,$drp,$arp,$noMatchValue);
>
> }
>
> sub _convertDomainSuffixListToPattern {
>   # convert @company.net, @*.org to (@company\.net$)|(@.*\.org$)
>
>   my $session = shift;
>   my $text = shift; # a comma separated list of suffixes
>                     # e.g. @company.net, @*.org
>
>   my $rp=undef # regexp pattern to be returned
>                # e.g. (@company\.net$)|(@.*\.org$)
>
>   # split domain list on comma separator and step through
>
>   my $item;
>   if (defined($text)) {
>    foreach $item (split(/,/,$text)) {
>     # avoid warnings of concat to undef, but allow undef if there are no items
>     $rp = '' unless defined($rp);
>     # strip leading and trailing spaces from the item caused by some editors
>     $item =~ s/\s+$//;
>     $item =~ s/^\s+//;
>
>     # convert email domain @*.company.net to pattern match (@.*\.company\.net$)
>     # escape dots to \.  convert * to regex .* match
>     # anchor suffix and concat it onto the end of the regex pattern
>     $item =~ s/\./\\\./g; # escape dots
>     $item =~ s/\*/.*/g; # convert stars
>     $rp .= '('.$item.'$)'; # anchor, package to avoid contamination and add
>     # add a | ready for the next item
>     $rp .= '|';
>    }
>   }
>
>   # strip any last trailing |
>   $rp =~ s/\|$// if defined($rp);
>
>   print STDERR "List ($text) converted to regular expression ($rp).\n" if MONITOR;
>   return $rp;
>
> }
>
> sub _checkTextAgainstPattern {
>   my $session = shift; # session object
>   my $text = shift; # text to match (e.g. a validated email)
>   my $drp = shift; # deny regex pattern
>   my $arp = shift; # allow regex pattern
>   my $noMatchValue = shift; # value to return on no explicit rules match.
>                           # suggest undef=unknown 0=deny or 1=allow
>
>   # Check DENYPATTERN, snarfed from checkAccessPermission to keep the logic exactly the same
>
>   if( defined( $drp )) {
>     if( $drp =~ /\S$/ ) {
>       if( $text =~ $drp ) {
>         $session->{failure} = $session->i18n->maketext("denied for $text");
>         print STDERR $session->{failure}." matching ($drp)\n" if MONITOR;
>         return 0;
>       }
>     } else {
>     # If deny pattern is empty, don't deny _anyone_
>       print STDERR "DENY PATTERN is empty\n" if MONITOR;
>       return 1;
>     }
>
>   }
> # Check ALLOW PATTERN. If this is defined the text _must_ match it
>   if( defined( $arp ) && $arp =~ /\S/ ) {
>     if( $text =~ $arp ) {
>       print STDERR "($text) in ALLOW PATTERN ($arp)\n" if MONITOR;
>       return 1;
>     }
>     $session->{failure} = $session->i18n->maketext("not allowed for $text");
>     print STDERR $session->{failure}." not matching ($arp)\n" if MONITOR;
>     return 0;
>   }
>
>   # default action is to return noMatchvalue. Caller can check take
>   # own appropriate default action (deny or allow) on no explicit match
>   print STDERR "Default action return ($noMatchValue) for ($text) due to no preferences matched\n" if MONITOR;
>   return $noMatchValue;
>
> }
> ### end new code
>

var/www/twiki/lib$ diff TWiki.spec.dist TWiki.spec
406a407,416
> # **STRING 40 EXPERT**
> # The name of the Registration web used to store registration preferences.
> # Leave blank to allow users to set preferences in their own copy of the form.
> $TWiki::cfg{Register}{TWikiRegistrationWeb} = 'TWiki';
>
> # **STRING 40 EXPERT**
> # The name of the Registration topic used to store registration preferences.
> # Leave blank to allow users to set preferences in their own copy of the form.
> $TWiki::cfg{Register}{TWikiRegistrationTopic} = 'TWikiRegistration';
>


diff messages.tmpl.dist messages.tmpl
152a153,163
> %TMPL:DEF{"registration_forbidden"}%
> ---+++ %MAKETEXT{"User Registration disabled for your email"}%
>
> %MAKETEXT{"The Administrator has disabled new user registration for your email address."}%
>
> %PARAM1%
>
> %MAKETEXT{"Please contact [_1]." args="%WIKIWEBMASTER%"}%
>
> %MAKETEXT{"You have *not* been registered."}%
> %TMPL:END%

-- Contributors: RayHunter - 16 Apr 2008

Discussion

Interesting idea. What's to stop someone registering with an "innocent" email address and then subsequently changing to an evil one?

  • Just set $TWiki::cfg{Register}{NeedVerification} to a true value. If you register with an innocent email address then the verification code will be sent to that innocent address. -- HaraldJoerg - 16 Apr 2008

Another (perhaps complementary) approach would be to filter on the IP address of the registrant.

IMHO this is yet another argument for a pluggable registration module.

-- CrawfordCurrie - 16 Apr 2008

This idea is not new for me. I have written my own webAAA package for registering PERL applications and deployed this on the Internet.

> what's to stop people changing addresses

Nothing at all. People can edit their own contact email. I read that TWiki was aimed at the corporate world, which made it more attractive for me. The assumption is that if someone has access to an email account at a trusted company then they are a trusted individual. The key here is that the user topic is only created when someone actually registers (so we assume that NeedVerification= true and they have received a registration email received in a real account.) You should always be able to go back in the log and work out who registered with what address. And then you can find their boss with a little bit of effort. Most, but not all, people in the corporate world have a boss who can discipline them.

> filter on IP address
yes. We do that too in our own implementation of our registration system. e.g. to only allow registration from trusted networks e.g. only someone located behind the corporate firewall can register, but once they have a user account they can log in from anywhere.

Which is why I wanted to write a generic pattern match routine so that could be called from registration by adding more preferences later e.g. ALLOWREGISTRATIONFROMIP = 131.176.0.0/16

We've run that system on the Internet and had few problems. Most spammers tend to move on if you put a few speed bumps in the way. Of course really determined people would try other attack vectors in the libraries, such as directly inserting a user topic or editing .htmlaccess.

We limit which applications people can register for e.g. users@company1PLEASENOSPAM.com can access all the deployment dates but not the dynamic project reporting docs. users@company2 can access internal project progress metrics that are generated on the fly by a PERL script. Project members can run a script to update target dates.

This could be done in TWiki using the standard allow/deny topic access mechanism with multiple automatic groups each with its own pattern (I've posted about this elsewhere on the add group membership at registration topic AddUserToGroupsOnRegistration.)

> pluggable registrion module
I agree that different people have different requirements here. But one step at a time?

Question for you: Is it better to go for power and have a regular expression in the preference, or is it better to have a comma separated list of email domains and generate the regular expression on the fly automatically?

Idea 1:

  • # Set ALLOWREGISTRATIONFROMEMAIL = @nicecompany\.com$|@morenicepeople\.com$
  • # Set DENYREGISTRATIONFROMEMAIL = @nastypeople\.baddomain$

Idea 2:

BTW Bugzilla already implements this idea to limit who can register bugs and for which product, some presumably the open source people at Mozilla also found this useful.

-- RayHunter - 16 Apr 2008

From the user and less experienced admin point of view i would definitely prefer idea 2, regex is too geeky.

-- CarloSchulz - 16 Apr 2008

If you can generate the RE on the fly, so much the better. A regex is more flexible, but I suspect in this instance you don't need all that power, and simplicity is good.

I like the idea. The remark about pluggable registration wasn't really aimed at you; it was aimed at Sven, who has been muttering about it for years.

-- CrawfordCurrie - 16 Apr 2008

A while back I had a client requested this feature and we ended up implementing it using javascript within the registration form. (I doubt this was completely secure but adequate for the circumstance.) We discussed using a regex list versus simple list of domain names and they said "regex what?" I would vote for simplicity in this case.

-- LynnwoodBrown - 16 Apr 2008

OK. Understood. I bow to your knowledge of the user base. I have altered the draft code and the spec to reflect the change to be a simple domain suffix list interface, with an on-the-fly pattern generation. More code but easier for the users.

I've basically got this code running now on my own TWiki system, and it seems fine. What else needs to be done (e.g. in terms of documentation) to fold this back into a release? And would you guys help me out in pointing me in the right direction so I could become the commited developer?

BTW Lynnwood. Whilst Javascript is a very useful first pass at preventing garbage in forms, and may even prevent casual spammers, you should really perform the same checks in the back end PERL, because Javascript checks won't stop people manually creating their own post request and pasting this direct into the browser address bar.

e.g. http://192.168.0.99/twiki/bin/register/Main/WebHome/register?Twk1FirstName=Ray&Twk1LastName=Hunter2&Twk1WikiName=RayHunter&Twk1Email=Ray.Spam@dontspamme.com&Twk0Password=aaaa&Twk0Confirm=aaaa&Twk0OrganisationName=&Twk0OrganisationURL=&Twk1Country=Albania

-- RayHunter - 17 Apr 2008

I wonder what good this feature will really do in a corporate world. In my Corporate World we all are known as .mot.com. Or .motorola.com. In my wife's company it is all novonordisk.com.

If a feature that places people automatically in a specific group upon registration is going to have value to corporations it has to work on any of the fields in the registration form. And the groups you would add people to should never be used for anything related to access to confidential information. But auto generating groups based on Site, Country or department could be made to good use.

-- KennethLavrsen - 17 Apr 2008

If all the TWiki users are all mot.com then it is probably of little or no value to you. In that case you'll probably host your server on your intranet, and people will connect via a secure corporate VPN. You probably don't even need any email verification in that case. Completely open registration for all users may be perfect.

The value comes when you are cooperating beyond the boundaries of your company and need to host on some neutral terrein outside the security blanket of the corporate VPN.

My experience is that this situation (especially in staff functions like IT) is becoming more common, especially in the corporates who are looking to outsource.

The problem with combining more fields from the registration form to determine whether registration is granted or not (whilst possibly desirable from an end user perspective) is that they are not (yet) authenticated. Only the email address is authenticated (assuming you set $TWiki::cfg{Register}{NeedVerification} = true;)

Otherwise you are then in the usage case of trusting the users to fill in the correct items on the web registration form (self registration) which is already covered elsewhere in another change proposal.

May I humbly suggest that Mototola does co-operate with other companies on many levels e.g. a quick google search shows JVs with Siemens, WMNetServ, Toshiba, Nortel, Wipro, HuaWei ? And you've probably got a few hundred large customers. You're probably not going to put sensitive chip design info out on an Internet TWiki, but there is plenty of other information that Motorola can and already does share and cooperate on e.g. social contacts, how-to's, FAQ's, open technical standards development, wifi alliance forum, feedback from designers and users, and application notes. I've seen similar things elsewhere in the electronics and semiconductor industry. The Wifi alliance might be a good example to highlight. This is a community that extends beyond corporate boundaries. Companies that are normally fierce competitors are cooperating to increase the overall market for their joint technology solution. Several companies "sponsor" this alliance. If the alliance ever wanted to use a TWiki to generate a discussion or answer FAQ's you might want to let all employees of sponsor companies register to a TWiki. Registered users could then run e.g. an application to approve, answer, and mark FAQs for publication. Whilst TWiki guests would be able to enter FAQs without registering, but these would not be moved and linked to the publication section.

But that's just my 2 (euro)cents worth.

-- RayHunter - 18 Apr 2008

I am not arguing against the feature proposal. I think it is great. But it would have much greater value if the pattern matching would work on any of the registration fields and not just the email address

-- KennethLavrsen - 18 Apr 2008

I seriously think this code should go into a Plugin, and we should make the effort to work out the most flexible way to add a pluginHandler to make it possible. That way we're not limited to the small set of options that we think of now.

What I noticed when I was working on that code last, was that the existing registrationHandler (good if you need to hook in to do an action after the rest of registration has succeeded) was almost good enough. The only 'bug' in it, is that it passes through too little information.

        $session->{plugins}->dispatch(
            'registrationHandler', $data->{WebName},
            $data->{WikiName}, $data->{LoginName} );

I propose we add an extra parameter ,that passes the entire $data hash, to make it backwards compatible, but thus giving access to all the information. (perhaps remove the password - dunno)

Ray, would this then allow you to implement your extension?

As Crawford said, architecurally speaking TWiki::UI::Register is a mess. It mixes in many assumptions that are specific to the twiki topic based user system, crippling other usermapping implementations a bit - and so I'd like to see a move to more pluggability, rather than adding more specifics inline.

-- SvenDowideit - 19 Apr 2008

I fully agree with Sven in this.

As a plugin there is much more flexibility to do advanced stuff.

-- KennethLavrsen - 19 Apr 2008

I hear your comments. I came across a "gotcha" myself yesterday. I was testing my registration code and it was failing partially. I had no idea why. It was creating the user log in, but not the user topic. Finally I worked it out. I had restricted change access to our Main web to be OurCompanyGroup only.

My mentality is apparently the other way on to most other TWiki users: nail it down first and then open up what I need, especially on the Main web which controls user access and groups. But hey I'm learning.

Of course TWikiRegistrationAgent could not create new user topics, as it had no web change access rights. So I agree that the cases of granting 'suid' rights during registration are not yet consistent and or clear. register_cgi can bypass settings to write .htpassword password files, but apparently does check web permissions and does not bypass these when creating new topics.

I think that there have indeed been a number of assumptions made about the cgi handler register_cgi.

I cannot comment directly if the plugin would make this cleaner. I don't have the depth of experience to tell you whether it will help or not. I suspect it would at least help to move register.pm to a more method based solution.

In a direct answer to your question, the existing plug in registrationHandler is actually called in the wrong place for my use (see discussion below) It is only called during '_complete' which is too late for me. Providing sufficient flexibility would require more radical surgery than your proposal of adding the extra data access. It would require restructuring register.pm to me more in-line with the original registration flow: Register -> Verify -> Approve -> Finish.

Registration by definition requires suid type rights. You have to pretend to be another user during registration, because the user does not exist yet. So there is little point in trying to restrict what the process can read IMHO. And yet I can't see any identifiable and re-usable call anywhere that deals specifically with this issue.

You do need to have the change restrictions embedded in the fundamental registration algorithm, and defined user rights for each step of the process i.e. finishing registration requires very different rights from initial registration and verification.

Maybe there actually needs to be four plug in handlers (or at least four parts/calls to a single plug-in handler) to match the registration flow: one call each for Register -> Verify -> Approve -> Finish. That way you could place your own verification handler in the verify step (possibly not based on email, but something else like a manual phone call from an admin, or an SQL call to another system), without affecting another handler that automatically adds user to groups (in the Finish step). A custom registration handler may have to augment (do additional actions over and above the default action) or perform a a completely different action (overload a pre-defined method). A generic registrationHandler plug in should support this flexibility.

My particular proposal contained in this change would have to be called in the Register portion of the flow. Other people's ideas such as AddUserToGroupsOnRegistration would be called after the default Finishing actions in _complete. Other people's needs would be called instead of the default email verification.

So that actually now comes out potentially to 8 different handler calls. The default method for each step in registration (that can be overloaded) to perform alternative functionality, and an augmentation post-processor (that is called consecutively after the default step) to perform additional actions for additional functionality.

One thing that does concern me in this whole thing is that the only place that appears to be a 'protected area' to store admin settings that are web related but not site related is LocalSite.cfg.

If a registration plug in is user editable for the specification of any filtering rules, what is the point of creating the plug-in if unprivileged users can anyway create their own registration topics or edit the main registration topic? [and they can today by simply taking a copy of TWiki Register using 'raw view' and then modifying it and storing it as a new topic..... which is why I was only reading settings from one named topic that I'd protected from change by unprivileged users]

The solution I used in the code above was to put the settings in a named topic defined in LocalSite.cfg as $topic = $TWiki::cfg{Register}{TWikiRegistrationTopic} and then only allow change access on this topic to TWikiAdminGroup in the TWiki using the normal access control solution.

The way I handled these issues in my own registration libraries was to completely separate cgi application code, webAAA library, data, and HTML templates, which is not intuitively TWiki-like, and not immediately obvously like a plug-in either at first glance. Users where registered using a minimal CGI script that generated all the forms on the fly, based on methods from a PERL package webAAA, and ran in multiple form steps. Clever people can write their own cgi registration script (it is only a few lines long, as it uses so many method calls), and can overload a specific package method for handling a particular step in the registration flow with their own method, whilst leaving the other methods in place. User data was stored in a SQL DB. There was a special sysAdmin user with change rights on the user DB, fields, and filtering rules. HTML templates were picked up from a specific location, editable by companyAdmin, who did not require heavy access rights to edit the HTML, as long as they left the CGI hidden variables in place it was OK, and they couldn't fake or remove these hidden variables because they were encrypted.

I have no clear idea if these concepts could be translated directly to TWiki: I guess sysAdmin would translate to the admin user; companyAdmin to TWikiAdminGroup; registration form to a normal TWiki topic; my library package webAAARegister.pm would map to a registration plug in; but where to store that protected registration data, a specific TWiki topic to store registration settings?

So IMHO for a TWiki registration plug-in to be successful it would need:

  1. a protected place to store settings that is TWikiAdminGroup changeable, but protected from normal users
  2. this is preferably not LocalSite.cfg to avoid potentially granting access to the crown jewels and avoid geekiness. The admin from TWikiAdminGroup who adds users and groups etc. may be a different person to the original TWiki installer.
  3. prevent unprivileged users from creating their own registration topics settings that bypass the settings created by the TWikiAdminGroup. They may be able to edit the format etc. but not settings.

The proposed solution would be to create a new dedicated topic TWikiRegistrationSettings (referred to by config variable $TWiki::cfg{Register}{TWikiRegistrationSettings}) that is default changeable by TWikiAdminGroup only, and by default allows everything as is. There should ovbiously be a method to writeRegistrationConfiguration writeRegistrationObject readRegistrationConfiguration and readRegistrationObject to abstract this storage requirement from the fact that the users are actually stored in the Main TWiki at the moment, in case someone wants to store them elsewhere e.g. an SQL db.

In addition, if you really want flexibility.

  1. the registration plug in should have unrestricted access to all relevant data values used in registration.
  2. the registration plug in should be called at each step during the registration flow to allow flexibility.
  3. you need to provide methods in register.pm that handle the suid requirements of registration (such as isSetTWikiUIDAllowed, and setTWikiUID) These of course should be "overload-able" to support other methods of user authentication, such as a system call.
  4. allow methods in the registration plug in that can override the default action (method overload) to perform alternative functionality to the default action,
  5. allow methods in the registration plug in that augment the original default action (additional method called consecutively) to perform additional functionality after the default action.

The proposal would require restructuring the register.pm and registrationHandler to be method based following the 4 step flow Register -> Verify -> Approve -> Finish, with a default and post-processor method for each step. In the extreme case, you may even want an array of methods for each step, in case more than one person wants to intercept the registration flow for their own needs.

This is going to cause some discussion I am sure. But I think it goes a fair way towards the clean architectural solution you've been looking for. My additional few lines of code could then simply be located in the plug in handler for the initial registration, and indeed I could steer clear of register_cgi completely. Other people could do the same with their amendments, such as automatic group membership, and register_cgi may suddenly become a whole lot simpler and more modular.

-- RayHunter - 21 Apr 2008

gads, is there any chance that you can reduce the length of this? or perhaps structure it to help me take it in?

I think I will agree with you, and can help you make it happen, but it is hard to be sure.

A few observations of my own:

TWiki Registration really should not be part of the 'Plugins' API, as that opens up a free-for-all, where security becomes compromised by some randome Plugin that is able to use the API.

At the API level we are heading towards the same results, but I was thinking of moving these into the UserMapping system, and making them chain-ably pluggable, so that the Admin must intentionally select them via the restricted configure interface. Another reason to use 'UserMapping' Chains, is that many of them are specific to a UserMapping implementation - mine for example often use a database to store the entire dataset, and then only presents a slice of that data to the TWiki session (preventing exposing confidential information right at the source).

I have implemented this kind of thing on an adhoc basis for some clients needs, and it has thus far worked in those specific cases, but for a generic implementation, more work is needed.

-- SvenDowideit - 22 Apr 2008

I'll try and simplify a few things then and keep it short(er)!

I'm new to this so I'll just say it how I see it to generate some sensible debate. Suspect I've opened a can of worms.

I think the core of the problem boils down to: what is User Registration and where does it belong in the TWiki system?

Is Registration considered a special item part of the UI, a plugin, or is it part of user mapping, or do parts of it belong in all of the above, or elsewhere as it's own package? I think registration merits its own package, as it is fundamental to security and smooth operations. Especially as LoginManager has its own identity. Why not RegistrationManager? I could also understand it being part of user mapping.

Today, it seems to be a special case of the UI as far as I can see, but it seems to make a whole load of assumptions about other elements in the TWiki system. I can't understand why it is UI.

IMVHO user registration touches a whole load of other elements in the system.

That is why there seems to be differences of opinion, depending on your priorities (e.g. lock it down for security or open it up for ease of use?) I accept that there are different priorities for different systems, depending on how they are deployed.

So flexibility and a clear API is definitely required.

In summary I think that registration can de split up further:

There is a portion that needs to gather the initial registration request data. At the moment this is done by displaying a user registration form: that could be static form stored as part of the TWiki text, or a stand alone cgi script that generates a registration form on the fly as part of UI, or another file based interface such as plain old HTML text. Today it is a static TWiki topic, which is fine. There should also be a displayRegistrationForm method in case people want to generate custom forms on the fly. There should also be a file-based /DB interface in the future to generate initial registration requests in bulk, fed from other systems.

There is a portion required to accept input from CGI based forms. That probably belongs in UI. It needs a clear API (and pretty much has one today based on the action variable and TW1Email etc.) Today register_cgi is part of UI. That seems about right, but it should be stripped right down to just purely cope with the specifics of the CGI interface, in case people want to run registration without going through CGI. Talking to CGI is not the core activity in registering users.

There is a registration specific portion required to control the basic registration flow: Register -> Verify -> Approve -> Finish. That is probably registration specific, and is certainly the core activity. It should certainly be independent of the UI in case you want to bulk register offline based e.g. on a file or a DB link. I think it would be sensible to think about a registration object passing along the flow, and at each point in the flow it is subjected to methods. e.g. $registrationRequest->verify(); Today this is hidden in the UI. I think it should probably be its own package with its own inheritance and methods based on a registrationObject. I could live with it being in user mapping. In any case it must be possible to overload and augment methods.

There is a portion that needs to be able to store and read registration objects as they pass through the flow. That is probably generic for all flow based apps, and could be part of UI? Today they are stored as special TWiki topics, but the methods to do this are hidden functions derived from the CGI. It should have an explicit method call like writeObjectStatus and readObjectStatus that handles a hash, rather than being hidden as an internal function in Register.pm. These belong elsewhere than in the registration library. I do not know where yet. Possibly a flow based application package base class, so that a Registration package could inherit them. It's a question of style.

There is a portion that needs to manipulate users (and groups) rights, passwords, and login, which probably belongs in user mapping. Today it is hidden in the CGI finish. It should be called as explicit methods like addUserToGroup addUser addGroup removeGroup removeUser isMemberOfGroup removeUserFromGroup changePassword checkPassword. I can only find a subset of these today.

There is a portion that needs to detect when and where suid type operations are permitted, and how to suid (is this part of user mapping? or is this a registration specific item, or is it part of configuration?) I cannot find this explicitly today. I think it relies on the $agent variable, and then people settings up change rights to topics correctly. It should call to explicit methods such as isSetuidAllowed setUid as part of user mapping.

There is a portion that needs to be able to create the users home topic in the TWiki. Today this is hidden in the CGI finish. It should be an explicit method call like createUserHomeTopic. This is not really user mapping, nor registration, but more closely bound with handling topics.

There is a portion that needs to be able to read it's own configuration, for example which filters to set. This is generic for all apps, but I think you need to be careful about access rights considering that registration needs suid rights at certain moments in the flow, but you do not want to expose the rest of the config to non-registered users. It should call things like readRegistrationConfig writeRegistrationConfig. This could be registration specific, but again it would be handy if this were made generic for all flow based applications.

So you can see that at the moment all of this is stored in internal functions around register_cgi and bulk_register, when in fact register_cgi should just be something really simply like.

use Registration.pm

sub register_cgi {

    # ignore all input unless registration is enabled
    die "Registration disabled." unless ($session->inContext('registration_supported'));

    my $registrationObject= Registration->new();

    # create the new registration request based on the contents of the form
    $registrationObject->register(@cgi_vars) if $action=register;

    # otherwise process existing requests
    $registrationObject->verify(@cgi_vars) if $action=verify;
    $registrationObject->approve(@cgi_vars) if $action=approve;
    $registrationObject->finish(@cgi_vars) if $action=finish;

    # last resort is to display a registration form
    $registrationObject->displayRegistrationForm(@cgi_vars);
}
and the rest should have proper APIs.

The code is there today, but the interfaces just aren't at all clean, so you cannot overload or augment the methods with your own without delving deep into the Register.pm itself. And opening them up as API's will facilitate other flow based applications in the future.

-- RayHunter - 22 Apr 2008

IMHO user registration belongs with the login manager/user mapper/password manager triad. It has no place in the core, and should not be part of the plugins API. UI generation can use services from the core, but should not dictate a standard.

'suid type operations' is a bit of a misnomer. You can perform 'suid type' operations from anywhere; just don't tell the TWiki function you call who the user is (pass undef) and it won't check.

My personal preference is not to execute flow based apps through top level script invocations, but to use REST handlers instead. By using REST handlers you could do everything from a view template, and not require a register script at all.

-- CrawfordCurrie - 22 Apr 2008

one thing you don't seem to be addressing - in TWiki registration is a tiny, almost irrelevant component - its used once by most users, and after that, isn't used at all.

Thats one of the main reasons the code hasn't been touched, I left it to last in the usermapping changes I did, and ran out of time.

The UserMapping and permissions system is quite seperate, specifically because alot of TWiki's don't use registration at all. They run using users defined from outside.

The rest of what you wrote, I'll read some other time.

-- SvenDowideit - 22 Apr 2008

I can't agree that how you identify (new) users is "almost irrelevant," even if it only executes once under normal operations, and even if not all TWiki's use this particular registration method. So I guess we'll agree to differ on that point.

The bottom line is all I'd like is a clear place where I can hook in ~140 lines of additional code in the registration flow during initial registration without breaking anything: it's a simple call that either allows normal flow to continue or throws an oops exception. Nothing more. And that is apparently tough at the minute because no one seems to be able to agree on even how user registration should be invoked. I've now seen 4 options: External user mapping handles registration, registration as a plug in handler in UI, as a top level CGI script called via UI, and a REST handler. So I think I'll just leave my patch as-is on my own personal TWiki and publish the code here. It works fine for me, and no-one else seems to see the need for it as they have other solutions, which is also fine. That's the joy of open source.

As for a TWiki equivalent of suid on the TWiki user: of course I'm aware that you are really running as something like UNIX user www-data and you can do all you want on the operating system level that the Apache user can do. But I'm not sure all TWiki developers/users appreciate that fact, as evidenced by the fact that the .htpasswd writing did not break when I set Set ALLOWWEBCHANGE = MyCompanyGroup on my Main web, whereas creating the home user topic did break because the existing registration code assumed it should run as TWikiRegistrationAgent and it could not write a new topic. I think it would be a handy set of generic methods to have, even if all it did at this time was do a text search for change rights for a specific TwikiUid on a named TWiki topic. Because if you want to change this permissions functionality in the future, you're probably gonna have to look in a lot of places to look for people who have made assumptions about which user they are running as, and how to check for proper permissions e.g. because they are currently bypassing the access module altogether by using undef as the username. If it's easy to check and there is an existing method then developers will use it. Otherwise they could bypass access checking altogether to get their code working. We're a lazy bunch.

I'm not so comfortable with CGI scripts that can be accessed by non-logged in application users that have world write access on the application data and that do not perform too many stringent checks. Maybe you guys are. Just check your apache logs for the wierd garbage that bad people send as requests to popular open source packages like horde and bugzilla, even if they are not even installed on your machine. I mean the check for registration being enabled in Register.pm: if (!$session->inContext('registration_supported')) is only checked if $action eq 'register', but not for any other URLs or values of $action passed to register_cgi. So the registration package is not disabled by the setting config variable 'registration_supported' to false, which I would expect as a dumb user: only the inital registration functionality is disabled, the rest is still accessible. TWiki is not the only app I'm running under CGI. So maybe I'm now thinking I have to run it on a separate virtual server under a different Unix user to protect those other apps.

-- RayHunter - 22 Apr 2008

I thought you were opening a discussion, and thus having the different options discussed so we could determine a best way forward is the only course.

Also, the 4 options that you list are all valid use cases that we support today, and all have importance to some part of our user base. There is work needed to improve the architecture, to improve flexibility and reuse, but they are in no way mutually exclusive.

Thats the other joy of open source - finding the best technical solution, and then implementing it, despite not needing the flexibility yourself!

interesting point wrt (!$session->inContext('registration_supported')), though you should find that the other actions cannot go far without the data from step 1. I added that context very late in the piece, more to disable the registration UI than anything else, and so didn't make it as intrusive as it could be - as disabling registration may mean you would like users that have completed phase 1 of registration, and only need to verify themselves to be able to. Basically, the Dumb user would get what they expect - where TWiki tells you theough the UI 'registration disabled' it is, and where is doesn't (like the verification email) it isn't.

-- SvenDowideit - 23 Apr 2008

Sven and Crawford. Instead of scaring Ray away - how do you suggest Ray implements this feature as an extension? And does Func support what is needed?

-- KennethLavrsen - 23 Apr 2008

I just had a look through the code in Register.pm. Some observations:

  1. Register.pm is self-contained. The only calls to it are redirects from Manage.pm that could be eliminated.
  2. The code is highly specific to registering a user using the TWikiUserMapping.
So, I'd debate how re-useable this code really is. My inclination at this point is to:
  1. Move Register.pm (and associated scripts, topics and messages) out of the core and into TWikiUserMapping
  2. Remove the redirects from manage (Manage.pm) to register, which should be part of TWikiUserMapping.
As far as I can see there is no requirement to extend the user mapping API to achieve this.

-- CrawfordCurrie - 23 Apr 2008

Given a choice between springing the complexities after the fact, and complaining about what odd undocumented use cases got broken, OR trying to educate and point out all the intricacies, I don't think Kenneth's Sven and Crawford. Instead of scaring Ray away is in any way productive.

Crawford, I used to agree, but I've now re-used that same Register.pm for 2 seperate non-TWikiUsers based mappings - essentially shoe-horning the same higher-level process that Ray enumerated.

Moving the topic based implementation details to the TWikiUserMapping is definatly a (somewhat easy) task that needs doing - and adding over-rideable methods to hook into the process at different points, basically comes out of it as a side effect.

-- SvenDowideit - 23 Apr 2008

I think Ray has probably realised he stepped into a small minefield by now smile

Sven, I'm looking for a middle road that's a bit less scary for Ray.

Ray, you can certainly post your 140 lines and leave it at that. What we are debating here is the topic you opened up above viz. how do we make the core more amenable to your extension, and other future extensions. There's actually no argument in the discussion; we are just zeroing in on the best long-term approach to support future requirements. Unfortunately in the past, many contributors have tacked pieces of code on here and there to address immediate needs, and as an experienced dev I'm sure you appreciate the effect that has had on the overall structure of the code.

So the choice is between:

  1. gluing a bit more onto the code, and leaving it for someone else to do something with it, (a.k.a chucking it over the wall)
  2. taking a few simple steps to improve the internal abstraction before gluing in the new code, to help out other people in the longer term,
  3. really taking the bull by the horns and rearchitecting the registration to:
    • decouple it from being specific to users-in-files
    • support plugins
As you observe, at the end of the day the choice is yours. As you observe, that's the joy of open source.

-- CrawfordCurrie - 23 Apr 2008

I'm now running my TWiki as user www-data2, with my other cgi apps run as www-data. Took a couple hours of work, but will make me sleep better at night. If you want a tip on how to do this see http://www-128.ibm.com/developerworks/library/wa-lampsec/

I think you're only real architectrual improvement is going to be achieved by taking path3: Although you do not have to do it all at once. I would suggest the following path. Starting point would be of course to take the existing code (which of course works.) First step, completely separate the method of invocation (via cgi or REST or whatever) from the top-level registration methods. I think it is already a mistake to assume that registration (or indeed any process) should only run under cgi. This becomes your API to registration, so if people want to build a REST handler or a batch file handler instead of cgi, they can bolt that on the front end, as long as it provides the correct parameters to the relevant registration methods. Second step, allow those methods to be overloaded somehow (may be as simple as using package inheritance and changing the name of the 'use' at the top level library. Then at least it is a conscious decision for someone to "break" the lib. Third step convert internal functions calls into either calls to additional methods that are stored in the correct place in your core code tree, or better still use existing methods if they are already there. An obvious example is addUser. That obviously belongs in the user mapping code. That then decouples your back end from the users-in-files assumption. Fourth step, add plugin extension calls to be invoked at the four appropriate points during the registration flow, to allow capturing the flow to allow additional functionality. As has been observed by others, Register.pm seems to be just a bit too "stand alone" at this time. And since everyone seems to want all interfaces and methods to be open.... then that's the way it should be.

-- RayHunter - 23 Apr 2008

  • First step, completely separate the method of invocation - agreed.
  • allow those methods to be overloaded - agreed. True inheritance is hard for some people to get their heads round, but we have had success with the "module selection" mechanisms used for the mapping and login manager (and several other modules)
  • convert internal functions - I was under the impression that we had had done most of this already, but I'm sure there's more to do.
  • add plugin extension calls - here we diverge slightly. My personal opinion is not to support plugin handlers for registration - I see them as a high security risk - but I'm only one voice, and I think Sven may well agree with you on this point.

-- CrawfordCurrie - 23 Apr 2008

mm, I'm about to hop on a plane - but one thing cought my eye.

I think it is already a mistake to assume that registration (or indeed any process) should only run under cgi

you are totally correct. It is totally incorrect to think that TWiki scripts are only able to be invoked from cgi. They are all invokable form the command line, and due to the UI abstraction, able to be invoked using the API directly.

I think you've not yet read enough of the code?

-- SvenDowideit - 23 Apr 2008

Apologies. I see it now. UI.pm checks for a CGI handler, and if it doesn't find one it stuffs things from the ARGV into a fake CGI query to make the underlying register.cgi function think it has the correct information from CGI, which register.cgi then references from $query = $session->{cgiQuery} and then copies into $data = _getDataFromQuery( $query, $query->param() );

I think I just need to leave you guys alone. Your API's are so completely opposite to my style.

Effectively the API to registration is a CGI interface, and then everything else is spoofed into a fake CGI query that calls a function register_cgi and sets parameters like action='register' to determine what it does.

I would do it precisely the other way on. I'd create my registration package objects and methods first. There'd be a method called 'register' that took an array of items that it needed as input from @_, and then I'd build a separate front end CGI script that called those native package methods with e.g. my $q=new CGI; $r=Register.pm->new(-appID=>12); $r->readVariables($q); $r->register($r->{user},$r->{email}) if $->{action} eq 'register'; readVariables handles the untainting.

Ah well. There's a million and one ways to do stuff in PERL. And your way seems to work too.

-- RayHunter -23 Apr 2008

Your API's are so completely opposite to my style - it's a question of chicken and egg. The registration code was written at a time when separate scripts was the way everything was done, and there were no internal APIs (i.e. it predated UI.pm). It was then moved under the common gateway without any major refactoring. What you describe is one way to do things, but bear in mind you are operating in the context of a lot more code, some of which has been there for many years.

-- CrawfordCurrie - 23 Apr 2008

Edit | Attach | Watch | Print version | History: r40 < r39 < r38 < r37 < r36 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r40 - 2008-04-23 - CrawfordCurrie
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2017 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.