Tags:
create new tag
view all tags

Question

From looking at the source I see that conversion from HTTP Authentication name to TWiki User Name is done by reading a text file into a hash each time the script is run.

What do people do in the situation where many thousands of users are predicted? This doesn't seem to scale.

-- TWikiGuest - 11 May 2001

Related point. The code in userToWikiListInit is poor. The %userToWikiList hash is re-built each time around the loop instead of just creating a new element.

%userToWikiList = ( %userToWikiList, $lUser, $wUser );

-- TWikiGuest - 12 May 2001

Answer

You won't always use the REMOTE_USER authentication technique, because this variable is not always available - run the cgi-bin/testenv script on your installation. On my setup (Linux + Apache, with IE 5.0) the variable is not set so it will just use HTTP authentication, which can be scaled up in various ways depending on your web server, e.g. using a hashed DB file or LDAP directory for user authentication. Perhaps this code needs to be configurable so it avoids reading in the user list every time if the installation is not using REMOTE_USER.

-- RichardDonkin - 14 May 2001

Thanks for the reply, but I don't understand. I though REMOTE_USER was set by the web server, in this case Apache, to the HTTP Authentication name. So how can it fall back onto HTTP Auth when REMOTE_USER is unavailble?

Apache documentation refers to http://hoohoo.ncsa.uiuc.edu/cgi/env.html for a list of env. variables. It includes REMOTE_USER.

-- TWikiGuest - 15 May 2001

See IntranetDoubleAuthentication for more information on this - basically, with a Windows client and Apache server, the server has no knowledge of your Windows login name. What I use is the TWikiRegistrationPub topic, renamed to TWikiRegistration, which then authenticates people using their WikiName.

-- No sig

But this doesn't apply in my case. I don't care what OS their clients are. We're not all on a LAN. I want to use HTTP Authentication but their HTTP username won't be a WikiName so it needs converting but that doesn't scale in the current code. Hope that makes it clearer.

-- TWikiGuest - 17 May 2001

At work we have now almost 500 registered users, and I don't see a big impact on reading the TWikiUsers file each time the script is run to convert from the HTTP Authentication name to the TWiki User Name.

OK, lets do some testing ...

===== user: 0.470 (0.470), system: 0.090 (0.090),  initialize( pthoeny ) start
===== user: 0.500 (0.030), system: 0.090 (0.000),  userToWikiListInit start
===== user: 0.580 (0.080), system: 0.090 (0.000),  userToWikiListInit end

... so the the text file processing takes around 80 ms on a reasonable server hardware.

This will improve if you run TWiki under mod_perl.

-- PeterThoeny - 18 May 2001

Peter and others - could you post more details of the scale of large installations, including number of users, number of topics, Kbytes total of topics, and typical number of views and changes per month?

  • [ PeterThoeny ] I will post those numbers, stay tuned.

Are there any plans to improve the scalability of searching as TWiki sites scale up? This seems the biggest area where an improvement may be needed for larger sites.

-- RichardDonkin - 19 May 2001

Thanks for the numbers, PeterThoeny. Here's some of my own showing the lack of scaling. With respect to RichardDonkin, unless initial user validation is improved they won't get to the search page.

use Benchmark;

for $len (qw(1000 2000 5000 10000 15000)) {
	 @line = map { ("k$_", "d$_") } 1..$len;

	 print "number of lines: $len\n";
	 timethese(1, {
				list => sub {
					 for ($i = 0; $i < @line; $i += 2) {
						  %h = (%h, $line[$i], $line[$i + 1]);
					 }
				},
				hash => sub {
					 for ($i = 0; $i < @line; $i += 2) {
						  $h{$line[$i]} = $line[$i + 1];
					 }
				},
		  }
	 );
}

It hasn't run to completion yet smile I reckon it's at least O(n**2).

					  CPU Seconds
Number of Lines  Hash		List
	  1000		  0.02	  30.83
	  2000		  0.04	 130.09
	  5000		  0.11	 829.09
	 10000		  0.21	3372.26
	 15000		  0.27	7691.56

-- TWikiGuest - 20 May 2001

Dear Guest: Sorry, it took me a while to realize that you are talking about the 01 Dec 2000 release which had this performance neck. The latest Beta 15 Mar 2001 has that improved. TWiki.pm:

# =========================
sub userToWikiListInit
{
	 my $text = &TWiki::Store::readFile( $userListFilename );
	 my @list = split( /\n/, $text );
	 @list = grep { /^\s*\* [A-Za-z0-9]*\s*\-\s*[^\-]*\-/ } @list;
	 %userToWikiList = ();
	 my $wUser;
	 my $lUser;
	 foreach( @list ) {
		  if(  ( /^\s*\* ([A-Za-z0-9]*)\s*\-\s*([^\s]*).*/ ) 
			 && ( isWikiName( $1 ) ) && ( $2 ) ) {
				$wUser = $1;
				$lUser = $2;
				$lUser =~ s/$securityFilter//go; 
				$userToWikiList{ $lUser } = $wUser; 
		  }
	 }
}

-- PeterThoeny - 21 May 2001

SupportStatus:
AnsweredQuestions
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r11 - 2001-05-21 - PeterThoeny
 
  • Learn about TWiki  
  • Download TWiki
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.