InterfaceThread -
FeaturesThread
Seems that given the recent announcement at Freshmeat.net that we got slashdotted. More details can be found at
http://www.geocrawler.com/lists/3/SourceForge/3504/0/3682790/
.
I'd assume that with a 100+ person load-period, that due to the process pipes (system calls `` and the like) in the code, that the process table would load up pretty quickly.
The question arises, what improves will be get from:
- using mod_perl
- reducing the amount of system calls
Some web-profiling is certainly in order.
--
NicholasLee - 03 May 2000
After all it we did not really get slashdotted. Sourceforge just changed the file permissions on the same day of the Freshmeat announcement by coincidence. Other Sourceforge projects have been affected as well. The last two days we had lots of traffic again because of the linux press release. No problem this time.
Nevertheless, performance is an issue. I would add one more item to Nicholas' list:
Regarding mod_perl, I heard of one person who installed TWiki successfully under mod_perl. Let me know if anybody else has done that. We should document that in
TWikiDocumentation as an option.
Regarding `system calls`, those are probably the major culprits. Anybody has experience how to test this? There are several
RCS calls per topic view. This could be optimized. Also, as a hack, we could read the
RCS ,v files directly to avoid system calls for certain
RCS commands where it is safe to do so. This could be optionally enabled with a flag ($doRcsDirectRead), the flag should be off by default to be on the safe side (e.g. do do the normal
RCS system calls).
--
PeterThoeny - 05 May 2000
Reading a rcs file is probably about as hard as writing one. I'm considering the requirements for going to a native (perl) diff/rcs file at the moment. There is the Algorithm-Diff cpan module (
http://search.cpan.org/search?dist=Algorithm-Diff
and
http://www.plover.com/~mjd/perl/diff/
), and it might be possible to build on top of that. It doesn't seem that hard, the extra work being creating the perl version of patch.
One we did that they would be almost no need for system calls and we could probably build much better file locking in. I could have something usage within a month, depends alot on my free time.
At the moment I'm trying to find the spec of the
RCS file format.
--
NicholasLee - 05 May 2000
NicholasLee wrote on
twiki-dev@listsPLEASENOSPAM.sourceforge.net on 24 May 2000:
Here is a quick patch to remove at least two system calls per view. Basically gets the author, date and reversion fields directly from the rcs files. I'm not sure how rcsfiles differ from unix to unix.
The patch is pretty straight forward. Just a matter of getting the right regexp.
The trickier part of grabbing the HEAD reversion requires more work. I probably wont worry about it too much.
Since I'm not 100% about the portability yet, I wont put it up on twiki just yet.
I'd be interested if people with loaded systems see any improve with this patch.
Nicholas
I did some timing. The alpha test installation has now a
wikix.pm , it is
wiki.pm with Nicholas' patch. I also created a
viewx script that basically does 1000 times the
view stuff without printing the output to the browser. Only the last run is printed to the browser. The run time (in seconds) is printed in the browser title.
- Run the original wiki.pm that uses RCS system calls:
- Run the patched wikix.pm that reads the RCS files directly:
You can see that the original approach takes about 26 ms in average for rendering one view, vs. 18 ms for the patched one. Looking at the performance, it makes sense to read the
RCS files directly. I would prefer to hold off with the implementation until we modularize TWiki, so that this part can be done in one maintainable spot. Like previously stated, reading the
RCS files should be optional, enabled by a
$doRcsDirectRead flag.
--
PeterThoeny - 26 May 2000
I'm most of the way there on that (unless Nicholas beats me to it!

Just out of curiousity, were you able to get a process count and/or load graph for CPU & memory? The speed isn't what kills my system since my system is so darned slow anyways, it's the resource hogging.
--
KevinKinnell - 26 May 2000
No, just timing.
--
PeterThoeny - 26 May 2000
Actually I could probably get the last rcs direct call sorted out in an hour or two if I had the time. Cleaning it up and making it switchable would take a little longer.
However, I'm not sure that the last rcs directment read replacement is going to get much. Basically the patch (wiki.diff below) removes two system calls. ie. two starting a new shell, etc. I'd say that is a large part of the 8 s (average) extra time. Doing the last read directly probably will gain another 10% speed.
We might gain more time running it instead under mod_perl and Apache::Registry. (I've been reading up.

That will gives us I beat at least another 8 secs.
Peter: Did you run with it the perl Benchmark modules?
Answer: No, I just used the Perl time() function.
--
NicholasLee - 26 May 2000
Hmmm, what about grabbing the file read/write routines from the
RCS source and linking them into a perl module and shared lib.
--
DavidGould - 28 May 2000
I've thought about that for both the rcs and cvs code. Don't have the time unfortunately to sit down and figure out the best to extract the required feature set and present a perl interface too it. Also depends on the modularity rcs/cvs code. Of course if someone was willing to do it, I'd take their work and use it.
--
NicholasLee - 28 May 2000
http://perl.pattern.net/bench/
has some interesting figures on the difference between mod_perl and perl cgi.