See also:
Feedback and discussion on the server side
CacheAddOn.
Discussion
Speed vs outdated content
Incredible speed improvement at the price of outdated content.
--
PeterThoeny - 10 Nov 2002
Ad performance:
Note, that the benchmark figures are actually optimistic for TWiki,
as the TWiki documentation web is around 130 topics, i.e. small to medium size.
For a web of 400+ pages, I can't get the index faster than 11 seconds.
A formatted search returning 40 of 400 pages: no faster than 5 seconds.
And note, that an occasional post on twiki.org does not require
the same responsiveness as more dedicated work -- which I try to do.
Ad outdated pages:
You must consider different types of pages,
what can be out-of-date, how often, impact versus potential gain.
The first two cases should be tolerable for many environments.
The last case can be easily fixed by changing the templates
to refer to
../fresh/WebChanges
.
The case for generated status overview pages is more difficult.
You need them to be accurate.
You want to use them frequently,
as they are an important entry point for navigation.
I propose to work around that problem
by adding a prominent refresh link/button to such pages.
--
PeterKlausner 13 Nov 2002
Detect dirty entries
By coding it in perl, a strategy could be to do some "push":
Keep track of pages which a topic depends on (by hooks on all the
functions loading files during rendering): preferences, included topics..,
then using this info in reverse: when you save a page, remove
(or just mark "dirty") all cached pages which depends on the saved page, recursively.
Note that a dumb solution which would invalidate
all caches on each save anywhere would
probably work, too, as saves are rarer than views.
--
ColasNahaboo - 13 Nov 2002
Calculating pages that require cache invalidation (or refresh) is the really hard part, and of course you could end up invalidating a huge set of pages after a single change... I agree the plugin should probably be in Perl, for portability reasons and access to
TWiki.cfg
for location of data and cache directories - however, would be interesting to see how the benchmarks change.
--
RichardDonkin - 13 Nov 2002
Ad "push":
I was thinking about reusing the link list built by the
TouchGraphAddOn.
This easily shows all potentially effected reverse links.
Clearing their entries would fix outdated
EditThis links.
But the most critical issue is not addressed:
pages with inline search result.
That would be rather hard to fix, I guess.
Clearing all the cache on save works in principle.
But it depends on the usage scenarios.
On our TWiki documentation, save/upload accounts to less then 1% of accesses.
Caching could kick in pretty well.
On our "real work" web, this ratio is 10%.
Less than 10 views before flush is probably not worth anything.
--
PeterKlausner 13 Nov 2002
Note, that version 2.0 offers the -maxage=0 option to the fresh (or cache.pl) script.
It will refresh the page, if the data
directory changed.
This should catch like 99% of all changes,
the 1% being changes from subsequent saves,
which will be caught when the lock expires
-- or any other activity starts.
--
PeterKlausner - 04 Mar 2003
Why not Perl?
I agree the plugin should probably be in Perl, for portability reasons and access to
TWiki.cfg
for location of data and cache directories - however, would be interesting to see how the benchmarks change.
--
RichardDonkin - 13 Nov 2002
Using Perl adds the 0.65 secs shown in the benchmark.
A workaround could be to just
require TWiki.cfg
.
This should give us all variables without much compile overhead.
. . yes, looks good:
Test case |
Min[s] |
Avg[s] |
Max[s] |
URL path |
Start perl +req vars |
0.10 |
0.13 |
0.39 |
/twiki/benchmark?-s |
Hello-world +load twiki |
0.63 |
0.65 |
0.74 |
/twiki/benchmark?-h |
--
PeterKlausner 13 Nov 2002
Ok, here it is...
Release 2.0 offers a Perl version, specifically tested on Windows (95, yep...)
--
PeterKlausner - 04 Mar 2003
Why ksh not sh?
- I notice the shell script uses ksh. It would be better to change this to #!/bin/sh. I haven't checked if there's any korn-isms in the script. Some sites don't have ksh (like sourceforge.net). -- JonathanCline - 05 Apr 2003
- Regular Bourne shell does not support the -nt (newer than) test operator -- PeterKlausner 6 Apr 2003
- I just double-checked my project at sourceforge (where I'm running this addon successfully): they don't have ksh, but they do have /bin/bash, which is symlinked as /bin/sh -- that explains why it worked when I used #!/bin/sh. -- JonathanCline - 06 Apr 2003
- Note, that plain old Bourne Shell is not available on Linux.
There you "only" have bash - kind of Korn Shell on steroids. -- PK
Plugin vs Add-On
Considering the nature of this Plugin (reinforced by examining its current implementation), wouldn't it make more sense to consider recategorizing this as an Addon instead of a Plugin?
--
TomKagan - 26 Nov 2002
Good point. Renamed from CachePlugin to CacheAddOn.
PeterKlausner: I recommend to repackage the zip file.
--
PeterThoeny - 27 Nov 2002
The current implementation does not interfere with the core code or rendering,
so it
is and add-on.
Still, I named it plug-in because I figured,
that we will need per-topic cache control.
Can we do that without parsing sth like %CACHE{expire="1hr"}% ?
--
PeterKlausner - 28 Nov 2002
Where is fresh.pl?
BTW: the Perl version doesn't need an extra fresh script,
just pass
maxage=-1
to
view.pl
;
I clarified the install instructions.
--
PeterKlausner - 21 May 2003
On the page
CacheAddOn the Perl version and Usage sections still
refer to "fresh". BTW great speed gain on my old PentiumPro
server
--
ArnoldAtMallos - 22 Dec 2007
Feature requests
Cache clean up frequency
Possibly set a daily cron job to remove the cache files?
--
PeterThoeny - 10 Nov 2002
Ad periodic flushing:
Currently, I'm using 14 days [sic] for all pages.
I'm thinking of a special variable like %EXPIRES,
which could set a shorter cache retention period.
Then manually add that to pages like
WebChanges
or pages with similar,
known dynamics.
Without such a variable,
different cronjobs with given lists of topics to flush/refresh would do.
E.g. solve the CPU load problem for
WebRss.
A different idea:
flush/pre-cache pages with inline %SEARCH faster than regular ones.
--
PeterKlausner - 13 Nov 2002
Pre-caching
Or better, a cron job to refresh the cache files so that the user does not experience cache refresh on rarely updated topics.
--
PeterThoeny - 10 Nov 2002
Yes, that should be fairly easy.
Currently, it's a cheap-o-cheap 80:20 solution.
--
PeterKlausner 13 Nov 2002
A fairly simplistic approach could be this cronjob:
wget
Port to windows
I got this working on Windows using
CygWin, but had to make some changes to use
bash
(also relevant to
TWikiOnLinux) - also, I can't work out how the original script would work since it seemed to always suffix
?
to every 'entry' value. Here is a patch to work with
bash
, tested on
bash
2.05b:
*** cache.old Wed Nov 13 08:49:32 2002
--- cache Wed Nov 13 09:05:18 2002
***************
*** 1,4 ****
! #!/bin/ksh
#
# @(#)$Id: CacheAddOnDev.txt,v 1.20 2003/04/28 19:13:00 RichardDonkin Exp nobody $ (c) Peter Klausner
#
--- 1,4 ----
! #!/bin/bash
#
# @(#)$Id: CacheAddOnDev.txt,v 1.20 2003/04/28 19:13:00 RichardDonkin Exp nobody $ (c) Peter Klausner
#***************
*** 25,33 ****
#exec 2> /tmp/qik.log
#set -x
! entry="$cache$PATH_INFO?$QUERY_STRING"
! if [ "$entry" -nt "$data/$PATH_INFO.txt" \) ]
then
exec cat "$entry"
else
--- 25,37 ----
#exec 2> /tmp/qik.log
#set -x
! entry="$cache$PATH_INFO"
! if [ "$QUERY_STRING" != '' ]
! then
! entry="$cache$PATH_INFO?$QUERY_STRING"
! fi
! if [ "$entry" -nt "$data/$PATH_INFO.txt" ]
then
exec cat "$entry"
else
(BTW the RCS ID keywords above are bogus - the TWiki page's RCS string is shown, rather than the one in the patch - I logged this bug at
ExpandsRcsKeywordsInText a while back, can be fixed quite easily using RCS options in
TWiki.cfg
.)
The speedup was great, but shortly after running the benchmark program I ran into Apache on Cygwin problems that weren't cured by a reboot (probably these are Cygwin socket issues) - not sure if these are related, but should test this script thoroughly before deploying on Cygwin at any rate!
There are some
CPAN modules that do similar things, which we should investigate, e.g.
CGI::Cache
- this would require some small code changes I think, though it's possible an external script could also work.
CGI::Cache
is compatible with
ModPerl and
SpeedyCGI.
--
RichardDonkin - 13 Nov 2002
Having problems with this addon, and I'll struggle (or keep asking) till I get it fixed - waiting 10 seconds for a page to load is not going to help my company adopt the TWiki I've setup here. Anyway, have TWiki installed and running fine (apart from file attachment) on
Win2K, with Apache and Cygwin. Am getting these errors:
[Wed Jan 15 14:26:06 2003] [error] [client 10.17.0.247] Premature end of script headers: c:/twiki/bin/cache
[Wed Jan 15 14:26:06 2003] [error] [client 10.17.0.247] c:\twiki\bin\cache: line 40: tee: command not found
Any suggestions would be greatly appreciated? From within Cygwin, tee is a valid command, and seems to work fine. My twiki is installed in c:\twiki and Cygwin at c:\cygwin if that's any help (all correctly mounted in Cygwin, with writable permissions). After all the fixes mentioned here, the cache perl file looks liks this:
#!c:/cygwin/bin/bash
#
#
# NAME:
# cache - quick'n dirty page caching for TWiki
#
# SYNOPSIS:
# Identical to TWiki's view
#
# DESCRIPTION:
# Rename original view to render
# Link this to 'view'
# See CachePlugin page for more.
#
# SEE ALSO:
# view fresh
#
# customize...
data=/twiki/data
cache=/twiki/cache
# debug...
#exec 2> /tmp/qik.log
#set -x
entry="$cache$PATH_INFO"
if [ "$QUERY_STRING" != '' ]
then
entry="$cache$PATH_INFO.$QUERY_STRING"
fi
if [[ -f "$entry" && "$entry" -nt "$data/$PATH_INFO.txt" ]]
then
exec cat "$entry"
elif [ -d "$entry" ]
then
exec ./render "$@"
else
exec ./render "$@" | tee "$entry"
fi
--
ClaudeSchneider - 15 Jan 2003
Ad cygwin:
I don't really use it;
always have problems with missing stuff and bad interoperability...
Might as well be the problem with your missing
tee
.
Try to insert the absolute path
c:/cygyin/bin/tee
(or such),
test it interactively and then try it via web server.
HTH -
PeterKlausner - 16 Jan 2003
Awful performance on Windows
Thanks for those tips - writing the full path to tee (and cat) worked a treat - the page loads now, and does create an HTML file in the cache folder. The benchmark script fails completely (probably too many Unix dependent commands to work with Cygwin), but a quick estimate shows that loading Twiki WelcomeGuest used to take 8 seconds without the cache, and now takes 4 seconds. This still isn't as quick as should be (I guess, from all the 0.5 seconds benchmarks mentioned here), and loading the cached HTML page directly (
http://twiki/cache/Twiki/WelcomeGuest.html) loads and Shift-F5s instantly.
--
ClaudeSchneider - 17 Jan 2003
Use cached pages from Go box
P.S. Something else I've just noticed - if you navigate to a page using the Go input box at the top, the resulting page is something like ?topic=Test.TestTopic5, which gets cached as .topic=Test.TestTopic5 in whichever Web I was in. This results in LOTS of duplicates of the same page being cached everywhere. I'll try to figure out a way of removing this reduncancy, but my perl/bash is non existent, so any help would be appreciated.
--
ClaudeSchneider - 17 Jan 2003
The
GoBox implementation from
GoIsSearch avoids this problem.
--
PeterKlausner - xx Feb 2003
Avoid misleading edit links
When you follow the edit link from a cached page
and create the missing topic,
the cached page won't reflect this.
When you follow this link a second time,
you will be dumped into the edit window.
Annoying!
Patch the edit script to invalidate the parent page like so:
diff -c -r1.1 edit
*** edit 2002/06/19 13:53:37 1.1
--- edit 2003/02/13 08:06:30
***************
*** 150,155 ****
--- 150,160 ----
$meta->put( "TOPICPARENT", ( "name" => $theParent ) );
}
$tmpl =~ s/%TOPICPARENT%/$theParent/;
+
+ # touch parent file to update links in cache; breaks in different web!
+ my $now = time();
+ my $parent = "$TWiki::dataDir/$webName/$theParent.txt";
+ $parent =~ /^([A-Za-z0-9_-]+)/ and utime $now, $now, $1;
# Processing of formtemplate - comes directly from query parameter formtemplate ,
# or indirectly from webtopictemplate parameter.
--
PeterKlausner - xx Feb 2003
Use with mod_perl, etc.
See
TwikiFreeBsdPerformance for a case where this addon, probably the Perl version, may be useful alongside
ModPerl on a slower machine.
Also, see
RenderOnceReadMostly for some discussion in this area.
--
RichardDonkin - 28 Apr 2003
Detect corrupt entries
A problem not yet solved:
if an error creates a corrupt page,
this page will be served from the cache
until it expires just like a regular page.
Unfortunately, you do not even have a refresh button.
The only way to fix it is to know & enter the refresh URL manually.
Rendering errors should be detected
and never go into the cache.
--
PeterKlausner - 15 May 2003
I got this problem somehow, cached files with null size. Problably caused by some interupted redering process. This I solved by adding a find -size 0 command into the cache script removing those files.
+ `/usr/bin/find $cache$path$sep -size 0 -type f -exec rm {} \\;`;
# handle max age parm...
if ( $maxage == 0 ) { # re-render on _any_ change in web, i.e.
Functional for me but not cross platform, depending of the normal find command. A better way is problably to use the library File::Find in perl.
--
MartenMartensson - 26 Oct 2003
Compress cache contents
- I've also just run a test for storing the cached pages using
gzip -1
(my personal closet-web-server machine is an old PC with limited disk space -- i386 at 233Mhz and 1 gig HD). The speed improvement is still dramatic and the compressed version is 30% of the rendered. A great improvement. -- JonathanCline - 05 Apr 2003
- The small patch to cache.sh:
if [ "$entry.gz" -nt "$data/$PATH_INFO.txt" ]
then
exec gzcat -d "$entry"
else
exec ./render "$@" | tee "$entry" 2>/dev/null
gzip -f -1 "$entry" &
fi
Second Level Cache for Topics with Access Restriction
The installation instructions for the
CacheAddOn say
- If you are using bin/viewauth link it to bin/render
Only after inspecting the cached page I understood why this could work:
Status: 302 Moved
location: http://host.domain/twiki/bin/viewauth/Restricted/WebHome
That is, the
view
(
render
) script responds by sending a redirect to the client.
So the access control is not invalidated by the cache, as I feared it would.
However, access restricted pages are not really cached; only the redirection is.
To get the caching right we would need a secondary
cache
script, which implements
access control as the ordinary
view
script does, but produces the cached page if
it exists instead of rerendering it.
As far as I can judge, this entails incorporating the access control checks from
view
in
cache.pl
. I'm not proficient in perl and the
cache.pl
script doesn't
work on my Linux box anyway. So before I might try this:
- Has anyone considered this already?
- Would the overhead of the access control make caching still worthwhile?
Since a consirable portion of the
view
script seems to be needed, it becomes
worth considering building caching into the
view
script itself.
--
EelcoVisser - 01 Jul 2003
Yes,
viewauth
is not cached; will clarify the doc.
No, I see no practical way to implement authentication outside of
view
.
No, putting it into the core/view will not be accepted easily,
as
TWikiMission seems to be more of an application server,
which requires real-time display.
2¢ by
PeterKlausner - 02 Jul 2003
Add a SHORTDESCRIPTION to the Add-On Info Section
I added a SHORTDESCRIPTION to the Add-On Info section so that this add-on is represented properly in the
AddOnPackage topic and query topics. Please take this into the release.
--
PeterThoeny - 06 Oct 2006
Bugs
Syntax error with \) in bash
I can't work out how the original
-nt
test worked with the
\)
included, as I got a syntax error from
bash
- is this valid
ksh
syntax? The script could of course work on any Bourne shell derivative by rewriting the
-nt
test using
ls -t
- however, this should be a config option as it would probably be somewhat slower.
--
RichardDonkin - 13 Nov 2002
egg on my face
It's not valid syntax, but tolerated by ksh.
(Leftover from an overly complicated if construct to avoid serving deleted pages,
which doesn't buy enough to be worth it.)
--
PeterKlausner 13 Nov 2002
? separator in path name
I got this working on Windows using
CygWin, but had to make some changes to use
bash
(also relevant to
TWikiOnLinux) - also, I can't work out how the original script would work since it seemed to always suffix
?
to every 'entry' value. Here is a patch to work with
bash
, tested on
bash
2.05b:
--
RichardDonkin - 13 Nov 2002
? separator
is not a bug, but a feature which seems not to work on Windows.
The idea is to cache skinned pages as well.
For symmetry with the URL syntax,
I choose '?' to separate PATH_INFO from the optional QUERY_STRING.
'.' should work as well, be actually more convenient on Unix
and not collide with TWiki word namespace.
I will change this.
--
PeterKlausner 13 Nov 2002
A separator for PATH_INFO and QUERY_STRING part of the cache filename
must fullfill these criteria:
- Not valid in a wiki word
- Not valid within PATH_INFO nor QUERY_STRING
- Valid as filename in Unix and Windows
- Not a directory name in Unix nor Windows
Actually, I don't know of
one character meeting this;
probably best to go for
`?' |
on Unix |
`__' |
on Windows; conflicts with 'TopicNamesLikeThis__' |
HTH -
PeterKlausner - 16 Jan 2003
Caching per web
This is an outstanding enhancement to the standard TWiki install. Poor performance was probably our most frequent complaint with the standard TWiki install.
One question: I want to cache some, but not all webs. (We have some webs where a lot of pages are built dynamically from content of other pages, so these should not be cached.) How could I handle this? For now I am doing this by creating directories for only the webs I want to cache under the "cache" directory, but that results in lots of warnings in the apache error log when it tries to do the tee command to write to a directory that does not exist. I don't know ksh so do not know how to extract the cache/web path and test that it exists before attempting a write.
--
MartinWatt - 22 Nov 2002
Sorry for the late response.
Yes. This is a bug. Fix by redirecting stderr to /dev/null like so:
#!/bin/ksh
# customize...
data=/var/twiki/data
cache=/var/twiki/cache
entry="$cache$PATH_INFO.$QUERY_STRING"
if [ "$entry" -nt "$data/$PATH_INFO.txt" ]
then
exec cat "$entry"
else
exec ./render "$@" | tee "$entry" 2>/dev/null
fi
Warning: I don't know yet, whether this affects Perl's error handling.
Parsing the directory etc. gets expensive very soon because *sh forks a lot.
Then we rather bite the bullet and go the bare-bones Perl route.
--
PeterKlausner - 28 Nov 2002
Occasionally corrupted page
One thing I see
very occasionally, like once every 1000 page saves, is a half-written page - it just cuts off partway through. Looks like the write operation got interrupted. I have not figured out exactly what circumstances cause this to happen - maybe hitting save again when a save is already in progress?
--
MartinWatt - 09 Jan 2003
Here is an update on this problem. We get this fairly regularly, I'd say once every week or two, which is roughly once per 1000 page saves. It causes considerable alarm for users whose pages suddenly disappear or truncate. I am almost certain that it is caused by impatient users hitting a browser button when the addon is in the middle of saving the cached page.
As for a solution, well the obvious one is to switch the company to decaf and have our users mellow out a little
Alternatively, is there a way to make a script being executed by the browser non-interruptible so the browser cannot just kill it partway through? Probably not, I suppose. My final suggestion is to have the addon attach an identifier to the very end of the cache file and have the caching addon only return the cached page if the identifier exists (as it then knows the write completed successfully), otherwise delete the cache file and regenerate it.
--
MartinWatt - 28 Apr 2003
Sorry, but I cannot reproduce this behaviour.
My combinations of apache & fs settings seem to prevent this from happening.
Maybe this patch of cache.sh works for you:
< exec ./render "$@" | tee "$entry" 2>/dev/null
---
> tmp="$entry.tmp$$" # use same filesystem!
> exec ./render "$@" | tee "$tmp" 2>/dev/null &&
> mv "$tmp" "$entry"
This should update the cache only after the rendering completed successfully.
If you see abondoned .tmp files lying around,
check if these have corrupt content as you suspect.
Whence we figured out what is wrong,
insert an exit handler before the exec:
trap "/bin/rm -f $tmp 2>/dev/null" 0
--
PeterKlausner - 04 May 2003
Failure on missing topic name, i.e. WebHome
In the bash version of this add on, I've modified it to properly handle the syntax of a user typing in only the web name in a URL, ommitting the topic name (
http://server/twiki/Web1
or
http://server/twiki/Web1/
) would generate internal server errors (at least for our TWiki).
See newer version
below...
The bottom if block is now:
if [[ -f "$entry" && "$entry" -nt "$data/$PATH_INFO.txt" ]]
then
exec cat "$entry"
elif [ -d "$entry" ]
then
exec ./render "$@"
else
exec ./render "$@" | tee "$entry"
fi
Note that I added the elif clause to not pipe the Directory name to tee, I believe this is a better option than sending the output of tee to /dev/null, as it's (remotely) possible that some fixable error is getting swallowed.
Note that this is only for bash, but I assume the same issue exists for ksh.
--
MikeMaurer - 14 Jan '03
Just recently, I ran across this as well.
If I understand correctly, the fix works only if you use '.' as separator,
because this makes $entry refer to the same directory '.../Web/.' -- kewl.
If you want to re-use the cached
WebHome,
put this in front of the first if:
test -d "$entry" && entry="$entry/WebHome."
HTH -
PeterKlausner - 16 Jan 2003
Directory bug one more time:
Duuuh... I missed the point:
'.' is not a kewl feature, it is the bug.
With the orginal '?' it worked right away.
(Small annoyance: any directory touch invalidates
the odd .../Web? and .../Web/? cache files
but not .../Web/WebHome? )
HTH -
PeterKlausner - 16 Jan 2003
I've tweaked my hack a little more to make it cache entries that don't include the specific topic (they probably want
WebHome). This should go
below the debug block in bash versions of this script. This is not tested on KSH and almost certainly won't work on it.
fullpath=$PATH_INFO$QUERY_STRING
fullpath=${fullpath%/}
entry="$cache$fullpath"
if [[ -f "$entry" && "$entry" -nt "$data/$PATH_INFO.txt" ]]
then
exec cat "$entry"
elif [ -d "$entry" ]
then
if [[ -f "$entry/WebHome" && "$entry/WebHome" -nt "$data/WebHome" ]]
then
exec cat "$entry/WebHome"
else
exec ./render "$entry/WebHome" | tee "$entry/WebHome"
fi
else
exec ./render "$@" | tee "$entry"
fi
--
MikeMaurer - 2 Feb '03
Refresh doesn't work on Windows
I've got the cache script saved in bin/view, and the original view script saved in bin/render, as well as having bin/fresh and bin/benchmark as they came from the zip file. I have a feeling that the cache script is doing something (it's writing the cached HTML file), but once the topic has been cached, shouldn't it load the HTML instantly? Also, the refresh link I've added (which loads the page using the fresh script), does load in a second, but doesn't force the page to be recached? It doesn't seem to actually remove the cached page and re-execute view...
Any assistance or insight would be greatly appreciated.
--
ClaudeSchneider - 17 Jan 2003
See Perl version of release 2.0
--
PeterKlausner - 04 Mar 2003
Installation/configuration troubles with mod_perl
I'm having a heck of time getting
CacheAddOn to work. The page keeps sending back nothing. I've created the .../cache/myweb dir, I've changed the paths in the cache script to point to the proper Perl binary and location of the render script, as well as the proper directories for cache and data, and nothing gets returned (literally, nothing. I used a Java program to just look at the exact text sent over the socket from the server, and it was receiving nothing) when I try to replace "view" with "cache" in the url. This is for the Perl version of the script. Any idea of things I can look at? Looking at my httpd-error.log, I see only this:
get s:/usr/local/www/twiki/data/Javatips/WebHome.txt c:1052788701 s:1052768185 m:336
In the .../cache/myweb directory, I only have one file created after an attempted view, and it's
WebHome__. It's empty.
--
SeanLeBlanc - 14 May 2003
Does it work, after you rename the cache directory tree?
If yes, then the 0-length file was created before the config was ok,
but pollutes the cache. Delete the empty
WebHome__
.
Noted feature request above: don't cache trash!
If not, then your call to
render.pl
is not yet correct.
--
PeterKlausner - 15 May 2003
Thanks. I was able to get a different error when I renamed the cache dir. The error is:
Software error:
Can't locate object method "request" via package "Apache" at /usr/libdata/perl/5.00503/CGI.pm line 234.
For help, please send mail to the webmaster (
you@yourPLEASENOSPAM.address), giving this error message and the time and date of the error.
I'm set up to use mod_perl. Is there something I need to change to make this work with mod_perl? Or a perl module that is missing?
--
SeanLeBlanc
Ignore this bit, leaving it in as general info only:
You may be using a version of
CPAN:CGI (CGI.pm) that doesn't support mod_perl 2.0. See the first hit on
Google:Can%27t+locate+%22method+request%22+via+package+%22Apache%22++cgi.pm, which is actually a TWiki.org page. It would be best to get the latest CGI.pm version in any case, but do run the latest
testenv
from
CVSget:bin/testenv to see the actual version. See also
IssuesWithPerl5dot8 where CGI.pm versions caused problems on Perl 5.8.0.
End of ignore
The most likely possibility is 2nd hit on this search,
here - if the Perl script needs to run outside mod_perl, but you are running it as a process from underneath a mod_perl Apache server, you'll get this error, as mod_perl sets the %ENV hash to indicate it should be used. This can be fixed by a tweak to the script to never use mod_perl, or perhaps using
SelectiveModPerl. However, it is clearly bettter for this script to work under mod_perl if possible to maximise performance by not forking a Perl interpreter process.
Having now looked at the
cache.pl
script, it forks the render script without resetting the environment as mentioned in this
message from Lincoln Stein, author of CGI.pm - hence the TWiki render (really
view
) script loads CGI.pm, which tries to use mod_perl since the %ENV (environment) says it can do. In
cache.pl
, try changing the following line as highlighted in bold (this assumes you are using
CygWin for
bash
and
perl
):
$render = "GATEWAY_INTERFACE=CGI/1.1 perl c:/opt/twiki/bin/render.pl"; # you might need full path!
This forces the forked Perl process to think that mod_perl is not available, which is actually the case. It would be more efficient to do
$ENV{GATEWAY_INTERFACE} = 'CGI/1.1'
perhaps, to avoid forking a shell, but one of these should work (not tested.)
Please attach the
testenv
output to
TwikiFreeBsdPerformance, where this originated, as I don't have mod_perl access. This will help in checking your environment and in updating
testenv
(see
ImproveTestenv) to recommend a CGI.pm upgrade to 2.87 or higher when using mod_perl 2.0 (aka 1.99.x for some reason). Even if you are not on mod_perl 2.0, this output would be useful.
I think this problem will also apply to those using the shell version of the script, if running under a mod_perl enabled web server, since the environment is also inherited in the same way.
The most efficient option for mod_perl (future project) would be to somehow run the
render
script without forking a Perl interpreter, sending its output into a string rather than to STDOUT. This would get the most performance out of mod_perl with this add-on, and is almost certainly possible with a small change to the Perl code in
view
/
render
, but I'm not sure how at the moment.
--
RichardDonkin - 17 May 2003
As discussed in the rationale section,
this add on was intended as alternative to mod_perl.
The shell version is totally incompatible with it,
as far as I understand it.
To get sth working for Windows,
I implemented the same crude thing in Perl;
I never expected the exec logic to work with mod_perl.
To squeeze out more performance for saves and reloads,
I guess you need a complete rewrite.
--
PeterKlausner - 18 May 2003
I think that
cache.pl
can just be tweaked to work better with
ModPerl - it may seem a bit odd to use the two together, but many Linux boxes these days come with mod_perl enabled so it would be good if it doesn't break.
SeanLeBlanc is using this add-on as a workaround to the
SiteMapIsSlow issue, which isn't helped by mod_perl, so until that issue is fixed there's some rationale for using this with mod_perl.
--
RichardDonkin - 19 May 2003
Putting in the mentioned line (with a semicolon) like so:
$render = "
GATEWAY_INTERFACE=CGI/1.1; perl c:/opt/twiki/bin/render.pl"; # you might need full path!
did work for me. However, my ksh didn't seem to work out so well with the "fresh" script, and I'm trying to rewrite that in perl, so if anyone has already done that, please let me know. Also, and this is a weird one, sometimes when I edit topics, I cannot save them, or even preview them. I have to go back and hit my half-working refresh link, and then go at the editing again. The symptoms are that it either a) acts like I'm not logged in, even though I was already permitted in to do editing or b) errors when doing the preview, complaining that it cannot find oops.tmpl. Both messages are clearly bogus. Here's what httpd-error.log has in it for the times it fails.
get s:/usr/local/www/twiki/data/Generalsoftware/MsSqlServer.txt c:1053381064 s:1053381063 m:24
[Mon May 19 16:35:06 2003] [warn] Apache::Registry: T switch ignored, enable with 'PerlTaintCheck On'
[Mon May 19 16:35:10 2003] [warn] Apache::Registry: T switch ignored, enable with 'PerlTaintCheck On'
[Mon May 19 16:35:23 2003] [warn] Apache::Registry: T switch ignored, enable with 'PerlTaintCheck On'
[Mon May 19 16:35:38 2003] [warn] Apache::Registry: T switch ignored, enable with 'PerlTaintCheck On'
put s:/usr/local/www/twiki/data/Generalsoftware/MsSqlServer.txt c:1053381064 s:1053383738 m:24
[Mon May 19 16:35:50 2003] [warn] Apache::Registry: T switch ignored, enable with 'PerlTaintCheck On'
--
SeanLeBlanc 19 May 2003
One thing to do is to use
PerlTaintCheck On
in
httpd.conf
. Another option is to use
SelectiveModPerl to ensure that the new fresh.pl script, or original ksh fresh script, are run outside mod_perl (it could be that Perl is trying to run the ksh script because it's in the same bin directory that is assigned to mod_perl). You may also need to unset the GATEWAY_INTERFACE variable as above.
--
RichardDonkin - 20 May 2003
I made the classic mistake of combining two issues into one entry. I think I have the fresh script hacked up to do what I want. The more pressing matter in any case is that I can't seem to do more than one edit per "session"...I get an error upon preview saying that I'm not logged in. Will
PerlTaintCheck On
help with this, and if so, what do I have to do to avoid the insecure path entry error? Also, in what script would I unset the GATEWAY_INTERFACE variable?
--
SeanLeBlanc - 20 May 2003
I'm working on several performance optimizations right now with the main goal to keep response times below one second for most pages.
Regarding cache.pl, if changed the following:
- make cache.pl run under mod_perl in an optimizied way, this is without forking an external perl interpreter on cache miss or refresh
- additional cleanup to get rid of warnings or errors from tainted paths
- check for cache directories (also to get rid of error messages)
- implement stderr redirection (doesn't work for me at all under mod_perl)
- cache per user (session plugin) to not mix up user-specific settings /access rights and things like user names in skins
I'll clean up the script (lots of old/commented code in there right now) and attach it here. It still needs further polishing and some better security checking, but to give you some figures:
time |
mod_perl |
cache |
refresh |
446ms |
yes |
no |
n/a |
1385ms |
no |
no |
n/a |
120ms |
yes |
yes |
no |
168ms |
no |
yes |
no |
721ms |
yes |
yes |
yes |
1574ms |
no |
yes |
yes |
The first two lines are from calling render.pl and can be used as a reference.
The performance gain from running cache.pl under mod_perl with cached content is 48ms or 28% (3rd and 4th line), which in absolute figures is too small to notice.
The biggest gain comes when the page is refreshed, here we gain 853ms or 54% from mod_perl. And what is most important, we stay under the one-second-barrier.
Finally, I commented out the info messages which gave another, incredible boost from 120ms to 35ms. TWiki on steroids!!!
I've attached my working version; it needs further modification on render.pl (the old view.pl), so I'll write some instructions on how to install later...
--
MichaelRausch - 04 Jun 2003
I've just seen Michael's impressive script - interesting that there's a benefit to both mod_perl and caching. In fact, various templating environments such as
TemplateToolkit prefer mod_perl and implement caching, so there is a precedent for combining the two.
Which 'info messages' were you talking about? Are these where the cache script writes to a log file? If so, it's probably worth leaving these in by default, a high-volume site could take them out if needed.
--
RichardDonkin - 03 Jul 2003
In one of the first comments on this page,
PeterKlausner talks about
WebChanges, and how you can simply revise all your URLs to be /fresh/WebChanges. That seemed dirty to me, and a lot of work. Instead, I modified the cache.pl script to add another conditional to the checking for cache, namely to ignore the cache for
WebChanges,
WebIndex, and
WebTopicList:
if ( ( $t_cache > $t_change ) # cached copy is newer
and ( $t_cache + $maxage * 3600 > time() ) # and expires in the future
and ( $path !~ /Web(Changes|Index|Topic)/ ) ) # and is not an index lookup.
I'm getting a few Perl warnings though:
[Fri Nov 28 15:06:18 2003] view: Use of uninitialized value in numeric gt (>) at view line 40.
[Fri Nov 28 15:08:07 2003] edit: Use of uninitialized value in substitution (s///) at edit line 273.
Not sure where the second one is coming from (I've yet to reliably reproduce it), but the first can be solved relatively easily. The warning is caused when a cache file does not yet exist for the particular topic. Since that file doesn't exist, the time check will fail:
$entry = "$cache$path$sep$query";
my $t_cache = (stat "$entry")[$mtime];
my $t_change = (stat "$source")[$mtime];
and thus, the conditional will be trying an undefined against a date:
if ( ( $t_cache > $t_change ) # cached copy is newer
Simple solution:
my $t_cache = (stat "$entry")[$mtime] || 0;
my $t_change = (stat "$source")[$mtime] || 0;
--
MorbusIff - 28 Nov 2003
Here is a patch for the fresh script. In my own setup, the cache place is not the standard one,thus the bug I noticed when trying to refresh a topic in the cache.
I decided to write the name of the variables in capitaland protect them with "{" and "}".
Tested it with pdksh, and it worked.
24,25c24,25
< cache=/var/twiki/cache
< data=/var/twiki/data
---
> TWIKI_CACHE=/var/twiki/cache
> TWIKI_DATA=/var/twiki/data
29c29
< if [ "$cache$PATH_INFO?$QUERY_STRING" -nt `dirname "$data$PATH_INFO"` ]
---
> if [ "${TWIKI_CACHE}${PATH_INFO}?${QUERY_STRING}" -nt `dirname "${TWIKI_DATA}${PATH_INFO}"` ]
34,35c34,35
< /bin/rm -f "/var/twiki/cache$PATH_INFO?"* \
< "/var/twiki/cache$PATH_INFO?$QUERY_STRING" 2>/dev/null
---
> /bin/rm -f "${TWIKI_CACHE}${PATH_INFO}?"* \
> "${TWIKI_CACHE}${PATH_INFO}?${QUERY_STRING}" 2>/dev/null
--
LaurentGautrot - 13 Jan 2005
Unknown features
Workaround to force that certain pages such as WebChanges are always refreshed
- cd .../Cache/Main
- touch -m -t 200108150000 WebChanges
- chmod ugo-w WebChanges
This forces the Cache/Main/WebChanges page to look older than the data/Main/WebChanges.txt and makes it non-writable, thus ensuring that the page will always be refreshed.
--
WolfgangSlany - 01 Oct 2005
Does not work with login in users
Just tried
CacheAddOn and it is great, but I quickly noticed that all users become me! Whenever a user edited a page it was my wiki user that has done the changes. And in the top left corner it says Welcome and my username instead of their. Not possible to logout either.
I guess the wiki session got cached in someway - how do I resolve this so I can start using
CacheAddOn again?
--
FredrikLarsson - 02 Oct 2007
The whole page is cached, so everybody looking at a cached page will see the side bars of the person that was logged in at the time. I use twiki with registered users only, I changed my apache2 configuration so that it requires authentication when the view script is called. On my Debian installation, I added 'view' to the FileMatch line, so it reads
<FilesMatch "(attach|edit|manage|rename|save|upload|mail|logon|view|.*auth).*">
Now the script cache.pl has knowledge about the user. I changed it so it will create and keep cache per user.
twiki/bin$ diff -au cache.pl view
--- cache.pl 2007-12-20 22:19:52.000000000 +0100
+++ view 2007-12-22 23:20:15.916829117 +0100
@@ -23,6 +23,16 @@
#$render =~ /(.*)/;
#$render = $1;
+# if a user is logged in, use the cache directory for that user
+my $user = $ENV{'REMOTE_USER'};
+if (defined $user) {
+ $cache .= "/$user";
+ # create the directory if it does not exist
+ if (! -e $cache) {
+ mkdir $cache;
+ }
+} # end of setting up user specific $cache
+
my $webhome = "WebHome";
my $maxage = 24 * 14; # default expiration after 14 hours (days?)
This seems to work fine.
--
ArnoldAtMallos - 22 Dec 2007
You may want to have a go at my different implementation of the same idea, at
PublicCacheAddOn (5 years after my comment on this topic... time flies...). The difference:
- it has a C front end for even faster operation, with only two disk access - it could even be optimized to one.
- it gets the page as TWikiguest, so you see "neutral" pages, not the ones of last user. But it will not cache read-protected page.
- it detects errors in getting pages, retry the original view script for them, and remembers not to cache them
- it locks so that even if you issue 100 request for the same, yet-uncached page, [1] only one process will build the page, the others will wait, [2] there is no possible corruptions due to race conditions. This is really important as it means you will still be able to edit your public site, even in heavy use.
- it has full automated install uninstall
- and, most importantly, it solves the problem of freshness in an - to my knowledge - original way: instead of trying to determine freshness of a cache by comparing with the source, it compares with the ... reader! It works this way: when you edit a page, you are noted as a "changer" by your IP adress, and the cache will let all requests from this IP fetch uncached contents, while the rest of the world still see the cache, so both views are consistent. After you are done editing (it waits 15mn after your last save), it actually "publish" your changes by clearing the cache. This concept really solves a lot of problems that were very hard to solve when trying to determine cache freshness by comparing it to the TML source.
The only drawback is that it is unix-only (uses bash, sed, grep, wget, crontab jobs...). It could be made easily work on windows with cygwin I guess, but I am not sure of the performance, but perhaps the C frontend would help. rewriting it in perl may be possible however.
--
ColasNahaboo - 03 Feb 2008