Bug: Apache snips a chunk out of the oopsmore and prevents topic reparenting
Reported as reparenting topics failed due to the select list only containing a portion of the topics in the web. Upon further inspection it appears that the output of oops is being altered by apache.
Test case
Click on "more topic actions" in a web with many topics. Observe to see if the "select topic parent" input contains all options and if the page renders correctly.
Environment
| TWiki version: |
TWikiRelease02Sep2004 |
| TWiki plugins: |
DefaultPlugin, TopicVarsPlugin, CalendarPlugin, SpreadSheetPlugin, CommentPlugin, EditTablePlugin, HeadlinesPlugin, InterwikiPlugin, RenderListPlugin, SlideShowPlugin, SmiliesPlugin, TablePlugin |
| Server OS: |
Production: Fedora 1r3 2.4.23; Test: Fedora 2r2 2.6.9 |
| Web server: |
Production: Apache 2.0.50; Test: Apache 2.0.51 |
| Perl version: |
Production: 5.8.3; Test: 5.8.3 |
| Client OS: |
Fedora 2r2 2.6.9 |
| Web Browser: |
wget 1.9 |
--
BradDixon - 13 Jan 2005
Follow up
Based on the user's comments I started checking into this. The web that exibits the problem has 1583 topics. Loading the
oopsmore template via the
oops script results in the contents of the parent select input being incompletely populated and the page rendering incorrectly. The
HTML from the test case ends with:
<option value="blah">blah</option></select></li></ul></li></ul></form></div></div></div></body></html>
Data
- Trying the same operation in a web that has far less topics (a few hundred) does not reproduce the error.
- Trying the same operation under TWikiRelease01Dec2001 with the same testcase does not reproduce the error.
- This is not a browser rendering issue.
wget and Firefox fetch the same HTML and in both cases it is flawed.
Next I started trying to triage this into a browser, apache, or TWiki
issue. I switched to
wget with identical results. I then created
this modification to
oops:
--- lib/TWiki/UI/Oops.pm.orig 2005-01-13 15:36:40.624310520 -0500
+++ lib/TWiki/UI/Oops.pm 2005-01-13 15:36:48.524109568 -0500
@@ -68,6 +68,11 @@
$tmplData = &TWiki::Render::getRenderedVersion( $tmplData );
$tmplData =~ s/( ?) *<\/?(nop|noautolink)\/?>\n?/$1/gois; # remove <nop> and <noautolink> tags
+ TWiki::writeDebug("oopsrename writing " . length($tmplData) . " bytes");
+ open TEMPDEBUG, ">/tmp/oops.out";
+ print TEMPDEBUG $tmplData;
+ close TEMPDEBUG;
+
TWiki::writeHeader( $query );
print $tmplData;
}
Now I could isolate Apache from TWiki. Here's what happens:
[bdixon@bd-pc tmp]$ wget -O oops_apache.out 'http://localhost/thebazaar2/bin/oops/Web/TestTopic?template=oopsmore¶m1=1.8¶m2=1.8' ;wc oops*
--16:05:03-- http://localhost/thebazaar2/bin/oops/Web/TestTopic?template=oopsmore¶m1=1.8¶m2=1.8
=> `oops_apache.out'
Resolving localhost... 127.0.0.1
Connecting to localhost[127.0.0.1]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
[ <=> ] 40,021 --.--K/s
16:05:04 (102.88 MB/s) - `oops_apache.out' saved [40,021]
149 1535 40021 oops_apache.out
151 3544 105557 oops.out
300 5079 145578 total
As you can see
oops_apache.out is much shorter then
oops.out.
Diff the two files and it appears that these lines of the
HTML output are deleted:
- <select name="topicparent" size="10">
- <option value="none">none (remove topic parent)</option>
the huge long line listing all the option values for the web's topic list
Everything else on the page is fine.
As a further test I deleted the
%TOPICLIST{}% line from
oopsmore.tmpl. This "fixes" the issue but of course breaks the
function.
I'm scratching my head now since this appears to be an issue that
depends on the length output returned by
%TOPICLIST{}% but the actual flaw is
in Apache. Any help would be appreciated.
--
BradDixon - 13 Jan 2005
A couple days ago a long standing though seldom noticed bug with the
content_length header was fixed in the DEVELOP branch. What you're seeing might be a facet of this now fixed bug.
http://koala.ilog.fr/twikiirc/bin/irclogger_log_search/twiki?search=content%3Blength&action=search&error=0
--
MattWilkie - 14 Jan 2005
What Matt is alluding to is the fact that sending a content-length header of 0 is
not the same as
not sending a content-length header at all. Try modifying the code (TWiki.pm; function writeHeaderFull) to
never send a content-length, but
always terminate text/html content with a \n, and let use know if that helps. Oh, and you might try adding the line
$| = 1;
in the same function to switch off output buffering as well.
--
CrawfordCurrie - 14 Jan 2005
Hmmm... based on
CrawfordCurrie's advice I hacked this up:
Index: thebazaar2/bin/oops
===================================================================
--- thebazaar2.orig/bin/oops
+++ thebazaar2/bin/oops
@@ -45,3 +45,4 @@
TWiki::UI::Oops::oops_cgi( $web, $topic, $user, $query );
+print "\n";
Index: thebazaar2/lib/TWiki.pm
===================================================================
--- thebazaar2.orig/lib/TWiki.pm
+++ thebazaar2/lib/TWiki.pm
@@ -823,6 +823,8 @@
# of other types of page, with expiry time driven by page type.
my( $pluginHeaders, $coreHeaders );
+ # disable output buffering
+ #$| = 1;
$contentType .= "; charset=$siteCharset";
@@ -844,7 +846,6 @@
# is cached until required expiry time.
$coreHeaders = $query->header(
-content_type => $contentType,
- -content_length => $contentLength,
-last_modified => $lastModifiedString,
-expires => "+${expireHours}h",
-cache_control => "max-age=$expireSeconds",
@@ -893,7 +894,7 @@
}
$finalHeaders .= "\r\n" if ( $finalHeaders);
- ##writeDebug( "===== Final Headers are:\n$finalHeaders" );
+ writeDebug( "===== Final Headers are:\n$finalHeaders" );
print $finalHeaders;
}
I hope that was what you meant. It didn't really seem to have an
impact upon the symptoms. What is really vexing is that the results
seem so random. Running the same URL 20 times produces 20 different
results:
# loopit "wget -O oops_apache_LOOP.out http://localhost/thebazaar2/bin/oops/Web/TestTopic?template=oopsmore¶m1=1.8¶m2=1.8" 1
<snip out 20 wget sessions>
# wc oops*
94 2543 77777 oops_apache_10.out
141 1372 35719 oops_apache_11.out
141 1376 35719 oops_apache_12.out
141 1359 35719 oops_apache_13.out
94 449 8145 oops_apache_14.out
141 1517 39815 oops_apache_15.out
94 881 24529 oops_apache_16.out
141 1376 35719 oops_apache_17.out
141 1369 35719 oops_apache_18.out
141 1508 39815 oops_apache_19.out
141 1370 35719 oops_apache_1.out
141 1196 31623 oops_apache_20.out
94 1062 28625 oops_apache_2.out
94 753 20433 oops_apache_3.out
141 1204 31623 oops_apache_4.out
141 1296 31623 oops_apache_5.out
141 1372 35719 oops_apache_6.out
141 1372 35719 oops_apache_7.out
94 1051 28625 oops_apache_8.out
141 1505 39815 oops_apache_9.out
2538 25931 688200 total
The headers look like this (from
debug.log):
14 Jan 2005 - 08:49 ===== Final Headers are:
Content-Type: text/html; charset=ISO-8859-1
TWiki seems to be producing consistent data but it gets bungled
between TWiki and Apache.
--
BradDixon 14 Jan 2005
Yes, that's what I meant. Shame. Only other thing I can suggest is to try the reverse, and force buffering
on i.e. $| = 0;
Sounds to me very much like a problem with buffering.
--
CrawfordCurrie - 14 Jan 2005
Perhaps this is an Apache 2.x specific issue... Maybe you could try the latest Apache version (2.0.52), or 1.3.33? If you do the former, you could report this as an Apache bug if you get a simple test case.
--
RichardDonkin - 14 Jan 2005
I think I fired off at Apache too soon. Take a look at this:
# rm /tmp/oops.out; ./oops '/Engineering/TestBug9774?template=oopsmore¶m1=1.1¶m2=1.1'| tee stdout.out | wc; wc /tmp/oops.out
96 453 8192
140 3517 105225 /tmp/oops.out
So Apache clearly is vindicated since it isn't even involved in this testcase. Guess I'll have to rename this topic.
I tried $| = 0 to no avail. Right now I'm going to try backing out
local changes to see if I broke something.
It looks like this local patch I applied breaks everything:
Index: thebazaar2/bin/oops
===================================================================
--- thebazaar2.orig/bin/oops
+++ thebazaar2/bin/oops
@@ -31,6 +31,9 @@
use TWiki;
use TWiki::UI::Oops;
+# From http://twiki.org/cgi-bin/view/Codev/HangsSavingLargePages
+use IO::Handle; STDOUT->blocking(0);
+
my $query = new CGI;
my $thePathInfo = $query->path_info();
my $theRemoteUser = $query->remote_user();
The IT staff patched Apache so I don't think I need this anymore. Let's back that off!
--
BradDixon 14 Jan 2005
That was the problem. It is now fixed. I guess I'll close this as a (l)user bug.
Fix record
Brad - Thanks for the details. Could you confirm which Apache version fixed this, presumably 2.0.52? Or was it an Apache patch, in which case which version was base and what was the patch.
Assuming you got the above patch from
TWikiOnApache2dot0Hangs (via
HangsSavingLargePages), the fix recommended there is to upgrade to at least 2.0.50, which fixes the hang issue. Interesting if the initially recommended patch there causes a problem though.
I'm keeping this as
BugAssigned for the moment.
--
RichardDonkin - 17 Jan 2005
I'm having this problem, and I'm running apache 2.0.52, perl 5.8.5, Cario 02Sep2004. I'm looking into if the fix posted above (settings STDOUT non-blocking) is working, but it's doesn't happen predictably. Everything else about the problem (from it being the oops script, to strace showing blocking on STDOUT) matches.
--
AndyBakun - 16 Feb 2005
I have a web with 811 topics (and growing fast).
I get the "oopsmore" problem when, like
BradDixon, I have the line
use IO::Handle; STDOUT->blocking(0);
as a fix for large pages that take a very long time to save.
Using Apache 2.0.40-21.16 legacy, on Red Hat 9. (There doesn't seem to be a newer version, unless I go to Apache directly. The Red Hat version is still being patched by the Fedora Legacy project.)
Addendum: I am using the 02Sept2004 version.
--
RobertEchlin - 15 Jun 2005