Scripts hang in TWiki::UI:run at drain STDIN logic
Platform
- TWiki 4.04 on Apache 2.0.55 on HP-UX 11i V2 (11.23) Itanium 2
Symptom
- /twiki/bin/configure works fine
- All attempts at /twiki/bin/register hang (as do most other functions).
- This problem did not occur for me on TWiki 4.04 on Apache 1.3.27 on RedHat Linux on IA-32.
Possible Bug Location
- register calls TWiki::UI:run, that routine hangs here:
sub run {
snip...
# drain STDIN. This may be necessary if the script is called
# due to a redirect and the original query was a POST. In this
# case the web server is waiting to write the POST data to
# this script's STDIN, but CGI.pm won't drain STDIN as it is
# seeing a GET because of the redirect, not a POST. This script
# tries to write to STDOUT, which goes back to the web server,
# but the server isn't paying attention to that (as its waiting for
# the script to _read_, not _write_), and everything blocks.
# Some versions of apache seem to be more susceptible than others to
# this.
my $content_length =
defined($ENV{'CONTENT_LENGTH'}) ? $ENV{'CONTENT_LENGTH'} : 0;
read(STDIN, my $buf, $content_length, 0 ) if $content_length;
snip...
Workaround
- I comment out the "drain stdin" logic and everything works fine.
Offer to Help
I'm willing to help debug and test necessary changes if someone who knows something about this piece of code is willing to help, otherwise, the work-around
seems to work fine.
--
RickGilligan - 10 Aug 2006
I have seen this sporadically on my system (TWiki SVN, Apache 2.0.55, Debian/Testing) too. This started to occure since the
latest security update to apache. Not before.
Thanks to your investigation, Rick, I commented out the "drain logic" as well.
Am still testing. But since then I never came accross such a hang.
I push this bug to
urgent as this needs to be investigated in more depth.
The comments above the "drain logic" code refer to GETs and POSTs in
the middle of a redirect. This can't be solved this way anyway, I think,
as this is more of a fundamental problem (...prove me wrong and I am happy).
So I'd opt to remove the "drain logic" as it seems to hurt more
than it helps.
MD
This was not introduced in connection with a security update. It was introduced in TWiki-4.0.3 as a bugfix.
The change was proposed by a guy Diab Jerius on
TWiki:Codev.TWikiOnApache2dot0Hangs
and after people had been kissing his feet and I had found no problems with it on my installations I decided to integrate it.
Obviously this fix is not complete or at least has to be made conditional based on what system it runs on.
See
Item2327. Instead of just reverting the fix we should make an improved fix that peoperly work around this Apache problem.
See also
TWiki:Support.NewHPitaniumInstallHangsInRegis
for more background.
This is a Hotfix candidate which means when fixed it will be in my next hotfix.
--
KJL
A last resort is to enable/disable the work around code in configure with an EXPERT setting. And unless someone else come up with a better solution this is what I will try and implement then.
It is pretty important that TWiki can run and work in any environment. And quite many people have been expressing happiness about the work-around that Diab wrote. So I will not recommend just reverting it.
--
KJL
Reverting it would certainly make me unhappy; I think making it a configure option is a good compromise. Perhaps the patch (for which kudos really belong to
DougClaar
) isn't the best way to fix things, but without debugging apache (which I was tempted to try before sanity regained control) I can only poke at things from the outside, and the behavior seemed to fit my model. Having since learned that I am far more ignorant than knowledgeable about the HTTP protocol, I certainly won't be betting the house on my explanation.
--
DiabJerius
A thought: Could this be directly related to whether or not the Apache 2.0 version is earlier than 2.0.50, or greater than or equal to 2.0.50? From what I've read, it appears that not having this fix will
always be a problem with earlier versions of Apache 2.0, and
never a problem the versions starting with 2.0.50. In those cases (if any) where the original fix does
not cause a problem with later versions of Apache 2.0, perhaps
O_NDELAY
or
O_NONBLOCK
are set on
STDIN
.
--
JamesParker - 15 Aug 2006
Could very well be. But it can also be related to which exact cgi module loaded by Apache. Something with mod_cgi versus mod_cgid
To make code conditional we need to be able to get the info from the environment variables given to us from Apache.
--
KJL
It's hardly a trial to make this a configure option. The hard part is documenting it clearly, concisely, and in the right places so that people can find it (and recognise when the problem exists).
CC
If this is now reduced to a documentation issue, it is hardly a
ReleaseBlocker?
--
SP
It is more than a documentation issue.
Right now TWiki contains code that makes it working well with Linux. But with some combinations of Apache versions, Unix and Apache modules the fix we have causes more trouble than it fixes. And if we remove the fix many more people get trouble.
The current suggestion is to HACK THE CODE when you have the problem.
Not what I would call a solution. The code should be able to find out how to act based on the environment. The people that currently have the problem have completely different hardware, OS and Apache modules than most of us run with which is why this one is hard to close. We simply cannot reproduce it. So it is waiting for someone to either be very bright, or someone that has the right combo and can implement a better fix and test it.
But documenting how to hack the code is a bit silly. As an interim solution the code could be activated through a configure setting where the default it how the code works as of today and you would have to actively deactivate it.
KJL
For what its worth, I'm running with the patch on Debian Etch (testing) with Apache 2.2 and have seen no problems. I haven't tried it without the patch.
--
DiabJerius
I am not letting this block a 4.1 release. Lowering to normal. If anyone disagrees - flip it back to urgent - or better - fix it
KJL
See also
Item3354
I suggest fixing with at least configure fix for 4.1
KJL
Since it causes trouble in other environments (
Item3354) as well, I suggest to add a configure flag for 4.1, default is not to drain STDIN.
--
PTh
I added a
{DrainStdin}
configure flag to the Security/Misc section, default state is FALSE.
--
PTh
In case someone wants to apply a fix to an existing TWiki 4:
Index: TWiki.spec
===================================================================
--- TWiki.spec (revision 12349)
+++ TWiki.spec (working copy)
@@ -485,6 +485,15 @@
$TWiki::cfg{GetScriptUrlFromCgi} = $FALSE;
# **BOOLEAN EXPERT**
+# Draining STDIN may be necessary if the script is called due to a
+# redirect and the original query was a POST. In this case the web
+# server is waiting to write the POST data to this script's STDIN,
+# but CGI.pm won't drain STDIN as it is seeing a GET because of the
+# redirect, not a POST. Enable this <b>only</b> in case a TWiki script
+# hangs.
+$TWiki::cfg{DrainStdin} = $FALSE;
+
+# **BOOLEAN EXPERT**
# Remove port number from URL. If set, and a URL is given with a port
# number e.g. http://my.server.com:8080/twiki/bin/view, this will strip
# off the port number before using the url in links.
Index: TWiki/UI.pm
===================================================================
--- TWiki/UI.pm (revision 12349)
+++ TWiki/UI.pm (working copy)
@@ -87,19 +87,22 @@
if( $ENV{'GATEWAY_INTERFACE'} ) {
# script is called by browser
$query = new CGI;
- # drain STDIN. This may be necessary if the script is called
- # due to a redirect and the original query was a POST. In this
- # case the web server is waiting to write the POST data to
- # this script's STDIN, but CGI.pm won't drain STDIN as it is
- # seeing a GET because of the redirect, not a POST. This script
- # tries to write to STDOUT, which goes back to the web server,
- # but the server isn't paying attention to that (as its waiting for
- # the script to _read_, not _write_), and everything blocks.
- # Some versions of apache seem to be more susceptible than others to
- # this.
- my $content_length =
- defined($ENV{'CONTENT_LENGTH'}) ? $ENV{'CONTENT_LENGTH'} : 0;
- read(STDIN, my $buf, $content_length, 0 ) if $content_length;
+
+ if( $TWiki::cfg{DrainStdin} ) {
+ # drain STDIN. This may be necessary if the script is called
+ # due to a redirect and the original query was a POST. In this
+ # case the web server is waiting to write the POST data to
+ # this script's STDIN, but CGI.pm won't drain STDIN as it is
+ # seeing a GET because of the redirect, not a POST. This script
+ # tries to write to STDOUT, which goes back to the web server,
+ # but the server isn't paying attention to that (as its waiting for
+ # the script to _read_, not _write_), and everything blocks.
+ # Some versions of apache seem to be more susceptible than others to
+ # this.
+ my $content_length =
+ defined($ENV{'CONTENT_LENGTH'}) ? $ENV{'CONTENT_LENGTH'} : 0;
+ read(STDIN, my $buf, $content_length, 0 ) if $content_length;
+ }
my $cache = $query->param('twiki_redirect_cache');
if ($cache) {
$cache = TWiki::Sandbox::untaintUnchecked($cache);
--
PTh
4.1.0 released
KJL