Support, The browser hangs for 300sec.waiting for httpd to respond and ends up timing out. When I decrease the amount of content I dont see this problem. I originally had this with 2.0.40. Download and build the latest and get the same hang. Heres the error_log: -------------------- Start Error Log ---------------- Thu Sep 25 15:28:31 2003] [notice] Apache/2.0.47 (Unix) configured -- resuming normal operations Thu Sep 25 15:28:31 2003] [info] Server built: Sep 25 2003 14:17:17 [Thu Sep 25 15:28:31 2003] [debug] prefork.c(1037): AcceptMutex: sysvsem (default: sysvsem) [Thu Sep 25 15:37:41 2003] [error] [client 192.168.8.158] (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed, referer: http://192.168.8.158/cgi-bin/editBuildTags-branch3.cgi -------------------- End -----------------------------
let me guess... your cgi script is writing a bunch of stuff to stderr? change that and see if it still occurs
Im not writing anything to stderr, only to stdout with print statements and writing/reading files.
Jeff, Any other ideas? Thanks.
If it wasn't a CGI request, I dunno :( Can you reproduce it with a certain type of request? Have you looked at the access log to see what type of request it is?
Heres the access.log entry... localhost.localdomain - - [03/Oct/2003:15:16:12 -0400] "GET /cgi- bin/editBuildTags-branch4.cgi HTTP/1.1" 200 6265 "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003" localhost.localdomain - - [03/Oct/2003:15:16:42 -0400] "POST /cgi- bin/editBuildTags-branch4.cgi HTTP/1.1" 200 I can execute the Perl script outside of httpd without any errors, if thats what your asking me.
Reproduce the problem, and during the 300 second hang, attach to the CGI process with strace and see what syscall it is in. If you have a simple testcase I can use to reproduce it on my system, please post it.
Unfortunately I cant get a simple testcase cuase everything I try works. The application itself will work 10-15% of the time. Is there a debugger I need to attatch to the process when it hangs? Im not familar how to do this?
This bug is probably the same as bug #20866.
Has this ever been resolved?
need more information to resolve it... we don't know what code is hanging... info requested on 10/10 is expected to help...
I have several apache 2.0.48 servers exhibiting the "ap_content_length_filter: apr_bucket_read() failed" problem with little bit of poking around I found that this happens avery time apache is waiting for a large amount of data to be returned to it from the cgi script (eg. large and time consuming database read/write or extensive writing to stderr). This happens on both 32 and 64 bit (Linux and Solaris) platform but only in the preforked mpm model, the worker (threaded) mpm seems to handle this problem better. I did couple of straces to my cgi script and it always gets stuck in the write system call and at the same time the corresponding http server is stuck in poll system call. this lasts for (5 min) 'Timeout 300' and the error is displayed inside the error_log file (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed and the cgi script either dies or becomes a zombie process. If I am correct error is coming from httpd-2.0.48/server/protocol.c file, (snippet included below) and has to do with BLOCK_READ vs. NONBLOCK_READ. I might be totally wrong about this guess but.... Hope you have a fix for this bug or maybe an idea how I could work around it. thanks fil /* Loop through this set of buckets to compute their length */ e = APR_BRIGADE_FIRST(b); while (e != APR_BRIGADE_SENTINEL(b)) { if (APR_BUCKET_IS_EOS(e)) { eos = 1; break; } if (e->length == (apr_size_t)-1) { apr_size_t len; const char *ignored; apr_status_t rv; /* This is probably a pipe bucket. Send everything * prior to this, and then read the data for this bucket. */ rv = apr_bucket_read(e, &ignored, &len, eblock); if (rv == APR_SUCCESS) { /* Attempt a nonblocking read next time through */ eblock = APR_NONBLOCK_READ; r->bytes_sent += len; } else if (APR_STATUS_IS_EAGAIN(rv)) { /* Output everything prior to this bucket, and then * do a blocking read on the next batch. */ if (e != APR_BRIGADE_FIRST(b)) { apr_bucket_brigade *split = apr_brigade_split(b, e); apr_bucket *flush = apr_bucket_flush_create(r->connection->bucket_alloc); APR_BRIGADE_INSERT_TAIL(b, flush); rv = ap_pass_brigade(f->next, b); if (rv != APR_SUCCESS || f->c->aborted) { apr_brigade_destroy(split); return rv; } b = split; e = APR_BRIGADE_FIRST(b); ctx->data_sent = 1; } eblock = APR_BLOCK_READ; continue; } else { ap_log_rerror(APLOG_MARK, APLOG_ERR, rv, r, "ap_content_length_filter: " "apr_bucket_read() failed"); return rv; } } else { r->bytes_sent += e->length; } e = APR_BUCKET_NEXT(e); }
Regarding the 2003-10-10 question from Jeff Trawick, I attached to the process using strace and it shows write(2, "Use of uninitialized value in co"..., 126 That would be a warning printed by a Perl script that has a lot of undefined variables embeded in strings and runs with -w. (Use of uninitialized value in concatenation (.) or string at xxx.pl line YYY) Development server, duh :) After the 300 seconds I got from strace this: --- SIGPIPE (Broken pipe) @ 0 (0) --- write(2, "Use of uninitialized value in co"..., 126) = -1 EPIPE (Broken pipe) --- SIGPIPE (Broken pipe) @ 0 (0) --- --- SIGTERM (Terminated) @ 0 (0) --- Regards Sorin Mocanu
Sorin, >Regarding the 2003-10-10 question from Jeff Trawick, I attached >to the process using strace and it shows > >write(2, "Use of uninitialized value in co"..., 126 The script is hung writing to stderr. You're suffering from PR 22030.
Looks like a possibly related bug #20866 may have been fixed in version 2.0.50. Folks should retry with that version.
To confirm whether this is related to stderr handling in mod_cgi please try to reproduce the problem using 2.0.50.
Let's presume this is bug 22030 and hence fixed in 2.0.50. lacking evidence to the contrary. *** This bug has been marked as a duplicate of 22030 ***
I'm running a script on our intranet (apache_2.0.52-win32-x86-no_ssl, ActivePerl-5.8.6.811-MSWin32-x86-122208) querying our MSSQL 2000 server, all running on a Win2K server, and I'm getting the same error: [Tue Aug 09 12:56:32 2005] [error] [client 10.0.0.110] (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed, referer: http://intranet/reports/order_me.pl It normally takes about 5 seconds to run, if the results are small (less than 50 rows as far as I've experienced.) In that case, there is no error, and the page loads and completes just fine. However, when I modify the script such that 1000 rows are returned, it always hangs in the browser for five minutes (as the original poster described) after outputting about 41 kb of HTML. It must be some kind of buffer problem because it'll stop in mid-print-statement, like after the '&nb' of an ' ' in an empty cell. Worse than the original poster described, however, is the fact that the results get cut short at that point and the rest are never displayed. Thanks, Kev
We're now on the latest, ActivePerl-5.8.7.813-MSWin32 and apache_2.0.54-win32, and still the exact same thing happens. I ran a test using a simple query that's fast and returns a lot of data. It ran fine, returning 365 kb of data. However, I tested the other query (the one that's causing problems) using a query analyzer, and it took 9 seconds (expected, it's complex) but not any 300 seconds. I would guess it's timing out waiting for the query results and something strange is happening that prevents it from getting the rest, because it starts returning results after only 2 seconds, gives one burst of data, and that's it. Is nobody else still having things happen like this? I see David Trusty reopened one of the ones this was supposedly a duplicate of as well. If you need any more info from me just let me know. Thanks, Kev
For now, anyway, Eric Pearce's suggestion works very well for me: 'I changed the begining of the script from "perl -w" to "perl -X" and all problems ceased...' Thanks, Kev
The only remaining bug here is that Win32 can't poll() across non-socket fds so mod_cgi's CGI bucket stuff can't be used on that platform. This needs to be fixed at APR or somewhere below mod_cgi, marking WONTFIX until it is. (perhaps "CANTFIX" is more appropriate)