Bug 45605 - [Mon Aug 04 16:30:39 2008] [crit] [Mon Aug 04 16:30:39 2008] file fdqueue.c, line 293, assertion "!((queue)->nelts == (queue)->bounds)" failed
Summary: [Mon Aug 04 16:30:39 2008] [crit] [Mon Aug 04 16:30:39 2008] file fdqueue.c, ...
Status: RESOLVED FIXED
Alias: None
Product: Apache httpd-2
Classification: Unclassified
Component: mpm_worker (show other bugs)
Version: 2.2.9
Hardware: PC All
: P2 normal (vote)
Target Milestone: ---
Assignee: Apache HTTPD Bugs Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-08-09 00:40 UTC by Roger Mattison
Modified: 2008-10-18 03:20 UTC (History)
3 users (show)



Attachments
worker_queue wait for not full patch for 2.2.6 (4.40 KB, patch)
2008-09-19 16:14 UTC, Chris Luby
Details | Diff
worker_queue wait for not full patch for 2.2.6 - version 2 (added warning) (5.11 KB, patch)
2008-09-19 16:53 UTC, Chris Luby
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Roger Mattison 2008-08-09 00:40:34 UTC
I have a mod_jk server, and the log shows this warning before the server dies:
 [Mon Aug 04 16:30:39 2008] [crit] [Mon Aug 04 16:30:39 2008] file fdqueue.c, line 293, assertion "!((queue)->nelts == (queue)->bounds)" failed

I think when some (not all) of the backend servers are down (and the rest are therefore slower) it happens.
Comment 1 Jeff Lawson 2008-09-18 13:11:09 UTC
I was able to reproduce this bug under heavy load on Linux CentOS 4. Without the maintainer-mode configured ap_queue_push() would simply write past the end of the worker_queue->data[] array, and into the work_queue_info structure that happened to be allocated directly past it in memory. The end result was that the condition variable pointer at worker_queue_info->wait_for_idler would be overwritten (as well as the other parts of worker_queue_info before that) and the child process would hang when it attempted to shutdown (queue_info_cleanup(), apr_thread_cond_destroy(), apr_pool_cleanup_run(), thread_cond_cleanup(), pthread_cond_destroy() hung in here attempting to lock the mutex embedded in the condition variable.

I also observer other problems (seg faults in two other places) which ins't surprising since we are looking at a buffer overrun into who knows what memory.

Duplicated with 2.2.9. When maintainer mode was enable I got the same assert as reported in this bug.

I would be willing to work on a fix for this, but haven't had time to dig deep enough to know if there already is a mechanism to prevent this buffer array overflow that simply is broken, or if one needs to be built. Any pointers would be appreciated.
Comment 2 Chris Luby 2008-09-19 16:14:17 UTC
Created attachment 22614 [details]
worker_queue wait for not full patch for 2.2.6

This is the patch I'm currently testing for this against 2.2.6.  I just finished it about an hour ago and I would love any feedback.  What I've done is added a second condition to the queue structure that the listener thread waits on when the queue is full until one of the worker threads signals it after popping a socket off of the queue.  It's kind of the opposite of the not_empty condition
Comment 3 Chris Luby 2008-09-19 16:53:46 UTC
Created attachment 22615 [details]
worker_queue wait for not full patch for 2.2.6 - version 2 (added warning)

Slight update to the patch that I'm testing to include an error log warning to help out validation
Comment 4 Denis Ustimenko 2008-10-07 07:08:42 UTC
The same was reproduced several times under a heavy load with the 1 process / 2 thread configuration. Due to the data corruption by the worker_queue overflow the core is dumped. Inspecting the core files, the underflow of worker_queue_info.idlers is found.

Finally, that looks like the race between the condition signal and the atomic update of the idlers variable.

The following scheduling scenario leads to the idlers underflow:

0. one listener + worker thread

1. listener got a connection, decreases idlers to 0, then context switch
2. worker does his job set the idlers from 0 to 1,
   then context switch before the condition signal
3. listener got a connection, sees that idlers is 1,
   so decreases is to 0, gets another connection,
   waits on the condition variable
4. worker remembering that the idlers was 0,
   does the cond_signal, then context switch
5. listener wakes up and set idlers to -1

The 2.2.9 patch is the following. The similar patch for 2.2.3 is currently under the test.

--- server/mpm/worker/fdqueue.c.fdqueue-overflow    2006-07-12 07:38:44.000000000 +0400
+++ server/mpm/worker/fdqueue.c     2008-10-07 13:53:28.000000000 +0400
@@ -166,7 +166,7 @@
          *     now nonzero, it's safe for this function to
          *     return immediately.
          */
-        if (queue_info->idlers == 0) {
+        while (queue_info->idlers == 0) {
             rv = apr_thread_cond_wait(queue_info->wait_for_idler,
                                   queue_info->idlers_mutex);
             if (rv != APR_SUCCESS) {
Comment 5 Ruediger Pluem 2008-10-07 08:52:54 UTC
(In reply to comment #4)
> The same was reproduced several times under a heavy load with the 1 process / 2
> thread configuration. Due to the data corruption by the worker_queue overflow
> the core is dumped. Inspecting the core files, the underflow of
> worker_queue_info.idlers is found.
> 
> Finally, that looks like the race between the condition signal and the atomic
> update of the idlers variable.
> 
> The following scheduling scenario leads to the idlers underflow:
> 
> 0. one listener + worker thread
> 
> 1. listener got a connection, decreases idlers to 0, then context switch
> 2. worker does his job set the idlers from 0 to 1,
>    then context switch before the condition signal
> 3. listener got a connection, sees that idlers is 1,
>    so decreases is to 0, gets another connection,
>    waits on the condition variable
> 4. worker remembering that the idlers was 0,
>    does the cond_signal, then context switch
> 5. listener wakes up and set idlers to -1
> 
> The 2.2.9 patch is the following. The similar patch for 2.2.3 is currently
> under the test.

Very nice analysis. Just one question for clarification: After applying the patch you submitted the issue was gone and no longer reproducable?

Comment 6 Denis Ustimenko 2008-10-08 01:39:39 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > 
> > The 2.2.9 patch is the following. The similar patch for 2.2.3 is currently
> > under the test.
> 
> Very nice analysis. Just one question for clarification: After applying the
> patch you submitted the issue was gone and no longer reproducable?
 
Right! No segfault anymore.
Comment 7 Ruediger Pluem 2008-10-08 06:33:05 UTC
Committed to trunk as r702867 (http://svn.apache.org/viewvc?rev=702867&view=rev).
Comment 8 Jeff Lawson 2008-10-08 08:41:36 UTC
This latest report is a legitimate bug, as can be seen by following events through the given scenario. However it is not the exactly the same bug as originally reported. As can be seen by reviewing the assert condition that the original bug reporter was having, the original condition was an overflow not an underflow. Here is the scenario where the overflow can happen.

1. Listener thread is waiting for workers.
1. Worker threads are all busy but one that just finished.
2. That worker thread atomically increments idlers from 0 to 1 and awakens the listener.
3. That worker thread context switches (before getting into ap_queue_pop())
4. Listener awakens and finds that there is an idle worker and begins to fill the queue (ap_queue_push()), repeatedly until queue is overfilled.

Unfortunately the fix provided by Denis does not address this problem. 

The root of the problem is that there is no way for the listener to indicate that it is idle, then execute code outside of a critical section, and then pick up the work to be done without this timing window being present. 

The possible solutions are:
- Have the idler not indicate it is ready to process a request before it is actually in the critical section where it will pick up the work to be processed. This is not trivial using the current architecture of having the queue and queueinfo structures being separate structures.
- Have the listener wait if the queue is full as Chris' patch does. This introduces and an extra condition variable, and an extra mutex lock for each 
request. (Might be able to mitigate the mutex lock cost to almost zero by only locking when the queue is full)
- To minimize code changes we could simply gracefully exit the child when the queue is full allowing those requests to finish but no more requests to be processed by this child. I don't like this solution because it makes more work (child startup/shutdown) right when the system is already overloaded.

Comments? Other possibilities?
Comment 9 Jeff Lawson 2008-10-08 09:07:29 UTC
there is a mistake in the last comment. The paragraph:
The root of the problem is that there is no way for the listener to indicate
that it is idle, then execute code outside of a critical section, and then pick
up the work to be done without this timing window being present. 

should read:
The root of the problem is that there is no way for the worker (not listener) to indicate that it is idle, then execute code outside of a critical section, and then pick up the work to be done without this timing window being present. 
Comment 10 Ruediger Pluem 2008-10-08 09:29:17 UTC
(In reply to comment #8)
> This latest report is a legitimate bug, as can be seen by following events
> through the given scenario. However it is not the exactly the same bug as
> originally reported. As can be seen by reviewing the assert condition that the
> original bug reporter was having, the original condition was an overflow not an
> underflow. Here is the scenario where the overflow can happen.
> 
> 1. Listener thread is waiting for workers.
> 1. Worker threads are all busy but one that just finished.
> 2. That worker thread atomically increments idlers from 0 to 1 and awakens the
> listener.
> 3. That worker thread context switches (before getting into ap_queue_pop())
> 4. Listener awakens and finds that there is an idle worker and begins to fill
> the queue (ap_queue_push()), repeatedly until queue is overfilled.

I cannot follow this last point. Once the listerner awakes again from 
apr_thread_cond_wait in ap_queue_info_wait_for_idler it knows for sure that there is at least one idle thread (after the patch from Denis is applied). But if there is only one idle thread queue_info->idlers is decreased to zero again by apr_atomic_dec32(&(queue_info->idlers)); in ap_queue_info_wait_for_idler. After returning from ap_queue_info_wait_for_idler the listener thread tries to accept *one* connection and pushes it to the queue. Afterwards it has to wait again for an idle thread in ap_queue_info_wait_for_idler (exactly in the call apr_thread_cond_wait). So the queue is not filled repeatedly by the listener thread until overfilled.
Have you applied the patch and checked whether you still experience the same kind of SegFaults as without?
Do you still see the assertion error message found by the original reporter?
Comment 11 Jeff Lawson 2008-10-08 14:36:12 UTC
Yes I see that, thanks for the explanation.

I will test with the new patch. 

I see how the underflow of queue_info->idlers can cause on overflow of queue->data[]:
1. Underflow explained by Denis happens.
2. Listener fills the queue, decrementing queue_info->idlers each insert making it more and more negative until queue->data[] overflows and bad things happen.
Comment 12 Jeff Lawson 2008-10-10 12:48:08 UTC
This patch also fixes ny scenario. No more assert.
Comment 13 Ruediger Pluem 2008-10-11 11:43:46 UTC
Proposed for backport as r703707 (http://svn.apache.org/viewvc?rev=703707&view=rev).
Comment 14 Ruediger Pluem 2008-10-18 03:20:19 UTC
Backported to 2.2.x as r705872 (http://svn.apache.org/viewvc?rev=705872&view=rev).