Lucene - Core
  1. Lucene - Core
  2. LUCENE-4544

possible bug in ConcurrentMergeScheduler.merge(IndexWriter)

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 4.1, 6.0
    • Component/s: core/other
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      from dev list:

      ¨i suspect that this code is broken. Lines 331 - 343 in org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter)

      mergeThreadCount() are currently active merges, they can be at most maxThreadCount, maxMergeCount is number of queued merges defaulted with maxThreadCount+2 and it can never be lower then maxThreadCount, which means that condition in while can never become true.

      synchronized(this) {
      long startStallTime = 0;
      while (mergeThreadCount() >= 1+maxMergeCount) {
      startStallTime = System.currentTimeMillis();
      if (verbose())

      { message(" too many merges; stalling..."); }

      try

      { wait(); }

      catch (InterruptedException ie)

      { throw new ThreadInterruptedException(ie); }

      }

      While confusing, I think the code is actually nearly correct... but I
      would love to find some simplifications of CMS's logic (it's really
      hairy).

      It turns out mergeThreadCount() is allowed to go higher than
      maxThreadCount; when this happens, Lucene pauses
      mergeThreadCount()-maxThreadCount of those merge threads, and resumes
      them once threads finish (see updateMergeThreads). Ie, CMS will
      accept up to maxMergeCount merges (and launch threads for them), but
      will only allow maxThreadCount of those threads to be running at once.

      So what that while loop is doing is preventing more than
      maxMergeCount+1 threads from starting, and then pausing the incoming
      thread to slow down the rate of segment creation (since merging cannot
      keep up).

      But ... I think the 1+ is wrong ... it seems like it should just be
      mergeThreadCount() >= maxMergeCount().

      1. LUCENE-4544.patch
        12 kB
        Michael McCandless
      2. LUCENE-4544.patch
        8 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch.

        It was somewhat tricky to fix the off-by-one, because we only want to stall the thread(s) producing segments if the number of running merges is >= maxMergeCount AND another merge wants to kick off ... I made CMS.merged sync'd, and removed the synchronous IW.mergeInit (I think deterministic segment name assignment isn't important).

        Show
        Michael McCandless added a comment - Patch. It was somewhat tricky to fix the off-by-one, because we only want to stall the thread(s) producing segments if the number of running merges is >= maxMergeCount AND another merge wants to kick off ... I made CMS.merged sync'd, and removed the synchronous IW.mergeInit (I think deterministic segment name assignment isn't important).
        Hide
        Radim Kolar added a comment -

        it can not be made simple by not creating thread for every new segment merge but to use standard threadpool + queue scheme? Lot of libraries can do it easily.

        Show
        Radim Kolar added a comment - it can not be made simple by not creating thread for every new segment merge but to use standard threadpool + queue scheme? Lot of libraries can do it easily.
        Hide
        Michael McCandless added a comment -

        I think it needs more than cutting over to thread pool to clean it up

        We've actually looked at using a thread pool (see LUCENE-2063) but it apparently wasn't straightforward ... if you can see a way that'd be nice

        But I think we should do that under a separate issue ... leave this one focused on the off-by-one on maxMergeCount.

        Show
        Michael McCandless added a comment - I think it needs more than cutting over to thread pool to clean it up We've actually looked at using a thread pool (see LUCENE-2063 ) but it apparently wasn't straightforward ... if you can see a way that'd be nice But I think we should do that under a separate issue ... leave this one focused on the off-by-one on maxMergeCount.
        Hide
        Michael McCandless added a comment -

        Added test case ... I think it's ready.

        Show
        Michael McCandless added a comment - Added test case ... I think it's ready.
        Hide
        Commit Tag Bot added a comment -

        [branch_4x commit] Michael McCandless
        http://svn.apache.org/viewvc?view=revision&revision=1408092

        LUCENE-4544: fix off-by-one in max number of CMS merge threads

        Show
        Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1408092 LUCENE-4544 : fix off-by-one in max number of CMS merge threads

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Radim Kolar
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development