Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-4544

possible bug in ConcurrentMergeScheduler.merge(IndexWriter)

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.0
    • 4.1, 6.0
    • core/other
    • None
    • New

    Description

      from dev list:

      ¨i suspect that this code is broken. Lines 331 - 343 in org.apache.lucene.index.ConcurrentMergeScheduler.merge(IndexWriter)

      mergeThreadCount() are currently active merges, they can be at most maxThreadCount, maxMergeCount is number of queued merges defaulted with maxThreadCount+2 and it can never be lower then maxThreadCount, which means that condition in while can never become true.

      synchronized(this) {
      long startStallTime = 0;
      while (mergeThreadCount() >= 1+maxMergeCount) {
      startStallTime = System.currentTimeMillis();
      if (verbose())

      { message(" too many merges; stalling..."); }

      try

      { wait(); }

      catch (InterruptedException ie)

      { throw new ThreadInterruptedException(ie); }

      }

      While confusing, I think the code is actually nearly correct... but I
      would love to find some simplifications of CMS's logic (it's really
      hairy).

      It turns out mergeThreadCount() is allowed to go higher than
      maxThreadCount; when this happens, Lucene pauses
      mergeThreadCount()-maxThreadCount of those merge threads, and resumes
      them once threads finish (see updateMergeThreads). Ie, CMS will
      accept up to maxMergeCount merges (and launch threads for them), but
      will only allow maxThreadCount of those threads to be running at once.

      So what that while loop is doing is preventing more than
      maxMergeCount+1 threads from starting, and then pausing the incoming
      thread to slow down the rate of segment creation (since merging cannot
      keep up).

      But ... I think the 1+ is wrong ... it seems like it should just be
      mergeThreadCount() >= maxMergeCount().

      Attachments

        1. LUCENE-4544.patch
          8 kB
          Michael McCandless
        2. LUCENE-4544.patch
          12 kB
          Michael McCandless

        Activity

          Patch.

          It was somewhat tricky to fix the off-by-one, because we only want to stall the thread(s) producing segments if the number of running merges is >= maxMergeCount AND another merge wants to kick off ... I made CMS.merged sync'd, and removed the synchronous IW.mergeInit (I think deterministic segment name assignment isn't important).

          mikemccand Michael McCandless added a comment - Patch. It was somewhat tricky to fix the off-by-one, because we only want to stall the thread(s) producing segments if the number of running merges is >= maxMergeCount AND another merge wants to kick off ... I made CMS.merged sync'd, and removed the synchronous IW.mergeInit (I think deterministic segment name assignment isn't important).
          hsn Radim Kolar added a comment -

          it can not be made simple by not creating thread for every new segment merge but to use standard threadpool + queue scheme? Lot of libraries can do it easily.

          hsn Radim Kolar added a comment - it can not be made simple by not creating thread for every new segment merge but to use standard threadpool + queue scheme? Lot of libraries can do it easily.

          I think it needs more than cutting over to thread pool to clean it up

          We've actually looked at using a thread pool (see LUCENE-2063) but it apparently wasn't straightforward ... if you can see a way that'd be nice

          But I think we should do that under a separate issue ... leave this one focused on the off-by-one on maxMergeCount.

          mikemccand Michael McCandless added a comment - I think it needs more than cutting over to thread pool to clean it up We've actually looked at using a thread pool (see LUCENE-2063 ) but it apparently wasn't straightforward ... if you can see a way that'd be nice But I think we should do that under a separate issue ... leave this one focused on the off-by-one on maxMergeCount.

          Added test case ... I think it's ready.

          mikemccand Michael McCandless added a comment - Added test case ... I think it's ready.
          commit-tag-bot Commit Tag Bot added a comment -

          [branch_4x commit] Michael McCandless
          http://svn.apache.org/viewvc?view=revision&revision=1408092

          LUCENE-4544: fix off-by-one in max number of CMS merge threads

          commit-tag-bot Commit Tag Bot added a comment - [branch_4x commit] Michael McCandless http://svn.apache.org/viewvc?view=revision&revision=1408092 LUCENE-4544 : fix off-by-one in max number of CMS merge threads
          tomoko Tomoko Uchida added a comment -

          This issue was moved to GitHub issue: #5610.

          tomoko Tomoko Uchida added a comment - This issue was moved to GitHub issue: #5610 .

          People

            mikemccand Michael McCandless
            hsn Radim Kolar
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: