Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.3, 3.0.2, 3.1, 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Some concurrency improvements for NRT

      I found & fixed some silly thread bottlenecks that affect NRT:

      • Multi/DirectoryReader.numDocs is synchronized, I think so only 1
        thread computes numDocs if it's -1. I removed this sync, and made
        numDocs volatile, instead. Yes, multiple threads may compute the
        numDocs for the first time, but I think that's harmless?
      • Fixed BitVector's ctor to set count to 0 on creating a new BV, and
        clone to copy the count over; this saves CPU computing the count
        unecessarily.
      • Also strengthened assertions done in SR, testing the delete docs
        count.

      I also found an annoying thread bottleneck that happens, due to CMS.
      Whenever CMS hits the max running merges (default changed from 3 to 1
      recently), and the merge policy now wants to launch another merge, it
      forces the incoming thread to wait until one of the BG threads
      finishes.

      This is a basic crude throttling mechanism – you force the mutators
      (whoever is causing new segments to appear) to stop, so that merging
      can catch up.

      Unfortunately, when stressing NRT, that thread is the one that's
      opening a new NRT reader.

      So, the first serious problem happens when you call .reopen() on your
      NRT reader – this call simply forwards to IW.getReader if the reader
      was an NRT reader. But, because DirectoryReader.doReopen is
      synchronized, this had the horrible effect of holding the monitor lock
      on your main IR. In my test, this blocked all searches (since each
      search uses incRef/decRef, still sync'd until LUCENE-2156, at least).
      I fixed this by making doReopen only sync'd on this if it's not simply
      forwarding to getWriter. So that's a good step forward.

      This prevents searches from being blocked while trying to reopen to a
      new NRT.

      However... it doesn't fix the problem that when an immense merge is
      off and running, opening an NRT reader could hit a tremendous delay
      because CMS blocks it. The BalancedSegmentMergePolicy should help
      here... by avoiding such immense merges.

      But, I think we should also pursue an improvement to CMS. EG, if it
      has 2 merges running, where one is huge and one is tiny, it ought to
      increase thread priority of the tiny one. I think with such a change
      we could increase the max thread count again, to prevent this
      starvation. I'll open a separate issue....

      1. LUCENE-2161.patch
        11 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Attached patch. Also adds an expert public String IndexReader.segString for debug purposes.

        Show
        Michael McCandless added a comment - Attached patch. Also adds an expert public String IndexReader.segString for debug purposes.
        Hide
        Earwin Burrfoot added a comment -

        Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?).

        Show
        Earwin Burrfoot added a comment - Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?).
        Hide
        Michael McCandless added a comment -

        Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?).

        Yeah I guess this would be fine. Even if they don't see the value (ie they still see the stale -1), it's harmless if they recompute it and re-overwrite it. So I'll remove volatile.

        Show
        Michael McCandless added a comment - Remove volatile from numDocs? All threads will hit some other sync sooner or later and see the value computed by the first (?few?). Yeah I guess this would be fine. Even if they don't see the value (ie they still see the stale -1), it's harmless if they recompute it and re-overwrite it. So I'll remove volatile.
        Hide
        Michael McCandless added a comment -

        Backport

        Show
        Michael McCandless added a comment - Backport
        Hide
        Shay Banon added a comment -

        Mike, is there a reason why this is not backported to 3.0.2?

        Show
        Shay Banon added a comment - Mike, is there a reason why this is not backported to 3.0.2?
        Hide
        Michael McCandless added a comment -

        Shay it will be backported to 3.0.2.

        Show
        Michael McCandless added a comment - Shay it will be backported to 3.0.2.
        Hide
        Shay Banon added a comment -

        Thanks!

        Show
        Shay Banon added a comment - Thanks!

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development