Lucene - Core
  1. Lucene - Core
  2. LUCENE-4163

Improve concurrency in MMapIndexInput.clone()

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.6, 4.0-ALPHA, 6.0
    • Fix Version/s: 4.0-ALPHA, 3.6.1, 6.0
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Followup issue from SOLR-3566:

      Whenever you clone the TermIndex, it also creates a clone of the underlying IndexInputs. In high cocurrent environments, the clone method of MMapIndexInput is a bottleneck (it has heavy work to do to manage the weak references in a synchronized block).

      Everywhere else in Lucene we use my new WeakIdentityMap for managing concurrent weak maps. For this case I did not do this, as the WeakIdentityMap has no iterators (it doe snot implement Map interface). This issue will add a key and values iterator (the key iterator will not return GC'ed keys), so MMapIndexInput can use WeakIdentityMap backed by ConcurrentHashMap and needs no synchronization. ConcurrentHashMap has better concurrency because it distributes the hash keys in different buckets per thread.

      1. LUCENE-4163.patch
        11 kB
        Uwe Schindler
      2. LUCENE-4163.patch
        10 kB
        Uwe Schindler
      3. LUCENE-4163.patch
        9 kB
        Uwe Schindler
      4. LUCENE-4163-3.x
        12 kB
        Uwe Schindler

        Issue Links

          Activity

          Hide
          Uwe Schindler added a comment -

          Patch.

          This patch also refactors the close() method of MMapIndexInput a little bit to work around the issue that we have no synchronization anymore. It will mark the IndexInput as closed (buffers = null) as first step, so later clone() or other access fails with AlreadyClosedException. After unsetting the buffers it will unset all clone buffers and finally unmap them.

          Show
          Uwe Schindler added a comment - Patch. This patch also refactors the close() method of MMapIndexInput a little bit to work around the issue that we have no synchronization anymore. It will mark the IndexInput as closed (buffers = null) as first step, so later clone() or other access fails with AlreadyClosedException. After unsetting the buffers it will unset all clone buffers and finally unmap them.
          Hide
          Uwe Schindler added a comment -

          Slightly improved test (null keys and iterator conformance).

          I think that's ready to commit and brings a big improvement in concurrency. We should backport this to 3.6.1!

          Show
          Uwe Schindler added a comment - Slightly improved test (null keys and iterator conformance). I think that's ready to commit and brings a big improvement in concurrency. We should backport this to 3.6.1!
          Hide
          Adrien Grand added a comment -

          +1

          Maybe we should also update WeakIdentityMap documentation now that it has key and value iterators:

          This implementation was forked from <a href="http://cxf.apache.org/">Apache CXF</a> but modified to <b>not</b> implement the {@link java.util.Map} interface and without any set/iterator views on it, as those are error-prone and inefficient

          Show
          Adrien Grand added a comment - +1 Maybe we should also update WeakIdentityMap documentation now that it has key and value iterators: This implementation was forked from <a href="http://cxf.apache.org/">Apache CXF</a> but modified to <b>not</b> implement the {@link java.util.Map} interface and without any set/iterator views on it, as those are error-prone and inefficient
          Hide
          Uwe Schindler added a comment -

          Adrien: right, will do!

          Show
          Uwe Schindler added a comment - Adrien: right, will do!
          Hide
          Uwe Schindler added a comment -

          Patch with updated Javadocs.

          Show
          Uwe Schindler added a comment - Patch with updated Javadocs.
          Hide
          Uwe Schindler added a comment -

          For now I will not commit this to 3.6.1, it may be too risky (especially as it is Java 5 and ConcurrentHashMap may deadlock sometimes under heavy load).

          If this should be backported reopen before release.

          Show
          Uwe Schindler added a comment - For now I will not commit this to 3.6.1, it may be too risky (especially as it is Java 5 and ConcurrentHashMap may deadlock sometimes under heavy load). If this should be backported reopen before release.
          Hide
          Uwe Schindler added a comment -

          Committed trunk revision: 1353101
          Committed 4.x branch revision: 1353102

          Show
          Uwe Schindler added a comment - Committed trunk revision: 1353101 Committed 4.x branch revision: 1353102
          Hide
          Uwe Schindler added a comment -

          Reopen for Backporting to 3.6.1

          Show
          Uwe Schindler added a comment - Reopen for Backporting to 3.6.1
          Hide
          Uwe Schindler added a comment -

          Here the backport patch.

          There should not be any problem with Java 5, I looked through the codebase, we use ConcurrentHashMap already everywhere.

          Show
          Uwe Schindler added a comment - Here the backport patch. There should not be any problem with Java 5, I looked through the codebase, we use ConcurrentHashMap already everywhere.
          Hide
          Uwe Schindler added a comment -

          Committed 3.6 revision: 1362239

          Show
          Uwe Schindler added a comment - Committed 3.6 revision: 1362239
          Hide
          Uwe Schindler added a comment -

          Bulk close for 3.6.1

          Show
          Uwe Schindler added a comment - Bulk close for 3.6.1

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Uwe Schindler
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development