Lucene - Core
  1. Lucene - Core
  2. LUCENE-3588

Try harder to prevent SIGSEGV on cloned MMapIndexInputs

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.4, 3.5
    • Fix Version/s: 3.6, 4.0-ALPHA
    • Component/s: core/store
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      We are unmapping mmapped byte buffers which is disallowed by the JDK, because it has the risk of SIGSEGV when you access the mapped byte buffer after unmapping.

      We currently prevent this for the main IndexInput by setting its buffer to null, so we NPE if somebody tries to access the underlying buffer. I recently fixed also the stupid curBuf (LUCENE-3200) by setting to null.

      The big problem are cloned IndexInputs which are generally not closed. Those still contain references to the unmapped ByteBuffer, which lead to SIGSEGV easily. The patch from Mike in LUCENE-3439 prevents most of this in Lucene 3.5, but its still not 100% safe (as it uses non-volatiles).

      This patch will fix the remaining issues by also setting the buffers of clones to null when the original is closed. The trick is to record weak references of all clones created and close them together with the original. This uses a ConcurrentHashMap<WeakReference<MMapIndexInput>,?> as store with the logic borrowed from WeakHashMap to cleanup the GCed references (using ReferenceQueue).

      If we respin 3.5, we should maybe also get this in.

      1. LUCENE-3588.patch
        7 kB
        Uwe Schindler
      2. LUCENE-3588.patch
        5 kB
        Robert Muir
      3. LUCENE-3588.patch
        4 kB
        Uwe Schindler
      4. LUCENE-3588-simpler.patch
        0.0 kB
        Uwe Schindler
      5. LUCENE-3588-simpler.patch
        5 kB
        Uwe Schindler
      6. LUCENE-3588-simpler.patch
        4 kB
        Uwe Schindler

        Activity

        Hide
        Uwe Schindler added a comment -

        I missed one more possible NPE -> AlreadyClosedException transformation in getFilePointer. Committed revs 1205954 (trunk), 1205956 (3x)

        Show
        Uwe Schindler added a comment - I missed one more possible NPE -> AlreadyClosedException transformation in getFilePointer. Committed revs 1205954 (trunk), 1205956 (3x)
        Hide
        Uwe Schindler added a comment -

        Committed trunk revision: 1205430
        Committed 3.x revision: 1205434

        Show
        Uwe Schindler added a comment - Committed trunk revision: 1205430 Committed 3.x revision: 1205434
        Hide
        Doron Cohen added a comment -

        Patch (last one) works well for me - the new test fails without the fix and passes with the fix.

        It relies on shallow cloning of 'clones' - and so would break if WHM starts to implement Cloneable for some reason, but then the 'assert clone.clones == this.clones' in clone() guarantees early detection of this in the tests, cool.

        Show
        Doron Cohen added a comment - Patch (last one) works well for me - the new test fails without the fix and passes with the fix. It relies on shallow cloning of 'clones' - and so would break if WHM starts to implement Cloneable for some reason, but then the 'assert clone.clones == this.clones' in clone() guarantees early detection of this in the tests, cool.
        Hide
        Uwe Schindler added a comment -

        New patch, that no longer throws NPE, all NPEs are converted to AlreadyClosedExceptions in MMapIndexInput. This does not add overhead, the try/catch blocks are already there.

        LUCENE-3588.patch is now the authoritative patch file.

        Show
        Uwe Schindler added a comment - New patch, that no longer throws NPE, all NPEs are converted to AlreadyClosedExceptions in MMapIndexInput. This does not add overhead, the try/catch blocks are already there. LUCENE-3588 .patch is now the authoritative patch file.
        Hide
        Uwe Schindler added a comment -

        Improved patch:

        • The clones (even clones of clones) share all the same WeakHashMap with the original. Only the original MMapIndexInput will unset the buffers in all clones/cloned-clones.
        • This reduces cost of creating clones (no HashMap instantiation, no ReferenceQueues,...)

        Added test with clone of clone.

        Show
        Uwe Schindler added a comment - Improved patch: The clones (even clones of clones) share all the same WeakHashMap with the original. Only the original MMapIndexInput will unset the buffers in all clones/cloned-clones. This reduces cost of creating clones (no HashMap instantiation, no ReferenceQueues,...) Added test with clone of clone.
        Hide
        Uwe Schindler added a comment -

        WeakHashMap silently discards GCed references during iteration. And the close() method synchronized on the map, too.

        Show
        Uwe Schindler added a comment - WeakHashMap silently discards GCed references during iteration. And the close() method synchronized on the map, too.
        Hide
        Dawid Weiss added a comment -

        I was thinking about this when looking at the code and I thought the intention of using CHM was to get an iterator that won' throw CME while iterating. If this isn't possible then you're right – same thing to use a decorated WhateverMap.

        Show
        Dawid Weiss added a comment - I was thinking about this when looking at the code and I thought the intention of using CHM was to get an iterator that won' throw CME while iterating. If this isn't possible then you're right – same thing to use a decorated WhateverMap.
        Hide
        Uwe Schindler added a comment -

        Here a much simplier patch than the one yesterday (including Robert's test):
        The added complexity by ConcurrentHashMap with WeakReference and ReferenceQueue is nonsense, as CHM is optimized for many clients getting entries from the map. In our use-case the only one who gets entries from the map is our close() method. When cloning, we only call put() so its always synchronized by CHM and no difference to a standard synchronized WhateverMap.
        This patch uses the simple apprach: Use a native WeakHashMap where we have a synchronization on the put()/close() cleanups. This removes all Reference handling and simplifies code a lot.

        I think this is ready to commit.

        Show
        Uwe Schindler added a comment - Here a much simplier patch than the one yesterday (including Robert's test): The added complexity by ConcurrentHashMap with WeakReference and ReferenceQueue is nonsense, as CHM is optimized for many clients getting entries from the map. In our use-case the only one who gets entries from the map is our close() method. When cloning, we only call put() so its always synchronized by CHM and no difference to a standard synchronized WhateverMap. This patch uses the simple apprach: Use a native WeakHashMap where we have a synchronization on the put()/close() cleanups. This removes all Reference handling and simplifies code a lot. I think this is ready to commit.
        Hide
        Dawid Weiss added a comment -

        Looks good to me. Interesting solution.

        Show
        Dawid Weiss added a comment - Looks good to me. Interesting solution.
        Hide
        Robert Muir added a comment -

        +1, i added a simple test, sigsegv's without patch, passes with it.

        Show
        Robert Muir added a comment - +1, i added a simple test, sigsegv's without patch, passes with it.

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development