Lucene - Core
  1. Lucene - Core
  2. LUCENE-3800

Readers wrapping other readers don't prevent usage if any of their subreaders was closed

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 4.0-ALPHA
    • Fix Version/s: 4.0-ALPHA
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      On recent trunk test we got this problem:
      org.apache.lucene.index.TestReaderClosed.test
      fails because the inner reader is closed but the wrapped outer ones are still open.

      I fixed the issue partially for SlowCompositeReaderWrapper and ParallelAtomicReader but it failed again. The cool thing with this test is the following:

      The test opens an DirectoryReader and then creates a searcher, closes the reader and executes a search. This is not an issue, if the reader is closed that the search is running on. This test uses LTC.newSearcher(wrap=true), which randomly wraps the passed Reader with SlowComposite or ParallelReader - or with both!!! If you then close the original inner reader, the close is not detected when excuting search. This can cause SIGSEGV when MMAP is used.

      The problem in (in Slow* and Parallel*) is, that both have their own Fields instances thats are kept alive until the reader itsself is closed. If the child reader is closed, the wrapping reader does not know and still uses its own Fields instance that delegates to the inner readers. On this step no more ensureOpen checks are done, causing the failures.

      The first fix done in Slow and Parallel was to call ensureOpen() on the subReader, too when requesting fields(). This works fine until you wrap two times: ParallelAtomicReader(SlowCompositeReaderWrapper(StandardDirectoryReader(segments_1:3:nrt _0(4.0):C42)))

      One solution would be to make ensureOpen also check all subreaders, but that would do the volatile checks way too often (with n is the total number of subreaders and m is the number of hierarchical levels this is n^m) - we cannot do this. Currently we only have n*m which is fine.

      The proposal how to solve this (closing subreaders under the hood of parent readers is to use the readerClosedListeners. Whenever a composite or slow reader wraps another readers, it registers itself as interested in readerClosed events. When a subreader is then forcefully closed (e.g by a programming error or this crazy test), we automatically close the parents, too.

      We should also fix this in 3.x, if we have similar problems there (needs investigation).

      1. LUCENE-3800.patch
        13 kB
        Uwe Schindler
      2. LUCENE-3800.patch
        15 kB
        Uwe Schindler
      3. LUCENE-3800.patch
        13 kB
        Uwe Schindler

        Activity

        Hide
        Robert Muir added a comment -

        The proposal how to solve this (closing subreaders under the hood of parent readers is to use the readerClosedListeners. Whenever a composite or slow reader wraps another readers, it registers itself as interested in readerClosed events. When a subreader is then forcefully closed (e.g by a programming error or this crazy test), we automatically close the parents, too.

        +1. free safety.

        Show
        Robert Muir added a comment - The proposal how to solve this (closing subreaders under the hood of parent readers is to use the readerClosedListeners. Whenever a composite or slow reader wraps another readers, it registers itself as interested in readerClosed events. When a subreader is then forcefully closed (e.g by a programming error or this crazy test), we automatically close the parents, too. +1. free safety.
        Hide
        Uwe Schindler added a comment -

        I am still playing around with different solutions, its not so easy. I have a patch, but I have to think more about it.

        For now (to prevent test failures), I will commit a temporary fix to TestReaderClosed.test, so it does not wrap on newSearcher().

        Show
        Uwe Schindler added a comment - I am still playing around with different solutions, its not so easy. I have a patch, but I have to think more about it. For now (to prevent test failures), I will commit a temporary fix to TestReaderClosed.test, so it does not wrap on newSearcher().
        Hide
        Uwe Schindler added a comment -

        Until this is fixed, I added a workaround for the test that's failing in revision: 1291071

        Show
        Uwe Schindler added a comment - Until this is fixed, I added a workaround for the test that's failing in revision: 1291071
        Hide
        Uwe Schindler added a comment -

        Patch for trunk that enforces a "soft-close" (make it unuseable) once a child-reader (subreader, delegate, parallel reader) is closed.

        The implementation works like that:

        • A parent reader that delegates to one or more child readers (its currently BaseCompositeReader, SlowCompositeReaderWrapper, FilterAtomicReader, ParallelAtomicReader), register itsself on every child/delegate reader. The registration is done using a weak reference, so there is no circular ref problems. To be able to use WeakHashMap, this patch explicitely enforces identity equals/hashcode on IndexReader (which we always assumedin the past for FieldCache and so on, now its enforced like in MMapIndexInput)
        • When a reader closes, it simply iterates over all registered parents and "marks" them as "unuseable" (they are not really closed).
        • When somebody calls any method of a reader of which any child was closed, ensureOpen() will throw an AlreadyClosedEx with corresponding message.
        • You are still able to decRef/close readers which are marked as invalid (otherwise tests would fail). The child reader closed -> disable parent reader is simply done for safety.
        • To make the ensureOpen() checks not more expensive than before, the extra boolean was not made volatile, instead happens before and volatile behaviour as explained in Michael Busch's talks was used to guard the "closedByChild" boolean.
        • The patch does not use readerClosedListener for 2 reasons: (1) RCL has no weak refs, so we don't produce circular references between parents and childs, (2) We have to recursively go up the parent chain, which may confuse conventional readerClosedListeners. Also the parent readers are not closed, so its not applicable to call the listeners.
        Show
        Uwe Schindler added a comment - Patch for trunk that enforces a "soft-close" (make it unuseable) once a child-reader (subreader, delegate, parallel reader) is closed. The implementation works like that: A parent reader that delegates to one or more child readers (its currently BaseCompositeReader, SlowCompositeReaderWrapper, FilterAtomicReader, ParallelAtomicReader), register itsself on every child/delegate reader. The registration is done using a weak reference, so there is no circular ref problems. To be able to use WeakHashMap, this patch explicitely enforces identity equals/hashcode on IndexReader (which we always assumedin the past for FieldCache and so on, now its enforced like in MMapIndexInput) When a reader closes, it simply iterates over all registered parents and "marks" them as "unuseable" (they are not really closed). When somebody calls any method of a reader of which any child was closed, ensureOpen() will throw an AlreadyClosedEx with corresponding message. You are still able to decRef/close readers which are marked as invalid (otherwise tests would fail). The child reader closed -> disable parent reader is simply done for safety. To make the ensureOpen() checks not more expensive than before, the extra boolean was not made volatile, instead happens before and volatile behaviour as explained in Michael Busch's talks was used to guard the "closedByChild" boolean. The patch does not use readerClosedListener for 2 reasons: (1) RCL has no weak refs, so we don't produce circular references between parents and childs, (2) We have to recursively go up the parent chain, which may confuse conventional readerClosedListeners. Also the parent readers are not closed, so its not applicable to call the listeners.
        Hide
        Uwe Schindler added a comment -

        Improved patch with some minor things fixed:

        • Better test for the close child case
        • Fix Solr TestDocSet now hopefully the last time by implementing the fake reader correctly as AtomicReader subclass without passing null to FilterAtomicReader

        All tests pass.

        Show
        Uwe Schindler added a comment - Improved patch with some minor things fixed: Better test for the close child case Fix Solr TestDocSet now hopefully the last time by implementing the fake reader correctly as AtomicReader subclass without passing null to FilterAtomicReader All tests pass.
        Hide
        Uwe Schindler added a comment -

        New patch, as parts of it were already committed (LTC changes).

        Show
        Uwe Schindler added a comment - New patch, as parts of it were already committed (LTC changes).
        Hide
        Uwe Schindler added a comment -

        Committed trunk revision: 1292293

        Show
        Uwe Schindler added a comment - Committed trunk revision: 1292293
        Hide
        Uwe Schindler added a comment -

        If this would be good for 3.x, too -> reopen. 3.x is more safe as if a child reader is closed also parent readers have mostly no chance to do anything.

        Show
        Uwe Schindler added a comment - If this would be good for 3.x, too -> reopen. 3.x is more safe as if a child reader is closed also parent readers have mostly no chance to do anything.
        Hide
        Yonik Seeley added a comment -

        Ouch - more weak references. I was hoping we could reduce the number of those (I've seen them cause worse GC problems for a number of people).
        But if I understand the description correctly, then without this patch things could core dump when using mmap directory?

        Show
        Yonik Seeley added a comment - Ouch - more weak references. I was hoping we could reduce the number of those (I've seen them cause worse GC problems for a number of people). But if I understand the description correctly, then without this patch things could core dump when using mmap directory?
        Hide
        Uwe Schindler added a comment - - edited

        Yes! By the way, the same trick is used for MMapDirectory to keep track of its clones IndexInputs.

        Show
        Uwe Schindler added a comment - - edited Yes! By the way, the same trick is used for MMapDirectory to keep track of its clones IndexInputs.

          People

          • Assignee:
            Uwe Schindler
            Reporter:
            Uwe Schindler
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development