Lucene - Core
  1. Lucene - Core
  2. LUCENE-6500

ParallelCompositeReader does not always call closed listeners

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      CompositeParallelReader misses to call closed listeners when the reader which is provided at construction time does not wrap leaf readers directly, such as a multi reader over directory readers.

        Issue Links

          Activity

          Hide
          Adrien Grand added a comment -

          Here is a patch: it does not prevent closing sub readers anymore in the composite case, so that closed listeners will be called on the underlying parallel leaf readers which are exposed through leaves() and may be used to register closed listeners.

          Uwe Schindler svn blame indicates that you should be quite familiar with this class, would you mind having a look? Thanks!

          Show
          Adrien Grand added a comment - Here is a patch: it does not prevent closing sub readers anymore in the composite case, so that closed listeners will be called on the underlying parallel leaf readers which are exposed through leaves() and may be used to register closed listeners. Uwe Schindler svn blame indicates that you should be quite familiar with this class, would you mind having a look? Thanks!
          Hide
          Uwe Schindler added a comment -

          Oh oh, yes I completely rewrote this class in 4.0... I'll take a look.

          Show
          Uwe Schindler added a comment - Oh oh, yes I completely rewrote this class in 4.0... I'll take a look.
          Hide
          Uwe Schindler added a comment -

          The anonymous subclassing was added to explicitely don't call closelisteners on the "synthetic" readers. The same code is there 2 times... (for composites and for atomics).

          Closing all subreaders is done in ParallelCompositeReader's doClose() where it iterates over all "real childs" which are passed in ctor.

          I think the new problem is that (as you say) the close listeners may be registered on the synthetic leaves... I have to think about how to better handle this, with your patch it may now happen that a reader is closed multiple times. Which is not a problem in your test, but could be in reality.

          In addition there is now (after your patch) an assymetry between atomic leaves and composite sub readers... Does't the same problem apply to atomic readers?

          Unfrotuentaly my Eclipse strikes like Deutsche Bahn and says "GC overhead limit". I am unable to dig...

          Show
          Uwe Schindler added a comment - The anonymous subclassing was added to explicitely don't call closelisteners on the "synthetic" readers. The same code is there 2 times... (for composites and for atomics). Closing all subreaders is done in ParallelCompositeReader's doClose() where it iterates over all "real childs" which are passed in ctor. I think the new problem is that (as you say) the close listeners may be registered on the synthetic leaves... I have to think about how to better handle this, with your patch it may now happen that a reader is closed multiple times. Which is not a problem in your test, but could be in reality. In addition there is now (after your patch) an assymetry between atomic leaves and composite sub readers... Does't the same problem apply to atomic readers? Unfrotuentaly my Eclipse strikes like Deutsche Bahn and says "GC overhead limit". I am unable to dig...
          Hide
          Ryan Ernst added a comment -

          Which is not a problem in your test, but could be in reality

          Isn't the java contract for Closeable that close() is idempotent?

          Show
          Ryan Ernst added a comment - Which is not a problem in your test, but could be in reality Isn't the java contract for Closeable that close() is idempotent?
          Hide
          Uwe Schindler added a comment -

          It should be, but indexReaders call decRef() on close() and that complains if it gets < 0.

          Show
          Uwe Schindler added a comment - It should be, but indexReaders call decRef() on close() and that complains if it gets < 0.
          Hide
          Uwe Schindler added a comment -

          Sorry, you are right. It's idempoent! But there is still some problem. I have to debug through the code to understand why it was written like that. I just remember, it was tricky...

          Show
          Uwe Schindler added a comment - Sorry, you are right. It's idempoent! But there is still some problem. I have to debug through the code to understand why it was written like that. I just remember, it was tricky...
          Hide
          Adrien Grand added a comment -

          In addition there is now (after your patch) an assymetry between atomic leaves and composite sub readers... Does't the same problem apply to atomic readers?

          Since prepareSubReaders wraps leaves recursively with ParallelCompositeReader in the composite case, I was thinking it should be fine, since at least the lower-level which is wrapping leaf readers would wrap with a ParallelLeafReader that prevents closing of sub readers?

          For the record, no existing tests failed with the change I made.

          Show
          Adrien Grand added a comment - In addition there is now (after your patch) an assymetry between atomic leaves and composite sub readers... Does't the same problem apply to atomic readers? Since prepareSubReaders wraps leaves recursively with ParallelCompositeReader in the composite case, I was thinking it should be fine, since at least the lower-level which is wrapping leaf readers would wrap with a ParallelLeafReader that prevents closing of sub readers? For the record, no existing tests failed with the change I made.
          Hide
          Uwe Schindler added a comment - - edited

          Here is the problem, why the patch is problematic:
          The current logic is fine, but fails - as Adrien says - when you have a ParallelMultiReader on a MultiReader of another CompositeReaders.
          His fix works fine in the test because the test's MultiReader has closeSubReaders=true - so it does no refcounting. If you have closeSubReader=false on the parallel on and also on the child MultiReader. in that case it will fail with too many decRefs (haven't tried).
          I am not sure how the best fix looks like. It this issue urgent? I have to think a night about it I would prefer to have a better solution alltgether, this refcounting is horrible...

          Show
          Uwe Schindler added a comment - - edited Here is the problem, why the patch is problematic: The current logic is fine, but fails - as Adrien says - when you have a ParallelMultiReader on a MultiReader of another CompositeReaders. His fix works fine in the test because the test's MultiReader has closeSubReaders=true - so it does no refcounting. If you have closeSubReader=false on the parallel on and also on the child MultiReader. in that case it will fail with too many decRefs (haven't tried). I am not sure how the best fix looks like. It this issue urgent? I have to think a night about it I would prefer to have a better solution alltgether, this refcounting is horrible...
          Hide
          Uwe Schindler added a comment -

          Another solution to fix this would be to also add all those deeper nested synthetic subreaders to the completeReaderSet (see last line of ctor). In that case they can stay with docClose() empty (to not affect refcount). I will try this out. I will also add a test for the case I mentioned before.

          Show
          Uwe Schindler added a comment - Another solution to fix this would be to also add all those deeper nested synthetic subreaders to the completeReaderSet (see last line of ctor). In that case they can stay with docClose() empty (to not affect refcount). I will try this out. I will also add a test for the case I mentioned before.
          Hide
          Uwe Schindler added a comment -

          A second solution would be to give up on completely mirroring the whole nested structure. We could simply create a ParallelCompositeReader with a ParallelLeafReader for each leave(), leaving out all inner composites. This would simplify the whole thing completely. There is (in my opinion) no reason to not flatten the structure. The alignment of readers would still match.

          Show
          Uwe Schindler added a comment - A second solution would be to give up on completely mirroring the whole nested structure. We could simply create a ParallelCompositeReader with a ParallelLeafReader for each leave(), leaving out all inner composites. This would simplify the whole thing completely. There is (in my opinion) no reason to not flatten the structure. The alignment of readers would still match.
          Hide
          Adrien Grand added a comment -

          It this issue urgent?

          No, I don't think it's urgent. The reason I opened it is just that we hit a test failure in Elasticsearch which was due to core closed listeners not being called, and the root cause was this issue. But we don't use ParallelCompositeReader, it was introduced by LuceneTestCase.newSearcher.

          You removed this test:

          I think it's still here, I just refactored the test to make it easier to test all combinations of `closeSubReaders` and whether sub readers are leaf or composite readers. You can try to revert changes in src/test and TestParallelCompositeReader will still pass.

          Another solution to fix this would be to also add all those deeper nested synthetic subreaders to the completeReaderSet (see last line of ctor). In that case they can stay with docClose() empty (to not affect refcount). I will try this out.

          Oh I see, so the bug is that we are not adding all created synthetic readers to this set currently. This sounds like a good fix to me.

          I liked the fact that you mentioned the second option would simplify the whole thing but I'm also afraid this would be a more significant change. So maybe we can first apply your first idea to fix the bug and later think about whether it would make sense to flatten the whole IndexReader structure to simplify this class.

          Show
          Adrien Grand added a comment - It this issue urgent? No, I don't think it's urgent. The reason I opened it is just that we hit a test failure in Elasticsearch which was due to core closed listeners not being called, and the root cause was this issue. But we don't use ParallelCompositeReader, it was introduced by LuceneTestCase.newSearcher. You removed this test: I think it's still here, I just refactored the test to make it easier to test all combinations of `closeSubReaders` and whether sub readers are leaf or composite readers. You can try to revert changes in src/test and TestParallelCompositeReader will still pass. Another solution to fix this would be to also add all those deeper nested synthetic subreaders to the completeReaderSet (see last line of ctor). In that case they can stay with docClose() empty (to not affect refcount). I will try this out. Oh I see, so the bug is that we are not adding all created synthetic readers to this set currently. This sounds like a good fix to me. I liked the fact that you mentioned the second option would simplify the whole thing but I'm also afraid this would be a more significant change. So maybe we can first apply your first idea to fix the bug and later think about whether it would make sense to flatten the whole IndexReader structure to simplify this class.
          Hide
          Uwe Schindler added a comment -

          Hi I implemented the last approach. Makes code much simpler. The ParallelComposite Reader now has the same leaves, but the structure was flattened. By that, the inner composite readers are never wrapped. We just build a new reader with the same leaves as the original, but flattened structure. I will now just add a test that it also works with MultiReaders that are marked as non-closed (which are incRefed and decRefed on close).

          Show
          Uwe Schindler added a comment - Hi I implemented the last approach. Makes code much simpler. The ParallelComposite Reader now has the same leaves, but the structure was flattened. By that, the inner composite readers are never wrapped. We just build a new reader with the same leaves as the original, but flattened structure. I will now just add a test that it also works with MultiReaders that are marked as non-closed (which are incRefed and decRefed on close).
          Hide
          Uwe Schindler added a comment -

          Here is my patch. The code that builds the reader got much easier (as nice side effect). This is also much easier to understand.

          Adrien Grand: what do you think?

          We can also open a new issue that takes care of flattening and this one just says "broken by the new one".

          Show
          Uwe Schindler added a comment - Here is my patch. The code that builds the reader got much easier (as nice side effect). This is also much easier to understand. Adrien Grand : what do you think? We can also open a new issue that takes care of flattening and this one just says "broken by the new one".
          Hide
          Adrien Grand added a comment -

          Thank you Uwe for the patch, I see how it fixes the issue. I'm just wondering if returning the list of leaves instead of sub readers now in getSequentialSubReaders could break anything?

          Show
          Adrien Grand added a comment - Thank you Uwe for the patch, I see how it fixes the issue. I'm just wondering if returning the list of leaves instead of sub readers now in getSequentialSubReaders could break anything?
          Hide
          Uwe Schindler added a comment -

          We would need to announce that change in the CHANGES.txt (that we flatten the reader structure). This would only be a problem for those people that rely on the fact that the reader structure is preserved...

          This is why I said we should open new issue and delay this one until the other one is fixed. The fix for this one would be a no-op.

          Show
          Uwe Schindler added a comment - We would need to announce that change in the CHANGES.txt (that we flatten the reader structure). This would only be a problem for those people that rely on the fact that the reader structure is preserved... This is why I said we should open new issue and delay this one until the other one is fixed. The fix for this one would be a no-op.
          Hide
          Uwe Schindler added a comment - - edited

          getSequentialSubReaders is protected... It cannot be called from outside! I did this change for 4.0, because I wanted to hide the "real" reader structure from search APIs. It makes no difference to IndexSearchers how the readers are built. It is an implementation detail.

          You can of course get the reader structure from the context, but I don't think anybody would rely on its real structure after warpping.

          Show
          Uwe Schindler added a comment - - edited getSequentialSubReaders is protected... It cannot be called from outside! I did this change for 4.0, because I wanted to hide the "real" reader structure from search APIs. It makes no difference to IndexSearchers how the readers are built. It is an implementation detail. You can of course get the reader structure from the context, but I don't think anybody would rely on its real structure after warpping.
          Hide
          Adrien Grand added a comment -

          OK, then +1 on your plan to open another issue to flatten the structure and make it close this one. I suppose the only point in keeping the tree structure like MultiReader does is to still be able to call closed listeners on intermediate readers too, which ParallelCompositeReader also supports but by creating this completeReaderSet.

          Show
          Adrien Grand added a comment - OK, then +1 on your plan to open another issue to flatten the structure and make it close this one. I suppose the only point in keeping the tree structure like MultiReader does is to still be able to call closed listeners on intermediate readers too, which ParallelCompositeReader also supports but by creating this completeReaderSet .
          Hide
          Adrien Grand added a comment -

          You can of course get the reader structure from the context, but I don't think anybody would rely on its real structure after warpping.

          Then maybe we should stop exposing it in IndexReaderContext?

          Show
          Adrien Grand added a comment - You can of course get the reader structure from the context, but I don't think anybody would rely on its real structure after warpping. Then maybe we should stop exposing it in IndexReaderContext?
          Hide
          Uwe Schindler added a comment -

          I think the complexity to preserve reader structure was done in ParallelCompositeReader before the change with getSequentialSubs. This was a large refactoring (I also made the whole thing unmodifiable lists, to make sure MultiReaders cannot be modified afterwards).

          The main reason for MultiReaders is to make it possible for users to combine readers as they like. But that is only during "construction" time. At "usage" time, the structure is no longer interesting.

          The problem with ParallelCompositeReader was just because it used this API, because Java allows to call protected methods from same package... (which is a bug in my opinion). Protected methods are just there for subclasses and should be hidden to the outside! (my personal opinion).

          Show
          Uwe Schindler added a comment - I think the complexity to preserve reader structure was done in ParallelCompositeReader before the change with getSequentialSubs. This was a large refactoring (I also made the whole thing unmodifiable lists, to make sure MultiReaders cannot be modified afterwards). The main reason for MultiReaders is to make it possible for users to combine readers as they like. But that is only during "construction" time. At "usage" time, the structure is no longer interesting. The problem with ParallelCompositeReader was just because it used this API, because Java allows to call protected methods from same package... (which is a bug in my opinion). Protected methods are just there for subclasses and should be hidden to the outside! (my personal opinion).
          Hide
          Uwe Schindler added a comment - - edited

          My personal opinion: We should hide ParallelLeafReader completely as impl detail. ParallelCompositeReader should just take any IndexReader in its ctor... (just an idea). Maybe it could be ParallelReader like it was back in early times...

          Show
          Uwe Schindler added a comment - - edited My personal opinion: We should hide ParallelLeafReader completely as impl detail. ParallelCompositeReader should just take any IndexReader in its ctor... (just an idea). Maybe it could be ParallelReader like it was back in early times...
          Hide
          Uwe Schindler added a comment - - edited

          Then maybe we should stop exposing it in IndexReaderContext?

          I just checked it: Its called nowhere! So +1, just needs to be maybe protected. I think this goes in line with the separate issue to nuke the IndexReaderContext at all...

          Show
          Uwe Schindler added a comment - - edited Then maybe we should stop exposing it in IndexReaderContext? I just checked it: Its called nowhere! So +1, just needs to be maybe protected. I think this goes in line with the separate issue to nuke the IndexReaderContext at all...
          Hide
          Uwe Schindler added a comment -

          I opened LUCENE-6501.

          Show
          Uwe Schindler added a comment - I opened LUCENE-6501 .
          Hide
          ASF subversion and git services added a comment -

          Commit 1681998 from Uwe Schindler in branch 'dev/trunk'
          [ https://svn.apache.org/r1681998 ]

          LUCENE-6501: Flatten subreader structure in ParallelCompositeReader (fixes close listener bug LUCENE-6500)

          Show
          ASF subversion and git services added a comment - Commit 1681998 from Uwe Schindler in branch 'dev/trunk' [ https://svn.apache.org/r1681998 ] LUCENE-6501 : Flatten subreader structure in ParallelCompositeReader (fixes close listener bug LUCENE-6500 )
          Hide
          ASF subversion and git services added a comment -

          Commit 1682000 from Uwe Schindler in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1682000 ]

          Merged revision(s) 1681998 from lucene/dev/trunk:
          LUCENE-6501: Flatten subreader structure in ParallelCompositeReader (fixes close listener bug LUCENE-6500)

          Show
          ASF subversion and git services added a comment - Commit 1682000 from Uwe Schindler in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1682000 ] Merged revision(s) 1681998 from lucene/dev/trunk: LUCENE-6501 : Flatten subreader structure in ParallelCompositeReader (fixes close listener bug LUCENE-6500 )
          Hide
          Uwe Schindler added a comment -

          Fixed through subreader flattening in LUCENE-6501. Thanks Adrien for reporting!

          Show
          Uwe Schindler added a comment - Fixed through subreader flattening in LUCENE-6501 . Thanks Adrien for reporting!
          Hide
          Shalin Shekhar Mangar added a comment -

          Bulk close for 5.3.0 release

          Show
          Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

            People

            • Assignee:
              Uwe Schindler
              Reporter:
              Adrien Grand
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development