Lucene - Core
  1. Lucene - Core
  2. LUCENE-6533

SlowCompositeReaderWrapper is caching an AssertingBits instance

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.3, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      I hit this curious failure in the new TestGeoPointQuery:

         [junit4]   2> มิ.ย. 09, 2015 12:49:55 ก่อนเที่ยง com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException
         [junit4]   2> WARNING: Uncaught exception in thread: Thread[T1,5,TGRP-TestGeoPointQuery]
         [junit4]   2> java.lang.AssertionError: Bits are only supposed to be consumed in the thread in which they have been acquired. But was acquired in Thread[TEST-TestGeoPointQuery.testRandom-seed#[173A293B58C5F8A1],5,TGRP-TestGeoPointQuery] and consumed in Thread[T1,5,TGRP-TestGeoPointQuery].
         [junit4]   2> 	at __randomizedtesting.SeedInfo.seed([173A293B58C5F8A1]:0)
         [junit4]   2> 	at org.apache.lucene.index.AssertingLeafReader.assertThread(AssertingLeafReader.java:39)
         [junit4]   2> 	at org.apache.lucene.index.AssertingLeafReader.access$000(AssertingLeafReader.java:33)
         [junit4]   2> 	at org.apache.lucene.index.AssertingLeafReader$AssertingBits.get(AssertingLeafReader.java:769)
         [junit4]   2> 	at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight$1.matches(LRUQueryCache.java:606)
         [junit4]   2> 	at org.apache.lucene.search.TwoPhaseIterator$1.doNext(TwoPhaseIterator.java:69)
         [junit4]   2> 	at org.apache.lucene.search.TwoPhaseIterator$1.nextDoc(TwoPhaseIterator.java:57)
         [junit4]   2> 	at org.apache.lucene.search.ConstantScoreScorer.nextDoc(ConstantScoreScorer.java:78)
         [junit4]   2> 	at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:204)
         [junit4]   2> 	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:176)
         [junit4]   2> 	at org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:79)
         [junit4]   2> 	at org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:63)
         [junit4]   2> 	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:563)
         [junit4]   2> 	at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:92)
         [junit4]   2> 	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:367)
         [junit4]   2> 	at org.apache.lucene.search.TestGeoPointQuery$1._run(TestGeoPointQuery.java:382)
         [junit4]   2> 	at org.apache.lucene.search.TestGeoPointQuery$1.run(TestGeoPointQuery.java:309)
         [junit4]   2> 
         [junit4]   2> NOTE: reproduce with: ant test  -Dtestcase=TestGeoPointQuery -Dtests.method=testRandom -Dtests.seed=173A293B58C5F8A1 -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=th -Dtests.timezone=Africa/Djibouti -Dtests.asserts=true -Dtests.file.encoding=UTF-8
      
      1. LUCENE-6533.patch
        4 kB
        Michael McCandless
      2. LUCENE-6533.patch
        3 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        I think what's happening is newSearcher is using an ExecutorService, so I have multiple threads calling acceptedDocs.get, and AssertingLeafReader wrapped acceptedDocs as an AssertingBits.

        But I'm confused ... isn't it OK if more than one thread calls .get on the acceptedDocs?

        Yet why are no other tests failing ... seems more likely to be a test bug in this new test!

        Show
        Michael McCandless added a comment - I think what's happening is newSearcher is using an ExecutorService, so I have multiple threads calling acceptedDocs.get, and AssertingLeafReader wrapped acceptedDocs as an AssertingBits. But I'm confused ... isn't it OK if more than one thread calls .get on the acceptedDocs? Yet why are no other tests failing ... seems more likely to be a test bug in this new test!
        Hide
        Michael McCandless added a comment -

        Oh I see, we wrap as AssertingDocs private to that thread ... so there is somehow something wrong w/ the test. I'll dig.

        Show
        Michael McCandless added a comment - Oh I see, we wrap as AssertingDocs private to that thread ... so there is somehow something wrong w/ the test. I'll dig.
        Hide
        Michael McCandless added a comment -

        OK this is not a GeoPoint issue ... it's a problem with how the readers are wrapped in LuceneTestCase.wrapReader.

        I isolated to the attached test case.

        It happens when we first wrap the reader in AssertLeafReader, and then in SlowCompositeReaderWrapper.

        The issue is that SlowCompositeReaderWrapper calls MultiFields.getBits to get the live docs up front in the ctor, which returns an AssertingBits, and then this AssertingBits is easily shared across threads.

        I'm not sure how to fix it ... I suppose we could "fix" SlowCompositeReaderWrapper to not cache its Bits like that, but that seems overly cruel: it seems like AssertingBits is overstepping its purpose somehow. It should only be used to wrap a getLiveDocs returned e.g. to a search thread ...

        Show
        Michael McCandless added a comment - OK this is not a GeoPoint issue ... it's a problem with how the readers are wrapped in LuceneTestCase.wrapReader. I isolated to the attached test case. It happens when we first wrap the reader in AssertLeafReader, and then in SlowCompositeReaderWrapper. The issue is that SlowCompositeReaderWrapper calls MultiFields.getBits to get the live docs up front in the ctor, which returns an AssertingBits, and then this AssertingBits is easily shared across threads. I'm not sure how to fix it ... I suppose we could "fix" SlowCompositeReaderWrapper to not cache its Bits like that, but that seems overly cruel: it seems like AssertingBits is overstepping its purpose somehow. It should only be used to wrap a getLiveDocs returned e.g. to a search thread ...
        Hide
        Adrien Grand added a comment -

        Thanks Mike for creating an isolated test case.

        The reason why I added this assertion in the first place was to make live docs consistent with terms enums and dvs, which are cached per thread and should not leak. While it's not really needed today given how live docs are implemented, it could be if we ever want to have disk-based live docs.

        We could fix the issue by not caching live docs anymore in SlowCompositeReaderWrapper and moving the MultiFields.getLiveDocs(in) call from SlowCompositeReaderWrapper constructor to SlowCompositeReaderWrapper.getLiveDocs (just like we arleady do for our other random-access data-structures like norms or doc values).

        Show
        Adrien Grand added a comment - Thanks Mike for creating an isolated test case. The reason why I added this assertion in the first place was to make live docs consistent with terms enums and dvs, which are cached per thread and should not leak. While it's not really needed today given how live docs are implemented, it could be if we ever want to have disk-based live docs. We could fix the issue by not caching live docs anymore in SlowCompositeReaderWrapper and moving the MultiFields.getLiveDocs(in) call from SlowCompositeReaderWrapper constructor to SlowCompositeReaderWrapper.getLiveDocs (just like we arleady do for our other random-access data-structures like norms or doc values).
        Hide
        Adrien Grand added a comment -

        I said "could" because another option would be to consider that this assertion is abusive and that we will never need per-thread state for live docs. But I personally think it is sane, especially given that we tend to think about live docs and doc values as the same thing: random-access, updateable, ...

        Show
        Adrien Grand added a comment - I said "could" because another option would be to consider that this assertion is abusive and that we will never need per-thread state for live docs. But I personally think it is sane, especially given that we tend to think about live docs and doc values as the same thing: random-access, updateable, ...
        Hide
        Michael McCandless added a comment -

        Thanks Adrien Grand, that makes sense ... here's a new patch that removes the live docs caching in SCRW ... test now passes, and the original seed in the geo test also passes! I'll commit soon.

        Show
        Michael McCandless added a comment - Thanks Adrien Grand , that makes sense ... here's a new patch that removes the live docs caching in SCRW ... test now passes, and the original seed in the geo test also passes! I'll commit soon.
        Hide
        Adrien Grand added a comment -

        +1

        Show
        Adrien Grand added a comment - +1
        Hide
        ASF subversion and git services added a comment -

        Commit 1684615 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1684615 ]

        LUCENE-6533: don't cache live docs in SlowCompositeReaderWrapper

        Show
        ASF subversion and git services added a comment - Commit 1684615 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1684615 ] LUCENE-6533 : don't cache live docs in SlowCompositeReaderWrapper
        Hide
        ASF subversion and git services added a comment -

        Commit 1684625 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1684625 ]

        LUCENE-6533: don't cache live docs in SlowCompositeReaderWrapper

        Show
        ASF subversion and git services added a comment - Commit 1684625 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1684625 ] LUCENE-6533 : don't cache live docs in SlowCompositeReaderWrapper
        Hide
        Michael McCandless added a comment -

        Thanks Adrien Grand!

        Show
        Michael McCandless added a comment - Thanks Adrien Grand !
        Hide
        Shalin Shekhar Mangar added a comment -

        Bulk close for 5.3.0 release

        Show
        Shalin Shekhar Mangar added a comment - Bulk close for 5.3.0 release

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development