Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5796

Scorer.getChildren() can throw or hide a subscorer for some boolean queries

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 4.9
    • 4.10, 6.0
    • core/search
    • None
    • New, Patch Available

    Description

      I've isolated two example boolean queries that don't behave with release 4.9 of Lucene.

      1. A BooleanQuery with three SHOULD clauses and a minimumNumberShouldMatch of 2 will throw an ArrayIndexOutOfBoundsException.
        java.lang.ArrayIndexOutOfBoundsException: 2
        	at __randomizedtesting.SeedInfo.seed([2F79B3DF917D071B:2539E6DBC4DF793C]:0)
        	at org.apache.lucene.search.MinShouldMatchSumScorer.getChildren(MinShouldMatchSumScorer.java:119)
        	at org.apache.lucene.search.TestBooleanQueryVisitSubscorers$ScorerSummarizingCollector.summarizeScorer(TestBooleanQueryVisitSubscorers.java:261)
        	at org.apache.lucene.search.TestBooleanQueryVisitSubscorers$ScorerSummarizingCollector.setScorer(TestBooleanQueryVisitSubscorers.java:238)
        	at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:161)
        	at org.apache.lucene.search.AssertingBulkScorer.score(AssertingBulkScorer.java:64)
        	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:621)
        	at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:94)
        	at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:309)
        	at org.apache.lucene.search.TestBooleanQueryVisitSubscorers.testGetChildrenMinShouldMatchSumScorer(TestBooleanQueryVisitSubscorers.java:196)
        
      2. A BooleanQuery with two should clauses, one of which is a miss for all documents in the current segment will accidentally mask the scorer that was a hit.

      Unit tests and patch based on branch_4x are available and will be attached as soon as this ticket has a number.

      They are immediately available on GitHub on branch shebiki/bqgetchildren as commit c64bb6f.

      I took the liberty of naming the relationship in BoostingScorer.getChildren() BOOSTING. Suspect someone will offer a better name for this. Here is a summary of the various relationships in play for all Scorer.getChildren() implementations on branch_4x to help choose.

      class relationships
      org.apache.lucene.search.AssertingScorer SHOULD
      org.apache.lucene.search.join.ToParentBlockJoinQuery.BlockJoinScorer BLOCK_JOIN
      org.apache.lucene.search.ConjunctionScorer MUST
      org.apache.lucene.search.ConstantScoreQuery.ConstantScorer constant
      org.apache.lucene.queries.function.BoostedQuery.CustomScorer CUSTOM
      org.apache.lucene.queries.CustomScoreQuery.CustomScorer CUSTOM
      org.apache.lucene.search.DisjunctionScorer SHOULD
      org.apache.lucene.facet.DrillSidewaysScorer.FakeScorer MUST
      org.apache.lucene.search.FilterScorer calls in.getChildren()
      org.apache.lucene.search.ScoreCachingWrappingScorer CACHED
      org.apache.lucene.search.FilteredQuery.LeapFrogScorer FILTERED
      org.apache.lucene.search.MinShouldMatchSumScorer SHOULD
      org.apache.lucene.search.FilteredQuery FILTERED
      org.apache.lucene.search.ReqExclScorer MUST
      org.apache.lucene.search.ReqOptSumScorer MUST, SHOULD
      org.apache.lucene.search.join.ToChildBlockJoinQuery BLOCK_JOIN

      I also removed FilterScorer.getChildren() to prevent mistakes and force subclasses to provide a correct implementation.

      Attachments

        1. LUCENE-5796.patch
          7 kB
          Terry Smith

        Activity

          People

            Unassigned Unassigned
            shebiki Terry Smith
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: