Lucene - Core
  1. Lucene - Core
  2. LUCENE-5222

TestExpressionSorts fails sometimes when using expression returning score

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 4.6, 6.0
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Jenkins picked this up. Repeat with:

      ant test  -Dtestcase=TestExpressionSorts -Dtests.method=testQueries -Dtests.seed=115AD00ED89D9F7B -Dtests.multiplier=3 -Dtests.slow=true -Dtests.locale=no_NO -Dtests.timezone=America/Nassau -Dtests.file.encoding=US-ASCII
      

      It appears to have to do with scoring, as removing the score sort from the original sorts causes the tests to pass. If you remove the possible discrepancy between doDocScores and docMaxScore params to searcher.search, then the test gets farther before failing.

        Activity

        Hide
        Robert Muir added a comment -

        Note this only happens with the paging collector... searchAfter.
        If u change the test to page the old fashioned way its fine.

        I suspect a bug in topfieldcollector.pagingfieldcollector. eg when there are multiple sorts and tiebreaks or something. I don't even think testsearchafter tests the case of multiple sorts this well.

        Show
        Robert Muir added a comment - Note this only happens with the paging collector... searchAfter. If u change the test to page the old fashioned way its fine. I suspect a bug in topfieldcollector.pagingfieldcollector. eg when there are multiple sorts and tiebreaks or something. I don't even think testsearchafter tests the case of multiple sorts this well.
        Hide
        Ryan Ernst added a comment -

        A couple things I found:

        • scorer.score() in ScoreFunctionValues is sometimes returning NaN
        • If I add the last two boolean values explicitly as true (doDocScores and doMaxScore) to both search() and searchAfter() calls, the test succeeds
        Show
        Ryan Ernst added a comment - A couple things I found: scorer.score() in ScoreFunctionValues is sometimes returning NaN If I add the last two boolean values explicitly as true (doDocScores and doMaxScore) to both search() and searchAfter() calls, the test succeeds
        Hide
        Robert Muir added a comment -

        Its not returning NaN, when you say 'doDocScores', indexsearcher confusingly fills a NaN.

        I think we have way too many booleans, optimizations, and specializations here...

        I dont think anyone tests this paging with multiple sorts... ill commit something to TestSearchAfter to try to break these collectors.

        Show
        Robert Muir added a comment - Its not returning NaN, when you say 'doDocScores', indexsearcher confusingly fills a NaN. I think we have way too many booleans, optimizations, and specializations here... I dont think anyone tests this paging with multiple sorts... ill commit something to TestSearchAfter to try to break these collectors.
        Hide
        Robert Muir added a comment -

        If I add the last two boolean values explicitly as true (doDocScores and doMaxScore) to both search() and searchAfter() calls, the test succeeds

        Indeed: I just tested this and it works (see patch below)

        Additionally i noticed TestSearchAfter in core somehow "hides the bug" with this stuff in its test:

        if (sort == Sort.RELEVANCE) {
                 paged = searcher.searchAfter(lastBottom, query, filter, pageSize, sort, true, doMaxScore);
        

        So there is a bug here in searchAfter/TopFieldCollector, and i'm also unhappy TestSearchAfter is not testing multiple sorts!

        Index: src/test/org/apache/lucene/expressions/TestExpressionSorts.java
        ===================================================================
        --- src/test/org/apache/lucene/expressions/TestExpressionSorts.java	(revision 1524082)
        +++ src/test/org/apache/lucene/expressions/TestExpressionSorts.java	(working copy)
        @@ -131,7 +131,10 @@
         
           void assertQuery(Query query, Filter filter, Sort sort) throws Exception {
             int size = _TestUtil.nextInt(random(), 1, searcher.getIndexReader().maxDoc()/5);
        -    TopDocs expected = searcher.search(query, filter, size, sort, random().nextBoolean(), random().nextBoolean());
        +    // consume randomness
        +    random().nextBoolean();
        +    random().nextBoolean();
        +    TopDocs expected = searcher.search(query, filter, size, sort, true, false);
             
             // make our actual sort, mutating original by replacing some of the 
             // sortfields with equivalent expressions
        @@ -152,12 +155,15 @@
             }
             
             Sort mutatedSort = new Sort(mutated);
        -    TopDocs actual = searcher.search(query, filter, size, mutatedSort, random().nextBoolean(), random().nextBoolean());
        +    // consume randomness
        +    random().nextBoolean();
        +    random().nextBoolean();
        +    TopDocs actual = searcher.search(query, filter, size, mutatedSort, true, false);
             CheckHits.checkEqual(query, expected.scoreDocs, actual.scoreDocs);
             
             if (size < actual.totalHits) {
        -      expected = searcher.searchAfter(expected.scoreDocs[size-1], query, filter, size, sort);
        -      actual = searcher.searchAfter(actual.scoreDocs[size-1], query, filter, size, mutatedSort);
        +      expected = searcher.searchAfter(expected.scoreDocs[size-1], query, filter, size, sort, true, false);
        +      actual = searcher.searchAfter(actual.scoreDocs[size-1], query, filter, size, mutatedSort, true, false);
               CheckHits.checkEqual(query, expected.scoreDocs, actual.scoreDocs);
             }
           }
        
        Show
        Robert Muir added a comment - If I add the last two boolean values explicitly as true (doDocScores and doMaxScore) to both search() and searchAfter() calls, the test succeeds Indeed: I just tested this and it works (see patch below) Additionally i noticed TestSearchAfter in core somehow "hides the bug" with this stuff in its test: if (sort == Sort.RELEVANCE) { paged = searcher.searchAfter(lastBottom, query, filter, pageSize, sort, true, doMaxScore); So there is a bug here in searchAfter/TopFieldCollector, and i'm also unhappy TestSearchAfter is not testing multiple sorts! Index: src/test/org/apache/lucene/expressions/TestExpressionSorts.java =================================================================== --- src/test/org/apache/lucene/expressions/TestExpressionSorts.java (revision 1524082) +++ src/test/org/apache/lucene/expressions/TestExpressionSorts.java (working copy) @@ -131,7 +131,10 @@ void assertQuery(Query query, Filter filter, Sort sort) throws Exception { int size = _TestUtil.nextInt(random(), 1, searcher.getIndexReader().maxDoc()/5); - TopDocs expected = searcher.search(query, filter, size, sort, random().nextBoolean(), random().nextBoolean()); + // consume randomness + random().nextBoolean(); + random().nextBoolean(); + TopDocs expected = searcher.search(query, filter, size, sort, true, false); // make our actual sort, mutating original by replacing some of the // sortfields with equivalent expressions @@ -152,12 +155,15 @@ } Sort mutatedSort = new Sort(mutated); - TopDocs actual = searcher.search(query, filter, size, mutatedSort, random().nextBoolean(), random().nextBoolean()); + // consume randomness + random().nextBoolean(); + random().nextBoolean(); + TopDocs actual = searcher.search(query, filter, size, mutatedSort, true, false); CheckHits.checkEqual(query, expected.scoreDocs, actual.scoreDocs); if (size < actual.totalHits) { - expected = searcher.searchAfter(expected.scoreDocs[size-1], query, filter, size, sort); - actual = searcher.searchAfter(actual.scoreDocs[size-1], query, filter, size, mutatedSort); + expected = searcher.searchAfter(expected.scoreDocs[size-1], query, filter, size, sort, true, false); + actual = searcher.searchAfter(actual.scoreDocs[size-1], query, filter, size, mutatedSort, true, false); CheckHits.checkEqual(query, expected.scoreDocs, actual.scoreDocs); } }
        Hide
        Michael McCandless added a comment -

        One thing I just discovered: the seed causes newSearcher to pass an ExecutorService to IndexSearcher ... if I turn that off (keeping randomness consumption constant) then the test passes ...

        Show
        Michael McCandless added a comment - One thing I just discovered: the seed causes newSearcher to pass an ExecutorService to IndexSearcher ... if I turn that off (keeping randomness consumption constant) then the test passes ...
        Hide
        Robert Muir added a comment -

        Ok, thats definitely the issue: the problem is that when using an Executor, IndexSearcher searches each segment and then merges the results with a FakeScorer.

        it has a special hack for this case where it actually ignores the boolean options you set, if the sort itself contains a relevance comparator:

        doDocScores || sort.needsScores()
        

        The current implementation is

          /** Whether the relevance score is needed to sort documents. */
          boolean needsScores() {
            for (SortField sortField : fields) {
              if (sortField.getType() == SortField.Type.SCORE) {
                return true;
              }
            }
            return false;
          }
        

        So in the case of the expression sortfield (or any other similar sortfield), the hack does not work of course, because its a CUSTOM sortfield.

        So I think we should change this hack to call SortField.needsScores() ? This way if someone has a custom one, they can just return true here and all this works.

        I will make a stab at a patch.

        Show
        Robert Muir added a comment - Ok, thats definitely the issue: the problem is that when using an Executor, IndexSearcher searches each segment and then merges the results with a FakeScorer. it has a special hack for this case where it actually ignores the boolean options you set, if the sort itself contains a relevance comparator: doDocScores || sort.needsScores() The current implementation is /** Whether the relevance score is needed to sort documents. */ boolean needsScores() { for (SortField sortField : fields) { if (sortField.getType() == SortField.Type.SCORE) { return true ; } } return false ; } So in the case of the expression sortfield (or any other similar sortfield), the hack does not work of course, because its a CUSTOM sortfield. So I think we should change this hack to call SortField.needsScores() ? This way if someone has a custom one, they can just return true here and all this works. I will make a stab at a patch.
        Hide
        Robert Muir added a comment -

        Here's a patch. Its a simple solution for now, I think we could optimize it more later?

        Show
        Robert Muir added a comment - Here's a patch. Its a simple solution for now, I think we could optimize it more later?
        Hide
        ASF subversion and git services added a comment -

        Commit 1524575 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1524575 ]

        LUCENE-5222: fix expression sort with executorService

        Show
        ASF subversion and git services added a comment - Commit 1524575 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1524575 ] LUCENE-5222 : fix expression sort with executorService
        Hide
        ASF subversion and git services added a comment -

        Commit 1524580 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1524580 ]

        LUCENE-5222: fix expression sort with executorService

        Show
        ASF subversion and git services added a comment - Commit 1524580 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1524580 ] LUCENE-5222 : fix expression sort with executorService
        Hide
        Robert Muir added a comment -

        Thanks Ryan and Mike, I opened LUCENE-5226 to make this not always return true.

        I think we can do it relatively cleanly, I will try. (I just wanted to fix the bug first).

        Show
        Robert Muir added a comment - Thanks Ryan and Mike, I opened LUCENE-5226 to make this not always return true. I think we can do it relatively cleanly, I will try. (I just wanted to fix the bug first).
        Hide
        ASF subversion and git services added a comment -

        Commit 1524779 from Robert Muir in branch 'dev/trunk'
        [ https://svn.apache.org/r1524779 ]

        LUCENE-5222: add CHANGES entry

        Show
        ASF subversion and git services added a comment - Commit 1524779 from Robert Muir in branch 'dev/trunk' [ https://svn.apache.org/r1524779 ] LUCENE-5222 : add CHANGES entry
        Hide
        ASF subversion and git services added a comment -

        Commit 1524780 from Robert Muir in branch 'dev/branches/branch_4x'
        [ https://svn.apache.org/r1524780 ]

        LUCENE-5222: add CHANGES entry

        Show
        ASF subversion and git services added a comment - Commit 1524780 from Robert Muir in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1524780 ] LUCENE-5222 : add CHANGES entry

          People

          • Assignee:
            Unassigned
            Reporter:
            Ryan Ernst
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development