Lucene - Core
  1. Lucene - Core
  2. LUCENE-6001

DrillSideways throws NullPointerException for some searches

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 4.10.1
    • Fix Version/s: 4.10.4, 5.1, 6.0
    • Component/s: modules/facet
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      For some DrillSideways searches I get NullPointerException.
      I have tracked the problem to DrillSidewaysScorer class, on line 126 in DrillSidewaysScorer.java:

      long baseQueryCost = baseScorer.cost();

      On some of my index segments, this call throws NullPoinerException.
      "baseScorer" is instance of ReqExclScorer.
      In ReqExclScorer.java:
      public long cost()

      { return reqScorer.cost(); }

      throws NullPointerException because reqScorer is null.

        Activity

        Hide
        Shai Erera added a comment -

        Can you post a test case which fails with this exception?

        Show
        Shai Erera added a comment - Can you post a test case which fails with this exception?
        Hide
        Dragan Jotanovic added a comment -

        I'm afraid that would be very hard as it is not easily reproducible, and I am not able to provide you with my production index.
        Basically what I do is I'm calling DrillSideways search method with boolean query like (+field:term1 + field:term2 +field:term3 +field:term4 -field:term5) and drilldown on one facet dimension.
        Of 26 segments that my index have, only one is coming with NullPointerException. If I send 4 instead of 5 terms, it works properly. Also if I leave out the NOT term, it works with any number of terms.
        I'm sorry I couldn't be of more help. If you tell me where to look and how to debug I can do it and post the results.
        I think we just need to find out why is that scorer not initialized, or we could default the cost to 0 (in DrillSidewaysScorer) if there is null pointer exception . In any way, this unchecked exception should probably be caught somewhere.

        Show
        Dragan Jotanovic added a comment - I'm afraid that would be very hard as it is not easily reproducible, and I am not able to provide you with my production index. Basically what I do is I'm calling DrillSideways search method with boolean query like (+field:term1 + field:term2 +field:term3 +field:term4 -field:term5) and drilldown on one facet dimension. Of 26 segments that my index have, only one is coming with NullPointerException. If I send 4 instead of 5 terms, it works properly. Also if I leave out the NOT term, it works with any number of terms. I'm sorry I couldn't be of more help. If you tell me where to look and how to debug I can do it and post the results. I think we just need to find out why is that scorer not initialized, or we could default the cost to 0 (in DrillSidewaysScorer) if there is null pointer exception . In any way, this unchecked exception should probably be caught somewhere.
        Hide
        Robert Muir added a comment -

        If ReqExclScorer has a null 'req' part, then something in the logic of BooleanWeight.

        Show
        Robert Muir added a comment - If ReqExclScorer has a null 'req' part, then something in the logic of BooleanWeight.
        Hide
        Robert Muir added a comment -

        ok I see the bug. ReqExclScorer can indeed be buggy here, because it marks 'req' as null for 'exhausted'. So if you call cost() after that, you will get NPE.

        Show
        Robert Muir added a comment - ok I see the bug. ReqExclScorer can indeed be buggy here, because it marks 'req' as null for 'exhausted'. So if you call cost() after that, you will get NPE.
        Hide
        Robert Muir added a comment -

        The issue could be more problematic really. In general most of these methods (cost(), getChildren(), etc) are currently only being called before consuming the iterator.

        But DrillSideWays first nextDoc()'s all of its subs, then calls cost().

        First we should clarify/define, when it is ok to call these methods, and what the semantics are supposed to be if you can call it "after consuming" (is it ok to call it at any time? what about after it returns NO_MORE_DOCS?).

        Then, depending on that, we just have to fix and add direct tests against our scorers for this, because otherwise problems will continue to happen, since core scorers are generally just checking this up-front and wont find bugs in the current test suite. Maybe improving AssertingScorer to do this can help.

        Show
        Robert Muir added a comment - The issue could be more problematic really. In general most of these methods (cost(), getChildren(), etc) are currently only being called before consuming the iterator. But DrillSideWays first nextDoc()'s all of its subs, then calls cost(). First we should clarify/define, when it is ok to call these methods, and what the semantics are supposed to be if you can call it "after consuming" (is it ok to call it at any time? what about after it returns NO_MORE_DOCS?). Then, depending on that, we just have to fix and add direct tests against our scorers for this, because otherwise problems will continue to happen, since core scorers are generally just checking this up-front and wont find bugs in the current test suite. Maybe improving AssertingScorer to do this can help.
        Hide
        Robert Muir added a comment -

        In this specific case, DrillsideWays calls cost() after nextDoc() already returned NO_MORE_DOCS. This is why ReqExclScorer sets req to null.

        I don't know if DrillSideways is doing this because its "one ahead" on its scorers all the time and looks at docID() [maybe] but I am a little afraid to just add the NO_MORE_DOCS check to it... on the other hand it would suck a little to have if (exhausted) return 0 in every single scorers cost impl, but maybe we should do that instead?

        Show
        Robert Muir added a comment - In this specific case, DrillsideWays calls cost() after nextDoc() already returned NO_MORE_DOCS. This is why ReqExclScorer sets req to null. I don't know if DrillSideways is doing this because its "one ahead" on its scorers all the time and looks at docID() [maybe] but I am a little afraid to just add the NO_MORE_DOCS check to it... on the other hand it would suck a little to have if (exhausted) return 0 in every single scorers cost impl, but maybe we should do that instead?
        Hide
        Stuart added a comment -

        We've been seeing this issue semi-regularly in our app. We've been working around it so far by simplifying our queries to remove clauses (probably a good idea anyway) but we can never be sure our users won't find a way to break it!

        Show
        Stuart added a comment - We've been seeing this issue semi-regularly in our app. We've been working around it so far by simplifying our queries to remove clauses (probably a good idea anyway) but we can never be sure our users won't find a way to break it!
        Hide
        jane chang added a comment -

        this bug was fixed on trunk when reqScorer was modified to no longer be set to null

        this patch is against 5.0, where ReqScorer can hit NPE if cost is called after nextDoc

        patch includes test and fix to call cost before nextDoc

        Show
        jane chang added a comment - this bug was fixed on trunk when reqScorer was modified to no longer be set to null this patch is against 5.0, where ReqScorer can hit NPE if cost is called after nextDoc patch includes test and fix to call cost before nextDoc
        Hide
        Michael McCandless added a comment -

        Thanks jane, patch looks great! You just moved the .cost() calls up before any .nextDoc()s. I'll commit shortly.

        It's nice that we "accidentally" fixed this in trunk ... I think we should also fix this for 4.10.4.

        Show
        Michael McCandless added a comment - Thanks jane, patch looks great! You just moved the .cost() calls up before any .nextDoc()s. I'll commit shortly. It's nice that we "accidentally" fixed this in trunk ... I think we should also fix this for 4.10.4.
        Hide
        ASF subversion and git services added a comment -

        Commit 1662673 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1662673 ]

        LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches

        Show
        ASF subversion and git services added a comment - Commit 1662673 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662673 ] LUCENE-6001 : DrillSideways hits NullPointerException for some BooleanQuery searches
        Hide
        ASF subversion and git services added a comment -

        Commit 1662674 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1662674 ]

        LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches

        Show
        ASF subversion and git services added a comment - Commit 1662674 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662674 ] LUCENE-6001 : DrillSideways hits NullPointerException for some BooleanQuery searches
        Hide
        Michael McCandless added a comment -

        Thank you Dragan and jane!

        Show
        Michael McCandless added a comment - Thank you Dragan and jane!
        Hide
        ASF subversion and git services added a comment -

        Commit 1662681 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1662681 ]

        LUCENE-6001: DrillSideways hits NullPointerException for some BooleanQuery searches

        Show
        ASF subversion and git services added a comment - Commit 1662681 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1662681 ] LUCENE-6001 : DrillSideways hits NullPointerException for some BooleanQuery searches
        Hide
        ASF subversion and git services added a comment -

        Commit 1662728 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10'
        [ https://svn.apache.org/r1662728 ]

        LUCENE-6001: null check was backwards

        Show
        ASF subversion and git services added a comment - Commit 1662728 from Michael McCandless in branch 'dev/branches/lucene_solr_4_10' [ https://svn.apache.org/r1662728 ] LUCENE-6001 : null check was backwards
        Hide
        ASF subversion and git services added a comment -

        Commit 1662732 from Michael McCandless in branch 'dev/branches/branch_5x'
        [ https://svn.apache.org/r1662732 ]

        LUCENE-6001: null check was backwards

        Show
        ASF subversion and git services added a comment - Commit 1662732 from Michael McCandless in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1662732 ] LUCENE-6001 : null check was backwards
        Hide
        ASF subversion and git services added a comment -

        Commit 1662733 from Michael McCandless in branch 'dev/trunk'
        [ https://svn.apache.org/r1662733 ]

        LUCENE-6001: null check was backwards

        Show
        ASF subversion and git services added a comment - Commit 1662733 from Michael McCandless in branch 'dev/trunk' [ https://svn.apache.org/r1662733 ] LUCENE-6001 : null check was backwards
        Hide
        Michael McCandless added a comment -

        Bulk close for 4.10.4 release

        Show
        Michael McCandless added a comment - Bulk close for 4.10.4 release

          People

          • Assignee:
            Unassigned
            Reporter:
            Dragan Jotanovic
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development