Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 2.2
    • Fix Version/s: None
    • Component/s: core/search
    • Labels:
      None
    • Lucene Fields:
      Patch Available

      Description

      There are a couple of minor bugs in BoostingTermQuery.explain().

      1. The computation of average payload score produces NaN if no payloads were found. It should probably be:
      float avgPayloadScore = super.score() * (payloadsSeen > 0 ? (payloadScore / payloadsSeen) : 1);

      2. If the average payload score is zero, the value of the explanation is 0:
      result.setValue(nonPayloadExpl.getValue() * avgPayloadScore);
      If the query is part of a BooleanClause, this results in:
      "no match on required clause..."
      "failure to meet condition(s) of required/prohibited clause(s)"

      The average payload score can be zero if the field boost = 0.

      I've attached a patch to 'TestBoostingTermQuery.java', however, the test 'testNoPayload' fails in 'SpanScorer.score()' because the doc = -1. It looks like 'setFreqCurrentDoc() should have been called before 'score()'. Maybe someone more knowledgable of spans could investigate this.

      1. TestBoostingTermQuery.patch
        2 kB
        Peter Keegan
      2. TestBoostingTermQuery2.patch
        5 kB
        Grant Ingersoll
      3. TestBoostingTermQuery3.patch
        5 kB
        Grant Ingersoll
      4. TestBoostingTermQuery4.patch
        5 kB
        Grant Ingersoll

        Activity

        Hide
        Peter Keegan added a comment -

        Added 'testNoPayload'

        Show
        Peter Keegan added a comment - Added 'testNoPayload'
        Hide
        Grant Ingersoll added a comment -

        Hi Peter,

        Couple comments. #1 makes sense, except the super.score() part, the score from the other part of the matching is handled by the nonPayloadExpl part. I do agree it should check for zero on payloadsSeen, though, and have added that.

        I don't think I am understanding the issue with #2 above. I am not sure the test is correct. The results[0] being passed into the checkHitCollector say you expect Document 0 to be a match, but this can't be since the boost is 0, therefore there are no results. This can be seen by running the query against the search without the explain, as in:
        TopDocs hits = searcher.search(query, null, 100);
        assertTrue("hits Size: " + hits.totalHits + " is not: " + 0, hits.totalHits == 0);

        Or, perhaps I am missing something? I guess I don't see why the boost part needs to be in there? Can't you have a test that has no payloads?

        Show
        Grant Ingersoll added a comment - Hi Peter, Couple comments. #1 makes sense, except the super.score() part, the score from the other part of the matching is handled by the nonPayloadExpl part. I do agree it should check for zero on payloadsSeen, though, and have added that. I don't think I am understanding the issue with #2 above. I am not sure the test is correct. The results [0] being passed into the checkHitCollector say you expect Document 0 to be a match, but this can't be since the boost is 0, therefore there are no results. This can be seen by running the query against the search without the explain, as in: TopDocs hits = searcher.search(query, null, 100); assertTrue("hits Size: " + hits.totalHits + " is not: " + 0, hits.totalHits == 0); Or, perhaps I am missing something? I guess I don't see why the boost part needs to be in there? Can't you have a test that has no payloads?
        Hide
        Grant Ingersoll added a comment -

        but, I agree, there is something wrong here. Attached is an update of the Test, plus a fix for #1.

        Show
        Grant Ingersoll added a comment - but, I agree, there is something wrong here. Attached is an update of the Test, plus a fix for #1.
        Hide
        Grant Ingersoll added a comment -

        OK, I think I see the problem,

        The issue lies in the fact that the Similarity override for this test sets the tf() to 1, regardless of the frequency coming in. Thus, for the "foo" clause, it

        Let me know what you think of this patch.

        Show
        Grant Ingersoll added a comment - OK, I think I see the problem, The issue lies in the fact that the Similarity override for this test sets the tf() to 1, regardless of the frequency coming in. Thus, for the "foo" clause, it Let me know what you think of this patch.
        Hide
        Peter Keegan added a comment -

        Hi Grant,

        > TopDocs hits = searcher.search(query, null, 100);
        > assertTrue("hits Size: " + hits.totalHits + " is not: " + 0, hits.totalHits == 0);

        TopDocCollector discards hits with score = 0, so that's not a fair comparison. If you do a similar test with TermQuery (with a field boost = 0) instead of BoostingTermQuery, you'll see the difference. Even terms with 0 weight are included in the explanation. Make sense?

        Peter

        Show
        Peter Keegan added a comment - Hi Grant, > TopDocs hits = searcher.search(query, null, 100); > assertTrue("hits Size: " + hits.totalHits + " is not: " + 0, hits.totalHits == 0); TopDocCollector discards hits with score = 0, so that's not a fair comparison. If you do a similar test with TermQuery (with a field boost = 0) instead of BoostingTermQuery, you'll see the difference. Even terms with 0 weight are included in the explanation. Make sense? Peter
        Hide
        Grant Ingersoll added a comment -

        OK, I added the setBoost(0) back in, but kept the similarity change and the test passes

        Show
        Grant Ingersoll added a comment - OK, I added the setBoost(0) back in, but kept the similarity change and the test passes
        Hide
        Peter Keegan added a comment -

        Confirmed - thanks.

        Show
        Peter Keegan added a comment - Confirmed - thanks.
        Hide
        Grant Ingersoll added a comment -

        Committed.

        Show
        Grant Ingersoll added a comment - Committed.
        Hide
        Peter Keegan added a comment -

        Hi Grant,

        TopDocCollector discards hits with score = 0, so that's not a fair
        comparison. If you do a similar test with TermQuery (with a field boost = 0)
        instead of BoostingTermQuery, you'll see the difference. Even terms with 0
        weight are included in the explanation. Make sense?

        Peter

        Show
        Peter Keegan added a comment - Hi Grant, TopDocCollector discards hits with score = 0, so that's not a fair comparison. If you do a similar test with TermQuery (with a field boost = 0) instead of BoostingTermQuery, you'll see the difference. Even terms with 0 weight are included in the explanation. Make sense? Peter

          People

          • Assignee:
            Grant Ingersoll
            Reporter:
            Peter Keegan
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development