Lucene - Core
  1. Lucene - Core
  2. LUCENE-4077

ToParentBlockJoinCollector provides no way to access computed scores and the maxScore

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 3.4, 3.5, 3.6
    • Fix Version/s: 4.0-ALPHA, 6.0
    • Component/s: modules/join
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      The constructor of ToParentBlockJoinCollector allows to turn on the tracking of parent scores and the maximum parent score, however there is no way to access those scores because:

      • maxScore is a private field, and there is no getter
      • TopGroups / GroupDocs does not provide access to the scores for the parent documents, only the children
      1. LUCENE-4077.patch
        25 kB
        Michael McCandless
      2. LUCENE-4077.patch
        25 kB
        Michael McCandless
      3. LUCENE-4077.patch
        16 kB
        Michael McCandless
      4. LUCENE-4077.patch
        15 kB
        Michael McCandless

        Activity

        Hide
        Michael McCandless added a comment -

        Patch.

        I added getMaxScore to ToParentBlockJoinCollector, and GroupDocs.score to access the aggregated score for the group. I also added a ScoreMergeMode enum to TopGroups.merge to control how the scores from the same group across multiple shards should be merged.

        Show
        Michael McCandless added a comment - Patch. I added getMaxScore to ToParentBlockJoinCollector, and GroupDocs.score to access the aggregated score for the group. I also added a ScoreMergeMode enum to TopGroups.merge to control how the scores from the same group across multiple shards should be merged.
        Hide
        Christoph Kaser added a comment -

        Hello Mike,

        thank you for the patch. There is one small problem: ToParentBlockJoinCollector.getMaxScore() always returns NaN. This happens because maxScore is initialized as

        private float maxScore = Float.NaN;
        

        and then updated as

        maxScore = Math.max(score, maxScore);
        

        which is always NaN.

        I hope I applied the patch to the correct revision and this is not caused by a version conflict.

        Show
        Christoph Kaser added a comment - Hello Mike, thank you for the patch. There is one small problem: ToParentBlockJoinCollector.getMaxScore() always returns NaN . This happens because maxScore is initialized as private float maxScore = Float .NaN; and then updated as maxScore = Math .max(score, maxScore); which is always NaN . I hope I applied the patch to the correct revision and this is not caused by a version conflict.
        Hide
        Michael McCandless added a comment -

        Ugh, sorry. Sneaky NaN!

        I added an assert in TestBJQ that shows the failure, then fixed it...

        Show
        Michael McCandless added a comment - Ugh, sorry. Sneaky NaN! I added an assert in TestBJQ that shows the failure, then fixed it...
        Hide
        Christoph Kaser added a comment -

        This patch works perfectly for my application. Thank you!

        Show
        Christoph Kaser added a comment - This patch works perfectly for my application. Thank you!
        Hide
        Michael McCandless added a comment -

        Super, thanks for testing Christoph. I'll commit shortly...

        Show
        Michael McCandless added a comment - Super, thanks for testing Christoph. I'll commit shortly...
        Hide
        Michael McCandless added a comment -

        I decided to add the maxScore to TopGroups so it's consistent w/ TopDocs; this way you don't have to ask the collector for the maxScore....

        Show
        Michael McCandless added a comment - I decided to add the maxScore to TopGroups so it's consistent w/ TopDocs; this way you don't have to ask the collector for the maxScore....
        Hide
        Christoph Kaser added a comment -

        Hi Mike,

        shouldn't TopGroups.maxScore contain the maximum parent score? If I am not mistaken, the way it is built now, it contains the maximum child score over all children.

        This is due to this line in ToParentBlockJoinCollector.getTopGroups():

        maxScore = Math.max(maxScore, topDocs.getMaxScore());
        

        I think it should read:

        totalMaxScore = Math.max(totalMaxScore, og.score);
        

        Otherwise, topGroups.maxScore is different to ToParentBlockJoinCollector.getMaxScore()

        Show
        Christoph Kaser added a comment - Hi Mike, shouldn't TopGroups.maxScore contain the maximum parent score? If I am not mistaken, the way it is built now, it contains the maximum child score over all children. This is due to this line in ToParentBlockJoinCollector.getTopGroups(): maxScore = Math .max(maxScore, topDocs.getMaxScore()); I think it should read: totalMaxScore = Math .max(totalMaxScore, og.score); Otherwise, topGroups.maxScore is different to ToParentBlockJoinCollector.getMaxScore()
        Hide
        Michael McCandless added a comment -

        Otherwise, topGroups.maxScore is different to ToParentBlockJoinCollector.getMaxScore()

        Woops, you're right, thanks. In fact I should be passing the maxScore that the collector already computed, not recomputing it in ToParentBJC.getTopGroups...

        Show
        Michael McCandless added a comment - Otherwise, topGroups.maxScore is different to ToParentBlockJoinCollector.getMaxScore() Woops, you're right, thanks. In fact I should be passing the maxScore that the collector already computed, not recomputing it in ToParentBJC.getTopGroups...
        Hide
        Michael McCandless added a comment -

        New patch, using the collector's maxScore in TPBJC.getTopGroups.

        Show
        Michael McCandless added a comment - New patch, using the collector's maxScore in TPBJC.getTopGroups.
        Hide
        Christoph Kaser added a comment -

        Thank you, now it works perfectly!

        Show
        Christoph Kaser added a comment - Thank you, now it works perfectly!
        Hide
        Michael McCandless added a comment -

        Super, thanks Christoph, I'll commit shortly...

        Show
        Michael McCandless added a comment - Super, thanks Christoph, I'll commit shortly...

          People

          • Assignee:
            Michael McCandless
            Reporter:
            Christoph Kaser
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development