Lucene - Core
  1. Lucene - Core
  2. LUCENE-6900

Grouping sortWithinGroup should use Sort.RELEVANCE to indicate that, not null

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 5.5
    • Component/s: modules/grouping
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      In AbstractSecondPassGroupingCollector, withinGroupSort uses a value of null to indicate a relevance sort. I think it's nicer to use Sort.RELEVANCE for this – after all it's how the groupSort variable is handled. This choice is also seen in GroupingSearch; likely some other collaborators too.

      Martijn van Groningen is there some wisdom in the current choice that escapes me? If not I'll post a patch.

      1. LUCENE_6900.patch
        32 kB
        David Smiley
      2. LUCENE_6900.patch
        33 kB
        David Smiley

        Issue Links

          Activity

          Hide
          Martijn van Groningen added a comment -

          No there isn't, what I remember is that when creating this class it was common to use 'null' as an indication to sort by relevancy.

          Show
          Martijn van Groningen added a comment - No there isn't, what I remember is that when creating this class it was common to use 'null' as an indication to sort by relevancy.
          Hide
          David Smiley added a comment -

          Here is a patch. There are some edits I did just for code clarity.

          I expanded the scope to include having the AbstractSecondPassGroupingCollector.needScores be determined based on the constructor args instead of always returning true (this is an optimization and addresses a TODO comment). I did not do this likewise for other collectors. Do you think this is okay or would you prefer a separate issue and increasing the scope there?

          The Solr side was tricky to debug & fix. I ended up doing a refactoring in TopGroupsResultTransformer to extract out near duplicated code. I strengthened the test in TestDistributedGrouping so it actually tests score ordered groups, which it didn't before because the commenter (you?) thought distributed IDF was necessary when in fact just returning some field value as a score works fine. I ran into SOLR-6612 but avoided it by adding "maxScore" to the "handle" map as a "SKIP". I spent a little time trying to fix it but I stopped myself as it was becoming a distraction. Some little improvements in the issue might reflect that.

          Show
          David Smiley added a comment - Here is a patch. There are some edits I did just for code clarity. I expanded the scope to include having the AbstractSecondPassGroupingCollector.needScores be determined based on the constructor args instead of always returning true (this is an optimization and addresses a TODO comment). I did not do this likewise for other collectors. Do you think this is okay or would you prefer a separate issue and increasing the scope there? The Solr side was tricky to debug & fix. I ended up doing a refactoring in TopGroupsResultTransformer to extract out near duplicated code. I strengthened the test in TestDistributedGrouping so it actually tests score ordered groups, which it didn't before because the commenter (you?) thought distributed IDF was necessary when in fact just returning some field value as a score works fine. I ran into SOLR-6612 but avoided it by adding "maxScore" to the "handle" map as a "SKIP". I spent a little time trying to fix it but I stopped myself as it was becoming a distraction. Some little improvements in the issue might reflect that.
          Hide
          Christine Poerschke added a comment -

          Patch looks good to me. Two minor comments:

          • in the AbstractSecondPassGroupingCollector constructor you replaced a size() == 0 with isEmpty() and perhaps the wording of the exception on the following line might be changed also to mention empty rather than size==0
          • in BlockGroupingCollector around line 235 there's an existing TODO comment mentioning null as meaning "by relevance"
          Show
          Christine Poerschke added a comment - Patch looks good to me. Two minor comments: in the AbstractSecondPassGroupingCollector constructor you replaced a size() == 0 with isEmpty() and perhaps the wording of the exception on the following line might be changed also to mention empty rather than size==0 in BlockGroupingCollector around line 235 there's an existing TODO comment mentioning null as meaning "by relevance"
          Hide
          David Smiley added a comment -

          Thanks for the review Christine! I updated the patch with those simple changes. Also, I backed-out unrelated improvements to needScore; I'll file a separate issue.

          Absent of further feedback, I'll commit this tomorrow around now.

          Show
          David Smiley added a comment - Thanks for the review Christine! I updated the patch with those simple changes. Also, I backed-out unrelated improvements to needScore; I'll file a separate issue . Absent of further feedback, I'll commit this tomorrow around now.
          Hide
          ASF subversion and git services added a comment -

          Commit 1716569 from David Smiley in branch 'dev/trunk'
          [ https://svn.apache.org/r1716569 ]

          LUCENE-6900: Grouping sortWithinGroup shouldn't be null; use Sort.RELEVANCE.
          Enhanced related Solr side a bit.

          Show
          ASF subversion and git services added a comment - Commit 1716569 from David Smiley in branch 'dev/trunk' [ https://svn.apache.org/r1716569 ] LUCENE-6900 : Grouping sortWithinGroup shouldn't be null; use Sort.RELEVANCE. Enhanced related Solr side a bit.
          Hide
          ASF subversion and git services added a comment -

          Commit 1716570 from David Smiley in branch 'dev/branches/branch_5x'
          [ https://svn.apache.org/r1716570 ]

          LUCENE-6900: Grouping sortWithinGroup shouldn't be null; use Sort.RELEVANCE.
          Enhanced related Solr side a bit.

          Show
          ASF subversion and git services added a comment - Commit 1716570 from David Smiley in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1716570 ] LUCENE-6900 : Grouping sortWithinGroup shouldn't be null; use Sort.RELEVANCE. Enhanced related Solr side a bit.
          Hide
          Alan Woodward added a comment -

          Just noticed that this has an entry in the Solr CHANGES.txt file - was that intentional?

          Show
          Alan Woodward added a comment - Just noticed that this has an entry in the Solr CHANGES.txt file - was that intentional?
          Hide
          David Smiley added a comment -

          Yes; most of my time on this issue was debugging the Solr side, after which I ended up enhancing a test and doing a little refactoring.

          Show
          David Smiley added a comment - Yes; most of my time on this issue was debugging the Solr side, after which I ended up enhancing a test and doing a little refactoring.

            People

            • Assignee:
              David Smiley
              Reporter:
              David Smiley
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development