Solr
  1. Solr
  2. SOLR-5230

Call DelegatingCollector.finish() during grouping

    Details

    • Type: New Feature New Feature
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 4.4
    • Fix Version/s: 4.7, 6.0
    • Component/s: search
    • Labels:
      None

      Description

      This is an add-on to SOLR-5020 to call the new DelegatingCollector.finish() method from inside the grouping flow.

      1. SOLR-5230.patch
        2 kB
        Erik Hatcher
      2. SOLR-5230.patch
        0.9 kB
        Joel Bernstein

        Issue Links

          Activity

          Hide
          Joel Bernstein added a comment -

          Patch

          Show
          Joel Bernstein added a comment - Patch
          Hide
          Joel Bernstein added a comment - - edited

          Still needs testing. Not sure if the finish() method needs to be called during first and second phase of grouping. Initial patch calls it in both places.

          Show
          Joel Bernstein added a comment - - edited Still needs testing. Not sure if the finish() method needs to be called during first and second phase of grouping. Initial patch calls it in both places.
          Hide
          Simon Endele added a comment - - edited

          Applied the patch and it seems to work at a first glance.
          Thank you very much for your quick reaction on SOLR-5020, Joel!

          But for some scenarios (e.g. expensive post-filters) it might be a drawback that the phases cannot be distinguished in the finish() method.
          What do you think about introducing a second method DelegatingCollector.finishAfterGrouping() or similar that is called in the second phase instead of finish()?

          Show
          Simon Endele added a comment - - edited Applied the patch and it seems to work at a first glance. Thank you very much for your quick reaction on SOLR-5020 , Joel! But for some scenarios (e.g. expensive post-filters) it might be a drawback that the phases cannot be distinguished in the finish() method. What do you think about introducing a second method DelegatingCollector.finishAfterGrouping() or similar that is called in the second phase instead of finish()?
          Hide
          Joel Bernstein added a comment - - edited

          After reviewing the code, I believe the finish() method needs to be called in both phases. It looks like in Phase1 the group heads are collected and in phase2 the sub-documents are collected. In both situations the postfilter needs to be applied and finish called.

          Show
          Joel Bernstein added a comment - - edited After reviewing the code, I believe the finish() method needs to be called in both phases. It looks like in Phase1 the group heads are collected and in phase2 the sub-documents are collected. In both situations the postfilter needs to be applied and finish called.
          Hide
          Simon Endele added a comment -

          Hm, I'm quite sure that collect() is called for all docs in both phases.

          Excerpt from my final result:
          <lst name="grouped">
          <lst name="group_id">
          <int name="matches">61</int>
          <int name="ngroups">35</int>
          <arr name="groups">
          [...]

          And collect() is called twice 61 times, followed by a call of finish() each.

          Show
          Simon Endele added a comment - Hm, I'm quite sure that collect() is called for all docs in both phases. Excerpt from my final result: <lst name="grouped"> <lst name="group_id"> <int name="matches">61</int> <int name="ngroups">35</int> <arr name="groups"> [...] And collect() is called twice 61 times, followed by a call of finish() each.
          Hide
          Joel Bernstein added a comment -

          This is the way it likely has to be. Both of the grouping phases need to call the postFilter to return the proper results. And if the postFilter is called, the finish() needs to be called.

          For expensive postFilters, this is not ideal. But, that's the implementation.

          Take a look at SOLR-5027 which was the original reason why the finish() method was added to DelegatingCollector. This an alternative to field collapsing and grouping which performs much better then ngroups and group.truncate combined.

          SOLR-5027 will be getting a lot of my attention this month so any feedback you have on this would be appreciated.

          Show
          Joel Bernstein added a comment - This is the way it likely has to be. Both of the grouping phases need to call the postFilter to return the proper results. And if the postFilter is called, the finish() needs to be called. For expensive postFilters, this is not ideal. But, that's the implementation. Take a look at SOLR-5027 which was the original reason why the finish() method was added to DelegatingCollector. This an alternative to field collapsing and grouping which performs much better then ngroups and group.truncate combined. SOLR-5027 will be getting a lot of my attention this month so any feedback you have on this would be appreciated.
          Hide
          Steve Rowe added a comment -

          Joel Bernstein, is this committable?

          Show
          Steve Rowe added a comment - Joel Bernstein , is this committable?
          Hide
          Joel Bernstein added a comment -

          Steve,

          I think this is done properly, but I haven't added any test cases for this yet.

          We'd need to mock up a simple PostFilter that uses finish() for the tests because the CollapsingQParserPlugin is the only PostFilter right now that relies on finish(). It would be better to mockup a simple test PostFilter for the tests.

          Joel

          Show
          Joel Bernstein added a comment - Steve, I think this is done properly, but I haven't added any test cases for this yet. We'd need to mock up a simple PostFilter that uses finish() for the tests because the CollapsingQParserPlugin is the only PostFilter right now that relies on finish(). It would be better to mockup a simple test PostFilter for the tests. Joel
          Hide
          Steve Rowe added a comment -

          Thanks Joel, I'll see if I can whip up a test. - Steve

          Show
          Steve Rowe added a comment - Thanks Joel, I'll see if I can whip up a test. - Steve
          Hide
          Erik Hatcher added a comment -

          Here's a 4x branch patch that includes a test case. The test case fails without the Grouping.java changes and passes after.

          Ok to commit? I'll do this in the next day or so if no objections.

          A more general purpose mocking test would be useful too, but this test at least covers the case of the collapse query parser breaking grouping.

          Show
          Erik Hatcher added a comment - Here's a 4x branch patch that includes a test case. The test case fails without the Grouping.java changes and passes after. Ok to commit? I'll do this in the next day or so if no objections. A more general purpose mocking test would be useful too, but this test at least covers the case of the collapse query parser breaking grouping.
          Hide
          Steve Rowe added a comment -

          +1 to commit

          Show
          Steve Rowe added a comment - +1 to commit
          Hide
          ASF subversion and git services added a comment -

          Commit 1562308 from Erik Hatcher in branch 'dev/branches/branch_4x'
          [ https://svn.apache.org/r1562308 ]

          SOLR-5230: Call DelegatingCollector.finish() during grouping

          Show
          ASF subversion and git services added a comment - Commit 1562308 from Erik Hatcher in branch 'dev/branches/branch_4x' [ https://svn.apache.org/r1562308 ] SOLR-5230 : Call DelegatingCollector.finish() during grouping
          Hide
          Erik Hatcher added a comment -

          I don't see an automatic comment for my trunk commit (JIRA was down briefly, so maybe that was it?), but it's r1562305 on trunk.

          Show
          Erik Hatcher added a comment - I don't see an automatic comment for my trunk commit (JIRA was down briefly, so maybe that was it?), but it's r1562305 on trunk.
          Hide
          Erik Hatcher added a comment -

          Thanks Joel for the original patch, and thanks Steve for the review.

          Show
          Erik Hatcher added a comment - Thanks Joel for the original patch, and thanks Steve for the review.
          Hide
          Joel Bernstein added a comment -

          Your welcome, and thanks for following this up and finishing it off.

          Show
          Joel Bernstein added a comment - Your welcome, and thanks for following this up and finishing it off.

            People

            • Assignee:
              Erik Hatcher
              Reporter:
              Joel Bernstein
            • Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development