Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-5409

ToParentBlockJoinCollector.getTopGroups returns empty Groups

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 4.6
    • Fix Version/s: 4.7, 6.0
    • Component/s: core/search
    • Labels:
      None
    • Environment:

      Ubuntu 12.04

    • Lucene Fields:
      New

      Description

      A bug is observed to cause unstable results returned by the getTopGroups function of class ToParentBlockJoinCollector.

      In the scorer generation stage, the ToParentBlockJoinCollector will automatically rewrite all the associated ToParentBlockJoinQuery (and their subqueries), and save them into its in-memory Look-up table, namely joinQueryID (see enroll() method for detail). Unfortunately, in the getTopGroups method, the new ToParentBlockJoinQuery parameter is not rewritten (at least users are not expected to do so). When the new one is searched in the old lookup table (considering the impact of rewrite() on hashCode()), the lookup will largely fail and eventually end up with a topGroup collection consisting of only empty groups (their hitCounts are guaranteed to be zero).

      An easy fix would be to rewrite the original BlockJoinQuery before invoking getTopGroups method. However, the computational cost of this is not optimal. A better but slightly more complex solution would be to save unrewrited Queries into the lookup table.

        Attachments

          Activity

            People

            • Assignee:
              mikemccand Michael McCandless
              Reporter:
              peng Peng Cheng
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Time Tracking

                Estimated:
                Original Estimate - 168h
                168h
                Remaining:
                Remaining Estimate - 168h
                168h
                Logged:
                Time Spent - Not Specified
                Not Specified