Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7341

EarlyTerminatingSortingCollector support for grouped searches

Details

    • New Feature
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      Currently grouped searches must not use the early terminating sorting collector because the wrong results would be returned. This ticket proposes to change the EarlyTerminatingSortingCollector class and probably the LeafCollector interface to support early termination for grouped searches.

      Illustration (aaa is an as yet unnamed boolean flag):

      # fictitious (sorted by X) segment
      | doc key | D0 D1 D2 D3 D4 D5 ... D10 D11 D12 D13 D14 D15 ... D20 D21 D22 D23 D24 D25 ...
      | doc grp | G0 G0 G0 G0 G0 G0 ... D10 G10 G10 G10 G10 G10 ... G20 G20 G20 G20 G20 G20 ... 
      
      # query with rows=3 sort=X group=false
      | query result | D0 D1 D2
      
      # existing code:
      #   use a EarlyTerminatingSortingCollector with numDocsToCollect=3
      #   EarlyTerminatingSortingCollector.getLeafCollector returns a LeafCollector
      #   whose collect method uses (++numCollected >= numDocsToCollect) as the terminating condition
      
      # query with rows=3 sort=X group=true group.field=grp group.sort=X group.limit=1
      | query result | G0(D0) G10(D10) G20(D20)
      
      # existing code:
      #   cannot use EarlyTerminatingSortingCollector (query result would wrongly be just 'G0(D0)')
      # proposed new code:
      #   use a EarlyTerminatingSortingCollector(... numDocsToCollect=3 aaa=true)
      
      # query with rows=3 sort=X group=true group.field=grp group.sort=X group.limit=5
      | query result | G0(D0,D1,D2,D3,D4) G10(D10,D11,D12,D13,D14) G20(D20,D21,D22,D23,D24)
      
      # existing code:
      #   cannot use EarlyTerminatingSortingCollector (query result would wrongly be just 'G0(D0,D1,D2)')
      # proposed new code:
      #   use a EarlyTerminatingSortingCollector(... numDocsToCollect=3 aaa=true)
      

      Attachments

        Activity

          People

            cpoerschke Christine Poerschke
            cpoerschke Christine Poerschke
            Votes:
            1 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: