Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7141

OfflineSorter shouldn't always forceMerge in the end

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Today it always does a final merge, to collapse all segments into a single segment.

      But typically the caller is going to re-iterate all values anyway, to go off and build an FST or a BKD tree or something, and so that final forceMerge is often not necessary and the merging can be done on the fly when the caller consumes the result.

      This is somewhat tricky to do ... I'd like to break it into steps, starting with fixing the ByteSequencesReader API to implement BytesRefIterator instead of its own read(BytesRefBuilder) method as a first step.

      1. LUCENE-7141.patch
        16 kB
        Michael McCandless

        Activity

        Hide
        mikemccand Michael McCandless added a comment -

        First phase ... just a rote cutover to BytesRefIterator.

        Show
        mikemccand Michael McCandless added a comment - First phase ... just a rote cutover to BytesRefIterator .
        Hide
        dweiss Dawid Weiss added a comment -

        +1. This is something I was going to suggest.

        Show
        dweiss Dawid Weiss added a comment - +1. This is something I was going to suggest.
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit 78d5cfefe2453345c498984bf0e405d254a9d5bc in lucene-solr's branch refs/heads/master from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=78d5cfe ]

        LUCENE-7141: switch OfflineSorter's ByteSequencesReader to BytesRefIterator

        Show
        jira-bot ASF subversion and git services added a comment - Commit 78d5cfefe2453345c498984bf0e405d254a9d5bc in lucene-solr's branch refs/heads/master from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=78d5cfe ] LUCENE-7141 : switch OfflineSorter's ByteSequencesReader to BytesRefIterator
        Hide
        jira-bot ASF subversion and git services added a comment -

        Commit c46d7686643e7503304cb35dfe546bce9c6684e7 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless
        [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c46d768 ]

        LUCENE-7141: switch OfflineSorter's ByteSequencesReader to BytesRefIterator

        Show
        jira-bot ASF subversion and git services added a comment - Commit c46d7686643e7503304cb35dfe546bce9c6684e7 in lucene-solr's branch refs/heads/branch_6x from Mike McCandless [ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=c46d768 ] LUCENE-7141 : switch OfflineSorter's ByteSequencesReader to BytesRefIterator

          People

          • Assignee:
            mikemccand Michael McCandless
            Reporter:
            mikemccand Michael McCandless
          • Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

            • Created:
              Updated:

              Development