Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7477

ExternalRefSorter should use OfflineSorter's actual writer for writing the input file

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • None
    • None
    • None
    • None
    • New

    Description

      Consider this constructor in ExternalRefSorter:

        public ExternalRefSorter(OfflineSorter sorter) throws IOException {
          this.sorter = sorter;
          this.input = sorter.getDirectory().createTempOutput(sorter.getTempFileNamePrefix(), "RefSorterRaw", IOContext.DEFAULT);
          this.writer = new OfflineSorter.ByteSequencesWriter(this.input);
        }
      

      The problem with it is that the writer for the initial input file is written with the default OfflineSorter.ByteSequencesWriter, but the instance of OfflineSorter may be unable to read it if it overrides getReader to use something else than the default.

      While this works now, it should be cleaned up (I think). It'd be probably ideal to allow OfflineSorter to generate its own temporary file and just return the ByteSequencesWriter it chooses to use, so the above snippet would read:

        public ExternalRefSorter(OfflineSorter sorter) throws IOException {
          this.sorter = sorter;
          this.writer = sorter.newUnsortedPartition();
        }
      

      This could be also extended so that OfflineSorter is in charge of managing its own (sorted and unsorted) partitions. Then sort(String file) would simply become ByteSequenceIterator sort() (or even Stream<BytesRef> sort() as Stream is conveniently AutoCloseable). If we made OfflineSorter implement Closeable it could also take care of cleaning up any resources it opens in the directory we pass to it. An additional bonus would be the ability to dodge the final internal merge(1) – if we manage sorted and unsorted partitions then there are open possibilities of returning an iterator that dynamically merges from multiple partitions.

      Attachments

        1. LUCENE-7477.patch
          20 kB
          Dawid Weiss

        Activity

          People

            dweiss Dawid Weiss
            dweiss Dawid Weiss
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: