Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8619

Decrease I/O pressure of OfflineSorter

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Not A Problem
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      OfflineSorter is likely I/O bound, yet it doesn't really try to relieve I/O. For instance it always writes the length on 2 bytes, which is waseful when used by BKDWriter since all byte[] arrays have exactly the same length. For LatLonPoint, this is a 25% space overhead that we could remove.

      Doing lightweight compression on the fly might also help.

      As a data point, Ignacio told me that after indexing 60M shapes with LatLonShape (1.65B triangles), the index directory was about 265GB and dropped to 57GB when merging was over.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jpountz Adrien Grand
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: