Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8619

Decrease I/O pressure of OfflineSorter

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Not A Problem
    • None
    • None
    • None
    • None
    • New

    Description

      OfflineSorter is likely I/O bound, yet it doesn't really try to relieve I/O. For instance it always writes the length on 2 bytes, which is waseful when used by BKDWriter since all byte[] arrays have exactly the same length. For LatLonPoint, this is a 25% space overhead that we could remove.

      Doing lightweight compression on the fly might also help.

      As a data point, Ignacio told me that after indexing 60M shapes with LatLonShape (1.65B triangles), the index directory was about 265GB and dropped to 57GB when merging was over.

      Attachments

        Activity

          People

            Unassigned Unassigned
            jpountz Adrien Grand
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: