Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8623

Decrease I/O pressure when merging high dimensional points

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.7, 8.0, master (9.0)
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      Related with LUCENE-8619, after indexing 60 million shapes(~1.65 billion triangles) using LatLonShape, the index directory grew to a size of 265 GB when performing merging of different segments. After the processes were over the index size was 57 GB.

      As an example imagine we are merging several segments to a new segment of size 10GB (4 dimensions). The BKD tree merging logic will create the following files:

      1) Level 0: 4 copies of the data, each one sorted by one dimensions : 40GB

      2) Level 1: 6 copies of half of the data, left and right : 30GB

      3) Level 2: 6 copies of one quarter of the data, left and right : 15 GB

      4) Level 3: 6 more copies halving the previous level, left and right : 7.5 GB

      5) Level 4: 6 more copies halving the previous level, left and right : 3.75 GB

       

      and so on... So it requires around 100GB to merge that segment. 

      In this issue is proposed to delay the creation of sorted copies to when they are needed. It reduces the total size required to half of what it is needed now. 

       

       

       

        Attachments

        1. LUCENE-8623.patch
          12 kB
          Ignacio Vera
        2. LUCENE-8623.patch
          13 kB
          Ignacio Vera
        3. LUCENE-8623.patch
          14 kB
          Ignacio Vera
        4. Geo3D.png
          8 kB
          Ignacio Vera
        5. LatLonShape.png
          9 kB
          Ignacio Vera
        6. LatLonPoint.png
          8 kB
          Ignacio Vera
        7. LUCENE-8623.patch
          16 kB
          Ignacio Vera

          Activity

            People

            • Assignee:
              ivera Ignacio Vera
              Reporter:
              ivera Ignacio Vera
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: