Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8635

Lazy loading Lucene FST offheap using mmap

    Details

    • Type: New Feature
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 8.0, 8.x, master (9.0)
    • Component/s: core/FSTs
    • Labels:
      None
    • Environment:

      I used below setup for es_rally tests:

      single node i3.xlarge running ES 6.5

      es_rally was running on another i3.xlarge instance

    • Lucene Fields:
      New, Patch Available
    • Review Patch?:
      Yes

      Description

      Currently, FST loads all the terms into heap memory during index open. This causes frequent JVM OOM issues if the term size gets big. A better way of doing this will be to lazily load FST using mmap. That ensures only the required terms get loaded into memory.

       
      Lucene can expose API for providing list of fields to load terms offheap. I'm planning to take following approach for this:

      1. Add a boolean property fstOffHeap in FieldInfo
      2. Pass list of offheap fields to lucene during index open (ALL can be special keyword for loading ALL fields offheap)
      3. Initialize the fstOffHeap property during lucene index open
      4. FieldReader invokes default FST constructor or OffHeap constructor based on fstOffHeap field

       
      I created a patch (that loads all fields offheap), did some benchmarks using es_rally and results look good.

        Attachments

        1. fst-offheap-rev.patch
          35 kB
          Mike Sokolov
        2. optional_offheap_ra.patch
          18 kB
          Ankit Jain
        3. fst-offheap-ra-rev.patch
          34 kB
          Mike Sokolov
        4. ra.patch
          17 kB
          Mike Sokolov
        5. rally_benchmark.xlsx
          44 kB
          Ankit Jain
        6. offheap.patch
          17 kB
          Ankit Jain

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                akjain Ankit Jain
              • Votes:
                0 Vote for this issue
                Watchers:
                12 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: