Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-8551

Purge unused FieldInfo on segment merge

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Won't Fix
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: core/index
    • Labels:
      None
    • Lucene Fields:
      New

      Description

      If a field is effectively unused (no norms, terms index, term vectors, docValues, stored value, points index), it will nonetheless hang around in FieldInfos indefinitely.  It would be nice to be able to recognize an unused FieldInfo and allow it to disappear after a merge (or two).

      SegmentMerger merges FieldInfo (from each segment) as nearly the first thing it does.  After that, the different index parts, before it's known what's "used" or not.  After writing, we theoretically know which fields are used or not, though we're not doing any bookkeeping to track it.  Maybe we should track the fields used during writing so we write a filtered merged fieldInfo at the end instead of unfiltered up front?  Or perhaps upon reading a segment, we make it cheap/easy for each index type (e.g. terms index, stored fields, ...) to know which fields have data for the corresponding type.  Then, on a subsequent merge, we know up front to filter the FieldInfos.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              dsmiley David Smiley
            • Votes:
              1 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: