Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-454

lazily create SegmentMergeInfo.docMap

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 1.9
    • None
    • None

    Description

      Since creating the docMap is expensive, and it's only used during segment merging, not searching, defer creation until it is requested.

      SegmentMergeInfo is also used in MultiTermEnum, the term enumerator for a MultiReader. TermEnum is used by queries such as PrefixQuery, RangeQuery, WildcardQuery, as well as RangeFilter, DateFilter, and sorting the first time (filling the FieldCache).

      Performance Results:
      A simple single field index with 555,555 documents, and 1000 random deletions was queried 1000 times with a PrefixQuery matching a single document.

      Performance Before Patch:
      indexing time = 121,656 ms
      querying time = 58,812 ms

      Performance After Patch:
      indexing time = 121,000 ms
      querying time = 598 ms

      A 100 fold increase in query performance!

      All lucene unit tests pass.

      Attachments

        1. docMap.txt
          2 kB
          Yonik Seeley
        2. docMap.txt
          2 kB
          Yonik Seeley

        Activity

          People

            yseeley@gmail.com Yonik Seeley
            yseeley@gmail.com Yonik Seeley
            Votes:
            6 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: