Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7391

MemoryIndexReader.fields() performance regression

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 6.0
    • Fix Version/s: 6.2
    • Component/s: None
    • Labels:
      None
    • Lucene Fields:
      New, Patch Available

      Description

      While upgrading our codebase from Lucene 4 to Lucene 6 we found a significant performance regression - a 5x slowdown

      On profiling the code, the method MemoryIndexReader.fields() shows up as one of the hottest methods

      Looking at the method, it just creates a copy of the inner fields Map before passing it to MemoryFields. It does this so that it can filter out fields with numTokens <= 0.

      The simplest "fix" would be to just remove the copying of the map completely, and pass fields directly to MemoryFields. It's simple and removes any slowdown caused by this method. It does potentially change behaviour though, but none of the unit tests seem to test that behaviour so I wonder whether it's necessary (I looked at the original ticket LUCENE-7091 that introduced this code, I can't find much in way of an explanation). I'm going to attach a patch to this effect anyway and we can take things from there

        Attachments

        1. LUCENE-7391.patch
          3 kB
          Steve Mason
        2. LUCENE-7391-test.patch
          2 kB
          Steve Mason
        3. LUCENE-7391.patch
          1 kB
          Steve Mason

          Activity

            People

            • Assignee:
              dsmiley David Smiley
              Reporter:
              spmason Steve Mason
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: