Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7391

MemoryIndexReader.fields() performance regression


    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 6.0
    • 6.2
    • None
    • None
    • New, Patch Available


      While upgrading our codebase from Lucene 4 to Lucene 6 we found a significant performance regression - a 5x slowdown

      On profiling the code, the method MemoryIndexReader.fields() shows up as one of the hottest methods

      Looking at the method, it just creates a copy of the inner fields Map before passing it to MemoryFields. It does this so that it can filter out fields with numTokens <= 0.

      The simplest "fix" would be to just remove the copying of the map completely, and pass fields directly to MemoryFields. It's simple and removes any slowdown caused by this method. It does potentially change behaviour though, but none of the unit tests seem to test that behaviour so I wonder whether it's necessary (I looked at the original ticket LUCENE-7091 that introduced this code, I can't find much in way of an explanation). I'm going to attach a patch to this effect anyway and we can take things from there


        1. LUCENE-7391-test.patch
          2 kB
          Steve Mason
        2. LUCENE-7391.patch
          1 kB
          Steve Mason
        3. LUCENE-7391.patch
          3 kB
          Steve Mason



            dsmiley David Smiley
            spmason Steve Mason
            0 Vote for this issue
            6 Start watching this issue