Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-7963

Useless getAttribute() in DefaultIndexingChain causes performance drop

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 7.0
    • 7.0
    • core/index
    • None
    • Hardware

      i3.16xlarge (Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz, 64 cores, two sockets)

      Software

      • Ubuntu 16.04 (kernel 4.4.0-1022-aws)
      • Java 1.8.0_144
    • New

    Description

      Some background: We have spotted a significant drop in indexing throughput on machines with a large number of cores (see https://github.com/elastic/elasticsearch/issues/26339 for details).

      We finally managed to isolate the behavior in a JMH "microbenchmark". Here is the output of running that benchmark with 32 threads with JMH's perfasm profiler (also Oracle Development Studio finds the same hotspot):

      ....[Hottest Region 1]..............................................................................
      C2, level 4, org.apache.lucene.index.DefaultIndexingChain$PerField::invert, version 1231 (792 bytes)
      
                                0x00007f52d56b026a: xor    %r10d,%r10d        ;*invokevirtual isAssignableFrom
                                                                              ; - org.apache.lucene.util.AttributeSource::addAttribute@28 (line 207)
                                                                              ; - org.apache.lucene.document.Field$BinaryTokenStream::<init>@8 (line 512)
                                                                              ; - org.apache.lucene.document.Field::tokenStream@82 (line 491)
                                                                              ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99 (line 729)
                                0x00007f52d56b026d: mov    $0x8,%r11d         ;*invokeinterface iterator
                                                                              ; - org.apache.lucene.util.AttributeSource::getCurrentState@46 (line 254)
                                                                              ; - org.apache.lucene.util.AttributeSource::clearAttributes@1 (line 269)
                                                                              ; - org.apache.lucene.document.Field$BinaryTokenStream::incrementToken@10 (line 532)
                                                                              ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@153 (line 736)
        0.00%    0.02%       ↗  0x00007f52d56b0273: test   %r10,%r10
                        ╭    │  0x00007f52d56b0276: je     0x00007f52d56b0292  ;*getfield fieldsData
                        │    │                                                ; - org.apache.lucene.document.Field::binaryValue@1 (line 441)
                        │    │                                                ; - org.apache.lucene.document.Field::tokenStream@65 (line 487)
                        │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99 (line 729)
        0.00%    0.00%  │    │  0x00007f52d56b0278: mov    (%r11),%rsi        ;*getfield next
                        │    │                                                ; - java.util.HashMap::getNode@137 (line 580)
                        │    │                                                ; - java.util.LinkedHashMap::get@6 (line 440)
                        │    │                                                ; - org.apache.lucene.util.AttributeSource::getAttribute@6 (line 245)
                        │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@143 (line 734)
        0.09%    0.51%  │    │  0x00007f52d56b027b: mov    0x18(%rsi),%r8
       23.70%    3.54%  │    │  0x00007f52d56b027f: mov    $0x7f4926a81d88,%rcx  ;   {metadata('org/apache/lucene/analysis/tokenattributes/CharTermAttribute')}
        0.00%    0.00%  │    │  0x00007f52d56b0289: cmp    %rcx,%r8
                        │    │  0x00007f52d56b028c: jne    0x00007f52d56b0949  ;*instanceof
                        │    │                                                ; - org.apache.lucene.document.Field::binaryValue@4 (line 441)
                        │    │                                                ; - org.apache.lucene.document.Field::tokenStream@65 (line 487)
                        │    │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99 (line 729)
        0.01%    0.00%  ↘    │  0x00007f52d56b0292: test   %r10,%r10
                             │  0x00007f52d56b0295: je     0x00007f52d56b1e0d  ;*invokevirtual addAttributeImpl
                             │                                                ; - org.apache.lucene.util.AttributeSource::addAttribute@80 (line 213)
                             │                                                ; - org.apache.lucene.document.Field$StringTokenStream::<init>@8 (line 550)
                             │                                                ; - org.apache.lucene.document.Field::tokenStream@47 (line 483)
                             │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@99 (line 729)
        0.01%    0.00%       │  0x00007f52d56b029b: mov    (%r11),%rsi        ;*invokevirtual hashCode
                             │                                                ; - java.util.HashMap::hash@9 (line 338)
                             │                                                ; - java.util.LinkedHashMap::get@2 (line 440)
                             │                                                ; - org.apache.lucene.util.AttributeSource::getAttribute@6 (line 245)
                             │                                                ; - org.apache.lucene.index.DefaultIndexingChain$PerField::invert@143 (line 734)
      

      You can see that a significant amount of time is spent in mov $0x7f4926a81d88,%rcx. It turns out that this maps to the following line in Java (DefaultIndexingChain$PerField::invert):

      CharTermAttribute termAtt = tokenStream.getAttribute(CharTermAttribute.class);
      

      Which is - luckily - unused (and removed by the attached patch).

      We have verified the impact of the change with an Elasticsearch macrobenchmark which indexes 165,346,692 documents (~ 74GB) with 24 clients concurrently. The reported numbers are median indexing throughput during that benchmark against a single Elasticsearch node:

      • Without the patch: 125418 documents/s
      • With the patch: 221237 documents/s

      Details about the benchmark setup etc. are mentioned in the Elasticsearch issue #26339 if you're interested.

      Unfortunately, it is beyond me why writing this register to main memory takes such a long time and why C2 did not eliminate this line as dead code to begin with.

      Attachments

        Issue Links

          Activity

            People

              uschindler Uwe Schindler
              dm Daniel Mitterdorfer
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Slack

                  Issue deployment