Uploaded image for project: 'Lucene - Core'
  1. Lucene - Core
  2. LUCENE-10519

Improvement for CloseableThreadLocal

Details

    • Bug
    • Status: Open
    • Critical
    • Resolution: Unresolved
    • 8.9, 8.10.1, 8.11.1
    • None
    • core/other
    • Elasticsearch v7.16.0

      OpenJDK v11

    • New, Patch Available
    • Yes

    Description

      Problem


      org.apache.lucene.util.CloseableThreadLocal(which is using ThreadLocal<WeakReference<T>>) may still have a flaw under G1GC. There is a single ThreadLocalMap stored for each thread, which all ThreadLocals share, and that master map only periodically purges stale entries. When we close a CloseableThreadLocal, we only take care of the current thread right now, others will be taken care of via the WeakReferences. Under G1GC, the WeakReferences of other threads may not be recycled even after several rounds of mix-GC. The ThreadLocalMap may grow very large, it can take an arbitrarily long amount of CPU and time to iterate the things you had stored in it.

      Hot thread of elasticsearch:

      ::: {xxxxxxxxx}{lCj7LcVnT328KHcJRd57yg}{WPiNCbk0R0SIKxg4-w3wew}{xxxxxxxx}{xxxxxxxx}
         Hot threads at 2020-04-12T05:25:10.224Z, interval=500ms, busiestThreads=3, ignoreIdleThreads=true:
         
         105.3% (526.5ms out of 500ms) cpu usage by thread 'elasticsearch[xxxxxxxx][bulk][T#31]'
           10/10 snapshots sharing following 34 elements
             java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:627)
             java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:509)
             java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:308)
             java.lang.ThreadLocal.remove(ThreadLocal.java:224)
             java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426)
             java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1349)
             java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881)
             org.elasticsearch.common.util.concurrent.ReleasableLock.close(ReleasableLock.java:49)
             org.elasticsearch.index.engine.InternalEngine.$closeResource(InternalEngine.java:356)
             org.elasticsearch.index.engine.InternalEngine.delete(InternalEngine.java:1272)
             org.elasticsearch.index.shard.IndexShard.delete(IndexShard.java:812)
             org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:779)
             org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnReplica(IndexShard.java:750)
             org.elasticsearch.action.bulk.TransportShardBulkAction.performOpOnReplica(TransportShardBulkAction.java:623)
             org.elasticsearch.action.bulk.TransportShardBulkAction.performOnReplica(TransportShardBulkAction.java:577) 

      Solution


      This bug does not reproduce under CMS. It can be reproduced under G1GC always.

      In fact, CloseableThreadLocal doesn't need to store entry twice in the hardRefs And ThreadLocals. Remove ThreadLocal from CloseableThreadLocal so that we would not be affected by the serious flaw of Java's built-in ThreadLocal. 

      See also


      https://github.com/elastic/elasticsearch/issues/56766

      https://bugs.openjdk.java.net/browse/JDK-8182982

      https://discuss.elastic.co/t/indexing-performance-degrading-over-time/40229/44

       

      Attachments

        1. image-2022-04-27-16-40-34-796.png
          222 kB
          Lucifer Boice
        2. image-2022-04-27-16-40-58-056.png
          222 kB
          Lucifer Boice
        3. image-2022-04-27-16-41-55-264.png
          161 kB
          Lucifer Boice

        Issue Links

          Activity

            People

              Unassigned Unassigned
              boice Lucifer Boice
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 5.5h
                  5.5h