Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-7696

Deadlock at GridDhtAtomicCache.lockEntries called through GridDhtAtomicCache.updateAllAsyncInternal

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.3
    • None
    • cache
    • None
      • Ignite 2.3
      • OpenJDK version "1.8.0_151"
      • Linux 4.4.0

    Description

      We observed that all nodes in a cluster completely stalls and put/get/remove operations to a cache block for ever. When it happens, we can see following log in thread dump:

      2018-02-14_04:21:33.84410 Found one Java-level deadlock:
      2018-02-14_04:21:33.84410 =============================
      2018-02-14_04:21:33.84411 "sys-#41%IgniteManager%":
      2018-02-14_04:21:33.84411   waiting to lock monitor 0x00007f6d5e41a558 (object 0x0000000781083ef0, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry),
      2018-02-14_04:21:33.84411   which is held by "sys-stripe-5-#6%IgniteManager%"
      2018-02-14_04:21:33.84412 "sys-stripe-5-#6%IgniteManager%":
      2018-02-14_04:21:33.84412   waiting to lock monitor 0x00007f6d5e41de68 (object 0x0000000781083e70, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
      2018-02-14_04:21:33.84412   in JNI, which is held by "sys-stripe-2-#3%IgniteManager%"
      2018-02-14_04:21:33.84412 "sys-stripe-2-#3%IgniteManager%":
      2018-02-14_04:21:33.84413   waiting to lock monitor 0x00007f6d5e41a558 (object 0x0000000781083ef0, a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
      2018-02-14_04:21:33.84413   in JNI, which is held by "sys-stripe-5-#6%IgniteManager%"
      2018-02-14_04:21:33.84413
      2018-02-14_04:21:33.84414 Java stack information for the threads listed above:
      2018-02-14_04:21:33.84414 ===================================================
      2018-02-14_04:21:33.84416 "sys-#41%IgniteManager%":
      2018-02-14_04:21:33.84416       at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.markObsoleteVersion(GridCacheMapEntry.java:2153)
      2018-02-14_04:21:33.84417       - waiting to lock <0x0000000781083ef0> (a org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheEntry)
      2018-02-14_04:21:33.84417       at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.removeVersionedEntry(GridDhtLocalPartition.java:368)
      2018-02-14_04:21:33.84418       at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtLocalPartition.cleanupRemoveQueue(GridDhtLocalPartition.java:392)
      2018-02-14_04:21:33.84418       at org.apache.ignite.internal.processors.cache.GridCacheProcessor$RemovedItemsCleanupTask$1.run(GridCacheProcessor.java:4051)
      2018-02-14_04:21:33.84418       at org.apache.ignite.internal.util.IgniteUtils.wrapThreadLoader(IgniteUtils.java:6687)
      2018-02-14_04:21:33.84419       at org.apache.ignite.internal.processors.closure.GridClosureProcessor$1.body(GridClosureProcessor.java:827)
      2018-02-14_04:21:33.84419       at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
      2018-02-14_04:21:33.84419       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      2018-02-14_04:21:33.84420       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      2018-02-14_04:21:33.84421       at java.lang.Thread.run(Thread.java:748)
      2018-02-14_04:21:33.84421 "sys-stripe-5-#6%IgniteManager%":
      2018-02-14_04:21:33.84421       at sun.misc.Unsafe.monitorEnter(Native Method)
      2018-02-14_04:21:33.84421       at org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
      2018-02-14_04:21:33.84422       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
      2018-02-14_04:21:33.84422       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
      2018-02-14_04:21:33.84423       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
      2018-02-14_04:21:33.84423       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
      2018-02-14_04:21:33.84424       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
      2018-02-14_04:21:33.84424       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
      2018-02-14_04:21:33.84425       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
      2018-02-14_04:21:33.84425       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
      2018-02-14_04:21:33.84425       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
      2018-02-14_04:21:33.84426       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
      2018-02-14_04:21:33.84426       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
      2018-02-14_04:21:33.84426       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
      2018-02-14_04:21:33.84427       at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
      2018-02-14_04:21:33.84427       at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
      2018-02-14_04:21:33.84427       at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
      2018-02-14_04:21:33.84428       at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
      2018-02-14_04:21:33.84429       at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
      2018-02-14_04:21:33.84429       at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
      2018-02-14_04:21:33.84429       at java.lang.Thread.run(Thread.java:748)
      2018-02-14_04:21:33.84429 "sys-stripe-2-#3%IgniteManager%":
      2018-02-14_04:21:33.84430       at sun.misc.Unsafe.monitorEnter(Native Method)
      2018-02-14_04:21:33.84430       at org.apache.ignite.internal.util.GridUnsafe.monitorEnter(GridUnsafe.java:1207)
      2018-02-14_04:21:33.84430       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2848)
      2018-02-14_04:21:33.84431       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1707)
      2018-02-14_04:21:33.84431       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1629)
      2018-02-14_04:21:33.84431       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3056)
      2018-02-14_04:21:33.84433       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:131)
      2018-02-14_04:21:33.84433       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:267)
      2018-02-14_04:21:33.84433       at org.apache.ignite.internal.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$5.apply(GridDhtAtomicCache.java:262)
      2018-02-14_04:21:33.84434       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
      2018-02-14_04:21:33.84434       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
      2018-02-14_04:21:33.84434       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
      2018-02-14_04:21:33.84435       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
      2018-02-14_04:21:33.84435       at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
      2018-02-14_04:21:33.84436       at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
      2018-02-14_04:21:33.84437       at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1555)
      2018-02-14_04:21:33.84437       at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1183)
      2018-02-14_04:21:33.84437       at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
      2018-02-14_04:21:33.84438       at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1090)
      2018-02-14_04:21:33.84438       at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
      2018-02-14_04:21:33.84438       at java.lang.Thread.run(Thread.java:748)
      2018-02-14_04:21:33.84439
      2018-02-14_04:21:33.84439 Found 1 deadlock.
      

      This indicates that GridDhtAtomicCache.lockEntries (threads sys-stripe-5-... and sys-stripe-7-...) is causing deadlock, and GridCacheMapEntry.markObsoleteVersion (sys-#41...) is involved in it:

      • Thread "sys-stripe-5-..." locked 0x...3ef0, waits for 0x...3e70
      • Thread "sys-stripe-7-..." locked 0x...3e70, waits for 0x...3ef0
      • Thread "sys-#41..." waits for 0x...3ef0

       

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              frsyuki Sadayuki Furuhashi
              Votes:
              1 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: