Uploaded image for project: 'Ignite'
  1. Ignite
  2. IGNITE-19563

Deadlock on indexes creation with multiple tables creation

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 3.0
    • None

    Description

      • Disruptor stripe took CP read lock and waits for index future (PK) to be completed. This future should be completed in configuration registry thread (see IndexManager#createIndexLocally);
      • Checkpoint thread is trying to take CP write lock;
      • Configuration registry thread is stuck on waiting for CP read lock as well, so it never will complete index future.

        

      "%sqllogic0%JRaft-FSMCaller-Disruptor-_stripe_6-0" #34 daemon prio=5 os_prio=0 cpu=46.88ms elapsed=26.72s tid=0x000002befd805000 nid=0x42e8 waiting on condition  [0x000000c8b92fe000]    java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@13.0.1/Native Method) - parking to wait for  <0x000000060f6d08a0> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(java.base@13.0.1/LockSupport.java:194) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@13.0.1/CompletableFuture.java:1867) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@13.0.1/ForkJoinPool.java:3137) at java.util.concurrent.CompletableFuture.waitingGet(java.base@13.0.1/CompletableFuture.java:1894) at java.util.concurrent.CompletableFuture.join(java.base@13.0.1/CompletableFuture.java:2114) at org.apache.ignite.internal.table.TableImpl.awaitIndexes(TableImpl.java:374) at org.apache.ignite.internal.table.TableImpl$1.get(TableImpl.java:232) at org.apache.ignite.internal.table.distributed.index.IndexUpdateHandler.buildIndex(IndexUpdateHandler.java:136) at org.apache.ignite.internal.table.distributed.raft.PartitionListener.lambda$handleBuildIndexCommand$14(PartitionListener.java:481) at org.apache.ignite.internal.table.distributed.raft.PartitionListener$$Lambda$2041/0x000000080151ac40.execute(Unknown Source) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$runConsistently$0(PersistentPageMemoryMvPartitionStorage.java:157) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage$$Lambda$1875/0x00000008014cf840.get(Unknown Source) at org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.busy(AbstractPageMemoryMvPartitionStorage.java:784) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.runConsistently(PersistentPageMemoryMvPartitionStorage.java:147) at org.apache.ignite.internal.table.distributed.raft.snapshot.outgoing.SnapshotAwarePartitionDataStorage.runConsistently(SnapshotAwarePartitionDataStorage.java:66) at org.apache.ignite.internal.table.distributed.raft.PartitionListener.handleBuildIndexCommand(PartitionListener.java:471) at org.apache.ignite.internal.table.distributed.raft.PartitionListener.lambda$onWrite$2(PartitionListener.java:187) at org.apache.ignite.internal.table.distributed.raft.PartitionListener$$Lambda$2037/0x0000000801518440.accept(Unknown Source) at java.util.Iterator.forEachRemaining(java.base@13.0.1/Iterator.java:133) at org.apache.ignite.internal.table.distributed.raft.PartitionListener.onWrite(PartitionListener.java:149) at org.apache.ignite.internal.raft.server.impl.JraftServerImpl$DelegatingStateMachine.onApply(JraftServerImpl.java:592) at org.apache.ignite.raft.jraft.core.FSMCallerImpl.doApplyTasks(FSMCallerImpl.java:561) at org.apache.ignite.raft.jraft.core.FSMCallerImpl.doCommitted(FSMCallerImpl.java:529) at org.apache.ignite.raft.jraft.core.FSMCallerImpl.runApplyTask(FSMCallerImpl.java:448) at org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:136) at org.apache.ignite.raft.jraft.core.FSMCallerImpl$ApplyTaskHandler.onEvent(FSMCallerImpl.java:130) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:217) at org.apache.ignite.raft.jraft.disruptor.StripedDisruptor$StripeEntryHandler.onEvent(StripedDisruptor.java:181) at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:137) at java.lang.Thread.run(java.base@13.0.1/Thread.java:830)   
      
      "%sqllogic0%vault-3" #107 prio=5 os_prio=0 cpu=171.88ms elapsed=26.34s tid=0x000002befec1f000 nid=0x6f74 waiting on condition  [0x000000c8be8fd000]    java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@13.0.1/Native Method) - parking to wait for  <0x000000060f4689b0> (a java.util.concurrent.CompletableFuture$Signaller) at java.util.concurrent.locks.LockSupport.park(java.base@13.0.1/LockSupport.java:194) at java.util.concurrent.CompletableFuture$Signaller.block(java.base@13.0.1/CompletableFuture.java:1867) at java.util.concurrent.ForkJoinPool.managedBlock(java.base@13.0.1/ForkJoinPool.java:3137) at java.util.concurrent.CompletableFuture.waitingGet(java.base@13.0.1/CompletableFuture.java:1894) at java.util.concurrent.CompletableFuture.get(java.base@13.0.1/CompletableFuture.java:2069) at org.apache.ignite.internal.util.IgniteUtils.getUninterruptibly(IgniteUtils.java:824) at org.apache.ignite.internal.pagememory.persistence.checkpoint.CheckpointTimeoutLock.checkpointReadLock(CheckpointTimeoutLock.java:163) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.lambda$runConsistently$0(PersistentPageMemoryMvPartitionStorage.java:152) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage$$Lambda$1875/0x00000008014cf840.get(Unknown Source) at org.apache.ignite.internal.storage.pagememory.mv.AbstractPageMemoryMvPartitionStorage.busy(AbstractPageMemoryMvPartitionStorage.java:784) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.runConsistently(PersistentPageMemoryMvPartitionStorage.java:147) at org.apache.ignite.internal.storage.pagememory.mv.PersistentPageMemoryMvPartitionStorage.getOrCreateHashIndex(PersistentPageMemoryMvPartitionStorage.java:303) at org.apache.ignite.internal.storage.pagememory.AbstractPageMemoryTableStorage.lambda$getOrCreateHashIndex$9(AbstractPageMemoryTableStorage.java:214) at org.apache.ignite.internal.storage.pagememory.AbstractPageMemoryTableStorage$$Lambda$2007/0x0000000801508840.get(Unknown Source) at org.apache.ignite.internal.util.IgniteUtils.inBusyLock(IgniteUtils.java:872) at org.apache.ignite.internal.storage.pagememory.AbstractPageMemoryTableStorage.busy(AbstractPageMemoryTableStorage.java:229) at org.apache.ignite.internal.storage.pagememory.AbstractPageMemoryTableStorage.getOrCreateHashIndex(AbstractPageMemoryTableStorage.java:207) at org.apache.ignite.internal.storage.engine.MvTableStorage.getOrCreateIndex(MvTableStorage.java:91) at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.startBuildIndex(PartitionReplicaListener.java:2476) at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener$1.lambda$onCreate$0(PartitionReplicaListener.java:2436) at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener$1$$Lambda$2006/0x0000000801508440.run(Unknown Source) at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener.inBusyLock(PartitionReplicaListener.java:2506) at org.apache.ignite.internal.table.distributed.replicator.PartitionReplicaListener$1.onCreate(PartitionReplicaListener.java:2432) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1$$Lambda$1755/0x000000080147bc40.apply(Unknown Source) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyPublicListeners(ConfigurationNotifier.java:488) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitNamedListNode(ConfigurationNotifier.java:206) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier$1.visitNamedListNode(ConfigurationNotifier.java:129) at org.apache.ignite.internal.schema.configuration.TablesNode.traverseChildren(Unknown Source) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyListeners(ConfigurationNotifier.java:129) at org.apache.ignite.internal.configuration.notifications.ConfigurationNotifier.notifyListeners(ConfigurationNotifier.java:91) at org.apache.ignite.internal.configuration.ConfigurationRegistry$3$1.visitInnerNode(ConfigurationRegistry.java:271) at org.apache.ignite.internal.configuration.ConfigurationRegistry$3$1.visitInnerNode(ConfigurationRegistry.java:254) at org.apache.ignite.internal.configuration.SuperRoot.traverseChildren(SuperRoot.java:105) at org.apache.ignite.internal.configuration.ConfigurationRegistry$3.onConfigurationUpdated(ConfigurationRegistry.java:254) at org.apache.ignite.internal.configuration.ConfigurationChanger$3.lambda$onEntriesChanged$1(ConfigurationChanger.java:658) at org.apache.ignite.internal.configuration.ConfigurationChanger$3$$Lambda$1747/0x0000000801479040.apply(Unknown Source) at java.util.concurrent.CompletableFuture$UniCompose.tryFire(java.base@13.0.1/CompletableFuture.java:1146) at java.util.concurrent.CompletableFuture.postComplete(java.base@13.0.1/CompletableFuture.java:506) at java.util.concurrent.CompletableFuture$AsyncRun.run(java.base@13.0.1/CompletableFuture.java:1813) at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@13.0.1/ThreadPoolExecutor.java:1128) at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@13.0.1/ThreadPoolExecutor.java:628) at java.lang.Thread.run(java.base@13.0.1/Thread.java:830)   
      
      "%sqllogic0%checkpoint-thread-1" #133 prio=5 os_prio=0 cpu=31.25ms elapsed=26.26s tid=0x000002befec3a800 nid=0x214c waiting on condition  [0x000000c8c03fe000]    java.lang.Thread.State: WAITING (parking) at jdk.internal.misc.Unsafe.park(java.base@13.0.1/Native Method) - parking to wait for  <0x00000006050084b0> (a java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync) at java.util.concurrent.locks.LockSupport.park(java.base@13.0.1/LockSupport.java:194) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(java.base@13.0.1/AbstractQueuedSynchronizer.java:885) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(java.base@13.0.1/AbstractQueuedSynchronizer.java:917) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(java.base@13.0.1/AbstractQueuedSynchronizer.java:1240) at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(java.base@13.0.1/ReentrantReadWriteLock.java:959) at org.apache.ignite.internal.pagememory.persistence.checkpoint.CheckpointReadWriteLock.writeLock(CheckpointReadWriteLock.java:129) at org.apache.ignite.internal.pagememory.persistence.checkpoint.CheckpointWorkflow.markCheckpointBegin(CheckpointWorkflow.java:209) at org.apache.ignite.internal.pagememory.persistence.checkpoint.Checkpointer.doCheckpoint(Checkpointer.java:280) at org.apache.ignite.internal.pagememory.persistence.checkpoint.Checkpointer.body(Checkpointer.java:205) at org.apache.ignite.internal.util.worker.IgniteWorker.run(IgniteWorker.java:108) at java.lang.Thread.run(java.base@13.0.1/Thread.java:830)

       

      Could be reproduced very fast on creation of multiple tables like it is described in IGNITE-19275 and with raft fsync turned off.

      Attachments

        Issue Links

          Activity

            People

              ibessonov Ivan Bessonov
              Denis Chudov Denis Chudov
              Kirill Tkalenko Kirill Tkalenko
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 20m
                  20m