Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
None
Description
During cancel of a test run, I observed this deadlock in the DatasetLifeCycleManager. It looks like the checkpoint thread is holding the optracker but needs the monitor on the DatasetLifeCycleManager, and the DatasetLifecycleManager needs the converse. This in turn, prevents clean shutdown.
"org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996" daemon prio=5 tid=0x74 nid=NA waiting for monitor entry
java.lang.Thread.State: BLOCKED
blocks org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995
waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995 to release lock on <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager)
at org.apache.asterix.common.context.DatasetLifecycleManager.allocateDatasetMemory(DatasetLifecycleManager.java:639)
at org.apache.asterix.common.context.PrimaryIndexOperationTracker.beforeOperation(PrimaryIndexOperationTracker.java:64)
at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.enterComponents(LSMHarness.java:180)
at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.getAndEnterComponents(LSMHarness.java:115)
- locked <0x17dd> (a org.apache.asterix.common.context.PrimaryIndexOperationTracker)
at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:333)
at org.apache.hyracks.storage.am.lsm.common.impls.LSMHarness.modify(LSMHarness.java:327)
at org.apache.hyracks.storage.am.lsm.common.impls.LSMTreeIndexAccessor.insert(LSMTreeIndexAccessor.java:50)
at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.nextFrame(AsterixLSMInsertDeleteOperatorNodePushable.java:102)
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:342)
at org.apache.hyracks.control.nc.Task.run(Task.java:290)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995" daemon prio=5 tid=0x77 nid=NA waiting for monitor entry
java.lang.Thread.State: BLOCKED
blocks Thread-55@5983
blocks org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996
waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:2:0:0@5996 to release lock on <0x17dd> (a org.apache.asterix.common.context.PrimaryIndexOperationTracker)
at org.apache.asterix.common.context.DatasetLifecycleManager.open(DatasetLifecycleManager.java:205)
- locked <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager)
at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.open(IndexDataflowHelper.java:116)
at org.apache.asterix.common.dataflow.AsterixLSMInsertDeleteOperatorNodePushable.open(AsterixLSMInsertDeleteOperatorNodePushable.java:61)
at org.apache.hyracks.control.nc.Task.pushFrames(Task.java:334)
at org.apache.hyracks.control.nc.Task.run(Task.java:290)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
"Thread-55@5983" prio=5 tid=0x5a nid=NA waiting for monitor entry
java.lang.Thread.State: BLOCKED
waiting for org.apache.hyracks.api.rewriter.runtime.SuperActivity:TAID:TID:ANID:ODID:0:0:3:0:0@5995 to release lock on <0x17dc> (a org.apache.asterix.common.context.DatasetLifecycleManager)
at org.apache.asterix.common.context.DatasetLifecycleManager.flushAllDatasets(DatasetLifecycleManager.java:474)
at org.apache.asterix.transaction.management.service.recovery.RecoveryManager.checkpoint(RecoveryManager.java:406)
- locked <0x17f5> (a org.apache.asterix.transaction.management.service.recovery.RecoveryManager)
at org.apache.asterix.hyracks.bootstrap.NCApplicationEntryPoint.stop(NCApplicationEntryPoint.java:132)
at org.apache.hyracks.control.nc.NodeControllerService.stop(NodeControllerService.java:347) - locked <0x17f7> (a org.apache.hyracks.control.nc.NodeControllerService)
at org.apache.hyracks.control.nc.NodeControllerService$JVMShutdownHook.run(NodeControllerService.java:588)
Attachments
Attachments
Issue Links
- incorporates
-
ASTERIXDB-1171 Deadlock between DatasetLifeCycleManager and PrimaryIndexOperationTracker
- Resolved