Details
-
Bug
-
Status: Resolved
-
Critical
-
Resolution: Fixed
-
None
Description
Surefire fork Intermittently timeouts in TestDecommissionAndMaintenance.
Container DB added to cache:
2023-05-03 08:18:26,909 [EndpointStateMachine task thread for /0.0.0.0:43723 - 0 ] INFO utils.DatanodeStoreCache (DatanodeStoreCache.java:addDB(58)) - Added db /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db to cache
but then not found and tried to open again:
2023-05-03 08:18:57,086 [Command processor thread] ERROR utils.DatanodeStoreCache (DatanodeStoreCache.java:getDB(74)) - Failed to get DB store /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db java.io.IOException: Failed init RocksDB, db path : /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db, exception :org.rocksdb.RocksDBException lock hold by current process, acquire time 1683101936 acquiring thread 139985634854656: /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db/LOCK: No locks available at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:182) at org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:212) at org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.start(AbstractDatanodeStore.java:147) at org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.<init>(AbstractDatanodeStore.java:99) at org.apache.hadoop.ozone.container.metadata.DatanodeStoreSchemaThreeImpl.<init>(DatanodeStoreSchemaThreeImpl.java:66) at org.apache.hadoop.ozone.container.common.utils.DatanodeStoreCache.getDB(DatanodeStoreCache.java:69) at org.apache.hadoop.ozone.container.keyvalue.helpers.BlockUtils.getDB(BlockUtils.java:132) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.flushAndSyncDB(KeyValueContainer.java:444) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.closeAndFlushIfNeeded(KeyValueContainer.java:385) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.quasiClose(KeyValueContainer.java:355) at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.quasiCloseContainer(KeyValueHandler.java:1121) at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.quasiCloseContainer(ContainerController.java:142) at org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.notifyGroupRemove(ContainerStateMachine.java:1052) at org.apache.ratis.server.impl.RaftServerImpl.groupRemove(RaftServerImpl.java:423) at org.apache.ratis.server.impl.RaftServerProxy.lambda$groupRemoveAsync$12(RaftServerProxy.java:530) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) at java.util.concurrent.CompletableFuture.uniApplyStage(CompletableFuture.java:628) at java.util.concurrent.CompletableFuture.thenApply(CompletableFuture.java:1996) at org.apache.ratis.server.impl.RaftServerProxy.groupRemoveAsync(RaftServerProxy.java:529) at org.apache.ratis.server.impl.RaftServerProxy.groupManagementAsync(RaftServerProxy.java:479) at org.apache.ratis.server.impl.RaftServerProxy.groupManagement(RaftServerProxy.java:459) at org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.removeGroup(XceiverServerRatis.java:822) at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.ClosePipelineCommandHandler.handle(ClosePipelineCommandHandler.java:77) at org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:99) at org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$3(DatanodeStateMachine.java:644) at java.lang.Thread.run(Thread.java:750)
This continues until the fork is killed:
2023-05-03 08:33:24,505 [Command processor thread] ERROR utils.DatanodeStoreCache (DatanodeStoreCache.java:getDB(74)) - Failed to get DB store /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db java.io.IOException: Failed init RocksDB, db path : /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db, exception :org.rocksdb.RocksDBException lock hold by current process, acquire time 1683101936 acquiring thread 139985634854656: /home/runner/work/ozone/ozone/hadoop-ozone/integration-test/target/test-dir/MiniOzoneClusterImpl-ff176d5b-bea5-4cbe-a997-8236a6853a89/datanode-0/data-0/containers/hdds/ff176d5b-bea5-4cbe-a997-8236a6853a89/DS-4328e108-8c1a-4a6f-8bff-6f686dd50b24/container.db/LOCK: No locks available at org.apache.hadoop.hdds.utils.db.RDBStore.<init>(RDBStore.java:182) at org.apache.hadoop.hdds.utils.db.DBStoreBuilder.build(DBStoreBuilder.java:212) at org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.start(AbstractDatanodeStore.java:147) at org.apache.hadoop.ozone.container.metadata.AbstractDatanodeStore.<init>(AbstractDatanodeStore.java:99) at org.apache.hadoop.ozone.container.metadata.DatanodeStoreSchemaThreeImpl.<init>(DatanodeStoreSchemaThreeImpl.java:66) at org.apache.hadoop.ozone.container.common.utils.DatanodeStoreCache.getDB(DatanodeStoreCache.java:69) at org.apache.hadoop.ozone.container.keyvalue.helpers.BlockUtils.getDB(BlockUtils.java:132) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.flushAndSyncDB(KeyValueContainer.java:444) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.closeAndFlushIfNeeded(KeyValueContainer.java:385) at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.quasiClose(KeyValueContainer.java:355)
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/21/21757/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/24/21800/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/24/21805/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/27/21885/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/27/21895/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/04/28/21927/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/05/03/21994/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
- https://github.com/adoroszlai/ozone-build-results/blob/master/2023/05/03/21995/it-flaky/hadoop-ozone/integration-test/org.apache.hadoop.ozone.scm.node.TestDecommissionAndMaintenance-output.txt
Attachments
Issue Links
- links to