Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
The test TestPendingInvalidateBlock failed sometimes. The stack info:
org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) Time elapsed: 7.703 sec <<< FAILURE! java.lang.AssertionError: expected:<2> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92)
It looks that the invalidateBlock has been removed before we do the check
// restart NN cluster.restartNameNode(true); dfs.delete(foo, true); Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); Assert.assertEquals(REPLICATION, cluster.getNamesystem() .getPendingDeletionBlocks()); Assert.assertEquals(REPLICATION, dfs.getPendingDeletionBlocksCount());
And I look into the related configurations. I found the property dfs.namenode.replication.interval was just set as 1 second in this test. And after the delay time of dfs.namenode.startup.delay.block.deletion.sec and the delete operation was slowly, it will cause this case. We can see the stack info before, the failed test costs 7.7s more than 5+1 second.
One way can improve this.
- Increase the time of dfs.namenode.replication.interval
Attachments
Attachments
Issue Links
- is related to
-
HDFS-10990 TestPendingInvalidateBlock should wait for IBRs
- Resolved