Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10816

TestComputeInvalidateWork#testDatanodeReRegistration fails due to race between test and replication monitor

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.9.0, 3.0.0-alpha4, 2.8.2
    • None
    • None
    • Reviewed

    Description

      java.lang.AssertionError: Expected invalidate blocks to be the number of DNs expected:<3> but was:<2>
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.failNotEquals(Assert.java:743)
      	at org.junit.Assert.assertEquals(Assert.java:118)
      	at org.junit.Assert.assertEquals(Assert.java:555)
      	at org.apache.hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork.testDatanodeReRegistration(TestComputeInvalidateWork.java:160)
      

      The test fails because of a race condition between the test and the replication monitor. The default replication monitor interval is 3 seconds, which is just about how long the test normally takes to run. The test deletes a file and then subsequently gets the namesystem writelock. However, if the replication monitor fires in between those two instructions, the test will fail as it will itself invalidate one of the blocks. This can be easily reproduced by removing the sleep() in the ReplicationMonitor's run() method in BlockManager.java, so that the replication monitor executes as quickly as possible and exacerbates the race.

      To fix the test all that needs to be done is to turn off the replication monitor.

      Attachments

        1. HDFS-10816.001.patch
          0.9 kB
          Eric Badger
        2. HDFS-10816.002.patch
          0.9 kB
          Eric Badger
        3. HDFS-10816-branch-2.002.patch
          0.9 kB
          Eric Badger
        4. HDFS-10816.002.patch
          0.9 kB
          Kihwal Lee

        Issue Links

          Activity

            People

              ebadger Eric Badger
              ebadger Eric Badger
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: