Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9865

TestBlockReplacement fails intermittently in trunk

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 2.7.1
    • Fix Version/s: 2.8.0, 2.7.3, 3.0.0-alpha1
    • Component/s: test
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Reviewed

      Description

      I found the testcase TestBlockReplacement will be failed sometimes in testing. And I looked the unit log, always I will found these infos:

      org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
      testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)  Time elapsed: 8.764 sec  <<< FAILURE!
      java.lang.AssertionError: The block should be only on 1 datanode  expected:<1> but was:<2>
      	at org.junit.Assert.fail(Assert.java:88)
      	at org.junit.Assert.failNotEquals(Assert.java:743)
      	at org.junit.Assert.assertEquals(Assert.java:118)
      	at org.junit.Assert.assertEquals(Assert.java:555)
      	at org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
      

      Finally I found the reason is that not deleting block completely in testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. And the time to wait FsDatasetAsyncDsikService to delete the block is not a accurate value.

      LOG.info("replaceBlock:  " + replaceBlock(block,
                (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
                (DatanodeInfo)destDnDesc));
      // Waiting for the FsDatasetAsyncDsikService to delete the block
      Thread.sleep(3000);
      

      When I adjust this time to 1 seconds, it will be always failed. Also the 3 seconds in test is not a accurate value too. We should adjust these code's logic to a better way such as waiting for the block to be replicated in testDecommision.

      1. HDFS-9865.001.patch
        2 kB
        Yiqun Lin
      2. HDFS-9865.002.patch
        2 kB
        Yiqun Lin

        Activity

        Hide
        linyiqun Yiqun Lin added a comment -

        Attach a initial patch, I adjust the waiting code as a dynamic way. Kindly review, thanks.

        Show
        linyiqun Yiqun Lin added a comment - Attach a initial patch, I adjust the waiting code as a dynamic way. Kindly review, thanks.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 10s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 51s trunk passed
        +1 compile 0m 37s trunk passed with JDK v1.8.0_72
        +1 compile 0m 39s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 20s trunk passed
        +1 mvnsite 0m 50s trunk passed
        +1 mvneclipse 0m 13s trunk passed
        +1 findbugs 1m 53s trunk passed
        +1 javadoc 1m 3s trunk passed with JDK v1.8.0_72
        +1 javadoc 1m 45s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 44s the patch passed
        +1 compile 0m 35s the patch passed with JDK v1.8.0_72
        +1 javac 0m 35s the patch passed
        +1 compile 0m 38s the patch passed with JDK v1.7.0_95
        +1 javac 0m 38s the patch passed
        +1 checkstyle 0m 18s the patch passed
        +1 mvnsite 0m 48s the patch passed
        +1 mvneclipse 0m 11s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 2m 5s the patch passed
        +1 javadoc 0m 59s the patch passed with JDK v1.8.0_72
        +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95
        -1 unit 52m 32s hadoop-hdfs in the patch failed with JDK v1.8.0_72.
        +1 unit 51m 12s hadoop-hdfs in the patch passed with JDK v1.7.0_95.
        +1 asflicense 0m 21s Patch does not generate ASF License warnings.
        128m 20s



        Reason Tests
        JDK v1.8.0_72 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12790114/HDFS-9865.001.patch
        JIRA Issue HDFS-9865
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux dc563ed9f8a2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / d7fdec1
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14628/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14628/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14628/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14628/console
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 51s trunk passed +1 compile 0m 37s trunk passed with JDK v1.8.0_72 +1 compile 0m 39s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 53s trunk passed +1 javadoc 1m 3s trunk passed with JDK v1.8.0_72 +1 javadoc 1m 45s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 44s the patch passed +1 compile 0m 35s the patch passed with JDK v1.8.0_72 +1 javac 0m 35s the patch passed +1 compile 0m 38s the patch passed with JDK v1.7.0_95 +1 javac 0m 38s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 48s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 5s the patch passed +1 javadoc 0m 59s the patch passed with JDK v1.8.0_72 +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95 -1 unit 52m 32s hadoop-hdfs in the patch failed with JDK v1.8.0_72. +1 unit 51m 12s hadoop-hdfs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 21s Patch does not generate ASF License warnings. 128m 20s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12790114/HDFS-9865.001.patch JIRA Issue HDFS-9865 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux dc563ed9f8a2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / d7fdec1 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/14628/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14628/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14628/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14628/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        linyiqun Yiqun Lin added a comment -

        The failed test is not related, thanks review.

        Show
        linyiqun Yiqun Lin added a comment - The failed test is not related, thanks review.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        Thanks for working on this, Yiqun Lin.

        415	      int tries = 0;
        416	      while (tries++ < 20) {
        

        for (int tries = 0; tries < 20; tries++) should be better to limit the scope of retreis.

        418	        // Triggering the incremental block report to report the deleted block
        419	        // to namnemode
        420	        cluster.getDataNodes().get(0).triggerBlockReport(
        421	            new BlockReportOptions.Factory().setIncremental(true).build());
        

        Can you replace triggerBlockReport with DataNodeTestUtils#triggerDeletionReport? Though triggerBlockReport was in the original code, sending just IBR is appropriate and faster.

        Show
        iwasakims Masatake Iwasaki added a comment - Thanks for working on this, Yiqun Lin . 415 int tries = 0; 416 while (tries++ < 20) { for (int tries = 0; tries < 20; tries++) should be better to limit the scope of retreis . 418 // Triggering the incremental block report to report the deleted block 419 // to namnemode 420 cluster.getDataNodes().get(0).triggerBlockReport( 421 new BlockReportOptions.Factory().setIncremental( true ).build()); Can you replace triggerBlockReport with DataNodeTestUtils#triggerDeletionReport ? Though triggerBlockReport was in the original code, sending just IBR is appropriate and faster.
        Hide
        linyiqun Yiqun Lin added a comment -

        Thanks Masatake Iwasaki for review. Update the latest patch for addressing comments.

        Show
        linyiqun Yiqun Lin added a comment - Thanks Masatake Iwasaki for review. Update the latest patch for addressing comments.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 11m 55s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 6m 51s trunk passed
        +1 compile 0m 40s trunk passed with JDK v1.8.0_74
        +1 compile 0m 39s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 20s trunk passed
        +1 mvnsite 0m 50s trunk passed
        +1 mvneclipse 0m 13s trunk passed
        +1 findbugs 1m 59s trunk passed
        +1 javadoc 1m 4s trunk passed with JDK v1.8.0_74
        +1 javadoc 1m 45s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 44s the patch passed
        +1 compile 0m 35s the patch passed with JDK v1.8.0_74
        +1 javac 0m 35s the patch passed
        +1 compile 0m 39s the patch passed with JDK v1.7.0_95
        +1 javac 0m 39s the patch passed
        +1 checkstyle 0m 18s the patch passed
        +1 mvnsite 0m 50s the patch passed
        +1 mvneclipse 0m 11s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 2m 10s the patch passed
        +1 javadoc 1m 6s the patch passed with JDK v1.8.0_74
        +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95
        -1 unit 64m 56s hadoop-hdfs in the patch failed with JDK v1.8.0_74.
        -1 unit 57m 27s hadoop-hdfs in the patch failed with JDK v1.7.0_95.
        +1 asflicense 0m 26s Patch does not generate ASF License warnings.
        159m 32s



        Reason Tests
        JDK v1.8.0_74 Failed junit tests hadoop.hdfs.TestFileAppend
          hadoop.hdfs.server.namenode.TestDecommissioningStatus
        JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2
        JDK v1.7.0_95 Failed junit tests hadoop.hdfs.shortcircuit.TestShortCircuitCache
          hadoop.hdfs.server.datanode.TestDataNodeLifeline



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12791747/HDFS-9865.002.patch
        JIRA Issue HDFS-9865
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux eb083ee11e24 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / fd1c09b
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14733/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14733/console
        Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 11m 55s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 51s trunk passed +1 compile 0m 40s trunk passed with JDK v1.8.0_74 +1 compile 0m 39s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 59s trunk passed +1 javadoc 1m 4s trunk passed with JDK v1.8.0_74 +1 javadoc 1m 45s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 44s the patch passed +1 compile 0m 35s the patch passed with JDK v1.8.0_74 +1 javac 0m 35s the patch passed +1 compile 0m 39s the patch passed with JDK v1.7.0_95 +1 javac 0m 39s the patch passed +1 checkstyle 0m 18s the patch passed +1 mvnsite 0m 50s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 10s the patch passed +1 javadoc 1m 6s the patch passed with JDK v1.8.0_74 +1 javadoc 1m 43s the patch passed with JDK v1.7.0_95 -1 unit 64m 56s hadoop-hdfs in the patch failed with JDK v1.8.0_74. -1 unit 57m 27s hadoop-hdfs in the patch failed with JDK v1.7.0_95. +1 asflicense 0m 26s Patch does not generate ASF License warnings. 159m 32s Reason Tests JDK v1.8.0_74 Failed junit tests hadoop.hdfs.TestFileAppend   hadoop.hdfs.server.namenode.TestDecommissioningStatus JDK v1.8.0_74 Timed out junit tests org.apache.hadoop.hdfs.TestLeaseRecovery2 JDK v1.7.0_95 Failed junit tests hadoop.hdfs.shortcircuit.TestShortCircuitCache   hadoop.hdfs.server.datanode.TestDataNodeLifeline Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12791747/HDFS-9865.002.patch JIRA Issue HDFS-9865 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux eb083ee11e24 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fd1c09b Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_74 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt unit https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_74.txt https://builds.apache.org/job/PreCommit-HDFS-Build/14733/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_95.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14733/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14733/console Powered by Apache Yetus 0.3.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        iwasakims Masatake Iwasaki added a comment -

        +1. Committed to branch-2.7 and above. Thanks, Yiqun Lin.

        Show
        iwasakims Masatake Iwasaki added a comment - +1. Committed to branch-2.7 and above. Thanks, Yiqun Lin .
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9436 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9436/)
        HDFS-9865. TestBlockReplacement fails intermittently in trunk (Lin Yiqun (iwasakims: rev d718fc1ee5aee3628e105339ee3ea183b6242409)

        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9436 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9436/ ) HDFS-9865 . TestBlockReplacement fails intermittently in trunk (Lin Yiqun (iwasakims: rev d718fc1ee5aee3628e105339ee3ea183b6242409) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java
        Hide
        linyiqun Yiqun Lin added a comment -

        Thanks Masatake Iwasaki for commit!

        Show
        linyiqun Yiqun Lin added a comment - Thanks Masatake Iwasaki for commit!
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Closing the JIRA as part of 2.7.3 release.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.

          People

          • Assignee:
            linyiqun Yiqun Lin
            Reporter:
            linyiqun Yiqun Lin
          • Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development