Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-9766

TestDataNodeMetrics#testDataNodeTimeSpend fails intermittently

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 2.8.0, 2.7.3, 3.0.0-alpha1
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Stacktrace

      java.lang.AssertionError: null
      	at org.junit.Assert.fail(Assert.java:86)
      	at org.junit.Assert.assertTrue(Assert.java:41)
      	at org.junit.Assert.assertTrue(Assert.java:52)
      	at org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics.testDataNodeTimeSpend(TestDataNodeMetrics.java:289)
      

      See recent builds:

        Activity

        Hide
        liuml07 Mingliang Liu added a comment -

        There is a 0.1 seconds sleep, which is placed wrongly as discussed in HDFS-9139.

        Don't know if 0.1 seconds is good as this may still fail in a slow test machine.

        Show
        liuml07 Mingliang Liu added a comment - There is a 0.1 seconds sleep, which is placed wrongly as discussed in HDFS-9139 . Don't know if 0.1 seconds is good as this may still fail in a slow test machine.
        Hide
        xiaochen Xiao Chen added a comment -

        Thanks Mingliang Liu for creating this and analyzing the cause. I met the same failure in a precommit in HDFS-9804.

        I think we can use a waitFor to fix the flakiness on sleep. I also think we may need a better way than {{for (int x =0; x < 50; x++) }} to make sure the metrics in fact increased.
        Attached patch 1 to this direction. Please see if it makes sense to you.

        Show
        xiaochen Xiao Chen added a comment - Thanks Mingliang Liu for creating this and analyzing the cause. I met the same failure in a precommit in HDFS-9804 . I think we can use a waitFor to fix the flakiness on sleep. I also think we may need a better way than {{for (int x =0; x < 50; x++) }} to make sure the metrics in fact increased. Attached patch 1 to this direction. Please see if it makes sense to you.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 11s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
        +1 mvninstall 7m 29s trunk passed
        +1 compile 0m 42s trunk passed with JDK v1.8.0_72
        +1 compile 0m 43s trunk passed with JDK v1.7.0_95
        +1 checkstyle 0m 20s trunk passed
        +1 mvnsite 0m 54s trunk passed
        +1 mvneclipse 0m 14s trunk passed
        +1 findbugs 2m 3s trunk passed
        +1 javadoc 1m 7s trunk passed with JDK v1.8.0_72
        +1 javadoc 1m 46s trunk passed with JDK v1.7.0_95
        +1 mvninstall 0m 46s the patch passed
        +1 compile 0m 38s the patch passed with JDK v1.8.0_72
        +1 javac 0m 38s the patch passed
        +1 compile 0m 41s the patch passed with JDK v1.7.0_95
        +1 javac 0m 41s the patch passed
        +1 checkstyle 0m 19s the patch passed
        +1 mvnsite 0m 53s the patch passed
        +1 mvneclipse 0m 12s the patch passed
        +1 whitespace 0m 0s Patch has no whitespace issues.
        +1 findbugs 2m 15s the patch passed
        +1 javadoc 1m 6s the patch passed with JDK v1.8.0_72
        +1 javadoc 1m 47s the patch passed with JDK v1.7.0_95
        -1 unit 56m 28s hadoop-hdfs in the patch failed with JDK v1.8.0_72.
        +1 unit 52m 55s hadoop-hdfs in the patch passed with JDK v1.7.0_95.
        +1 asflicense 0m 26s Patch does not generate ASF License warnings.
        135m 57s



        Reason Tests
        JDK v1.8.0_72 Failed junit tests hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock
          hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:0ca8df7
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789883/HDFS-9766.01.patch
        JIRA Issue HDFS-9766
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 16af9b5e0d0d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 8808779
        Default Java 1.7.0_95
        Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/14615/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt
        unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14615/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt
        JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14615/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14615/console
        Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 11s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 29s trunk passed +1 compile 0m 42s trunk passed with JDK v1.8.0_72 +1 compile 0m 43s trunk passed with JDK v1.7.0_95 +1 checkstyle 0m 20s trunk passed +1 mvnsite 0m 54s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 2m 3s trunk passed +1 javadoc 1m 7s trunk passed with JDK v1.8.0_72 +1 javadoc 1m 46s trunk passed with JDK v1.7.0_95 +1 mvninstall 0m 46s the patch passed +1 compile 0m 38s the patch passed with JDK v1.8.0_72 +1 javac 0m 38s the patch passed +1 compile 0m 41s the patch passed with JDK v1.7.0_95 +1 javac 0m 41s the patch passed +1 checkstyle 0m 19s the patch passed +1 mvnsite 0m 53s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s Patch has no whitespace issues. +1 findbugs 2m 15s the patch passed +1 javadoc 1m 6s the patch passed with JDK v1.8.0_72 +1 javadoc 1m 47s the patch passed with JDK v1.7.0_95 -1 unit 56m 28s hadoop-hdfs in the patch failed with JDK v1.8.0_72. +1 unit 52m 55s hadoop-hdfs in the patch passed with JDK v1.7.0_95. +1 asflicense 0m 26s Patch does not generate ASF License warnings. 135m 57s Reason Tests JDK v1.8.0_72 Failed junit tests hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock   hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness Subsystem Report/Notes Docker Image:yetus/hadoop:0ca8df7 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12789883/HDFS-9766.01.patch JIRA Issue HDFS-9766 Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 16af9b5e0d0d 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 8808779 Default Java 1.7.0_95 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_72 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_95 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/14615/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt unit test logs https://builds.apache.org/job/PreCommit-HDFS-Build/14615/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_72.txt JDK v1.7.0_95 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/14615/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/14615/console Powered by Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        xiaochen Xiao Chen added a comment -

        Failed test looks unrelated. Please review. Thanks!

        Show
        xiaochen Xiao Chen added a comment - Failed test looks unrelated. Please review. Thanks!
        Hide
        liuml07 Mingliang Liu added a comment -

        Using waitFor is much better than the fixed- time sleep, which is fragile and error-prone. We expect the metrics should be updated in 30s or else it fails with a timed out exception.

        +1 (non-binding) to the current fix.

        Show
        liuml07 Mingliang Liu added a comment - Using waitFor is much better than the fixed- time sleep, which is fragile and error-prone. We expect the metrics should be updated in 30s or else it fails with a timed out exception. +1 (non-binding) to the current fix.
        Hide
        xiaochen Xiao Chen added a comment -

        Thanks Mingliang Liu for the review and further interpret.
        FYI - Locally the test passes around 5s on my box, so I think using a 30s timeout is reasonable.

        Show
        xiaochen Xiao Chen added a comment - Thanks Mingliang Liu for the review and further interpret. FYI - Locally the test passes around 5s on my box, so I think using a 30s timeout is reasonable.
        Hide
        ajisakaa Akira Ajisaka added a comment -

        LGTM, +1.

        Show
        ajisakaa Akira Ajisaka added a comment - LGTM, +1.
        Hide
        ajisakaa Akira Ajisaka added a comment -

        Committed this to branch-2.7 and above. Thanks Xiao Chen for the contribution and thanks Mingliang Liu for the review!

        Show
        ajisakaa Akira Ajisaka added a comment - Committed this to branch-2.7 and above. Thanks Xiao Chen for the contribution and thanks Mingliang Liu for the review!
        Hide
        hudson Hudson added a comment -

        FAILURE: Integrated in Hadoop-trunk-Commit #9409 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9409/)
        HDFS-9766. TestDataNodeMetrics#testDataNodeTimeSpend fails (aajisaka: rev e2ddf824694eb4605f3bb04a9c26e4b98529f5bc)

        • hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
        • hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java
        Show
        hudson Hudson added a comment - FAILURE: Integrated in Hadoop-trunk-Commit #9409 (See https://builds.apache.org/job/Hadoop-trunk-Commit/9409/ ) HDFS-9766 . TestDataNodeMetrics#testDataNodeTimeSpend fails (aajisaka: rev e2ddf824694eb4605f3bb04a9c26e4b98529f5bc) hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestDataNodeMetrics.java
        Hide
        xiaochen Xiao Chen added a comment -
        Show
        xiaochen Xiao Chen added a comment - Thanks Akira Ajisaka and Mingliang Liu .
        Hide
        vinodkv Vinod Kumar Vavilapalli added a comment -

        Closing the JIRA as part of 2.7.3 release.

        Show
        vinodkv Vinod Kumar Vavilapalli added a comment - Closing the JIRA as part of 2.7.3 release.

          People

          • Assignee:
            xiaochen Xiao Chen
            Reporter:
            liuml07 Mingliang Liu
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development