Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-2186

DN volume failures on startup are not counted

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.23.0
    • Component/s: datanode
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Volume failures detected on startup are not currently counted/reported as such. Eg if you have configured 4 volumes, 2 tolerated failures, and you start a DN with two failed volumes it will come up and report (to the NN) no failed volumes. The DN will still be able to tolerate 2 additional volume failures (ie it's OK with no valid volumes remaining). The intent of the volume failure toleration config value is that if more than this # of volumes of the total set of configured volumes have failed the DN should shutdown, therefore volume failures detected on startup should count against this quota.

      1. hdfs-2186-1.patch
        5 kB
        Eli Collins
      2. hdfs-2186-2.patch
        6 kB
        Eli Collins

        Activity

        Hide
        Eli Collins added a comment -

        Patch attached. Considers any configured volume that is not a valid storage directory to be a failed volume. The new test asserts that a failed volume on startup is seen as such by the NN. This is visible in the web UI "failed volumes" field now, ie a DN that starts with 1 failed volume will show 1 failed volume on the web UI.

        Show
        Eli Collins added a comment - Patch attached. Considers any configured volume that is not a valid storage directory to be a failed volume. The new test asserts that a failed volume on startup is seen as such by the NN. This is visible in the web UI "failed volumes" field now, ie a DN that starts with 1 failed volume will show 1 failed volume on the web UI.
        Hide
        Eli Collins added a comment -
        +1 overall.  
        
            +1 @author.  The patch does not contain any @author tags.
        
            +1 tests included.  The patch appears to include 3 new or modified tests.
        
            +1 javadoc.  The javadoc tool did not generate any warning messages.
        
            +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
        
            +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) warnings.
        
            +1 release audit.  The applied patch does not increase the total number of release audit warnings.
        
            +1 system test framework.  The patch passed system test framework compile.
        
        Show
        Eli Collins added a comment - +1 overall. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 system test framework. The patch passed system test framework compile.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12487404/hdfs-2186-1.patch
        against trunk revision 1156860.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 javac. The patch appears to cause tar ant target to fail.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:

        -1 contrib tests. The patch failed contrib unit tests.

        -1 system test framework. The patch failed system test framework compile.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12487404/hdfs-2186-1.patch against trunk revision 1156860. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause tar ant target to fail. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: -1 contrib tests. The patch failed contrib unit tests. -1 system test framework. The patch failed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1093//console This message is automatically generated.
        Hide
        Eli Collins added a comment -

        Patch rebased on trunk.

        Show
        Eli Collins added a comment - Patch rebased on trunk.
        Hide
        Todd Lipcon added a comment -

        +1 once you get a clean test-patch

        Show
        Todd Lipcon added a comment - +1 once you get a clean test-patch
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12490213/hdfs-2186-2.patch
        against trunk revision 1156860.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these core unit tests:
        org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy
        org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery
        org.apache.hadoop.hdfs.server.datanode.TestDataDirs
        org.apache.hadoop.hdfs.server.namenode.TestCheckpoint
        org.apache.hadoop.hdfs.server.namenode.TestGetImageServlet
        org.apache.hadoop.hdfs.server.namenode.TestINodeFile
        org.apache.hadoop.hdfs.server.namenode.TestNNLeaseRecovery
        org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark
        org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings
        org.apache.hadoop.hdfs.TestDFSRollback
        org.apache.hadoop.hdfs.TestHDFSServerPorts

        +1 contrib tests. The patch passed contrib unit tests.

        +1 system test framework. The patch passed system test framework compile.

        Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12490213/hdfs-2186-2.patch against trunk revision 1156860. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these core unit tests: org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy org.apache.hadoop.hdfs.server.datanode.TestBlockRecovery org.apache.hadoop.hdfs.server.datanode.TestDataDirs org.apache.hadoop.hdfs.server.namenode.TestCheckpoint org.apache.hadoop.hdfs.server.namenode.TestGetImageServlet org.apache.hadoop.hdfs.server.namenode.TestINodeFile org.apache.hadoop.hdfs.server.namenode.TestNNLeaseRecovery org.apache.hadoop.hdfs.server.namenode.TestNNThroughputBenchmark org.apache.hadoop.hdfs.server.namenode.TestValidateConfigurationSettings org.apache.hadoop.hdfs.TestDFSRollback org.apache.hadoop.hdfs.TestHDFSServerPorts +1 contrib tests. The patch passed contrib unit tests. +1 system test framework. The patch passed system test framework compile. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/1094//console This message is automatically generated.
        Hide
        Eli Collins added a comment -

        I ran the tests locally, only tests failures are those from HDFS-2242. Thanks Todd!

        Show
        Eli Collins added a comment - I ran the tests locally, only tests failures are those from HDFS-2242 . Thanks Todd!
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #831 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/831/)
        HDFS-2186. DN volume failures on startup are not counted. Contributed by Eli Collins

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156974
        Files :

        • /hadoop/common/trunk/hdfs/CHANGES.txt
        • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java
        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #831 (See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/831/ ) HDFS-2186 . DN volume failures on startup are not counted. Contributed by Eli Collins eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156974 Files : /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk-Commit #739 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/739/)
        HDFS-2186. DN volume failures on startup are not counted. Contributed by Eli Collins

        eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156974
        Files :

        • /hadoop/common/trunk/hdfs/CHANGES.txt
        • /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java
        • /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java
        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk-Commit #739 (See https://builds.apache.org/job/Hadoop-Common-trunk-Commit/739/ ) HDFS-2186 . DN volume failures on startup are not counted. Contributed by Eli Collins eli : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1156974 Files : /hadoop/common/trunk/hdfs/CHANGES.txt /hadoop/common/trunk/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/datanode/TestDataNodeVolumeFailureToleration.java /hadoop/common/trunk/hdfs/src/java/org/apache/hadoop/hdfs/server/datanode/FSDataset.java

          People

          • Assignee:
            Eli Collins
            Reporter:
            Eli Collins
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development