Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11378

Verify multiple DataNodes can be decommissioned/maintenance at the same time

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 2.9.0, 3.0.0-alpha4
    • Component/s: hdfs
    • Labels:
      None

      Description

      DecommissionManager is capable of transitioning multiple DataNodes to Decommission/Maintenance states. Current tests under TestDecommission and TestMaintenanceState only request for one DataNode for Decommission/Maintenance. Better if we can simulate real world cases whereby multiple DataNodes can be taken out of service and verify the resulting block replication factor for the files.

      1. HDFS-11378.01.patch
        14 kB
        Manoj Govindassamy
      2. HDFS-11378-branch-2.fix.01.patch
        1 kB
        Manoj Govindassamy

        Activity

        Hide
        kasha Karthik Kambatla added a comment -

        +1

        Just committed the addendum to branch-2. Thanks again Manoj Govindassamy for the quick turnaround.

        Show
        kasha Karthik Kambatla added a comment - +1 Just committed the addendum to branch-2. Thanks again Manoj Govindassamy for the quick turnaround.
        Hide
        kasha Karthik Kambatla added a comment -

        Thanks for the quick turnaround, Manoj. I can take care of the addendum shortly.

        Show
        kasha Karthik Kambatla added a comment - Thanks for the quick turnaround, Manoj. I can take care of the addendum shortly.
        Hide
        manojg Manoj Govindassamy added a comment - - edited

        Sorry for the inconvenience Karthik Kambatla. Local variables accessed from within inner class needs to be declared final for java7 and my bad I didn't attach a different patch for branch-2 here. Attaching a fix for the test code compilation failure in branch-2.

        Xiao Chen, Ming Ma, can you please verify and commit the addendum fix patch for branch2 ? If this needs more time we can also revert the commit 34f9ceab4a53007bba485b51fbd909dae5198148 in branch2 ?

        Show
        manojg Manoj Govindassamy added a comment - - edited Sorry for the inconvenience Karthik Kambatla . Local variables accessed from within inner class needs to be declared final for java7 and my bad I didn't attach a different patch for branch-2 here. Attaching a fix for the test code compilation failure in branch-2. Xiao Chen , Ming Ma , can you please verify and commit the addendum fix patch for branch2 ? If this needs more time we can also revert the commit 34f9ceab4a53007bba485b51fbd909dae5198148 in branch2 ?
        Hide
        kasha Karthik Kambatla added a comment -

        This seems to break branch-2 compilation as follows (tried java8 and java7):

        [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile (default-testCompile) on project hadoop-hdfs: Compilation failure: Compilation failure:
        [ERROR] /Users/kasha/code/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java:[1135,52] local variable repl is accessed from within inner class; needs to be declared final
        [ERROR] /Users/kasha/code/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java:[1136,15] local variable decomDataNodes is accessed from within inner class; needs to be declared final
        

        Manoj Govindassamy, Ming Ma - can we fix/revert it at the earliest please?

        Show
        kasha Karthik Kambatla added a comment - This seems to break branch-2 compilation as follows (tried java8 and java7): [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:testCompile (default-testCompile) on project hadoop-hdfs: Compilation failure: Compilation failure: [ERROR] /Users/kasha/code/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java:[1135,52] local variable repl is accessed from within inner class; needs to be declared final [ERROR] /Users/kasha/code/hadoop/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java:[1136,15] local variable decomDataNodes is accessed from within inner class; needs to be declared final Manoj Govindassamy , Ming Ma - can we fix/revert it at the earliest please?
        Hide
        hudson Hudson added a comment -

        SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11187 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11187/)
        HDFS-11378. Verify multiple DataNodes can be decommissioned/maintenance (mingma: rev 312b36d113d83640b92c62fdd91ede74bd04c00f)

        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMaintenanceState.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AdminStatesBaseTest.java
        • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
        Show
        hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11187 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11187/ ) HDFS-11378 . Verify multiple DataNodes can be decommissioned/maintenance (mingma: rev 312b36d113d83640b92c62fdd91ede74bd04c00f) (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestMaintenanceState.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/AdminStatesBaseTest.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDecommission.java
        Hide
        mingma Ming Ma added a comment -

        +1. Thanks Manoj Govindassamy for the contribution. I have committed it to trunk and branch-2.

        Show
        mingma Ming Ma added a comment - +1. Thanks Manoj Govindassamy for the contribution. I have committed it to trunk and branch-2.
        Hide
        manojg Manoj Govindassamy added a comment -

        Thanks for the review Ming Ma. Sure, multiple DNs in maintenance and decommission in parallel is also an interesting case. Am also planning for few more corner cases like min replication not met when N-1 or N-2 of the nodes are asked for Maintenance/Decom and DNs need to be recommissioned for the other ones to go IN_MAINTENANCE. Will add these tests in the next batch. Thanks.

        Show
        manojg Manoj Govindassamy added a comment - Thanks for the review Ming Ma . Sure, multiple DNs in maintenance and decommission in parallel is also an interesting case. Am also planning for few more corner cases like min replication not met when N-1 or N-2 of the nodes are asked for Maintenance/Decom and DNs need to be recommissioned for the other ones to go IN_MAINTENANCE. Will add these tests in the next batch. Thanks.
        Hide
        mingma Ming Ma added a comment -

        The patch LGTM. Thanks Manoj Govindassamy for the useful test cases! We might want to add test cases of putting some nodes to decommission and other nodes to maintenance at the same time. But that can be done in a separate jira unless it is your attention to do it here.

        Show
        mingma Ming Ma added a comment - The patch LGTM. Thanks Manoj Govindassamy for the useful test cases! We might want to add test cases of putting some nodes to decommission and other nodes to maintenance at the same time. But that can be done in a separate jira unless it is your attention to do it here.
        Hide
        hadoopqa Hadoop QA added a comment -
        -1 overall



        Vote Subsystem Runtime Comment
        0 reexec 0m 14s Docker mode activated.
        +1 @author 0m 0s The patch does not contain any @author tags.
        +1 test4tests 0m 0s The patch appears to include 3 new or modified test files.
        +1 mvninstall 16m 32s trunk passed
        +1 compile 1m 3s trunk passed
        +1 checkstyle 0m 38s trunk passed
        +1 mvnsite 1m 12s trunk passed
        +1 mvneclipse 0m 17s trunk passed
        +1 findbugs 2m 17s trunk passed
        +1 javadoc 0m 50s trunk passed
        +1 mvninstall 0m 59s the patch passed
        +1 compile 0m 56s the patch passed
        +1 javac 0m 56s the patch passed
        +1 checkstyle 0m 28s the patch passed
        +1 mvnsite 1m 0s the patch passed
        +1 mvneclipse 0m 13s the patch passed
        +1 whitespace 0m 0s The patch has no whitespace issues.
        +1 findbugs 2m 21s the patch passed
        +1 javadoc 0m 45s the patch passed
        -1 unit 80m 6s hadoop-hdfs in the patch failed.
        +1 asflicense 0m 23s The patch does not generate ASF License warnings.
        111m 56s



        Reason Tests
        Failed junit tests hadoop.hdfs.TestAclsEndToEnd



        Subsystem Report/Notes
        Docker Image:yetus/hadoop:a9ad5d6
        JIRA Issue HDFS-11378
        JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12849639/HDFS-11378.01.patch
        Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
        uname Linux 4f4554786fa9 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
        Build tool maven
        Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
        git revision trunk / 7bc333a
        Default Java 1.8.0_121
        findbugs v3.0.0
        unit https://builds.apache.org/job/PreCommit-HDFS-Build/18281/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
        Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18281/testReport/
        modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
        Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18281/console
        Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

        This message was automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 3 new or modified test files. +1 mvninstall 16m 32s trunk passed +1 compile 1m 3s trunk passed +1 checkstyle 0m 38s trunk passed +1 mvnsite 1m 12s trunk passed +1 mvneclipse 0m 17s trunk passed +1 findbugs 2m 17s trunk passed +1 javadoc 0m 50s trunk passed +1 mvninstall 0m 59s the patch passed +1 compile 0m 56s the patch passed +1 javac 0m 56s the patch passed +1 checkstyle 0m 28s the patch passed +1 mvnsite 1m 0s the patch passed +1 mvneclipse 0m 13s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 21s the patch passed +1 javadoc 0m 45s the patch passed -1 unit 80m 6s hadoop-hdfs in the patch failed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 111m 56s Reason Tests Failed junit tests hadoop.hdfs.TestAclsEndToEnd Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-11378 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12849639/HDFS-11378.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 4f4554786fa9 3.13.0-105-generic #152-Ubuntu SMP Fri Dec 2 15:37:11 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 7bc333a Default Java 1.8.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18281/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18281/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18281/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
        Hide
        manojg Manoj Govindassamy added a comment -

        Attached v01 patch to address the following

        • A new method takeNodeOutofService in AdminStatesBaseTest to take in a list of DataNodeUUIDs for decommission/maintenance operations
        • New test in TestDecommission to invoke takeNodeOutofService with multiple DNs at the same time, verify the replication factor after decommission
        • New test in TestMaintenanceState to invoke takeNodeOutofService with multiple DNs at the same time, verify the replication factor before and after maintenance operation

        Lei (Eddy) Xu, Ming Ma, can you please take a look at the patch ? Thanks.

        Show
        manojg Manoj Govindassamy added a comment - Attached v01 patch to address the following A new method takeNodeOutofService in AdminStatesBaseTest to take in a list of DataNodeUUIDs for decommission/maintenance operations New test in TestDecommission to invoke takeNodeOutofService with multiple DNs at the same time, verify the replication factor after decommission New test in TestMaintenanceState to invoke takeNodeOutofService with multiple DNs at the same time, verify the replication factor before and after maintenance operation Lei (Eddy) Xu , Ming Ma , can you please take a look at the patch ? Thanks.

          People

          • Assignee:
            manojg Manoj Govindassamy
            Reporter:
            manojg Manoj Govindassamy
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development