Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 2.9.0, 3.0.0-alpha2
    • Component/s: None
    • Labels:
      None
    1. HDFS-9391.01.patch
      23 kB
      Manoj Govindassamy
    2. HDFS-9391.02.patch
      23 kB
      Manoj Govindassamy
    3. HDFS-9391.03.patch
      23 kB
      Manoj Govindassamy
    4. HDFS-9391.04.patch
      21 kB
      Manoj Govindassamy
    5. HDFS-9391-branch-2.01.patch
      21 kB
      Manoj Govindassamy
    6. HDFS-9391-branch-2.02.patch
      20 kB
      Manoj Govindassamy
    7. HDFS-9391-branch-2-MaintenanceMode-WebUI.pdf
      648 kB
      Manoj Govindassamy
    8. HDFS-9391-MaintenanceMode-WebUI.pdf
      557 kB
      Manoj Govindassamy
    9. Maintenance webUI.png
      49 kB
      Ming Ma

      Activity

      Hide
      manojg Manoj Govindassamy added a comment -

      Taking up this task after discussing with Lei (Eddy) Xu

      Show
      manojg Manoj Govindassamy added a comment - Taking up this task after discussing with Lei (Eddy) Xu
      Hide
      manojg Manoj Govindassamy added a comment -

      Ming Ma,

      I am trying to extend NameNodeMXBean to include maintenance nodes and I have few questions on the same.
      After HDFS-9390, there is a new concept of OutOfService nodes, which includes all of the nods in any of the below states

      1. DECOMMISSIONING
      2. DECOMMISSIONED
      3. MAINTENANCE_NOT_FOR_READ
      4. MAINTENANCE_FOR_READ

      1. FSNameSystem#getDecomNodes constructs a map of all currently Decommissioning nodes, but includes all OutOfServiceReplicas for each of them in their value attributes map. Shouldn't the DecomNodes include only replicas for DECOMMISSION_INPROGRESS nodes ?

      2. Just like FSNameSystem#getDecomNodes, should we also have FSNameSystem#getMaintenanceNodes which returns info about EnteringMaintenanceNodes ?

      3. Anything else you have in mind for NameNodeMXBean and tests w.r.t showing Maintenance nodes details ?

      Any help is much appreciated.

      Show
      manojg Manoj Govindassamy added a comment - Ming Ma , I am trying to extend NameNodeMXBean to include maintenance nodes and I have few questions on the same. After HDFS-9390 , there is a new concept of OutOfService nodes, which includes all of the nods in any of the below states 1. DECOMMISSIONING 2. DECOMMISSIONED 3. MAINTENANCE_NOT_FOR_READ 4. MAINTENANCE_FOR_READ 1. FSNameSystem#getDecomNodes constructs a map of all currently Decommissioning nodes, but includes all OutOfServiceReplicas for each of them in their value attributes map. Shouldn't the DecomNodes include only replicas for DECOMMISSION_INPROGRESS nodes ? 2. Just like FSNameSystem#getDecomNodes , should we also have FSNameSystem#getMaintenanceNodes which returns info about EnteringMaintenanceNodes ? 3. Anything else you have in mind for NameNodeMXBean and tests w.r.t showing Maintenance nodes details ? Any help is much appreciated.
      Hide
      manojg Manoj Govindassamy added a comment -

      Ming Ma, Lei (Eddy) Xu,

      4. In the NameNode UI, under Summary page, Live Node and Dead Nodes count currently show Decommissioned nodes details also. I am assuming "In Maintenance" Live/Dead nodes count also need to be shown along with Decommission nodes ? Please confirm.

      5. Is there a plan to expose the concept of 'OutOfService' (which includes both Decommissioned and Maintenance nodes) in JMX and UI ?

      Show
      manojg Manoj Govindassamy added a comment - Ming Ma , Lei (Eddy) Xu , 4. In the NameNode UI, under Summary page, Live Node and Dead Nodes count currently show Decommissioned nodes details also. I am assuming "In Maintenance" Live/Dead nodes count also need to be shown along with Decommission nodes ? Please confirm. 5. Is there a plan to expose the concept of 'OutOfService' (which includes both Decommissioned and Maintenance nodes) in JMX and UI ?
      Hide
      mingma Ming Ma added a comment -

      Thanks Manoj Govindassamy!

      Shouldn't the DecomNodes include only replicas for DECOMMISSION_INPROGRESS nodes?

      Good point. The question what value "decommissionOnlyReplicas" property should be in the context of maintenance mode. A specific example is if a block has 3 replicas with one node entering maintenance and the other two being decommissioned, if it should be included in "decommissionOnlyReplicas". Given we normally use the property as a risk indicator, e.g. what if all decommissioning or entering maintenance nodes fail, it seems ok to include both. Sure there is backward compatibility semantics here; you can argue it is ok given the behavior is the same without maintenance. If we really want to separately account for all-3-replicas-being-decommissioned, we can keep the strict semantics and add a new property "outOfServiceOnlyReplicas" to account for both types. To enable that, we will need to track each type separately in LeavingServiceStatus.

      should we also have FSNameSystem#getMaintenanceNodes?

      Yes something like NameNodeMXBean#getEnteringMaintenanceNodes will be useful.

      w.r.t showing Maintenance nodes details ?

      getDeadNodes only returns decommissioned case. You can add ".put("adminState", node.getAdminState().toString())" to the JSON to cover maintenance. You can also add counters to FSNamesystemMBean such as getNumMaintenanceLiveDataNodes, similar to getNumDecomLiveDataNodes and getNumDecomDeadDataNodes.

      "In Maintenance" Live/Dead nodes count also need to be shown along with Decommission nodes ?

      That is right. The attached screenshot could be useful. After someone clicks the "Entering Maintenance Nodes", it should redirect to another page about its progress, similar to the "Decommissioning Nodes".

      Is there a plan to expose the concept of 'OutOfService'?

      Based on how we use them, decommissioned nodes are tracked separately from maintenance nodes, other than the first point you brought up.

      Show
      mingma Ming Ma added a comment - Thanks Manoj Govindassamy ! Shouldn't the DecomNodes include only replicas for DECOMMISSION_INPROGRESS nodes? Good point. The question what value "decommissionOnlyReplicas" property should be in the context of maintenance mode. A specific example is if a block has 3 replicas with one node entering maintenance and the other two being decommissioned, if it should be included in "decommissionOnlyReplicas". Given we normally use the property as a risk indicator, e.g. what if all decommissioning or entering maintenance nodes fail, it seems ok to include both. Sure there is backward compatibility semantics here; you can argue it is ok given the behavior is the same without maintenance. If we really want to separately account for all-3-replicas-being-decommissioned, we can keep the strict semantics and add a new property "outOfServiceOnlyReplicas" to account for both types. To enable that, we will need to track each type separately in LeavingServiceStatus. should we also have FSNameSystem#getMaintenanceNodes? Yes something like NameNodeMXBean#getEnteringMaintenanceNodes will be useful. w.r.t showing Maintenance nodes details ? getDeadNodes only returns decommissioned case. You can add ".put("adminState", node.getAdminState().toString())" to the JSON to cover maintenance. You can also add counters to FSNamesystemMBean such as getNumMaintenanceLiveDataNodes, similar to getNumDecomLiveDataNodes and getNumDecomDeadDataNodes. "In Maintenance" Live/Dead nodes count also need to be shown along with Decommission nodes ? That is right. The attached screenshot could be useful. After someone clicks the "Entering Maintenance Nodes", it should redirect to another page about its progress, similar to the "Decommissioning Nodes". Is there a plan to expose the concept of 'OutOfService'? Based on how we use them, decommissioned nodes are tracked separately from maintenance nodes, other than the first point you brought up.
      Hide
      manojg Manoj Govindassamy added a comment -

      Thanks for the detailed comments Ming Ma. Much appreciated.

      FSNameSystem#getDecomNodes

      If we really want to separately account for all-3-replicas-being-decommissioned, we can keep the strict semantics and add a new property "outOfServiceOnlyReplicas" to account for both types. To enable that, we will need to track each type separately in LeavingServiceStatus.

      // TODO use another property name for outOfServiceOnlyReplicas.
      .put("decommissionOnlyReplicas",
      node.getLeavingServiceStatus().getOutOfServiceOnlyReplicas())

      Risk indicator including all probable failures does sounds good. Except, the other page which shows more details on the Decommissioning Nodes will not tally with this number. So, I would like decommissionOnlyReplicas property to capture decommission related ones only and additionally have another property outOfServiceOnlyReplicas capturing all. Yes, we will have to extend LeavingServiceStatus to include all these numbers, but looks straight forward and useful to me. Will go with this approach.

      All other items totally makes sense. Thanks.

      Show
      manojg Manoj Govindassamy added a comment - Thanks for the detailed comments Ming Ma . Much appreciated. FSNameSystem#getDecomNodes If we really want to separately account for all-3-replicas-being-decommissioned, we can keep the strict semantics and add a new property "outOfServiceOnlyReplicas" to account for both types. To enable that, we will need to track each type separately in LeavingServiceStatus. // TODO use another property name for outOfServiceOnlyReplicas. .put("decommissionOnlyReplicas", node.getLeavingServiceStatus().getOutOfServiceOnlyReplicas()) Risk indicator including all probable failures does sounds good. Except, the other page which shows more details on the Decommissioning Nodes will not tally with this number. So, I would like decommissionOnlyReplicas property to capture decommission related ones only and additionally have another property outOfServiceOnlyReplicas capturing all. Yes, we will have to extend LeavingServiceStatus to include all these numbers, but looks straight forward and useful to me. Will go with this approach. All other items totally makes sense. Thanks.
      Hide
      manojg Manoj Govindassamy added a comment -

      Attaching the WebUI proposal for Maintenance Mode details.

      Show
      manojg Manoj Govindassamy added a comment - Attaching the WebUI proposal for Maintenance Mode details.
      Hide
      manojg Manoj Govindassamy added a comment - - edited

      Attaching v01 patch to address the following:
      1. Introduced NameNodeMXBean#getEnteringMaintenanceNodes and the same is implemented in FSNameSystem
      2. Updated LeavingServiceStatus to include details on Decommissioning, Maintenance and OutOfService replicas
      3. DecommissionManager to update LeavingServiceStatus
      4. Updated dfshealth.html to have details on Maintenance nodes in Summary and DataNode Information pages. (Entering Maintenance, Live Maintenance, Dead Maintenance)
      5. Unit test for the Maintenance mode JMX.

      Lei (Eddy) Xu, Ming Ma, Can you please take a look at the patch and comment on what can be improved ?

      Show
      manojg Manoj Govindassamy added a comment - - edited Attaching v01 patch to address the following: 1. Introduced NameNodeMXBean#getEnteringMaintenanceNodes and the same is implemented in FSNameSystem 2. Updated LeavingServiceStatus to include details on Decommissioning, Maintenance and OutOfService replicas 3. DecommissionManager to update LeavingServiceStatus 4. Updated dfshealth.html to have details on Maintenance nodes in Summary and DataNode Information pages. (Entering Maintenance, Live Maintenance, Dead Maintenance) 5. Unit test for the Maintenance mode JMX. Lei (Eddy) Xu , Ming Ma , Can you please take a look at the patch and comment on what can be improved ?
      Hide
      mingma Ming Ma added a comment -

      Thanks Manoj Govindassamy. Some minor questions:

      • .put("inMaintenance", node.isInMaintenance()) might not be necessary given it also outputs .put("adminState", node.getAdminState().toString()).
      • Should liveDecommissioningReplicas be OnlyDecommissioningReplicas which is the old behavior before maintenance? There are two differences, one is "Only", another one is "live".
      Show
      mingma Ming Ma added a comment - Thanks Manoj Govindassamy . Some minor questions: .put("inMaintenance", node.isInMaintenance()) might not be necessary given it also outputs .put("adminState", node.getAdminState().toString()) . Should liveDecommissioningReplicas be OnlyDecommissioningReplicas which is the old behavior before maintenance? There are two differences, one is "Only", another one is "live".
      Hide
      hadoopqa Hadoop QA added a comment -
      +1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 22s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 13m 58s trunk passed
      +1 compile 0m 55s trunk passed
      +1 checkstyle 0m 30s trunk passed
      +1 mvnsite 1m 1s trunk passed
      +1 mvneclipse 0m 13s trunk passed
      +1 findbugs 1m 55s trunk passed
      +1 javadoc 0m 43s trunk passed
      +1 mvninstall 0m 56s the patch passed
      +1 compile 0m 45s the patch passed
      +1 javac 0m 45s the patch passed
      +1 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277)
      +1 mvnsite 0m 55s the patch passed
      +1 mvneclipse 0m 11s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 2m 4s the patch passed
      +1 javadoc 0m 42s the patch passed
      +1 unit 101m 21s hadoop-hdfs in the patch passed.
      +1 asflicense 0m 38s The patch does not generate ASF License warnings.
      129m 5s



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:a9ad5d6
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12843698/HDFS-9391.01.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux c552f1d8f8d4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision trunk / fcbe152
      Default Java 1.8.0_111
      findbugs v3.0.0
      Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17881/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17881/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 22s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 58s trunk passed +1 compile 0m 55s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 1m 1s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 55s trunk passed +1 javadoc 0m 43s trunk passed +1 mvninstall 0m 56s the patch passed +1 compile 0m 45s the patch passed +1 javac 0m 45s the patch passed +1 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277) +1 mvnsite 0m 55s the patch passed +1 mvneclipse 0m 11s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 4s the patch passed +1 javadoc 0m 42s the patch passed +1 unit 101m 21s hadoop-hdfs in the patch passed. +1 asflicense 0m 38s The patch does not generate ASF License warnings. 129m 5s Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12843698/HDFS-9391.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux c552f1d8f8d4 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / fcbe152 Default Java 1.8.0_111 findbugs v3.0.0 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17881/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17881/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      manojg Manoj Govindassamy added a comment - - edited

      Thanks for the quick review. Much appreciated.

      .put("inMaintenance", node.isInMaintenance()) might not be necessary given adminState

      True. Now that getDeadNodes JSON also has adminState, both decommissioned and inMaintenance are not needed. Have changed the usage in dfshealth.js to check for nodes' adminState instead of decommissioned and inMaintenance attributes.

      Should liveDecommissioningReplicas be OnlyDecommissioningReplicas which is the old behavior ?

      Sounds good.

      • Previous to HDFS-9390 fix when MM was not there, in FSNamesystem#getDecomNodes, for decommissionOnlyReplicas, we used to return getDecommissionOnlyReplicas() which is both decommissioning and decommissioned replicas.
      • After HDFS-9390 fix, we started returning getLeavingServiceStatus().getOutOfServiceOnlyReplicas() which is both decommission and maintenance replicas. Based on our previous comments discussions, I assume we can go back to the old behavior of returning getDecommissionOnlyReplicas() here. I incorporated this change in patch v02.
      • Now, in FSNameSystem#getEnteringMaintenanceNodes(), for maintenanceOnlyReplicas, will follow the same model and return maintenance replicas only – getLeavingServiceStatus().getMaintenanceOnlyReplicas().
      Show
      manojg Manoj Govindassamy added a comment - - edited Thanks for the quick review. Much appreciated. .put("inMaintenance", node.isInMaintenance()) might not be necessary given adminState True. Now that getDeadNodes JSON also has adminState, both decommissioned and inMaintenance are not needed. Have changed the usage in dfshealth.js to check for nodes' adminState instead of decommissioned and inMaintenance attributes. Should liveDecommissioningReplicas be OnlyDecommissioningReplicas which is the old behavior ? Sounds good. Previous to HDFS-9390 fix when MM was not there, in FSNamesystem#getDecomNodes , for decommissionOnlyReplicas , we used to return getDecommissionOnlyReplicas() which is both decommissioning and decommissioned replicas. After HDFS-9390 fix, we started returning getLeavingServiceStatus().getOutOfServiceOnlyReplicas() which is both decommission and maintenance replicas. Based on our previous comments discussions, I assume we can go back to the old behavior of returning getDecommissionOnlyReplicas() here. I incorporated this change in patch v02. Now, in FSNameSystem#getEnteringMaintenanceNodes() , for maintenanceOnlyReplicas , will follow the same model and return maintenance replicas only – getLeavingServiceStatus().getMaintenanceOnlyReplicas() .
      Hide
      manojg Manoj Govindassamy added a comment -

      Attached v02 patch addressing previous review comments.

      • DatanodeDescriptor#LeavingServiceStatus fixed to have maintenance and decommission only replicas DecommissionManager#Monitor#processBlocksInternal updated to track maintenance and decommission only replicas.
      • FSNameSystem#getDecomNodes and FSNameSystem#getEnteringMaintenanceNodes updated to use the right APIs.
        Ming Ma, Lei (Eddy) Xu, please take a look at the patch.
      Show
      manojg Manoj Govindassamy added a comment - Attached v02 patch addressing previous review comments. DatanodeDescriptor#LeavingServiceStatus fixed to have maintenance and decommission only replicas DecommissionManager#Monitor#processBlocksInternal updated to track maintenance and decommission only replicas. FSNameSystem#getDecomNodes and FSNameSystem#getEnteringMaintenanceNodes updated to use the right APIs. Ming Ma , Lei (Eddy) Xu , please take a look at the patch.
      Hide
      hadoopqa Hadoop QA added a comment -
      -1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 10s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 13m 9s trunk passed
      +1 compile 0m 45s trunk passed
      +1 checkstyle 0m 30s trunk passed
      +1 mvnsite 0m 50s trunk passed
      +1 mvneclipse 0m 12s trunk passed
      +1 findbugs 1m 41s trunk passed
      +1 javadoc 0m 39s trunk passed
      +1 mvninstall 0m 45s the patch passed
      +1 compile 0m 42s the patch passed
      +1 javac 0m 42s the patch passed
      +1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277)
      +1 mvnsite 0m 48s the patch passed
      +1 mvneclipse 0m 9s the patch passed
      +1 whitespace 0m 1s The patch has no whitespace issues.
      +1 findbugs 1m 46s the patch passed
      +1 javadoc 0m 37s the patch passed
      -1 unit 63m 15s hadoop-hdfs in the patch failed.
      +1 asflicense 0m 20s The patch does not generate ASF License warnings.
      87m 55s



      Reason Tests
      Failed junit tests hadoop.hdfs.TestMaintenanceState
        hadoop.hdfs.TestHDFSServerPorts



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:a9ad5d6
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12843969/HDFS-9391.02.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux 2d20247efa36 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision trunk / ef2dd7b
      Default Java 1.8.0_111
      findbugs v3.0.0
      unit https://builds.apache.org/job/PreCommit-HDFS-Build/17902/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
      Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17902/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17902/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 10s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 13m 9s trunk passed +1 compile 0m 45s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 50s trunk passed +1 mvneclipse 0m 12s trunk passed +1 findbugs 1m 41s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 45s the patch passed +1 compile 0m 42s the patch passed +1 javac 0m 42s the patch passed +1 checkstyle 0m 26s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277) +1 mvnsite 0m 48s the patch passed +1 mvneclipse 0m 9s the patch passed +1 whitespace 0m 1s The patch has no whitespace issues. +1 findbugs 1m 46s the patch passed +1 javadoc 0m 37s the patch passed -1 unit 63m 15s hadoop-hdfs in the patch failed. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 87m 55s Reason Tests Failed junit tests hadoop.hdfs.TestMaintenanceState   hadoop.hdfs.TestHDFSServerPorts Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12843969/HDFS-9391.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 2d20247efa36 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / ef2dd7b Default Java 1.8.0_111 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/17902/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/17902/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/17902/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      manojg Manoj Govindassamy added a comment -

      >> hadoop.hdfs.TestHDFSServerPorts
      >> hadoop.hdfs.TestMaintenanceState
      Above test failures are not related to the patch. Ran them locally and I don't see the same failures again.

      Show
      manojg Manoj Govindassamy added a comment - >> hadoop.hdfs.TestHDFSServerPorts >> hadoop.hdfs.TestMaintenanceState Above test failures are not related to the patch. Ran them locally and I don't see the same failures again.
      Hide
      manojg Manoj Govindassamy added a comment -

      Discussed Maintenance Mode UI proposal with Dilaver and he has the following comments

      1. In the NameNode UI, under DataNode Information there are few legends like "In Service", "Down", "Decommissioned & Dead" etc., (Refer Page 1, item 2 in the UI attached).
        • What is the difference between Down and Dead nodes ? Better to be consistent in naming and terminology.
        • Any help hover text for these Icons would be very useful
      2. Icon visualization should be extended to cover Maintenance Mode states.
      Show
      manojg Manoj Govindassamy added a comment - Discussed Maintenance Mode UI proposal with Dilaver and he has the following comments In the NameNode UI, under DataNode Information there are few legends like "In Service", "Down", "Decommissioned & Dead" etc., (Refer Page 1, item 2 in the UI attached ). What is the difference between Down and Dead nodes ? Better to be consistent in naming and terminology. Any help hover text for these Icons would be very useful Icon visualization should be extended to cover Maintenance Mode states.
      Hide
      manojg Manoj Govindassamy added a comment -

      HDFS-11265 has been filed to track the item 2. For item 1, it will be tracked in a separate jira outside of Maintenance Mode.

      Show
      manojg Manoj Govindassamy added a comment - HDFS-11265 has been filed to track the item 2. For item 1, it will be tracked in a separate jira outside of Maintenance Mode.
      Hide
      eddyxu Lei (Eddy) Xu added a comment -

      Hi, Manoj Govindassamy.

      It LGTM overall.

      One small question, is this going to be an incompatible change? This patch removes decommissioned attr and adds adminState attr. It might cause incompatible issues with monitoring system or scripts.

      - .put("decommissioned", node.isDecommissioned())
      + .put("adminState", node.getAdminState().toString())
      
      Show
      eddyxu Lei (Eddy) Xu added a comment - Hi, Manoj Govindassamy . It LGTM overall. One small question, is this going to be an incompatible change? This patch removes decommissioned attr and adds adminState attr. It might cause incompatible issues with monitoring system or scripts. - .put( "decommissioned" , node.isDecommissioned()) + .put( "adminState" , node.getAdminState().toString())
      Hide
      manojg Manoj Govindassamy added a comment -

      Thanks for the review Lei (Eddy) Xu.
      With the addition of "adminState" attribute, explicit state attr like "decommissioned" becomes redundant. So, removed it. But, you are right, it could be incompatible with the removal of an existing attr. May be, I should retain the existing "decommissioned" attr as is and also add the additional "adminState". In a way, we want to deprecate "decommissioned" attr as "adminState" covers all. Any advice on right way of doing this ?

      Show
      manojg Manoj Govindassamy added a comment - Thanks for the review Lei (Eddy) Xu . With the addition of "adminState" attribute, explicit state attr like "decommissioned" becomes redundant. So, removed it. But, you are right, it could be incompatible with the removal of an existing attr. May be, I should retain the existing "decommissioned" attr as is and also add the additional "adminState". In a way, we want to deprecate "decommissioned" attr as "adminState" covers all. Any advice on right way of doing this ?
      Hide
      manojg Manoj Govindassamy added a comment -

      Lei (Eddy) Xu, Ming Ma,
      Are you ok leaving "decommissioned" attr in FSNameSystem#getDeadNodes() even after the addition of new "adminState", so as to preserve compatibility ? And, any other review comments ? Please let me know.

      Show
      manojg Manoj Govindassamy added a comment - Lei (Eddy) Xu , Ming Ma , Are you ok leaving "decommissioned" attr in FSNameSystem#getDeadNodes() even after the addition of new "adminState", so as to preserve compatibility ? And, any other review comments ? Please let me know.
      Hide
      eddyxu Lei (Eddy) Xu added a comment -

      Hi, Manoj

      I would prefer to keep "decommissioned" for Hadoop 3 for a while.

      Thanks.

      Show
      eddyxu Lei (Eddy) Xu added a comment - Hi, Manoj I would prefer to keep "decommissioned" for Hadoop 3 for a while. Thanks.
      Hide
      mingma Ming Ma added a comment -

      Thanks Manoj. Yep let us keep the existing property as Eddy mentioned.

      • In getMaintenanceOnlyReplicas the check of if (!isDecommissionInProgress() && !isEnteringMaintenance()) only needs to check for maintenance part.
      • It seems you will need to add <li class="dfshealth-node-icon dfshealth-node-down-maintenance">In Maintenance & dead</li> to match the addition of nodes[i].state = "down-maintenance";.
      • For the EnteringMaintenanceNodes page, it uses maintenanceOnlyReplicas to describe Blocks with no live replicas. Should we use OutOfServiceOnlyReplicas?
      Show
      mingma Ming Ma added a comment - Thanks Manoj. Yep let us keep the existing property as Eddy mentioned. In getMaintenanceOnlyReplicas the check of if (!isDecommissionInProgress() && !isEnteringMaintenance()) only needs to check for maintenance part. It seems you will need to add <li class="dfshealth-node-icon dfshealth-node-down-maintenance">In Maintenance & dead</li> to match the addition of nodes [i] .state = "down-maintenance"; . For the EnteringMaintenanceNodes page, it uses maintenanceOnlyReplicas to describe Blocks with no live replicas . Should we use OutOfServiceOnlyReplicas ?
      Hide
      manojg Manoj Govindassamy added a comment -

      sure, makes sense. will do this.

      Show
      manojg Manoj Govindassamy added a comment - sure, makes sense. will do this.
      Hide
      manojg Manoj Govindassamy added a comment - - edited

      sure, will do 1 & 2.

      3.
      >> For the EnteringMaintenanceNodes page, it uses maintenanceOnlyReplicas to describe Blocks with no live replicas. Should we use OutOfServiceOnlyReplicas?

      Thanks for bringing up this Ming Ma. There are some inconsistencies even with "Decommissioning" page and would like to get clarified on that number as well.

      • HDFS-9390 updated the Decommissioning page to use getOutOfServiceOnlyReplicas() instead of getDecommissionOnlyReplicas()
      • But as part of HDFS-9390, getOutOfServiceOnlyReplicas() which got introduced, included all Maintenance and Decommission replicas. Effectively, the page has been showing all "out of service" replicas, even though the page name is "Decommissioning"

      Excerpts from Patch v02:

              if ((liveReplicas == 0) &&
                  (num.decommissionedAndDecommissioning() > 0)) {
                decommissionOnlyReplicas++;
              }
              if ((liveReplicas == 0) && (num.maintenanceReplicas() > 0)) {
                maintenanceOnlyReplicas++;
              }
              if ((liveReplicas == 0) && (num.outOfServiceReplicas() > 0)) {
                outOfServiceOnlyReplicas++;
              }
      
      • So, what should "Decommissioning" page actually show ? In the patch v02 uploaded here, I made this page to include decommission related replicas only. And, not all out of service replicas.
      • Now coming to "Entering Maintenance" page, what exact replicas should be included here ? If we show up "OutOfServiceOnlyReplicas" then it will include all decommissioning related replicas as well. So, I am using "maintenanceOnlyReplicas" for this page. Do, you still believe showing all "OutOfServiceOnlyReplicas" would be better here ? Please let me know.
      Show
      manojg Manoj Govindassamy added a comment - - edited sure, will do 1 & 2. 3. >> For the EnteringMaintenanceNodes page, it uses maintenanceOnlyReplicas to describe Blocks with no live replicas. Should we use OutOfServiceOnlyReplicas? Thanks for bringing up this Ming Ma . There are some inconsistencies even with "Decommissioning" page and would like to get clarified on that number as well. HDFS-9390 updated the Decommissioning page to use getOutOfServiceOnlyReplicas() instead of getDecommissionOnlyReplicas() But as part of HDFS-9390 , getOutOfServiceOnlyReplicas() which got introduced, included all Maintenance and Decommission replicas. Effectively, the page has been showing all "out of service" replicas, even though the page name is "Decommissioning" Excerpts from Patch v02: if ((liveReplicas == 0) && (num.decommissionedAndDecommissioning() > 0)) { decommissionOnlyReplicas++; } if ((liveReplicas == 0) && (num.maintenanceReplicas() > 0)) { maintenanceOnlyReplicas++; } if ((liveReplicas == 0) && (num.outOfServiceReplicas() > 0)) { outOfServiceOnlyReplicas++; } So, what should "Decommissioning" page actually show ? In the patch v02 uploaded here, I made this page to include decommission related replicas only. And, not all out of service replicas. Now coming to "Entering Maintenance" page, what exact replicas should be included here ? If we show up "OutOfServiceOnlyReplicas" then it will include all decommissioning related replicas as well. So, I am using "maintenanceOnlyReplicas" for this page. Do, you still believe showing all "OutOfServiceOnlyReplicas" would be better here ? Please let me know.
      Hide
      mingma Ming Ma added a comment -

      Good point. Actually it seems maintenanceOnlyReplicas is the same as outOfServiceOnlyReplicas in such case. For example, say one replica is decommissioning and two are entering maintenance, both maintenanceOnlyReplicas and outOfServiceOnlyReplicas are incremented. In other words, maintenanceOnlyReplicas isn't strictly "all 3 replicas are maintenance". Maybe this new definition is more desirable. What do you think?

      Show
      mingma Ming Ma added a comment - Good point. Actually it seems maintenanceOnlyReplicas is the same as outOfServiceOnlyReplicas in such case. For example, say one replica is decommissioning and two are entering maintenance, both maintenanceOnlyReplicas and outOfServiceOnlyReplicas are incremented. In other words, maintenanceOnlyReplicas isn't strictly "all 3 replicas are maintenance". Maybe this new definition is more desirable. What do you think?
      Hide
      manojg Manoj Govindassamy added a comment -

      Yes, in the example you gave both maintenanceOnlyReplicas and outOfServiceOnlyReplicas are incremented. I see outOfServiceOnlyReplicas as more of a cumulative number of both decommission and maintenance. The final counts would be 2 for maintenanceOnlyReplicas and 3 for outOfServiceOnlyReplicas.

      >> In other words, maintenanceOnlyReplicas isn't strictly "all 3 replicas are maintenance". Maybe this new definition is more desirable.

      Yes, in the above example, all 3 replicas are in some sort of maintenance and it is ok to have EnteringMaintenance page display "OutOfServiceOnlyReplicas".

      But, in an another example, where only one node is in decommission and no other nodes are in maintenance,
      – the Decommissioning page will rightly show 1 node in decommission. There is no problem with this page.
      – the EnteringMaintenance page, if we start to use the new cumulative "OutOfServiceOnlyReplicas", then this page also will show 1 node in maintenance the same one which is decommissioning.

      Hopefully you have thought about this case as well. The EnteringMaintenance page behavior for the second example sounds ok to you ? Please let me know.

                .put("maintenanceOnlyReplicas",
                    node.getLeavingServiceStatus().getOutOfServiceOnlyReplicas())
      

      Once this open question is resolved, will attach the new patch incorporating all pending changes. Thanks Ming Ma.

      Show
      manojg Manoj Govindassamy added a comment - Yes, in the example you gave both maintenanceOnlyReplicas and outOfServiceOnlyReplicas are incremented. I see outOfServiceOnlyReplicas as more of a cumulative number of both decommission and maintenance. The final counts would be 2 for maintenanceOnlyReplicas and 3 for outOfServiceOnlyReplicas. >> In other words, maintenanceOnlyReplicas isn't strictly "all 3 replicas are maintenance". Maybe this new definition is more desirable. Yes, in the above example, all 3 replicas are in some sort of maintenance and it is ok to have EnteringMaintenance page display "OutOfServiceOnlyReplicas". But, in an another example, where only one node is in decommission and no other nodes are in maintenance, – the Decommissioning page will rightly show 1 node in decommission. There is no problem with this page. – the EnteringMaintenance page, if we start to use the new cumulative "OutOfServiceOnlyReplicas", then this page also will show 1 node in maintenance the same one which is decommissioning. Hopefully you have thought about this case as well. The EnteringMaintenance page behavior for the second example sounds ok to you ? Please let me know. .put("maintenanceOnlyReplicas", node.getLeavingServiceStatus().getOutOfServiceOnlyReplicas()) Once this open question is resolved, will attach the new patch incorporating all pending changes. Thanks Ming Ma .
      Hide
      mingma Ming Ma added a comment -

      Sure let us keep what you have in patch 02. Just to make sure, can you confirm the followings?

      • For the case of "one replica is decommissioning and two replicas of the same block are entering maintenance", the code will still increment maintenanceOnlyReplicas when processing the decommissioning node, because NumberReplicas includes all replicas stats. Thus decommissionOnlyReplicas == maintenanceOnlyReplicas == outOfServiceReplicas.
      • For the case of "all replicas are decommissioning", then EnteringMaintenance page will have nothing to show to begin with given no nodes are entering maintenance.
      Show
      mingma Ming Ma added a comment - Sure let us keep what you have in patch 02. Just to make sure, can you confirm the followings? For the case of "one replica is decommissioning and two replicas of the same block are entering maintenance", the code will still increment maintenanceOnlyReplicas when processing the decommissioning node, because NumberReplicas includes all replicas stats. Thus decommissionOnlyReplicas == maintenanceOnlyReplicas == outOfServiceReplicas. For the case of "all replicas are decommissioning", then EnteringMaintenance page will have nothing to show to begin with given no nodes are entering maintenance.
      Hide
      manojg Manoj Govindassamy added a comment -

      Case 1: One replica is decommissioning and two replicas of the same block are entering maintenance

      the code will still increment maintenanceOnlyReplicas when processing the decommissioning node, because NumberReplicas includes all replicas stats.

      • I don't think so. In the current upstream trunk code, a node can only be in one state and NumberReplicas accounting for Decommission and Maintenance are exclusive. If the replica is Decommissioning, it cannot be in any of Maintenance states, and vice versa.
      • In the current upstream trunk code, outOfServiceReplicas is defined as a sum of both decommission and maintenance
      • Patch v02 in this jira, just makes use of already accounted numbers in NumberReplicas in different variables namely – decommissionOnlyReplicas and maintenanceOnlyReplicas

      So in the above example, with the patch v02 we will get following numbers
      – decommissionOnlyReplicas = 1
      – maintenanceOnlyReplicas = 2
      – outOfServiceReplicas = 3

      Hence,
      – Entering Maintenance page will only show 2 maintenance replica nodes
      – Decommissioning page will only show 1 decommissioning replica node

      Case 2: All replicas are decommissioning

      EnteringMaintenance page will have nothing to show to begin with given no nodes are entering maintenance.

      • Thats right. When all replicas are decommissioning, NumberReplicas will only have decommissionedAndDecommissioning nodes.
      • Entering Maintenance page will be empty
      • Decommissioning page will show all these decommissioning nodes

      Ming Ma,

      • Do the above cases and results match your expectation ?
      • Also, lets make sure we are targeting the same goals w.r.t Decommissioing and EnteringMaintenance Page. Based on the discussions we had earlier (refer comments 1 - 4), I assumed we want the following
        • Entering Maintenance Page to show the nodes that are Entering Maintenance / In Maintenance Only.
        • Decommissioning Page to show the nodes that are Decommissioning / Decommissioned Only.
      • Please correct me if my understanding from the current upstream trunk code is wrong or if we want different goals for the pages.
      Show
      manojg Manoj Govindassamy added a comment - Case 1: One replica is decommissioning and two replicas of the same block are entering maintenance the code will still increment maintenanceOnlyReplicas when processing the decommissioning node, because NumberReplicas includes all replicas stats. I don't think so. In the current upstream trunk code, a node can only be in one state and NumberReplicas accounting for Decommission and Maintenance are exclusive. If the replica is Decommissioning, it cannot be in any of Maintenance states, and vice versa. In the current upstream trunk code, outOfServiceReplicas is defined as a sum of both decommission and maintenance Patch v02 in this jira, just makes use of already accounted numbers in NumberReplicas in different variables namely – decommissionOnlyReplicas and maintenanceOnlyReplicas So in the above example, with the patch v02 we will get following numbers – decommissionOnlyReplicas = 1 – maintenanceOnlyReplicas = 2 – outOfServiceReplicas = 3 Hence, – Entering Maintenance page will only show 2 maintenance replica nodes – Decommissioning page will only show 1 decommissioning replica node Case 2: All replicas are decommissioning EnteringMaintenance page will have nothing to show to begin with given no nodes are entering maintenance. Thats right. When all replicas are decommissioning, NumberReplicas will only have decommissionedAndDecommissioning nodes. Entering Maintenance page will be empty Decommissioning page will show all these decommissioning nodes Ming Ma , Do the above cases and results match your expectation ? Also, lets make sure we are targeting the same goals w.r.t Decommissioing and EnteringMaintenance Page. Based on the discussions we had earlier (refer comments 1 - 4), I assumed we want the following Entering Maintenance Page to show the nodes that are Entering Maintenance / In Maintenance Only. Decommissioning Page to show the nodes that are Decommissioning / Decommissioned Only. Please correct me if my understanding from the current upstream trunk code is wrong or if we want different goals for the pages.
      Hide
      manojg Manoj Govindassamy added a comment -

      Code References:

      BlockManager#checkReplicaOnStorage:

      •       } else if (node.isDecommissionInProgress()) {
                s = StoredReplicaState.DECOMMISSIONING;
              } else if (node.isDecommissioned()) {
                s = StoredReplicaState.DECOMMISSIONED;
              } else if (node.isMaintenance()) {
                if (node.isInMaintenance() || !node.isAlive()) {
                  s = StoredReplicaState.MAINTENANCE_NOT_FOR_READ;
                } else {
                  s = StoredReplicaState.MAINTENANCE_FOR_READ;
                }
              } else if (isExcess(node, b)) {
                s = StoredReplicaState.EXCESS;
              } else {
                s = StoredReplicaState.LIVE;
              }
              counters.add(s, 1);
        

      DecommissionManager#Monitor#processBlocksInternal:

      •         if ((liveReplicas == 0) &&
                    (num.decommissionedAndDecommissioning() > 0)) {
                  decommissionOnlyReplicas++;
                }
                if ((liveReplicas == 0) && (num.maintenanceReplicas() > 0)) {
                  maintenanceOnlyReplicas++;
                }
                if ((liveReplicas == 0) && (num.outOfServiceReplicas() > 0)) {
                  outOfServiceOnlyReplicas++;
                }
        

      NumberReplicas:

      • 
          public int decommissionedAndDecommissioning() {
            return decommissioned() + decommissioning();
          }
        
          public int maintenanceReplicas() {
            return (int) (get(MAINTENANCE_NOT_FOR_READ) + get(MAINTENANCE_FOR_READ));
          }
        
          public int outOfServiceReplicas() {
            return maintenanceReplicas() + decommissionedAndDecommissioning();
          }
        
        
      Show
      manojg Manoj Govindassamy added a comment - Code References: BlockManager#checkReplicaOnStorage: } else if (node.isDecommissionInProgress()) { s = StoredReplicaState.DECOMMISSIONING; } else if (node.isDecommissioned()) { s = StoredReplicaState.DECOMMISSIONED; } else if (node.isMaintenance()) { if (node.isInMaintenance() || !node.isAlive()) { s = StoredReplicaState.MAINTENANCE_NOT_FOR_READ; } else { s = StoredReplicaState.MAINTENANCE_FOR_READ; } } else if (isExcess(node, b)) { s = StoredReplicaState.EXCESS; } else { s = StoredReplicaState.LIVE; } counters.add(s, 1); DecommissionManager#Monitor#processBlocksInternal: if ((liveReplicas == 0) && (num.decommissionedAndDecommissioning() > 0)) { decommissionOnlyReplicas++; } if ((liveReplicas == 0) && (num.maintenanceReplicas() > 0)) { maintenanceOnlyReplicas++; } if ((liveReplicas == 0) && (num.outOfServiceReplicas() > 0)) { outOfServiceOnlyReplicas++; } NumberReplicas: public int decommissionedAndDecommissioning() { return decommissioned() + decommissioning(); } public int maintenanceReplicas() { return ( int ) (get(MAINTENANCE_NOT_FOR_READ) + get(MAINTENANCE_FOR_READ)); } public int outOfServiceReplicas() { return maintenanceReplicas() + decommissionedAndDecommissioning(); }
      Hide
      mingma Ming Ma added a comment - - edited

      A given replica is only in one admin state, normal, decommission or maintenance. But NumberReplicas represents the state of all replicas. Thus for the case "One replica is decommissioning and two replicas of the same block are entering maintenance", NumberReplicas#decommissionedAndDecommissioning == 1, NumberReplicas#maintenanceReplicas() == 2. No?

      Show
      mingma Ming Ma added a comment - - edited A given replica is only in one admin state, normal, decommission or maintenance. But NumberReplicas represents the state of all replicas. Thus for the case "One replica is decommissioning and two replicas of the same block are entering maintenance", NumberReplicas#decommissionedAndDecommissioning == 1 , NumberReplicas#maintenanceReplicas() == 2 . No?
      Hide
      manojg Manoj Govindassamy added a comment -

      >> But NumberReplicas represents the state of all replicas.

      Thats Right.

      >> Thus for the case "One replica is decommissioning and two replicas of the same block are entering maintenance", NumberReplicas#decommissionedAndDecommissioning == 1, NumberReplicas#maintenanceReplicas() == 2.

      Yes, exactly.

      Show
      manojg Manoj Govindassamy added a comment - >> But NumberReplicas represents the state of all replicas. Thats Right. >> Thus for the case "One replica is decommissioning and two replicas of the same block are entering maintenance", NumberReplicas#decommissionedAndDecommissioning == 1, NumberReplicas#maintenanceReplicas() == 2. Yes, exactly.
      Hide
      mingma Ming Ma added a comment -

      Then for that specific case when DecommissionManager#Monitor#processBlocksInternal is processing the decommissioning node, NumberReplicas#decommissionedAndDecommissioning() > 0 and NumberReplicas#maintenanceReplicas() > 0 are satisfied. Thus both decommissionOnlyReplicas and maintenanceOnlyReplicas will be incremented. The same applies to the other two entering maintenance nodes.

      Show
      mingma Ming Ma added a comment - Then for that specific case when DecommissionManager#Monitor#processBlocksInternal is processing the decommissioning node, NumberReplicas#decommissionedAndDecommissioning() > 0 and NumberReplicas#maintenanceReplicas() > 0 are satisfied. Thus both decommissionOnlyReplicas and maintenanceOnlyReplicas will be incremented. The same applies to the other two entering maintenance nodes.
      Hide
      manojg Manoj Govindassamy added a comment -

      Got your point. Thats right. processBlocksInternal goes over block by block and gets NumberReplicas for each of the block. The NumberReplicas thus returned will have both decommissioning and maintenance check passing through for the case1 we discussed. Thanks a lot for detailed discussion with example cases on this.

      So, are we good with patch v02 w.r.t Decommissioning and EnteringMaintenance page stats ?
      If you are ok with this, I will incorporate the other open review comments like the following and upload patch v03

      • leaving "decommissioned" attr in FSNameSystem#getDeadNodes() as is
      • removal decommission check in getMaintenanceOnlyReplicas
      • addition of dfshealth-node-down-maintenance in DN page
      Show
      manojg Manoj Govindassamy added a comment - Got your point. Thats right. processBlocksInternal goes over block by block and gets NumberReplicas for each of the block. The NumberReplicas thus returned will have both decommissioning and maintenance check passing through for the case1 we discussed. Thanks a lot for detailed discussion with example cases on this. So, are we good with patch v02 w.r.t Decommissioning and EnteringMaintenance page stats ? If you are ok with this, I will incorporate the other open review comments like the following and upload patch v03 leaving "decommissioned" attr in FSNameSystem#getDeadNodes() as is removal decommission check in getMaintenanceOnlyReplicas addition of dfshealth-node-down-maintenance in DN page
      Hide
      mingma Ming Ma added a comment -

      Thanks. Sounds good.

      Show
      mingma Ming Ma added a comment - Thanks. Sounds good.
      Hide
      manojg Manoj Govindassamy added a comment -

      Attached patch v03 to address the open review items as discussed in previous jira comment. Ming Ma, Lei (Eddy) Xu, can you please take a look ? Thanks for the review.

      Show
      manojg Manoj Govindassamy added a comment - Attached patch v03 to address the open review items as discussed in previous jira comment. Ming Ma , Lei (Eddy) Xu , can you please take a look ? Thanks for the review.
      Hide
      hadoopqa Hadoop QA added a comment -
      -1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 14s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 14m 53s trunk passed
      +1 compile 0m 50s trunk passed
      +1 checkstyle 0m 31s trunk passed
      +1 mvnsite 0m 56s trunk passed
      +1 mvneclipse 0m 14s trunk passed
      +1 findbugs 1m 58s trunk passed
      +1 javadoc 0m 42s trunk passed
      +1 mvninstall 1m 1s the patch passed
      +1 compile 0m 58s the patch passed
      +1 javac 0m 58s the patch passed
      +1 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277)
      +1 mvnsite 1m 2s the patch passed
      +1 mvneclipse 0m 12s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 2m 13s the patch passed
      +1 javadoc 0m 42s the patch passed
      -1 unit 78m 48s hadoop-hdfs in the patch failed.
      +1 asflicense 0m 26s The patch does not generate ASF License warnings.
      107m 41s



      Reason Tests
      Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:a9ad5d6
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846084/HDFS-9391.03.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux a930eade1737 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision trunk / 2977bc6
      Default Java 1.8.0_111
      findbugs v3.0.0
      unit https://builds.apache.org/job/PreCommit-HDFS-Build/18052/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
      Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18052/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18052/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 14s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 14m 53s trunk passed +1 compile 0m 50s trunk passed +1 checkstyle 0m 31s trunk passed +1 mvnsite 0m 56s trunk passed +1 mvneclipse 0m 14s trunk passed +1 findbugs 1m 58s trunk passed +1 javadoc 0m 42s trunk passed +1 mvninstall 1m 1s the patch passed +1 compile 0m 58s the patch passed +1 javac 0m 58s the patch passed +1 checkstyle 0m 30s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277) +1 mvnsite 1m 2s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 13s the patch passed +1 javadoc 0m 42s the patch passed -1 unit 78m 48s hadoop-hdfs in the patch failed. +1 asflicense 0m 26s The patch does not generate ASF License warnings. 107m 41s Reason Tests Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846084/HDFS-9391.03.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux a930eade1737 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 2977bc6 Default Java 1.8.0_111 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18052/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18052/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18052/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      manojg Manoj Govindassamy added a comment -

      TestDataNodeVolumeFailureReporting passes locally for me. Jenkins test failure doesn't look like related to this patch.

      Show
      manojg Manoj Govindassamy added a comment - TestDataNodeVolumeFailureReporting passes locally for me. Jenkins test failure doesn't look like related to this patch.
      Hide
      mingma Ming Ma added a comment -

      Thanks Manoj. I just found something related to our discussion. For any decommissioning node, given getDecommissionOnlyReplicas is the same as getOutOfServiceOnlyReplicas, can we just use getOutOfServiceOnlyReplicas value for JSON decommissionOnlyReplicas property? Same for any entering maintenance node. In other words, we might not need to add the extra decommissionOnlyReplicas and maintenanceOnlyReplicas to LeavingServiceStatus.

      Show
      mingma Ming Ma added a comment - Thanks Manoj. I just found something related to our discussion. For any decommissioning node, given getDecommissionOnlyReplicas is the same as getOutOfServiceOnlyReplicas, can we just use getOutOfServiceOnlyReplicas value for JSON decommissionOnlyReplicas property? Same for any entering maintenance node. In other words, we might not need to add the extra decommissionOnlyReplicas and maintenanceOnlyReplicas to LeavingServiceStatus.
      Hide
      manojg Manoj Govindassamy added a comment - - edited

      Sure.

      Its about whether we should include all blocks of Maintenance + Decommission states under "Block with No Live Replicas" for each of DN in the "Entering Maintenance" and "Decommissioning" page. Previously i was trying to have them include only one of these states (as per the initial discussion in this jira). But, thinking more about it and after your discussing i feel including both these states makes sense. Will upload the new patch soon. Thanks a lot for the review.

      Show
      manojg Manoj Govindassamy added a comment - - edited Sure. Its about whether we should include all blocks of Maintenance + Decommission states under "Block with No Live Replicas" for each of DN in the "Entering Maintenance" and "Decommissioning" page. Previously i was trying to have them include only one of these states (as per the initial discussion in this jira). But, thinking more about it and after your discussing i feel including both these states makes sense. Will upload the new patch soon. Thanks a lot for the review.
      Hide
      manojg Manoj Govindassamy added a comment -

      Attached v04 patch to fix the LeavingServiceStatus and use outOfServiceReplica block counts in DN UI pages. Ming Ma, Lei (Eddy) Xu can you please take a look at this patch revision ?

      Show
      manojg Manoj Govindassamy added a comment - Attached v04 patch to fix the LeavingServiceStatus and use outOfServiceReplica block counts in DN UI pages. Ming Ma , Lei (Eddy) Xu can you please take a look at this patch revision ?
      Hide
      hadoopqa Hadoop QA added a comment -
      -1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 16s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 12m 55s trunk passed
      +1 compile 0m 50s trunk passed
      +1 checkstyle 0m 30s trunk passed
      +1 mvnsite 0m 51s trunk passed
      +1 mvneclipse 0m 13s trunk passed
      +1 findbugs 1m 42s trunk passed
      +1 javadoc 0m 39s trunk passed
      +1 mvninstall 0m 46s the patch passed
      +1 compile 0m 44s the patch passed
      +1 javac 0m 44s the patch passed
      +1 checkstyle 0m 27s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277)
      +1 mvnsite 0m 49s the patch passed
      +1 mvneclipse 0m 10s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 1m 48s the patch passed
      +1 javadoc 0m 36s the patch passed
      -1 unit 79m 43s hadoop-hdfs in the patch failed.
      +1 asflicense 0m 18s The patch does not generate ASF License warnings.
      104m 30s



      Reason Tests
      Failed junit tests hadoop.hdfs.server.namenode.TestFileTruncate
        hadoop.hdfs.server.namenode.TestStartup
      Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:a9ad5d6
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846427/HDFS-9391.04.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux 1c07ed9ae4f4 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision trunk / 91bf504
      Default Java 1.8.0_111
      findbugs v3.0.0
      unit https://builds.apache.org/job/PreCommit-HDFS-Build/18119/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
      Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18119/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18119/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 16s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 12m 55s trunk passed +1 compile 0m 50s trunk passed +1 checkstyle 0m 30s trunk passed +1 mvnsite 0m 51s trunk passed +1 mvneclipse 0m 13s trunk passed +1 findbugs 1m 42s trunk passed +1 javadoc 0m 39s trunk passed +1 mvninstall 0m 46s the patch passed +1 compile 0m 44s the patch passed +1 javac 0m 44s the patch passed +1 checkstyle 0m 27s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 276 unchanged - 1 fixed = 276 total (was 277) +1 mvnsite 0m 49s the patch passed +1 mvneclipse 0m 10s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 1m 48s the patch passed +1 javadoc 0m 36s the patch passed -1 unit 79m 43s hadoop-hdfs in the patch failed. +1 asflicense 0m 18s The patch does not generate ASF License warnings. 104m 30s Reason Tests Failed junit tests hadoop.hdfs.server.namenode.TestFileTruncate   hadoop.hdfs.server.namenode.TestStartup Timed out junit tests org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure Subsystem Report/Notes Docker Image:yetus/hadoop:a9ad5d6 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846427/HDFS-9391.04.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 1c07ed9ae4f4 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 91bf504 Default Java 1.8.0_111 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18119/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18119/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18119/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      manojg Manoj Govindassamy added a comment -

      Test failures are not related to the patch. They are passing locally for me.

      Show
      manojg Manoj Govindassamy added a comment - Test failures are not related to the patch. They are passing locally for me.
      Hide
      mingma Ming Ma added a comment -

      +1. Manoj, given the patch doesn't apply directly to branch-2, can you please provide another patch? Thanks.

      Show
      mingma Ming Ma added a comment - +1. Manoj, given the patch doesn't apply directly to branch-2, can you please provide another patch? Thanks.
      Hide
      manojg Manoj Govindassamy added a comment -

      Thanks Ming Ma. Attached branch-2.01.patch. Thanks for the review.

      Show
      manojg Manoj Govindassamy added a comment - Thanks Ming Ma . Attached branch-2.01.patch. Thanks for the review.
      Hide
      hadoopqa Hadoop QA added a comment -
      -1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 19s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 6m 54s branch-2 passed
      +1 compile 0m 47s branch-2 passed with JDK v1.8.0_111
      +1 compile 0m 45s branch-2 passed with JDK v1.7.0_121
      +1 checkstyle 0m 29s branch-2 passed
      +1 mvnsite 0m 50s branch-2 passed
      +1 mvneclipse 0m 15s branch-2 passed
      +1 findbugs 1m 57s branch-2 passed
      +1 javadoc 0m 55s branch-2 passed with JDK v1.8.0_111
      +1 javadoc 1m 37s branch-2 passed with JDK v1.7.0_121
      +1 mvninstall 0m 46s the patch passed
      +1 compile 0m 46s the patch passed with JDK v1.8.0_111
      +1 javac 0m 46s the patch passed
      +1 compile 0m 43s the patch passed with JDK v1.7.0_121
      +1 javac 0m 43s the patch passed
      +1 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 294 unchanged - 1 fixed = 294 total (was 295)
      +1 mvnsite 0m 51s the patch passed
      +1 mvneclipse 0m 12s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 2m 11s the patch passed
      +1 javadoc 0m 59s the patch passed with JDK v1.8.0_111
      +1 javadoc 1m 38s the patch passed with JDK v1.7.0_121
      -1 unit 69m 15s hadoop-hdfs in the patch failed with JDK v1.7.0_121.
      +1 asflicense 0m 21s The patch does not generate ASF License warnings.
      162m 54s



      Reason Tests
      JDK v1.8.0_111 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion
        hadoop.hdfs.server.namenode.TestDecommissioningStatus
      JDK v1.7.0_121 Failed junit tests hadoop.hdfs.server.namenode.TestDecommissioningStatus
        hadoop.hdfs.TestEncryptionZones



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:b59b8b7
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846485/HDFS-9391-branch-2.01.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux 67c8a669255d 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision branch-2 / b600577
      Default Java 1.7.0_121
      Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121
      findbugs v3.0.0
      unit https://builds.apache.org/job/PreCommit-HDFS-Build/18122/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_121.txt
      JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18122/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18122/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - -1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 6m 54s branch-2 passed +1 compile 0m 47s branch-2 passed with JDK v1.8.0_111 +1 compile 0m 45s branch-2 passed with JDK v1.7.0_121 +1 checkstyle 0m 29s branch-2 passed +1 mvnsite 0m 50s branch-2 passed +1 mvneclipse 0m 15s branch-2 passed +1 findbugs 1m 57s branch-2 passed +1 javadoc 0m 55s branch-2 passed with JDK v1.8.0_111 +1 javadoc 1m 37s branch-2 passed with JDK v1.7.0_121 +1 mvninstall 0m 46s the patch passed +1 compile 0m 46s the patch passed with JDK v1.8.0_111 +1 javac 0m 46s the patch passed +1 compile 0m 43s the patch passed with JDK v1.7.0_121 +1 javac 0m 43s the patch passed +1 checkstyle 0m 28s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 294 unchanged - 1 fixed = 294 total (was 295) +1 mvnsite 0m 51s the patch passed +1 mvneclipse 0m 12s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 11s the patch passed +1 javadoc 0m 59s the patch passed with JDK v1.8.0_111 +1 javadoc 1m 38s the patch passed with JDK v1.7.0_121 -1 unit 69m 15s hadoop-hdfs in the patch failed with JDK v1.7.0_121. +1 asflicense 0m 21s The patch does not generate ASF License warnings. 162m 54s Reason Tests JDK v1.8.0_111 Failed junit tests hadoop.hdfs.server.namenode.snapshot.TestSnapshotDeletion   hadoop.hdfs.server.namenode.TestDecommissioningStatus JDK v1.7.0_121 Failed junit tests hadoop.hdfs.server.namenode.TestDecommissioningStatus   hadoop.hdfs.TestEncryptionZones Subsystem Report/Notes Docker Image:yetus/hadoop:b59b8b7 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846485/HDFS-9391-branch-2.01.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 67c8a669255d 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / b600577 Default Java 1.7.0_121 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 findbugs v3.0.0 unit https://builds.apache.org/job/PreCommit-HDFS-Build/18122/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_121.txt JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18122/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18122/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      mingma Ming Ma added a comment -

      Thanks Manoj Govindassamy. It seems there is a typo in branch-2 patch getLeavingServiceStatus().set which passes the wrong variable, which caused TestDecommissioningStatus to fail. In addition, it will be useful to verify UI for the branch-2 patch.

      Show
      mingma Ming Ma added a comment - Thanks Manoj Govindassamy . It seems there is a typo in branch-2 patch getLeavingServiceStatus().set which passes the wrong variable, which caused TestDecommissioningStatus to fail. In addition, it will be useful to verify UI for the branch-2 patch.
      Hide
      manojg Manoj Govindassamy added a comment -

      Yes, Ming Ma. After looking at jenkins test results, I found the same problem and I have fixed them. Will upload the branch-2 patch after testing unit tests and UI. Thanks for looking at this.

      Show
      manojg Manoj Govindassamy added a comment - Yes, Ming Ma . After looking at jenkins test results, I found the same problem and I have fixed them. Will upload the branch-2 patch after testing unit tests and UI. Thanks for looking at this.
      Hide
      manojg Manoj Govindassamy added a comment -

      Thanks Ming Ma for the review. Attached branch2.02 patch fixing the LeavingServiceStatus function arguments and WebUI samples for branch-2.

      Show
      manojg Manoj Govindassamy added a comment - Thanks Ming Ma for the review. Attached branch2.02 patch fixing the LeavingServiceStatus function arguments and WebUI samples for branch-2.
      Hide
      hadoopqa Hadoop QA added a comment -
      +1 overall



      Vote Subsystem Runtime Comment
      0 reexec 0m 19s Docker mode activated.
      +1 @author 0m 0s The patch does not contain any @author tags.
      +1 test4tests 0m 0s The patch appears to include 1 new or modified test files.
      +1 mvninstall 7m 36s branch-2 passed
      +1 compile 0m 54s branch-2 passed with JDK v1.8.0_111
      +1 compile 0m 50s branch-2 passed with JDK v1.7.0_121
      +1 checkstyle 0m 33s branch-2 passed
      +1 mvnsite 0m 59s branch-2 passed
      +1 mvneclipse 0m 17s branch-2 passed
      +1 findbugs 2m 14s branch-2 passed
      +1 javadoc 1m 5s branch-2 passed with JDK v1.8.0_111
      +1 javadoc 1m 49s branch-2 passed with JDK v1.7.0_121
      +1 mvninstall 0m 51s the patch passed
      +1 compile 0m 50s the patch passed with JDK v1.8.0_111
      +1 javac 0m 50s the patch passed
      +1 compile 0m 47s the patch passed with JDK v1.7.0_121
      +1 javac 0m 47s the patch passed
      +1 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 294 unchanged - 1 fixed = 294 total (was 295)
      +1 mvnsite 0m 57s the patch passed
      +1 mvneclipse 0m 15s the patch passed
      +1 whitespace 0m 0s The patch has no whitespace issues.
      +1 findbugs 2m 26s the patch passed
      +1 javadoc 1m 2s the patch passed with JDK v1.8.0_111
      +1 javadoc 1m 32s the patch passed with JDK v1.7.0_121
      +1 unit 51m 12s hadoop-hdfs in the patch passed with JDK v1.7.0_121.
      +1 asflicense 0m 20s The patch does not generate ASF License warnings.
      134m 48s



      Reason Tests
      JDK v1.8.0_111 Failed junit tests hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA



      Subsystem Report/Notes
      Docker Image:yetus/hadoop:b59b8b7
      JIRA Issue HDFS-9391
      JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846716/HDFS-9391-branch-2.02.patch
      Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle
      uname Linux 56a007f022c0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
      Build tool maven
      Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
      git revision branch-2 / ce5ad0e
      Default Java 1.7.0_121
      Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121
      findbugs v3.0.0
      JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18136/testReport/
      modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs
      Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18136/console
      Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org

      This message was automatically generated.

      Show
      hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 1 new or modified test files. +1 mvninstall 7m 36s branch-2 passed +1 compile 0m 54s branch-2 passed with JDK v1.8.0_111 +1 compile 0m 50s branch-2 passed with JDK v1.7.0_121 +1 checkstyle 0m 33s branch-2 passed +1 mvnsite 0m 59s branch-2 passed +1 mvneclipse 0m 17s branch-2 passed +1 findbugs 2m 14s branch-2 passed +1 javadoc 1m 5s branch-2 passed with JDK v1.8.0_111 +1 javadoc 1m 49s branch-2 passed with JDK v1.7.0_121 +1 mvninstall 0m 51s the patch passed +1 compile 0m 50s the patch passed with JDK v1.8.0_111 +1 javac 0m 50s the patch passed +1 compile 0m 47s the patch passed with JDK v1.7.0_121 +1 javac 0m 47s the patch passed +1 checkstyle 0m 31s hadoop-hdfs-project/hadoop-hdfs: The patch generated 0 new + 294 unchanged - 1 fixed = 294 total (was 295) +1 mvnsite 0m 57s the patch passed +1 mvneclipse 0m 15s the patch passed +1 whitespace 0m 0s The patch has no whitespace issues. +1 findbugs 2m 26s the patch passed +1 javadoc 1m 2s the patch passed with JDK v1.8.0_111 +1 javadoc 1m 32s the patch passed with JDK v1.7.0_121 +1 unit 51m 12s hadoop-hdfs in the patch passed with JDK v1.7.0_121. +1 asflicense 0m 20s The patch does not generate ASF License warnings. 134m 48s Reason Tests JDK v1.8.0_111 Failed junit tests hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA Subsystem Report/Notes Docker Image:yetus/hadoop:b59b8b7 JIRA Issue HDFS-9391 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12846716/HDFS-9391-branch-2.02.patch Optional Tests asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle uname Linux 56a007f022c0 3.13.0-95-generic #142-Ubuntu SMP Fri Aug 12 17:00:09 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision branch-2 / ce5ad0e Default Java 1.7.0_121 Multi-JDK versions /usr/lib/jvm/java-8-oracle:1.8.0_111 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_121 findbugs v3.0.0 JDK v1.7.0_121 Test Results https://builds.apache.org/job/PreCommit-HDFS-Build/18136/testReport/ modules C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs Console output https://builds.apache.org/job/PreCommit-HDFS-Build/18136/console Powered by Apache Yetus 0.5.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
      Hide
      mingma Ming Ma added a comment - - edited

      Thanks Manoj Govindassamy for the contribution. Thanks Dilaver and Lei (Eddy) Xu for the review. I have committed the patch to trunk and branch-2.

      Show
      mingma Ming Ma added a comment - - edited Thanks Manoj Govindassamy for the contribution. Thanks Dilaver and Lei (Eddy) Xu for the review. I have committed the patch to trunk and branch-2.
      Hide
      hudson Hudson added a comment -

      FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11104 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11104/)
      HDFS-9391. Update webUI/JMX to display maintenance state info. (Manoj (mingma: rev 467f5f1735494c5ef74e6591069884d3771c17e4)

      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/NumberReplicas.java
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeMXBean.java
      • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
      Show
      hudson Hudson added a comment - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #11104 (See https://builds.apache.org/job/Hadoop-trunk-Commit/11104/ ) HDFS-9391 . Update webUI/JMX to display maintenance state info. (Manoj (mingma: rev 467f5f1735494c5ef74e6591069884d3771c17e4) (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.js (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DecommissionManager.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/webapps/hdfs/dfshealth.html (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/NumberReplicas.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeMXBean.java (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java
      Hide
      manojg Manoj Govindassamy added a comment -

      Thanks Ming Ma for the review and commit. Much appreciated.

      Show
      manojg Manoj Govindassamy added a comment - Thanks Ming Ma for the review and commit. Much appreciated.
      Hide
      manojg Manoj Govindassamy added a comment -
      • A clean build with the latest pull for both trunk and branch-2 passes through for me on my local system.
      • Even in this Jenkins Failure log, Build succeeded, and later the deploy part failed due to a tool missing.
      4591 [INFO] Apache Hadoop Client Modules ....................... SUCCESS [  0.335 s]
      4592 [INFO] ------------------------------------------------------------------------
      4593 [INFO] BUILD SUCCESS
      4594 [INFO] ------------------------------------------------------------------------
      4595 [INFO] Total time: 07:26 min (Wall Clock)
      4596 [INFO] Finished at: 2017-01-11T04:38:09+00:00
      4597 [INFO] Final Memory: 315M/4892M
      4598 [INFO] ------------------------------------------------------------------------
      4599 + /home/jenkins/tools/maven/apache-maven-3.3.3/bin/mvn deploy -DdeployAtEnd=true -DretryFailedDeploymentCount=10 -DskipTests -Dmaven.repo.local=/home/jenkins/jenkins-slave     /workspace/Hadoop-trunk-Commit/maven-repo
      4600 [INFO] Scanning for projects...
      ..
      ..
      5217 Build step 'Execute shell' marked build as failure
      5218 Updating HDFS-9391
      5219 ERROR: No tool found matching LATEST1_8_HOME
      5220 Setting MAVEN_3_3_3_HOME=/home/jenkins/tools/maven/apache-maven-3.3.3
      5221 Finished: FAILURE
      
      
      Show
      manojg Manoj Govindassamy added a comment - A clean build with the latest pull for both trunk and branch-2 passes through for me on my local system. Even in this Jenkins Failure log, Build succeeded, and later the deploy part failed due to a tool missing. 4591 [INFO] Apache Hadoop Client Modules ....................... SUCCESS [ 0.335 s] 4592 [INFO] ------------------------------------------------------------------------ 4593 [INFO] BUILD SUCCESS 4594 [INFO] ------------------------------------------------------------------------ 4595 [INFO] Total time: 07:26 min (Wall Clock) 4596 [INFO] Finished at: 2017-01-11T04:38:09+00:00 4597 [INFO] Final Memory: 315M/4892M 4598 [INFO] ------------------------------------------------------------------------ 4599 + /home/jenkins/tools/maven/apache-maven-3.3.3/bin/mvn deploy -DdeployAtEnd=true -DretryFailedDeploymentCount=10 -DskipTests -Dmaven.repo.local=/home/jenkins/jenkins-slave /workspace/Hadoop-trunk-Commit/maven-repo 4600 [INFO] Scanning for projects... .. .. 5217 Build step 'Execute shell' marked build as failure 5218 Updating HDFS-9391 5219 ERROR: No tool found matching LATEST1_8_HOME 5220 Setting MAVEN_3_3_3_HOME=/home/jenkins/tools/maven/apache-maven-3.3.3 5221 Finished: FAILURE

        People

        • Assignee:
          manojg Manoj Govindassamy
          Reporter:
          mingma Ming Ma
        • Votes:
          0 Vote for this issue
          Watchers:
          8 Start watching this issue

          Dates

          • Created:
            Updated:
            Resolved:

            Development