Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-758

Improve reporting of progress of decommissioning

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      New name node web UI page displays details of decommissioning progress. (dfsnodelist.jsp?whatNodes=DECOMMISSIONING)
    1. HDFS-758.5.0-20.patch
      22 kB
      Jitendra Nath Pandey
    2. HDFS-758.4.patch
      22 kB
      Jitendra Nath Pandey
    3. HDFS-758.3.patch
      22 kB
      Jitendra Nath Pandey
    4. HDFS-758.2.patch
      20 kB
      Jitendra Nath Pandey
    5. HDFS-758.1.patch
      20 kB
      Jitendra Nath Pandey

      Issue Links

        Activity

        Jitendra Nath Pandey created issue -
        Hide
        Brian Bockelman added a comment -

        This would be incredibly useful for our site. We find that there are sometimes very "long tails" in decommissioning - the last 4 hours of a node decommissioning might be due to a handful of stuck or slow blocks. It drives our admins nuts that they don't know if things are 50% done or 99.999% done (and should just pull the plug).

        Show
        Brian Bockelman added a comment - This would be incredibly useful for our site. We find that there are sometimes very "long tails" in decommissioning - the last 4 hours of a node decommissioning might be due to a handful of stuck or slow blocks. It drives our admins nuts that they don't know if things are 50% done or 99.999% done (and should just pull the plug).
        Jitendra Nath Pandey made changes -
        Field Original Value New Value
        Attachment HDFS-758.1.patch [ 12424915 ]
        Jitendra Nath Pandey made changes -
        Assignee Jitendra Nath Pandey [ jnp ]
        Jitendra Nath Pandey made changes -
        Attachment HDFS-758.2.patch [ 12424929 ]
        Hide
        Suresh Srinivas added a comment -
        1. Check for 80 char column size in the code
        2. Instead of assertTrue to check equality, use assertEquals in tests
        3. BlockManager.java
          • remove import of org.mortbay.log.Log and use FSNameSystem.LOG for logging
          • isReplicationInProgress() - logging information about block can be moved to separate method for better readability. Also move setting status = true to pervious if (!status)) block.
          • isReplicationInProgress() - should blocksWithOnlyDecommissioningReplicasCount be incremented {{if (curReplicas == 0 && num.decommissioningReplicas() > 0). Second condition is the new addition.
          • isReplicationInProgress() - consistent and shorter naming - underReplicatedCount to underReplicatedBlocks, blocksOnlyWithDecommissionReplicasCount to decommissionOnlyReplicas, underRepBlocksInFilesUnderConstruction to underReplicatedInOpenFiles.
        4. DatanodeDescriptor.java
          • make DecommissioningStatus and member decommissioningStatus package private and move all the related method into the class. Methods can be directly called on decommissioningStatus to set and get the data.
          • renaming - setBlockCountsInDecommissioning to set, getUnderRepBlockCountInDecommission to getUnderReplicatedBlocks, getBlocksWithOnlyDecommissionReplicas to getDecommissionOnlyReplicas(), getDecommissioningUnderRepBlocksInFilesUnderConstruction to getUnderReplicatedInOpenFiles, setDecommissionStartTime to setStartTime, getDecommissioningStartTime to getStartTime, DecommissionStatus.decommissionStartTime to DecommissionStatus.startTime.
          • Should block and replica counts be int instead of long?
        5. FSNamesystem
          • startDecommission() - setting the decommission start tiem should move DatanodeDescriptor.startDecommission(). Also the check if isDecommissionInProgress() && isDecommissioned should happen in startDecommission().
          • startDecommission() - checkDecommissionStateInternal() replaces some code. They do not seem to be equivalent code.
          • getDecommissioningNodes - instead of getDataNodeListForReport(ALL), getDataNodeListForReport(LIVE) should suffice.
        6. NamenodeJspHelper
          • LOG should use NameJspHelper.class
          • Should NamenodeJspHelper.generateDecommissioningNodeData() return immediately if the node is not decommissioning or decommissioned?
          • generateHealthReport() - is it a good idea to have a method in FSNamesystem, getDecommissioningNodeCount(), instead of having to rely on getDecommissioningNodes, which returns an ArrayList, just to print the count?
          • generateNodesList() - please check whatNodes.equals(DECOMMISSIONING) in the else condition. typo Decommissioing. Also building decommissioning nodes should be done when whatNodes == DECOMMISSIONING.

        I have not reviewed the test changes yet.

        Show
        Suresh Srinivas added a comment - Check for 80 char column size in the code Instead of assertTrue to check equality, use assertEquals in tests BlockManager.java remove import of org.mortbay.log.Log and use FSNameSystem.LOG for logging isReplicationInProgress() - logging information about block can be moved to separate method for better readability. Also move setting status = true to pervious if (!status)) block. isReplicationInProgress() - should blocksWithOnlyDecommissioningReplicasCount be incremented {{if (curReplicas == 0 && num.decommissioningReplicas() > 0). Second condition is the new addition. isReplicationInProgress() - consistent and shorter naming - underReplicatedCount to underReplicatedBlocks, blocksOnlyWithDecommissionReplicasCount to decommissionOnlyReplicas, underRepBlocksInFilesUnderConstruction to underReplicatedInOpenFiles. DatanodeDescriptor.java make DecommissioningStatus and member decommissioningStatus package private and move all the related method into the class. Methods can be directly called on decommissioningStatus to set and get the data. renaming - setBlockCountsInDecommissioning to set, getUnderRepBlockCountInDecommission to getUnderReplicatedBlocks, getBlocksWithOnlyDecommissionReplicas to getDecommissionOnlyReplicas(), getDecommissioningUnderRepBlocksInFilesUnderConstruction to getUnderReplicatedInOpenFiles, setDecommissionStartTime to setStartTime, getDecommissioningStartTime to getStartTime, DecommissionStatus.decommissionStartTime to DecommissionStatus.startTime. Should block and replica counts be int instead of long? FSNamesystem startDecommission() - setting the decommission start tiem should move DatanodeDescriptor.startDecommission(). Also the check if isDecommissionInProgress() && isDecommissioned should happen in startDecommission(). startDecommission() - checkDecommissionStateInternal() replaces some code. They do not seem to be equivalent code. getDecommissioningNodes - instead of getDataNodeListForReport(ALL), getDataNodeListForReport(LIVE) should suffice. NamenodeJspHelper LOG should use NameJspHelper.class Should NamenodeJspHelper.generateDecommissioningNodeData() return immediately if the node is not decommissioning or decommissioned? generateHealthReport() - is it a good idea to have a method in FSNamesystem, getDecommissioningNodeCount(), instead of having to rely on getDecommissioningNodes, which returns an ArrayList, just to print the count? generateNodesList() - please check whatNodes.equals(DECOMMISSIONING) in the else condition. typo Decommissioing. Also building decommissioning nodes should be done when whatNodes == DECOMMISSIONING. I have not reviewed the test changes yet.
        Hide
        Jitendra Nath Pandey added a comment -

        New patch uploaded.

        Show
        Jitendra Nath Pandey added a comment - New patch uploaded.
        Jitendra Nath Pandey made changes -
        Attachment HDFS-758.3.patch [ 12425553 ]
        Jitendra Nath Pandey made changes -
        Attachment HDFS-758.4.patch [ 12425676 ]
        Hide
        Jitendra Nath Pandey added a comment -

        test-patch results for HDFS-758.4.patch :

        [exec] +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 4 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.

        Show
        Jitendra Nath Pandey added a comment - test-patch results for HDFS-758 .4.patch : [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 4 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        Hide
        Jitendra Nath Pandey added a comment -

        Patch HDFS-758.4.patch submitted for hudson tests.

        Show
        Jitendra Nath Pandey added a comment - Patch HDFS-758 .4.patch submitted for hudson tests.
        Jitendra Nath Pandey made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        +1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12425676/HDFS-758.4.patch
        against trunk revision 882733.

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 4 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        +1 findbugs. The patch does not introduce any new Findbugs warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        +1 core tests. The patch passed core unit tests.

        +1 contrib tests. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12425676/HDFS-758.4.patch against trunk revision 882733. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 4 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/121/console This message is automatically generated.
        Hide
        Suresh Srinivas added a comment -

        +1 for the patch.

        Show
        Suresh Srinivas added a comment - +1 for the patch.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #120 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/120/)
        . Add decommissioning status page to Namenode Web UI. Contributed by Jitendra Nath Pandey.

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #120 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/120/ ) . Add decommissioning status page to Namenode Web UI. Contributed by Jitendra Nath Pandey.
        Hide
        Jitendra Nath Pandey added a comment -

        HDFS-758.5.0-20.patch is the patch for hadoop-0.20.

        Show
        Jitendra Nath Pandey added a comment - HDFS-758 .5.0-20.patch is the patch for hadoop-0.20.
        Jitendra Nath Pandey made changes -
        Attachment HDFS-758.5.0-20.patch [ 12426000 ]
        Hide
        Hudson added a comment -

        Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #123 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/123/)
        . Add decommissioning status page to Namenode Web UI. Contributed by Jitendra Nath Pandey.

        Show
        Hudson added a comment - Integrated in Hdfs-Patch-h5.grid.sp2.yahoo.net #123 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/123/ ) . Add decommissioning status page to Namenode Web UI. Contributed by Jitendra Nath Pandey.
        Robert Chansler made changes -
        Link This issue relates to HDFS-283 [ HDFS-283 ]
        Hide
        Hudson added a comment -

        Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #81 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/81/)

        Show
        Hudson added a comment - Integrated in Hdfs-Patch-h2.grid.sp2.yahoo.net #81 (See http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/81/ )
        Ravi Phulari made changes -
        Link This issue relates to HDFS-810 [ HDFS-810 ]
        Robert Chansler made changes -
        Release Note New name node web UI page displays details of decommissioning progress. (dfsnodelist.jsp?whatNodes=DECOMMISSIONING)
        Hide
        Todd Lipcon added a comment -

        This was committed 11/24/09 by Suresh. Marking resolved.

        Show
        Todd Lipcon added a comment - This was committed 11/24/09 by Suresh. Marking resolved.
        Todd Lipcon made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 0.22.0 [ 12314241 ]
        Resolution Fixed [ 1 ]
        Tom White made changes -
        Fix Version/s 0.21.0 [ 12314046 ]
        Fix Version/s 0.22.0 [ 12314241 ]
        Tom White made changes -
        Status Resolved [ 5 ] Closed [ 6 ]

          People

          • Assignee:
            Jitendra Nath Pandey
            Reporter:
            Jitendra Nath Pandey
          • Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development