Hadoop Common
  1. Hadoop Common
  2. HADOOP-4430

Namenode Web UI capacity report is inconsistent with Balancer

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Changed reporting in the NameNode Web UI to more closely reflect the behavior of the re-balancer. Removed no longer used config parameter dfs.datanode.du.pct from hadoop-default.xml.
      Show
      Changed reporting in the NameNode Web UI to more closely reflect the behavior of the re-balancer. Removed no longer used config parameter dfs.datanode.du.pct from hadoop-default.xml.

      Description

      Solution to 2816 changed

      • Total Capacity definition from (the disk space of all data directories) to (the disk space of all the data directories - the reserved space)
      • We added a new element Present Capacity to the report. It is set to (Used Capacity + Remaining Capacity)
      • We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) to (Used Capacity)/(Present Capacity)
      • All these changes are displayed on Namenode Web UI.

      Balancer functionality
      Balancer script is started with a threshold parameter. It tries to move the blocks from the nodes that have Used % that is more than (Cluster average + threshold) to the nodes that have less than (Cluster average - threshold). Essentially balancer gets all the datanodes used % to with in (the Cluster average +/- threshold).

      Inconsistencies due to the change in 2816
      When MapReduce jobs are run, temporary files are generated. This eats away a lot of space from Present Capacity. The difference between the Total Capacity and the Present Capacity can be huge. Currently balancer computes Used Percentage based (Used Capacity)/(Total Capacity). The Used % the balancer uses could be significantly different from Used % displayed on the Namenode Web UI. When balancer is done balancing, the Namenode Used % might still appear unbalanced.

      1. HADOOP-4430.patch
        22 kB
        Suresh Srinivas
      2. HADOOP-4430.patch
        22 kB
        Suresh Srinivas
      3. HADOOP-4430.patch
        22 kB
        Suresh Srinivas

        Issue Links

          Activity

          Hide
          Suresh Srinivas added a comment -

          Proposed solution:

          • The definition of "Configured Capacity" from 2816 will be retained.
          • The "DFS Used %" will be changed from (DFS Used)/(Present Capacity) to (DFS Used)/(Configured Capacity)
          • "Present Capacity" introduced in 2816 should be same as "Configured Capacity", if the MapReduce generated temporary files do not take more than the reserved space. When the temporary files use more than the reserved space, "Present Capacity" reduces proportionally. With this change, "Present Capacity" data is removed. Instead, the extra space exceeding reserved space used by temporary files, is reported as "Non DFS Used" space.
          • New "DFS Remaining %" will be added to explicitly to indicate remaining % space for DFS used.
          • Currently a percentage factor, as defined by "dfs.datanode.du.pct", is used to reduce the actual remaining space to calculate the DFS Remaining. This does not serve any purpose (see the comments in 2816). This will be removed.

          Here are the definition of data reported on the Web UI:
          Configured Capacity: Disk space corresponding to all the data directories - Reserved space as defined by dfs.datanode.du.reserved
          DFS Used: Space used by DFS
          Non DFS Used: 0 if the temporary files do not exceed reserved space. Otherwise this is the size by which temporary files exceed the reserved space and encroach into the DFS configured space.
          DFS Remaining: (Configured Capacity - DFS Used - Non DFS Used)
          DFS Used %: (DFS Used / Configured Capacity) * 100
          DFS Remaining % = (DFS Remaining / Configured Capacity) * 100

          Show
          Suresh Srinivas added a comment - Proposed solution: The definition of "Configured Capacity" from 2816 will be retained. The "DFS Used %" will be changed from (DFS Used)/(Present Capacity) to (DFS Used)/(Configured Capacity) "Present Capacity" introduced in 2816 should be same as "Configured Capacity", if the MapReduce generated temporary files do not take more than the reserved space. When the temporary files use more than the reserved space, "Present Capacity" reduces proportionally. With this change, "Present Capacity" data is removed. Instead, the extra space exceeding reserved space used by temporary files, is reported as "Non DFS Used" space. New "DFS Remaining %" will be added to explicitly to indicate remaining % space for DFS used. Currently a percentage factor, as defined by "dfs.datanode.du.pct", is used to reduce the actual remaining space to calculate the DFS Remaining. This does not serve any purpose (see the comments in 2816). This will be removed. Here are the definition of data reported on the Web UI: Configured Capacity: Disk space corresponding to all the data directories - Reserved space as defined by dfs.datanode.du.reserved DFS Used: Space used by DFS Non DFS Used: 0 if the temporary files do not exceed reserved space. Otherwise this is the size by which temporary files exceed the reserved space and encroach into the DFS configured space. DFS Remaining: (Configured Capacity - DFS Used - Non DFS Used) DFS Used %: (DFS Used / Configured Capacity) * 100 DFS Remaining % = (DFS Remaining / Configured Capacity) * 100
          Hide
          Robert Chansler added a comment -

          We really want this in 0.19 to avoid operational confusion. A successful execution of the rebalancer should result in the appearance of balance on the home page!

          Show
          Robert Chansler added a comment - We really want this in 0.19 to avoid operational confusion. A successful execution of the rebalancer should result in the appearance of balance on the home page!
          Hide
          Suresh Srinivas added a comment -

          The changes are based on the solution presented in an earlier comment.

          Here is the test-patch result:
          [exec] +1 overall.

          [exec] +1 @author. The patch does not contain any @author tags.

          [exec] +1 tests included. The patch appears to include 3 new or modified tests.

          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.

          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

          Show
          Suresh Srinivas added a comment - The changes are based on the solution presented in an earlier comment. Here is the test-patch result: [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
          Hide
          Hairong Kuang added a comment -

          1. DatanodeInfo.append() line 190: u should be nonDFSUsed.
          2. FSNamesystem.getCapacityUsedNodnDFS() line 3306: return should be out of the synchronized block.
          3. FSnamesystem.getCapacityRemainingPercent() line 3324: the calculation is not consistent with that in DatanodeInfo.getRemainingPercent().

          Show
          Hairong Kuang added a comment - 1. DatanodeInfo.append() line 190: u should be nonDFSUsed. 2. FSNamesystem.getCapacityUsedNodnDFS() line 3306: return should be out of the synchronized block. 3. FSnamesystem.getCapacityRemainingPercent() line 3324: the calculation is not consistent with that in DatanodeInfo.getRemainingPercent().
          Hide
          Suresh Srinivas added a comment -

          Thanks Hairong for the review. I have attached a new patch with the suggested changes.

          Show
          Suresh Srinivas added a comment - Thanks Hairong for the review. I have attached a new patch with the suggested changes.
          Hide
          Suresh Srinivas added a comment -

          Previous patch does not build. Attaching a new one.

          Show
          Suresh Srinivas added a comment - Previous patch does not build. Attaching a new one.
          Hide
          Hairong Kuang added a comment -

          +1. The patch looks good to me.

          Show
          Hairong Kuang added a comment - +1. The patch looks good to me.
          Hide
          Hadoop QA added a comment -

          +1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12392389/HADOOP-4430.patch
          against trunk revision 705831.

          +1 @author. The patch does not contain any @author tags.

          +1 tests included. The patch appears to include 3 new or modified tests.

          +1 javadoc. The javadoc tool did not generate any warning messages.

          +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          +1 findbugs. The patch does not introduce any new Findbugs warnings.

          +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          +1 core tests. The patch passed core unit tests.

          +1 contrib tests. The patch passed contrib unit tests.

          Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/testReport/
          Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
          Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/artifact/trunk/build/test/checkstyle-errors.html
          Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12392389/HADOOP-4430.patch against trunk revision 705831. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. +1 core tests. The patch passed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3492/console This message is automatically generated.
          Hide
          Hairong Kuang added a comment -

          I just committed this. Thank you, Suresh!

          Show
          Hairong Kuang added a comment - I just committed this. Thank you, Suresh!
          Hide
          Raghu Angadi added a comment -

          What is the worst case possible if some one upgrades without noting the changes?

          Show
          Raghu Angadi added a comment - What is the worst case possible if some one upgrades without noting the changes?
          Hide
          Suresh Srinivas added a comment -

          This change is mainly related to Web UI. It provides better clarity to how the file system capacity is represented on Web UI. This should not affect any functionality post upgrade.

          Show
          Suresh Srinivas added a comment - This change is mainly related to Web UI. It provides better clarity to how the file system capacity is represented on Web UI. This should not affect any functionality post upgrade.
          Hide
          Hudson added a comment -
          Show
          Hudson added a comment - Integrated in Hadoop-trunk #640 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/640/ )

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development