Hadoop Common
  1. Hadoop Common
  2. HADOOP-4430

Namenode Web UI capacity report is inconsistent with Balancer

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.19.0
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change, Reviewed
    • Release Note:
      Hide
      Changed reporting in the NameNode Web UI to more closely reflect the behavior of the re-balancer. Removed no longer used config parameter dfs.datanode.du.pct from hadoop-default.xml.
      Show
      Changed reporting in the NameNode Web UI to more closely reflect the behavior of the re-balancer. Removed no longer used config parameter dfs.datanode.du.pct from hadoop-default.xml.

      Description

      Solution to 2816 changed

      • Total Capacity definition from (the disk space of all data directories) to (the disk space of all the data directories - the reserved space)
      • We added a new element Present Capacity to the report. It is set to (Used Capacity + Remaining Capacity)
      • We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) to (Used Capacity)/(Present Capacity)
      • All these changes are displayed on Namenode Web UI.

      Balancer functionality
      Balancer script is started with a threshold parameter. It tries to move the blocks from the nodes that have Used % that is more than (Cluster average + threshold) to the nodes that have less than (Cluster average - threshold). Essentially balancer gets all the datanodes used % to with in (the Cluster average +/- threshold).

      Inconsistencies due to the change in 2816
      When MapReduce jobs are run, temporary files are generated. This eats away a lot of space from Present Capacity. The difference between the Total Capacity and the Present Capacity can be huge. Currently balancer computes Used Percentage based (Used Capacity)/(Total Capacity). The Used % the balancer uses could be significantly different from Used % displayed on the Namenode Web UI. When balancer is done balancing, the Namenode Used % might still appear unbalanced.

      1. HADOOP-4430.patch
        22 kB
        Suresh Srinivas
      2. HADOOP-4430.patch
        22 kB
        Suresh Srinivas
      3. HADOOP-4430.patch
        22 kB
        Suresh Srinivas

        Issue Links

          Activity

          Suresh Srinivas created issue -
          Robert Chansler made changes -
          Field Original Value New Value
          Priority Major [ 3 ] Blocker [ 1 ]
          Suresh Srinivas made changes -
          Link This issue blocks HADOOP-2816 [ HADOOP-2816 ]
          Devaraj Das made changes -
          Component/s dfs [ 12310710 ]
          Suresh Srinivas made changes -
          Attachment HADOOP-4430.patch [ 12392371 ]
          Suresh Srinivas made changes -
          Release Note Incompatible changes:
          1) Config parameter dfs.datanode.du.pct is no longer used and is removed from the hadoop-default.xml.

          2) Namenoe Web UI has the following changes:
             The following parameters are removed:
             * Total Capacity

             The following parameters are added to both Cluster Summary and Datanode information:
             * Configured Capacity - This is total diskspace of all the data directories minus the resereved capacity defined by
             * Non DFS Used - This indicates the disk space taken by non DFS file
             * DFS remaining % - This is remaining % of Configured Capacity available for DFS use

             The following parameters are modified:
             * DFS Used % - This is changed from % of Total Capacity to % of Configured Capacity
          Hadoop Flags [Incompatible change]
          Status Open [ 1 ] Patch Available [ 10002 ]
          Suresh Srinivas made changes -
          Attachment HADOOP-4430.patch [ 12392383 ]
          Suresh Srinivas made changes -
          Attachment HADOOP-4430.patch [ 12392389 ]
          Suresh Srinivas made changes -
          Release Note Incompatible changes:
          1) Config parameter dfs.datanode.du.pct is no longer used and is removed from the hadoop-default.xml.

          2) Namenoe Web UI has the following changes:
             The following parameters are removed:
             * Total Capacity

             The following parameters are added to both Cluster Summary and Datanode information:
             * Configured Capacity - This is total diskspace of all the data directories minus the resereved capacity defined by
             * Non DFS Used - This indicates the disk space taken by non DFS file
             * DFS remaining % - This is remaining % of Configured Capacity available for DFS use

             The following parameters are modified:
             * DFS Used % - This is changed from % of Total Capacity to % of Configured Capacity
          Incompatible changes:
          This change modifies/retains the changes made in 2816 as follows:
          1) Present Capacity added in 2816 is removed from the Web UI
          2) Change of Total Capacity to Configured Capacity and its definition from 2816 is retained in the Web UI
          3) Data node protocol change to report Configured Capacity instead of Total Capacity is retained.
          4) DFS Used% was calculated as a percentage of Present Capacity. It is changed to percentage of Configured Capacity.

          Other incompatible changes:
          1) Config parameter dfs.datanode.du.pct is no longer used and is removed from the hadoop-default.xml.

          2) Namenode Web UI has the following addional changes:
             The following parameters are added to both Cluster Summary and Datanode information:
             * Non DFS Used - This indicates the disk space taken by non DFS file
             * DFS remaining % - This is remaining % of Configured Capacity available for DFS use
          Hairong Kuang made changes -
          Hadoop Flags [Incompatible change] [Incompatible change, Reviewed]
          Resolution Fixed [ 1 ]
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Robert Chansler made changes -
          Hadoop Flags [Reviewed, Incompatible change] [Incompatible change, Reviewed]
          Release Note Incompatible changes:
          This change modifies/retains the changes made in 2816 as follows:
          1) Present Capacity added in 2816 is removed from the Web UI
          2) Change of Total Capacity to Configured Capacity and its definition from 2816 is retained in the Web UI
          3) Data node protocol change to report Configured Capacity instead of Total Capacity is retained.
          4) DFS Used% was calculated as a percentage of Present Capacity. It is changed to percentage of Configured Capacity.

          Other incompatible changes:
          1) Config parameter dfs.datanode.du.pct is no longer used and is removed from the hadoop-default.xml.

          2) Namenode Web UI has the following addional changes:
             The following parameters are added to both Cluster Summary and Datanode information:
             * Non DFS Used - This indicates the disk space taken by non DFS file
             * DFS remaining % - This is remaining % of Configured Capacity available for DFS use
          Changed reporting in the NameNode Web UI to more closely reflect the behavior of the re-balancer. Removed no longer used config parameter dfs.datanode.du.pct from hadoop-default.xml.
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Jeff Hammerbacher made changes -
          Link This issue relates to HDFS-1564 [ HDFS-1564 ]

            People

            • Assignee:
              Suresh Srinivas
              Reporter:
              Suresh Srinivas
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development