Details
-
Bug
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
0.19.0
-
None
-
None
-
Incompatible change, Reviewed
-
Description
Solution to 2816 changed
- Total Capacity definition from (the disk space of all data directories) to (the disk space of all the data directories - the reserved space)
- We added a new element Present Capacity to the report. It is set to (Used Capacity + Remaining Capacity)
- We changed the Used Percentage reported from (Used Capacity)/(Total Capacity) to (Used Capacity)/(Present Capacity)
- All these changes are displayed on Namenode Web UI.
Balancer functionality
Balancer script is started with a threshold parameter. It tries to move the blocks from the nodes that have Used % that is more than (Cluster average + threshold) to the nodes that have less than (Cluster average - threshold). Essentially balancer gets all the datanodes used % to with in (the Cluster average +/- threshold).
Inconsistencies due to the change in 2816
When MapReduce jobs are run, temporary files are generated. This eats away a lot of space from Present Capacity. The difference between the Total Capacity and the Present Capacity can be huge. Currently balancer computes Used Percentage based (Used Capacity)/(Total Capacity). The Used % the balancer uses could be significantly different from Used % displayed on the Namenode Web UI. When balancer is done balancing, the Namenode Used % might still appear unbalanced.
Attachments
Attachments
Issue Links
- blocks
-
HADOOP-2816 Cluster summary at name node web has confusing report for space utilization
- Closed
- relates to
-
HDFS-1564 Make dfs.datanode.du.reserved configurable per volume
- Open