[HDFS-3070] HDFS balancer doesn't ensure that hdfs-site.xml is loaded - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 2.0.0-alpha
Fix Version/s: 2.0.0-alpha
Component/s: balancer & mover
Labels:
None

Target Version/s:

2.0.0-alpha
Hadoop Flags:

Reviewed

Description

I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, both have over 3% disk usage.
Attached is a screenshot of the Live Nodes web UI.

On styx01, I run the hdfs balancer command with threshold 1% and don't see the blocks being balanced across all 4 datanodes (all blocks on styx01 and styx02 stay put).

HA is currently enabled.

[schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
active
[schu@styx01 ~]$ hdfs balancer -threshold 1
12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
12/03/08 10:10:32 INFO balancer.Balancer: p = Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved
Balancing took 95.0 milliseconds
[schu@styx01 ~]$

I believe with a threshold of 1% the balancer should trigger blocks being moved across DataNodes, right? I am curious about the "namenode = []" from the above output.

[schu@styx01 ~]$ hadoop version
Hadoop 0.24.0-SNAPSHOT
Subversion git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common -r f6a577d697bbcd04ffbc568167c97b79479ff319
Compiled by schu on Thu Mar 8 15:32:50 PST 2012
From source with checksum ec971a6e7316f7fbf471b617905856b8

From http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
The threshold parameter is a fraction in the range of (0%, 100%) with a default value of 10%. The threshold sets a target for whether the cluster is balanced. A cluster is balanced if for each datanode, the utilization of the node (ratio of used space at the node to total capacity of the node) differs from the utilization of the (ratio of used space in the cluster to total capacity of the cluster) by no more than the threshold value. The smaller the threshold, the more balanced a cluster will become. It takes more time to run the balancer for small threshold values. Also for a very small threshold the cluster may not be able to reach the balanced state when applications write and delete files concurrently.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-3070.patch
31/Mar/12 02:11
1 kB
Aaron Myers
unbalanced_nodes_inservice.png
09/Mar/12 00:57
59 kB
Stephen Chu
unbalanced_nodes.png
09/Mar/12 00:55
59 kB
Stephen Chu

Activity

People

Assignee:: Aaron Myers

Reporter:: Stephen Chu

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Dates

Created:: 09/Mar/12 00:54

Updated:: 28/Sep/15 20:58

Resolved:: 31/Mar/12 16:23