Details
-
Task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
3.0.0
-
None
Description
From http://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/HDFSErasureCoding.html:
For rack fault-tolerance, it is also important to have at least as many racks as the configured EC stripe width. For EC policy RS (6,3), this means minimally 9 racks, and ideally 10 or 11 to handle planned and unplanned outages. For clusters with fewer racks than the stripe width, HDFS cannot maintain rack fault-tolerance, but will still attempt to spread a striped file across multiple nodes to preserve node-level fault-tolerance.
Theoretical minimum is 3 racks, and ideally 9 or more, so the document should be updated.
(I didn't check timestamps, but this is probably due to BlockPlacementPolicyRackFaultTolerant isn't completely done when HDFS-9088 introduced this doc. Later there's also examples in TestErasureCodingMultipleRacks to test this explicitly.)