A proposed update to https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsDesign.md in the section "Replica Placement: The First Baby Steps" 4th paragraph, 2nd last line.
The sentence is leading to ambiguity of reader.
Considering the statement segmented in 3 parts by the commas:
- the first part talks about "one thirds of replicas";
- the second part talks about "two thirds of replicas"
- the third part talking about "the other third" is leading to ambiguity when one thirds and two thirds have already accounted for the whole.
Getting rid of the third part or rephrasing entire sentence to capture the overall essence of the sentence.
In other words, replacing
One third of replicas are on one node, two thirds of replicas are on one rack, and the other third are evenly distributed across the remaining racks.
One third of replicas are on one node, two thirds of replicas are on one rack.
Two replicas are on different nodes of one rack and the remaining replica is on a node of one of the other racks.
In addition to this, found 2 more sentences in same paragraph that will be corrected.
However, it does reduce the aggregate network bandwidth used when reading data since a block is placed in only two unique racks rather than three.
will replace: does reduce -> does not reduce
With this policy, the replicas of a file do not evenly distribute across the racks.
will replace: replicas of a file -> replicas of a block
Please suggest if any additional meaning is getting lost with this replacement.