Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-5734

HDFS architecture documentation describes outdated placement policy

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.20.0
    • 0.21.0
    • documentation
    • None
    • Reviewed

    Description

      The "Replica Placement: The First Baby Steps" section of HDFS architecture document states:

      "...
      For the common case, when the replication factor is three, HDFS's placement policy is to put one replica on one node in the local rack, another on a different node in the local rack, and the last on a different node in a different rack. This policy cuts the inter-rack write traffic which generally improves write performance.
      ..."

      However, according to the ReplicationTargetChooser.chooseTarger()'s code the actual logic is to put the second replica on a different rack as well as the third replica. So you have two replicas located on a different nodes of remote rack and one (initial replica) on the local rack's node. Thus, the sentence should say something like this:

      "For the common case, when the replication factor is three, HDFS's placement policy is to put one replica on one node in the local rack, another on a node in a different (remote) rack, and the last on a different node in the same remote rack. This policy cuts the inter-rack write traffic which generally improves write performance."

      Attachments

        1. HADOOP-5734.patch
          4 kB
          Konstantin I Boudnik

        Activity

          People

            cos Konstantin I Boudnik
            cos Konstantin I Boudnik
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - Not Specified
                Not Specified
                Remaining:
                Remaining Estimate - 0h
                0h
                Logged:
                Time Spent - 2h
                2h