Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-692

Rack-aware Replica Placement

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.10.1
    • 0.11.0
    • None
    • None

    Description

      This issue assumes that HDFS runs on a cluster of computers that spread across many racks. Communication between two nodes on different racks needs to go through switches. Bandwidth in/out of a rack may be less than the total bandwidth of machines in the rack. The purpose of rack-aware replica placement is to improve data reliability, availability, and network bandwidth utilization. The basic idea is that each data node determines to which rack it belongs at the startup time and notifies the name node of the rack id upon registration. The name node maintains a rackid-to-datanode map and tries to place replicas across racks.

      Attachments

        1. Rack_aware_HDFS_proposal.pdf
          16 kB
          Hairong Kuang
        2. rack.patch
          101 kB
          Hairong Kuang

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hairong Hairong Kuang
            hairong Hairong Kuang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment