Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12270

Allow more spreading of replicas during block placement

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • None
    • block placement
    • None

    Description

      The default block placement places the first replica locally if possible, then on a node in a remote rack, and finally another node in the remote rack. If more than 3 replicas are requested, the rest are spread across available racks. This strategy was chosen to minimize the inter-rack traffic and be able to tolerate a rack-level failure such as switch outages.

      This can tolerate a single rack failure, but if there also is a node outage (double failure), having missing blocks is highly likely. Although network bandwidth is still limited resource, it is less so than in the past. Some users might want increased data availability at the price of increased inter-rack traffic.

      This can be achieved by using the upgrade domain feature, but a simple tweak in the default policy can enable this, in case one does not want to go with the upgrade domain.

      I propose introducing a new config to control this.
      Rack placement level 0: default. Current behavior.
      Rack placement level 1: Use minimum 3 racks, if available. Allow existing blocks to remain as is.
      Rack placement level 2: Use minimum 3 racks, if available. Apply this policy to all replication verification. (e.g. replication queue initialization)

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            kihwal Kihwal Lee
            kihwal Kihwal Lee

            Dates

              Created:
              Updated:

              Slack

                Issue deployment