Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-659

Boost the priority of re-replicating blocks that are far from their replication target

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.10.1
    • 0.11.0
    • None
    • None

    Description

      I see two types of replications that should be accelerated compared to all others.
      1. Blocks that have only one remaining copy (but are required to have higher replication).
      2. Blocks that have less than 1/3 of their replicas in place.
      The latter occurs when map/reduce sets replication of certain files to 10, and we want
      it happen fast to achieve better performance on the tasks.

      So I think we should distinguish two major groups of under-replicated blocks:
      first-priority (having only 1 copy or less than 1/3 of required replicas), and the rest.
      The name-node places first-priority blocks into the beginning of the neededReplication
      list, and the rest are placed at the end. That way the first-priority blocks will be replicated
      first and then the others.

      Attachments

        1. priBlockRep2.patch
          17 kB
          Hairong Kuang
        2. remove.patch
          0.5 kB
          Hairong Kuang

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            hairong Hairong Kuang
            shv Konstantin Shvachko
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment