Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-659

Boost the priority of re-replicating blocks that are far from their replication target

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.10.1
    • Fix Version/s: 0.11.0
    • Component/s: None
    • Labels:
      None

      Description

      I see two types of replications that should be accelerated compared to all others.
      1. Blocks that have only one remaining copy (but are required to have higher replication).
      2. Blocks that have less than 1/3 of their replicas in place.
      The latter occurs when map/reduce sets replication of certain files to 10, and we want
      it happen fast to achieve better performance on the tasks.

      So I think we should distinguish two major groups of under-replicated blocks:
      first-priority (having only 1 copy or less than 1/3 of required replicas), and the rest.
      The name-node places first-priority blocks into the beginning of the neededReplication
      list, and the rest are placed at the end. That way the first-priority blocks will be replicated
      first and then the others.

        Attachments

        1. priBlockRep2.patch
          17 kB
          Hairong Kuang
        2. remove.patch
          0.5 kB
          Hairong Kuang

          Activity

            People

            • Assignee:
              hairong Hairong Kuang
              Reporter:
              shv Konstantin Shvachko

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment