Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1172

Blocks in newly completed files are considered under-replicated too quickly

    Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: 0.21.0
    • Fix Version/s: None
    • Component/s: namenode
    • Labels:
      None

      Description

      I've seen this for a long time, and imagine it's a known issue, but couldn't find an existing JIRA. It often happens that we see the NN schedule replication on the last block of files very quickly after they're completed, before the other DNs in the pipeline have a chance to report the new block. This results in a lot of extra replication work on the cluster, as we replicate the block and then end up with multiple excess replicas which are very quickly deleted.

      1. HDFS-1172.patch
        3 kB
        Boris Shkolnik
      2. replicateBlocksFUC.patch
        4 kB
        Hairong Kuang
      3. replicateBlocksFUC1.patch
        8 kB
        Hairong Kuang
      4. replicateBlocksFUC1.patch
        8 kB
        Hairong Kuang
      5. hdfs-1172.txt
        20 kB
        Todd Lipcon
      6. hdfs-1172.txt
        21 kB
        Eli Collins

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Unassigned
            Reporter:
            Todd Lipcon
          • Votes:
            1 Vote for this issue
            Watchers:
            27 Start watching this issue

            Dates

            • Created:
              Updated:

              Development