Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-12043

Add counters for block re-replication

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 2.9.0, 3.0.0-beta1
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We occasionally see that the under-replicated block count is not going down quickly enough. We've made at least one fix to speed up block replications (HDFS-9205) but we need better insight into the current state and activity of the block re-replication logic. For example, we need to understand whether is it because re-replication is not making forward progress at all, or is it because new under-replicated blocks are being added faster.

      We should include additional metrics:

      1. Cumulative number of blocks that were successfully replicated.
      2. Cumulative number of re-replications that timed out.
      3. Cumulative number of blocks that were dequeued for re-replication but not scheduled e.g. because they were invalid, or under-construction or replication was postponed.

      The growth rate of of the above metrics will make it clear whether block replication is making forward progress and if not then provide potential clues about why it is stalled.

      Thanks Arpit Agarwal for the offline discussions.

        Attachments

        1. HDFS-12043.001.patch
          6 kB
          Chen Liang
        2. HDFS-12043.002.patch
          12 kB
          Chen Liang
        3. HDFS-12043.003.patch
          12 kB
          Chen Liang
        4. HDFS-12043.004.patch
          13 kB
          Chen Liang
        5. HDFS-12043-branch-2.005.patch
          11 kB
          Chen Liang

          Issue Links

            Activity

              People

              • Assignee:
                vagarychen Chen Liang
                Reporter:
                vagarychen Chen Liang
              • Votes:
                0 Vote for this issue
                Watchers:
                6 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: