Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14053

Provide ability for NN to re-replicate based on topology changes.

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 3.2.0
    • Component/s: None
    • Labels:
      None
    • Environment:

      VMWare w/ local storage

    • Hadoop Flags:
      Reviewed
    • Release Note:
      A new option (-replicate) is added to fsck command to re-trigger the replication for mis-replicated blocks. This option should be used instead of previous workaround of increasing and then decreasing replication factor (using hadoop fs -setrep command).

      Description

      Enabling HVE/node awareness does not rebalance replicas on data that existed prior to topology changes.

      [root@vmw-d10-001 jenkins]# more /opt/cloudera/topology.data
      10.20.xxx.161 /rack1/nodegroup1
      10.20.xxx.162 /rack1/nodegroup1
      10.20.xxx.163 /rack3/nodegroup1
      10.20.xxx.164 /rack3/nodegroup1
      172.17.xxx.71 /rack2/nodegroup1
      172.17.xxx.72 /rack2/nodegroup1

      before HVE:
      /user/impalauser/tpcds/store_sales <dir>
      /user/impalauser/tpcds/store_sales/store_sales.dat 1180463121 bytes, 9 block(s): OK
      0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742xxx_1382 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 10.20.xxx.163:20002]
      1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742213_1389 len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002, 10.20.xxx.161:20002]
      2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742214_1390 len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002, 10.20.xxx.163:20002]
      3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742215_1391 len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002, 10.20.xxx.163:20002]
      4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742216_1392 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742217_1393 len=134217728 repl=3 [10.20.xxx.164:20002, 172.17.xxx.72:20002, 10.20.xxx.163:20002]
      6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742220_1396 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002, 10.20.xxx.163:20002]
      7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742222_1398 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002, 10.20.xxx.161:20002]
      8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742224_1400 len=106721297 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002, 172.17.xxx.72:20002]
      ---------

      Before enabling HVE:
      Status: HEALTHY
      Total size: 1648156454 B (Total open files size: 498 B)
      Total dirs: 138
      Total files: 384
      Total symlinks: 0 (Files currently being written: 6)
      Total blocks (validated): 390 (avg. block size 4226042 B) (Total open file blocks (not validated): 6)
      Minimally replicated blocks: 390 (100.0 %)
      Over-replicated blocks: 0 (0.0 %)
      Under-replicated blocks: 1 (0.25641027 %)
      Mis-replicated blocks: 0 (0.0 %)
      Default replication factor: 3
      Average block replication: 2.8564103
      Corrupt blocks: 0
      Missing replicas: 5 (0.44682753 %)
      Number of data-nodes: 5
      Number of racks: 1
      FSCK ended at Wed Dec 10 14:04:35 EST 2014 in 50 milliseconds

      The filesystem under path '/' is HEALTHY

      ------
      after HVE (and NN restart):

      /user/impalauser/tpcds/store_sales <dir>
      /user/impalauser/tpcds/store_sales/store_sales.dat 1180463121 bytes, 9 block(s): OK
      0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742xxx_1382 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002, 10.20.xxx.161:20002]
      1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742213_1389 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.161:20002]
      2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742214_1390 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.163:20002]
      3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742215_1391 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.163:20002]
      4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742216_1392 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.161:20002]
      5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742217_1393 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.163:20002]
      6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742220_1396 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002, 10.20.xxx.162:20002]
      7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742222_1398 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.163:20002, 10.20.xxx.161:20002]
      8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073742224_1400 len=106721297 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.162:20002]

      Status: HEALTHY
      Total size: 1659427036 B (Total open files size: 498 B)
      Total dirs: 176
      Total files: 529
      Total symlinks: 0 (Files currently being written: 6)
      Total blocks (validated): 532 (avg. block size 3119223 B) (Total open file blocks (not validated): 6)
      Minimally replicated blocks: 532 (100.0 %)
      Over-replicated blocks: 0 (0.0 %)
      Under-replicated blocks: 1 (0.18796992 %)
      Mis-replicated blocks: 0 (0.0 %)
      Default replication factor: 3
      Average block replication: 2.8383458
      Corrupt blocks: 0
      Missing replicas: 7 (0.46143705 %)
      Number of data-nodes: 5
      Number of racks: 3
      FSCK ended at Wed Dec 10 14:29:23 EST 2014 in 115 milliseconds
      The filesystem under path '/' is HEALTHY

      ------------

      store sales pushed to hdfs after HVE was configured:

      /user/impalauser/tpcds_after_hve <dir>
      /user/impalauser/tpcds_after_hve/store_sales.dat 1180463121 bytes, 9 block(s): OK
      0. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743406_2582 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      1. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743412_2588 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002, 172.17.xxx.72:20002]
      2. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743415_2591 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      3. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743416_2592 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.162:20002, 172.17.xxx.72:20002]
      4. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743417_2593 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      5. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743418_2594 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      6. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743419_2595 len=134217728 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]
      7. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743422_2598 len=134217728 repl=3 [172.17.xxx.72:20002, 10.20.xxx.164:20002, 10.20.xxx.162:20002]
      8. BP-1184748135-172.17.xxx.71-1418235396548:blk_1073743423_2599 len=106721297 repl=3 [10.20.xxx.164:20002, 10.20.xxx.161:20002, 172.17.xxx.72:20002]

        Attachments

        1. HADOOP-11391-001.patch
          13 kB
          Hrishikesh Gadre
        2. HADOOP-11391-002.patch
          13 kB
          Hrishikesh Gadre
        3. HADOOP-11391-003.patch
          14 kB
          Hrishikesh Gadre
        4. HADOOP-11391-004.patch
          14 kB
          Hrishikesh Gadre
        5. HADOOP-11391-005.patch
          15 kB
          Hrishikesh Gadre
        6. HADOOP-11391-006.patch
          15 kB
          Hrishikesh Gadre

          Activity

            People

            • Assignee:
              hgadre Hrishikesh Gadre
              Reporter:
              ejohansen ellen johansen
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: