Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-14699

Erasure Coding: Storage not considered in live replica when replication streams hard limit reached to threshold



    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 3.2.0, 3.1.1, 3.3.0
    • 3.3.0, 3.1.4, 3.2.2
    • ec


      We are tried the EC function on 80 node cluster with hadoop 3.1.1, we hit the same scenario as you said https://issues.apache.org/jira/browse/HDFS-8881. Following are our testing steps, hope it can helpful.(following DNs have the testing internal blocks)

      1. we customized a new 10-2-1024k policy and use it on a path, now we have 12 internal block(12 live block)
      2. decommission one DN, after the decommission complete. now we have 13 internal block(12 live block and 1 decommission block)
      3. then shutdown one DN which did not have the same block id as 1 decommission block, now we have 12 internal block(11 live block and 1 decommission block)
      4. after wait for about 600s (before the heart beat come) commission the decommissioned DN again, now we have 12 internal block(11 live block and 1 duplicate block)
      5. Then the EC is not reconstruct the missed block

      We think this is a critical issue for using the EC function in a production env. Could you help? Thanks a lot!


        1. HDFS-14699.00.patch
          2 kB
          Zhao Yi Ming
        2. HDFS-14699.01.patch
          7 kB
          Zhao Yi Ming
        3. HDFS-14699.02.patch
          7 kB
          Zhao Yi Ming
        4. HDFS-14699.03.patch
          7 kB
          Zhao Yi Ming
        5. HDFS-14699.04.patch
          7 kB
          Zhao Yi Ming
        6. HDFS-14699.05.patch
          7 kB
          Zhao Yi Ming
        7. image-2019-08-20-19-58-51-872.png
          136 kB
          Zhao Yi Ming
        8. image-2019-09-02-17-51-46-742.png
          74 kB
          Zhao Yi Ming

        Issue Links



              zhaoyim Zhao Yi Ming
              zhaoyim Zhao Yi Ming
              0 Vote for this issue
              10 Start watching this issue