Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16182

numOfReplicas is given the wrong value in BlockPlacementPolicyDefault$chooseTarget can cause DataStreamer to fail with Heterogeneous Storage

    XMLWordPrintableJSON

Details

    Description

      In our hdfs cluster, we use heterogeneous storage to store data in SSD  for a better performance. Sometimes  hdfs client transfer data in pipline,  it will throw IOException and exit.  Exception logs are below: 

      ```
      java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK], DatanodeInfoWithStorage[dn03_ip:5004,DS-a388c067-76a4-4014-a16c-ccc49c8da77b,SSD], DatanodeInfoWithStorage[dn04_ip:5004,DS-b81da262-0dd9-4567-a498-c516fab84fe0,SSD], DatanodeInfoWithStorage[dn05_ip:5004,DS-34e3af2e-da80-46ac-938c-6a3218a646b9,SSD]], original=[DatanodeInfoWithStorage[dn01_ip:5004,DS-ef7882e0-427d-4c1e-b9ba-a929fac44fb4,DISK], DatanodeInfoWithStorage[dn02_ip:5004,DS-3871282a-ad45-4332-866a-f000f9361ecb,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
      ```

      After search it,   I found when existing pipline need replace new dn to transfer data, the client will get one additional dn from namenode  and check that the number of dn is the original number + 1.

      ```

      ## DataStreamer$findNewDatanode

      if (nodes.length != original.length + 1)

      { throw new IOException( "Failed to replace a bad datanode on the existing pipeline " + "due to no more good datanodes being available to try. " + "(Nodes: current=" + Arrays.asList(nodes) + ", original=" + Arrays.asList(original) + "). " + "The current failed datanode replacement policy is " + dfsClient.dtpReplaceDatanodeOnFailure + ", and a client may configure this via '" + BlockWrite.ReplaceDatanodeOnFailure.POLICY_KEY + "' in its configuration."); }

      ```

      The root cause is that Namenode$getAdditionalDatanode returns multi datanodes , not one in DataStreamer.addDatanode2ExistingPipeline. 

       

      Maybe we can fix it in BlockPlacementPolicyDefault$chooseTarget.  I think numOfReplicas should not be assigned by requiredStorageTypes.

       

         

       

       

       

       

      Attachments

        1. HDFS-16182.patch
          6 kB
          Max Xie

        Issue Links

          Activity

            People

              max2049 Max Xie
              max2049 Max Xie
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3.5h
                  3.5h