Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-4071

FSNameSystem.isReplicationInProgress should add an underReplicated block to the neededReplication queue using method "add" not "update"

    XMLWordPrintableJSON

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.19.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      We have a datanode that did not get decommission done for days. It turned out that there was an under replicated block that was never placed in the neededReplication queue and therefore did not get replicated. The following debug line showed the problem:

      " DEBUG org.apache.hadoop.dfs.StateChange: UnderReplicationBlocks.update blk_-7437651423871278837_0 curReplicas 8
      curExpectedReplicas 10 oldReplicas 9 oldExpectedReplicas 10 curPri 2 oldPri 2"

      The block was not in the neededReplication queue, but the update method concluded that the block was under replicated and the priority level did not change, so it did not add the block to the needReplication queue.

      The solution is that in stead of using the update method, the name node should use the add method to add the block to the neededReplication queue. The add method guarantees success if the block is indeed under replicated.

        Attachments

        1. decommission.patch
          1 kB
          Hairong Kuang

          Activity

            People

            • Assignee:
              hairong Hairong Kuang
              Reporter:
              hairong Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: