Hadoop Common
  1. Hadoop Common
  2. HADOOP-4702

Failed block replication leaves an incomplete block in receiver's tmp data directory

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.17.2
    • Fix Version/s: 0.18.3
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      When a failure occurs while replicating a block from a source DataNode to a target DataNode, the target node keeps an incomplete on-disk copy of the block in its temp data directory and an in-memory copy of the block in ongoingCreates queue. This causes two problems:
      1. Since this block is not (should not) be finalized, NameNode is not aware of the existence of this incomplete block. It may schedule replicating the same block to this node again, which will fail with a message: "Block XX has already been started (though not completed), and thus cannot be created."
      2. Restarting the datanode moves the blocks under the temp data directory to be valid blocks, thus introduces corrupted blocks into HDFS. Sometimes those corrupted blocks stay in the system undetected if it happens that the partial block and its checksums match.

      A failed block replication should clean up both the in-memory & on-disk copies of the incomplete block.

      1. tmpBlockRemoval2.patch
        8 kB
        Hairong Kuang
      2. tmpBlockRemoval1.patch
        7 kB
        Hairong Kuang
      3. tmpBlockRemoval.patch
        7 kB
        Hairong Kuang

        Issue Links

          Activity

          Hide
          Tsz Wo Nicholas Sze added a comment -

          Block replication and block creation should be different: block creation allows partial block but block replication should be atomic, either replicate the entire block or do nothing.

          Show
          Tsz Wo Nicholas Sze added a comment - Block replication and block creation should be different: block creation allows partial block but block replication should be atomic, either replicate the entire block or do nothing.
          Hide
          Hairong Kuang added a comment -

          This patch removes all traces (in-memory & on disk) of a temporary block if block replication fails.

          Show
          Hairong Kuang added a comment - This patch removes all traces (in-memory & on disk) of a temporary block if block replication fails.
          Hide
          Konstantin Shvachko added a comment -
          1. file.delete() in unfinalizeBlock() should check the return value and log if unsuccessful.
          2. It would be good to designate a method for block and meta files removal. It could be reused in invalidate() and unfinalizeBlock(), may ne other places too.
          3. Reverse the condition for checking whether the transfer is replication-related in BlockReceiver.removeBlock().

          I verified that temporary block is removed when failed transfer is initiated by block replication or replacement (the balancer).
          And is not removed if this a client write, which is what the patch is intended to do.

          Show
          Konstantin Shvachko added a comment - file.delete() in unfinalizeBlock() should check the return value and log if unsuccessful. It would be good to designate a method for block and meta files removal. It could be reused in invalidate() and unfinalizeBlock() , may ne other places too. Reverse the condition for checking whether the transfer is replication-related in BlockReceiver.removeBlock() . I verified that temporary block is removed when failed transfer is initiated by block replication or replacement (the balancer). And is not removed if this a client write, which is what the patch is intended to do.
          Hide
          Hairong Kuang added a comment -

          This patch incorporates all Konstantin's review comments.

          Show
          Hairong Kuang added a comment - This patch incorporates all Konstantin's review comments.
          Hide
          Hairong Kuang added a comment -

          This patch makes two more changes:
          1. FSDataSet.unfinalize returns if the block is not in ongoingCreates list;
          2. BlockReceiver initializes fields srcDataNode & datanode before intializing checksum. This is to avoid NullPointerException in removeBlock in case an exception is thrown when intializing field checksum.

          Show
          Hairong Kuang added a comment - This patch makes two more changes: 1. FSDataSet.unfinalize returns if the block is not in ongoingCreates list; 2. BlockReceiver initializes fields srcDataNode & datanode before intializing checksum. This is to avoid NullPointerException in removeBlock in case an exception is thrown when intializing field checksum.
          Hide
          Konstantin Shvachko added a comment -

          +1
          Let us create a separate issue for reusing delBlockFromDisk() where it should/can be used. This patch goes to 0.18 so we want to minimize changes.

          Show
          Konstantin Shvachko added a comment - +1 Let us create a separate issue for reusing delBlockFromDisk() where it should/can be used. This patch goes to 0.18 so we want to minimize changes.
          Hide
          Hairong Kuang added a comment -

          Ant test-core was successful:
          BUILD SUCCESSFUL
          Total time: 127 minutes 28 seconds

          Ant test-patch result:
          [exec] +1 overall.

          [exec] +1 @author. The patch does not contain any @author tags.

          [exec] +1 tests included. The patch appears to include 5 new or modified tests.

          [exec] +1 javadoc. The javadoc tool did not generate any warning messages.

          [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.

          [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.

          [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.

          Show
          Hairong Kuang added a comment - Ant test-core was successful: BUILD SUCCESSFUL Total time: 127 minutes 28 seconds Ant test-patch result: [exec] +1 overall. [exec] +1 @author. The patch does not contain any @author tags. [exec] +1 tests included. The patch appears to include 5 new or modified tests. [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] +1 Eclipse classpath. The patch retains Eclipse classpath integrity.
          Hide
          Hairong Kuang added a comment -

          > Let us create a separate issue for reusing delBlockFromDisk() where it should/can be used..
          I created HADOOP-4812.

          Show
          Hairong Kuang added a comment - > Let us create a separate issue for reusing delBlockFromDisk() where it should/can be used.. I created HADOOP-4812 .
          Hide
          Hairong Kuang added a comment -

          I've just committed this.

          Show
          Hairong Kuang added a comment - I've just committed this.
          Hide
          Hudson added a comment -

          Integrated in Hadoop-trunk #684 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/684/)
          . Failed block replication leaves an incomplete block in receiver's tmp data directory. Contributed by Hairong Kuang.

          Show
          Hudson added a comment - Integrated in Hadoop-trunk #684 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/684/ ) . Failed block replication leaves an incomplete block in receiver's tmp data directory. Contributed by Hairong Kuang.

            People

            • Assignee:
              Hairong Kuang
              Reporter:
              Hairong Kuang
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development