Hadoop Common
  1. Hadoop Common
  2. HADOOP-3035

Data nodes should inform the name-node about block crc errors.

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Changed protocol for transferring blocks between data nodes to report corrupt blocks to data node for re-replication from a good replica.

      Description

      Currently if a crc error occurs when data-node replicates a block to another node it throws an exception, and continues.

          [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode (DataNode.java:transferBlocks(811)) - 127.0.0.1:3730 Starting thread to transfer block blk_-1962819020391742554 to 127.0.0.1:3740
          [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode (DataNode.java:writeBlock(1067)) - Receiving block blk_-1962819020391742554 src: /127.0.0.1:3791 dest: /127.0.0.1:3740
          [junit] 2008-03-17 19:46:11,855 INFO  dfs.DataNode (DataNode.java:receiveBlock(2504)) - Exception in receiveBlock for block blk_-1962819020391742554 java.io.IOException: Unexpected checksum mismatch while writing blk_-1962819020391742554 from /127.0.0.1
          [junit] 2008-03-17 19:46:11,871 INFO  dfs.DataNode (DataNode.java:run(2626)) - 127.0.0.1:3730:Transmitted block blk_-1962819020391742554 to /127.0.0.1:3740
          [junit] 2008-03-17 19:46:11,871 INFO  dfs.DataNode (DataNode.java:writeBlock(1192)) - writeBlock blk_-1962819020391742554 received exception java.io.IOException: Unexpected checksum mismatch while writing blk_-1962819020391742554 from /127.0.0.1
          [junit] 2008-03-17 19:46:11,871 ERROR dfs.DataNode (DataNode.java:run(979)) - 127.0.0.1:3740:DataXceiver: java.io.IOException: Unexpected checksum mismatch while writing blk_-1962819020391742554 from /127.0.0.1
          [junit]     at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveChunk(DataNode.java:2246)
          [junit]     at org.apache.hadoop.dfs.DataNode$BlockReceiver.receivePacket(DataNode.java:2416)
          [junit]     at org.apache.hadoop.dfs.DataNode$BlockReceiver.receiveBlock(DataNode.java:2474)
          [junit]     at org.apache.hadoop.dfs.DataNode$DataXceiver.writeBlock(DataNode.java:1173)
          [junit]     at org.apache.hadoop.dfs.DataNode$DataXceiver.run(DataNode.java:956)
          [junit]     at java.lang.Thread.run(Thread.java:595)
      

      The data-node should report the error to the name-node so that the corrupted replica could be removed and replicated.

      1. HADOOP-3035-1.patch
        10 kB
        Lohit Vijayarenu
      2. HADOOP-3035-2.patch
        10 kB
        Lohit Vijayarenu
      3. HADOOP-3035-3.patch
        12 kB
        Lohit Vijayarenu

        Issue Links

          Activity

          Owen O'Malley made changes -
          Component/s dfs [ 12310710 ]
          Nigel Daley made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Robert Chansler made changes -
          Release Note During block transfers between datanodes, the receiving datanode, now can report corrupt replicas received from src node to the namenode Changed protocol for transferring blocks between data nodes to report corrupt blocks to data node for re-replication from a good replica.
          Robert Chansler made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Fix Version/s 0.18.0 [ 12312972 ]
          Resolution Fixed [ 1 ]
          Lohit Vijayarenu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Lohit Vijayarenu made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Lohit Vijayarenu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Lohit Vijayarenu made changes -
          Attachment HADOOP-3035-3.patch [ 12382506 ]
          Lohit Vijayarenu made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Chris Douglas made changes -
          Assignee lohit vijayarenu [ lohit ]
          Lohit Vijayarenu made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Release Note During block transfers between datanodes, the receiving datanode, now can report corrupt replicas received from src node to the namenode
          Hadoop Flags [Incompatible change]
          Lohit Vijayarenu made changes -
          Attachment HADOOP-3035-2.patch [ 12382066 ]
          Lohit Vijayarenu made changes -
          Attachment HADOOP-3035-1.patch [ 12382062 ]
          dhruba borthakur made changes -
          Field Original Value New Value
          Link This issue relates to HADOOP-3314 [ HADOOP-3314 ]
          Konstantin Shvachko created issue -

            People

            • Assignee:
              Lohit Vijayarenu
              Reporter:
              Konstantin Shvachko
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development