Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-17003

Erasure Coding: invalidate wrong block after reporting bad blocks from datanode

VotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Reviewed

    Description

      After receiving reportBadBlocks RPC from datanode, NameNode compute wrong block to invalidate. It is a dangerous behaviour and may cause data loss. Some logs in our production as below:

       

      NameNode log:

      2023-05-08 21:23:49,112 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks for block: BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on datanode: datanode1:50010
      
      2023-05-08 21:23:49,183 INFO org.apache.hadoop.hdfs.StateChange: *DIR* reportBadBlocks for block: BP-932824627-xxxx-1680179358678:blk_-9223372036848404319_1471186 on datanode: datanode2:50010

      datanode1 log:

      2023-05-08 21:23:49,088 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-932824627-xxxx-1680179358678:blk_-9223372036848404320_1471186 on /data7/hadoop/hdfs/datanode
      
      2023-05-08 21:24:00,509 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Failed to delete replica blk_-9223372036848404319_1471186: ReplicaInfo not found.

       

      This phenomenon can be reproduced.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            zhanghaobo farmmamba
            zhanghaobo farmmamba
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment