Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-16420

Avoid deleting unique data blocks when deleting redundancy striped blocks

    XMLWordPrintableJSON

Details

    Description

      We have a similar problem as HDFS-16297 described. 

      In our cluster, we used ec(6+3) + balancer with version 3.1.0, and the missing block happened. 

      We got the block(blk_-9223372036824119008) info from fsck, only 5 live replications and multiple redundant replications. 

      blk_-9223372036824119008_220037616 len=133370338 MISSING! Live_repl=5
      blk_-9223372036824119007:DatanodeInfoWithStorage,   
      blk_-9223372036824119002:DatanodeInfoWithStorage,    
      blk_-9223372036824119001:DatanodeInfoWithStorage,  
      blk_-9223372036824119000:DatanodeInfoWithStorage, 
      blk_-9223372036824119004:DatanodeInfoWithStorage,  
      blk_-9223372036824119004:DatanodeInfoWithStorage, 
      blk_-9223372036824119004:DatanodeInfoWithStorage, 
      blk_-9223372036824119004:DatanodeInfoWithStorage, 
      blk_-9223372036824119004:DatanodeInfoWithStorage, 
      blk_-9223372036824119004:DatanodeInfoWithStorage 

         

      We searched the log from all datanode, and found that the internal blocks of blk_-9223372036824119008 were deleted almost at the same time.

       

      08:15:58,550 INFO  impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(333)) - Deleted BP-1606066499-xxxx-1606188026755 blk_-9223372036824119008_220037616 URI file:/data15/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119008
      
      08:16:21,214 INFO  impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(333)) - Deleted BP-1606066499-xxxx-1606188026755 blk_-9223372036824119006_220037616 URI file:/data4/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119006
      
      08:16:55,737 INFO  impl.FsDatasetAsyncDiskService (FsDatasetAsyncDiskService.java:run(333)) - Deleted BP-1606066499-xxxx-1606188026755 blk_-9223372036824119005_220037616 URI file:/data2/hadoop/hdfs/data/current/BP-1606066499-xxxx-1606188026755/current/finalized/subdir19/subdir9/blk_-9223372036824119005
      

       

      The total number of internal blocks deleted during 08:15-08:17 are as follows

      internal block index     delete num
      blk_-9223372036824119008      
      blk_-9223372036824119006         
      blk_-9223372036824119005         
      blk_-9223372036824119004         
      blk_-9223372036824119003         
      blk_-9223372036824119000        
      0
      2
      3
      4
      5
      8
              1
              1
              1  
              50
              1
              1

       

      During 08:15 to 08:17, we restarted 2 datanode and triggered full block report immediately.

       

      There are 2 questions: 
      1. Why are there so many replicas of this block?
      2. Why delete the internal block with only one copy?

      The reasons for the first problem may be as follows: 
      1. We set the full block report period of some datanode to 168 hours.
      2. We have done a namenode HA operation.
      3. After namenode HA, the state of storage became stale, and the state not change until next full block report.
      4. The balancer copied the replica without deleting the replica from source node, because the source node have the stale storage, and the request was put into postponedMisreplicatedBlocks.

      5. Balancer continues to copy the replica, eventually resulting in multiple copies of a replica

      The set of rescannedMisreplicatedBlocks have so many block to remove.

      As for the second question, we checked the code of processExtraRedundancyBlock, but didn't find any problem.

       

      Attachments

        1. image-2022-01-10-17-31-35-910.png
          169 kB
          qinyuren
        2. image-2022-01-10-17-32-56-981.png
          149 kB
          qinyuren

        Issue Links

          Activity

            People

              JacksonWang Jackson Wang
              qinyuren qinyuren
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 2h 10m
                  2h 10m