Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-3493

Invalidate excess corrupted blocks as long as minimum replication is satisfied

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 2.0.0-alpha, 2.0.5-alpha
    • 2.5.0
    • namenode
    • None

    Description

      replication factor= 3, block report interval= 1min and start NN and 3DN

      Step 1:Write a file without close and do hflush (Dn1,DN2,DN3 has blk_ts1)
      Step 2:Stopped DN3
      Step 3:recovery happens and time stamp updated(blk_ts2)
      Step 4:close the file
      Step 5:blk_ts2 is finalized and available in DN1 and Dn2
      Step 6:now restarted DN3(which has got blk_ts1 in rbw)

      From the NN side there is no cmd issued to DN3 to delete the blk_ts1 . But ask DN3 to make the block as corrupt .
      Replication of blk_ts2 to DN3 is not happened.

      NN logs:
      ========

      INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in block map 1008
      INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX, storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), blocks: 2, processing time: 1 msecs
      INFO org.apache.hadoop.hdfs.StateChange: BLOCK* Removing block blk_3927215081484173742_1008 from neededReplications as it has enough replicas.
      
      INFO org.apache.hadoop.hdfs.StateChange: BLOCK NameSystem.addToCorruptReplicasMap: duplicate requested for blk_3927215081484173742 to add as corrupt on XX.XX.XX.XX:50276 by /XX.XX.XX.XX because reported RWR replica with genstamp 1007 does not match COMPLETE block's genstamp in block map 1008
      INFO org.apache.hadoop.hdfs.StateChange: BLOCK* processReport: from DatanodeRegistration(XX.XX.XX.XX, storageID=DS-443871816-XX.XX.XX.XX-50276-1336829714197, infoPort=50275, ipcPort=50277, storageInfo=lv=-40;cid=CID-e654ac13-92dc-4f82-a22b-c0b6861d06d7;nsid=2063001898;c=0), blocks: 2, processing time: 1 msecs
      WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Not able to place enough replicas, still in need of 1 to reach 1
      For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
      

      fsck Report
      ===========

      /file21:  Under replicated BP-1008469586-XX.XX.XX.XX-1336829603103:blk_3927215081484173742_1008. Target Replicas is 3 but found 2 replica(s).
      .Status: HEALTHY
       Total size:	495 B
       Total dirs:	1
       Total files:	3
       Total blocks (validated):	3 (avg. block size 165 B)
       Minimally replicated blocks:	3 (100.0 %)
       Over-replicated blocks:	0 (0.0 %)
       Under-replicated blocks:	1 (33.333332 %)
       Mis-replicated blocks:		0 (0.0 %)
       Default replication factor:	1
       Average block replication:	2.0
       Corrupt blocks:		0
       Missing replicas:		1 (14.285714 %)
       Number of data-nodes:		3
       Number of racks:		1
      FSCK ended at Sun May 13 09:49:05 IST 2012 in 9 milliseconds
      The filesystem under path '/' is HEALTHY
      

      Attachments

        1. HDFS-3493.002.patch
          6 kB
          Juan Yu
        2. HDFS-3493.003.patch
          9 kB
          Juan Yu
        3. HDFS-3493.004.patch
          8 kB
          Juan Yu
        4. HDFS-3493.patch
          5 kB
          Vinayakumar B

        Issue Links

          Activity

            People

              jyu@cloudera.com Juan Yu
              andreina J.Andreina
              Votes:
              0 Vote for this issue
              Watchers:
              14 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: