Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-11553

Erasure Coding: Missing parity blocks in the block group are warned as corrupt blocks

Add voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

      Description

      Currently, DFSStripedOutputStream verifies if the allocated block locations are at least numDataBlocks length. That is, for the EC Policy RS-6-3-64K, though the total needed DNs for a full EC Block Group is 9, Clients will be able to successfully create a DFSStripedOutputStream with just 6 DNs. Moreover, the output stream thus created with less DNs will totally ignore writing Parity Blocks. HDFS-11552 is tracking the improvement needed to accommodate Parity Blocks along with Data Blocks from the same Block Group.

      [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block location for parity block, index=6
      [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block location for parity block, index=7
      [Thread-5] WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:allocateNewBlock(497)) - Failed to get block location for parity block, index=8
      

      In the above case, upon file stream close we get the following warning message when the parity blocks are not yet written out. The warning message claims that there are 3 corrupt blocks, which is in-correct. Its just the EC redundancy is not sufficient and not corrupt or lost yet. This warning message in the context of above usecase need to be fixed.

      INFO  namenode.FSNamesystem (FSNamesystem.java:checkBlocksComplete(2726)) - BLOCK* blk_-9223372036854775792_1002 is COMMITTED but not COMPLETE(numNodes= 0 <  minimum = 6) in file /ec/test1
      INFO  hdfs.StateChange (FSNamesystem.java:completeFile(2679)) - DIR* completeFile: /ec/test1 is closed by DFSClient_NONMAPREDUCE_-1900076771_17
      WARN  hdfs.DFSOutputStream (DFSStripedOutputStream.java:logCorruptBlocks(1117)) - Block group <1> has 3 corrupt blocks. It's at high risk of losing data.
      

        Attachments

          Activity

            People

            • Assignee:
              manojg Manoj Govindassamy
              Reporter:
              manojg Manoj Govindassamy

              Dates

              • Created:
                Updated:

                Issue deployment