Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-5809

BlockPoolSliceScanner and high speed hdfs appending make datanode to drop into infinite loop

    XMLWordPrintableJSON

Details

    Description

      BlockPoolSliceScanner#scan contains a "while" loop that continues to verify (i.e. scan) blocks until the blockInfoSet is empty (or some other conditions like a timeout have occurred.) In order to do this, it calls BlockPoolSliceScanner#verifyFirstBlock. This is intended to grab the first block in the blockInfoSet, verify it, and remove it from that set. (blockInfoSet is sorted by last scan time.) Unfortunately, if we hit a certain bug in updateScanStatus, the block may never be removed from blockInfoSet. When this happens, we keep rescanning the exact same block until the timeout hits.

      The bug is triggered when a block winds up in blockInfoSet but not in blockMap. You can see it clearly in this code:

        private synchronized void updateScanStatus(Block block,                      
                                                   ScanType type,
                                                   boolean scanOk) {                 
          BlockScanInfo info = blockMap.get(block);
                                                                                     
          if ( info != null ) {
            delBlockInfo(info);
          } else {                                                                   
            // It might already be removed. Thats ok, it will be caught next time.   
            info = new BlockScanInfo(block);                                         
          }   
      

      If info == null, we never call delBlockInfo, the function which is intended to remove the blockInfoSet entry.

      Luckily, there is a simple fix here... the variable that updateScanStatus is being passed is actually a BlockInfo object, so we can simply call delBlockInfo on it directly, without doing a lookup in the blockMap. This is both faster and more robust.

      Attachments

        1. HDFS-5809.001.patch
          3 kB
          Colin McCabe

        Activity

          People

            cmccabe Colin McCabe
            ikweesung ikweesung
            Votes:
            2 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: