Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-10512

VolumeScanner may terminate due to NPE in DataNode.reportBadBlocks

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • None
    • 2.8.0, 2.7.4, 3.0.0-alpha1
    • datanode
    • None
    • Reviewed

    Description

      VolumeScanner may terminate due to unexpected NullPointerException thrown in DataNode.reportBadBlocks(). This is different from HDFS-8850/HDFS-9190

      I observed this bug in a production CDH 5.5.1 cluster and the same bug still persist in upstream trunk.

      2016-04-07 20:30:53,830 WARN org.apache.hadoop.hdfs.server.datanode.VolumeScanner: Reporting bad BP-1800173197-10.204.68.5-1444425156296:blk_1170134484_96468685 on /dfs/dn
      2016-04-07 20:30:53,831 ERROR org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting because of exception
      java.lang.NullPointerException
              at org.apache.hadoop.hdfs.server.datanode.DataNode.reportBadBlocks(DataNode.java:1018)
              at org.apache.hadoop.hdfs.server.datanode.VolumeScanner$ScanResultHandler.handle(VolumeScanner.java:287)
              at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.scanBlock(VolumeScanner.java:443)
              at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.runLoop(VolumeScanner.java:547)
              at org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:621)
      2016-04-07 20:30:53,832 INFO org.apache.hadoop.hdfs.server.datanode.VolumeScanner: VolumeScanner(/dfs/dn, DS-89b72832-2a8c-48f3-8235-48e6c5eb5ab3) exiting.
      

      I think the NPE comes from the volume variable in the following code snippet. Somehow the volume scanner know the volume, but the datanode can not lookup the volume using the block.

      public void reportBadBlocks(ExtendedBlock block) throws IOException{
          BPOfferService bpos = getBPOSForBlock(block);
          FsVolumeSpi volume = getFSDataset().getVolume(block);
          bpos.reportBadBlocks(
              block, volume.getStorageID(), volume.getStorageType());
        }
      

      Attachments

        1. HDFS-10512.001.patch
          0.9 kB
          Yiqun Lin
        2. HDFS-10512.002.patch
          2 kB
          Yiqun Lin
        3. HDFS-10512.004.patch
          4 kB
          Wei-Chiu Chuang
        4. HDFS-10512.005.patch
          7 kB
          Yiqun Lin
        5. HDFS-10512.006.patch
          7 kB
          Yiqun Lin

        Issue Links

          Activity

            People

              linyiqun Yiqun Lin
              weichiu Wei-Chiu Chuang
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: