VotersWatch issueWatchersLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: HA branch (HDFS-1623)
    • Fix Version/s: HA branch (HDFS-1623)
    • Component/s: ha, namenode
    • Labels:
      None

      Description

      I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch. The test failure is: org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.

      Here is the backtrace:
      java.lang.NullPointerException
      at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
      at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
      at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
      at org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
      at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
      at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)

      Here is the relevant section of the test:

         try {
            DistributedFileSystem dfs = (DistributedFileSystem) cluster.getFileSystem();
            dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
            cluster.shutdown();
            try {
              // wal.writer.close() will throw an exception,
              // but still call this since it closes the LogSyncer thread first
              wal.close();
            } catch (IOException e) {
              LOG.info(e);
            }
            fs.close(); // closing FS last so DFSOutputStream can't call close
            LOG.info("STOPPED first instance of the cluster");
          } finally {
            // Restart the cluster
            while (cluster.isClusterUp()){
              LOG.error("Waiting for cluster to go down");
              Thread.sleep(1000);
            }
      

      Fix looks trivial, will include patch shortly.

        Attachments

        1. HDFS-2838-v2.patch
          3 kB
          Gregory Chanan
        2. HDFS-2838.patch
          0.9 kB
          Gregory Chanan

          Activity

            People

            • Assignee:
              gchanan Gregory Chanan
              Reporter:
              gchanan Gregory Chanan

              Dates

              • Created:
                Updated:
                Resolved:

                Issue deployment