[HDFS-2838] HA: NPE in FSNamesystem when in safe mode - ASF JIRA

Voters

Watch issue

Watchers

Link

Clone

Update Comment Author

Replace String in Comment

Update Comment Visibility

Delete Comments

XML

Word

Printable

JSON

Details

Type: Sub-task
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: HA branch (HDFS-1623)
Fix Version/s: HA branch (HDFS-1623)
Component/s: ha, namenode
Labels:
None

Target Version/s:

HA branch (HDFS-1623)
Hadoop Flags:

Reviewed

Description

I'm seeing an NPE when running HBase 0.92 unit tests against the HA branch. The test failure is: org.apache.hadoop.hbase.regionserver.wal.TestHLog.testAppendClose.

Here is the backtrace:
java.lang.NullPointerException
at org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.size(BlocksMap.java:179)
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getActiveBlockCount(BlockManager.java:2465)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.doConsistencyCheck(FSNamesystem.java:3591)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.isOn(FSNamesystem.java:3285)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem$SafeModeInfo.access$900(FSNamesystem.java:3196)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isInSafeMode(FSNamesystem.java:3670)
at org.apache.hadoop.hdfs.server.namenode.NameNode.isInSafeMode(NameNode.java:609)
at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:1476)
at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:1487)

Here is the relevant section of the test:

   try {
      DistributedFileSystem dfs = (DistributedFileSystem) cluster.getFileSystem();
      dfs.setSafeMode(FSConstants.SafeModeAction.SAFEMODE_ENTER);
      cluster.shutdown();
      try {
        // wal.writer.close() will throw an exception,
        // but still call this since it closes the LogSyncer thread first
        wal.close();
      } catch (IOException e) {
        LOG.info(e);
      }
      fs.close(); // closing FS last so DFSOutputStream can't call close
      LOG.info("STOPPED first instance of the cluster");
    } finally {
      // Restart the cluster
      while (cluster.isClusterUp()){
        LOG.error("Waiting for cluster to go down");
        Thread.sleep(1000);
      }

Fix looks trivial, will include patch shortly.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HDFS-2838-v2.patch
26/Jan/12 21:35
3 kB
Gregory Chanan
HDFS-2838.patch
25/Jan/12 02:19
0.9 kB
Gregory Chanan

Activity

Comment

This comment will be Viewable by All Users Viewable by All Users

Cancel

People

Assignee:: Gregory Chanan

Reporter:: Gregory Chanan

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Dates

Created:: 25/Jan/12 02:11

Updated:: 02/Mar/12 06:17

Resolved:: 26/Jan/12 23:48

Agile

View on Board

HA: NPE in FSNamesystem when in safe mode

Details

Description

Attachments

Attachments

Activity

People

Dates

Agile

Slack

Issue deployment