Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-15992

Preserve original KeeperException when converted to external exceptions

    XMLWordPrintableJSON

Details

    • Brainstorming
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 0.98.14
    • None
    • None

    Description

      During an investigation in which we were seeing unexpected NoServerForRegionException errors, the root cause turned out to be a KeeperException that got lost and so resulted in a misleading top level indication.

      The underlying exception with partial stacktrace is this:

      org.apache.zookeeper.KeeperException$AuthFailedException: KeeperErrorCode = AuthFailed for /hbase/meta-region-server
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:123)
      	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
      	at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1289)
      	at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359)
      	at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:684)
      	at org.apache.hadoop.hbase.zookeeper.ZKUtil.blockUntilAvailable(ZKUtil.java:2032)
      	at org.apache.hadoop.hbase.zookeeper.MetaRegionTracker.blockUntilAvailable(MetaRegionTracker.java:203)
      	at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getMetaRegionLocation(ZooKeeperRegistry.java:58)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateMeta(HConnectionManager.java:1209)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1175)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1301)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1178)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:1135)
      	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:976)
      

      Here is some additional information:

      • The exception first gets caught here
      • It gets logged and rethrown from here
      • It gets caught again, logged and rethrown here
      • This finally gets caught and rethrown as InterruptedException here

      When thrown as InterruptedException, the cause is lost, so the code catching it can't (and currently doesn't) determine the cause. Perhaps the exception should be preserved and passed on to the caller such that it is available when finally the NoServerForRegionException is thrown here. Alternatively, a more meaningful exception could also be thrown instead of a misleading NoServerForRegionException, especially in cases where the failure indicates a more permanent condition.

      Attachments

        Activity

          People

            Unassigned Unassigned
            haridsv Hari Krishna Dara
            Votes:
            1 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated: