Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5733

AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE.

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: 0.94.1, 0.95.0
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Found while going through the code...
      AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE as this is directly iterating the nodes from listChildrenAndWatchForNewChildren with-out checking for null.

      Here also we need to handle with null check like other places.

      1. HBASE-5733.patch
        5 kB
        Uma Maheswara Rao G
      2. HBASE-5733.patch
        5 kB
        Uma Maheswara Rao G
      3. HBASE-5733.patch
        5 kB
        Uma Maheswara Rao G

        Activity

        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        When we can not get the children due to ZK problem, we may not be able to mark as failover as there is no nodes.
        In-fact currently it will throw NPE. Do we need to shutdown the master in this case? or we can retry?

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - When we can not get the children due to ZK problem, we may not be able to mark as failover as there is no nodes. In-fact currently it will throw NPE. Do we need to shutdown the master in this case? or we can retry?
        Hide
        zhihyu@ebaysf.com Ted Yu added a comment -

        We should retry in this scenario.

        Show
        zhihyu@ebaysf.com Ted Yu added a comment - We should retry in this scenario.
        Hide
        ram_krish ramkrishna.s.vasudevan added a comment -

        Already it is a RecoverableZookeeper right. So we again retrying may be redundant.

        Show
        ram_krish ramkrishna.s.vasudevan added a comment - Already it is a RecoverableZookeeper right. So we again retrying may be redundant.
        Hide
        stack stack added a comment -

        If can't get to zk, then all bets are off (As Ram says, if connectionloss issues, RZK will retry under the covers).

        Show
        stack stack added a comment - If can't get to zk, then all bets are off (As Ram says, if connectionloss issues, RZK will retry under the covers).
        Hide
        zhihyu@ebaysf.com Ted Yu added a comment -

        @Uma:
        Can you generate a patch for trunk ?
        I got the following when I tried to apply your patch to trunk:

        [ERROR] /Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java:[495,75] unreported exception com.google.protobuf.ServiceException; must be caught or declared to be thrown
        
        Show
        zhihyu@ebaysf.com Ted Yu added a comment - @Uma: Can you generate a patch for trunk ? I got the following when I tried to apply your patch to trunk: [ERROR] /Users/zhihyu/trunk-hbase/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java:[495,75] unreported exception com.google.protobuf.ServiceException; must be caught or declared to be thrown
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Thanks a lot, Ted for taking a look!
        Yep, accidentally uploaded the little older one than today's patch. Updated the latest one, which I tested with real cluster for aborting on this situation.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Thanks a lot, Ted for taking a look! Yep, accidentally uploaded the little older one than today's patch. Updated the latest one, which I tested with real cluster for aborting on this situation.
        Hide
        zhihyu@ebaysf.com Ted Yu added a comment -

        testProcessDeadServersAndRegionsInTransitionShouldNotFailWithNPE failed without the patch and passes with the patch.

        Show
        zhihyu@ebaysf.com Ted Yu added a comment - testProcessDeadServersAndRegionsInTransitionShouldNotFailWithNPE failed without the patch and passes with the patch.
        Hide
        zhihyu@ebaysf.com Ted Yu added a comment -

        Minor comment:
        Similar sentence appears 3 times below:

        +      LOG.fatal("Problem in getting the children from ZK. Going to abort");
        +      master.abort("Problem in getting the children from ZK", new IOException(
        +          "Failed to get the children from ZK"));
        +      return;
        

        Can "Failed to get the children from ZK" be shared ?

        Show
        zhihyu@ebaysf.com Ted Yu added a comment - Minor comment: Similar sentence appears 3 times below: + LOG.fatal( "Problem in getting the children from ZK. Going to abort" ); + master.abort( "Problem in getting the children from ZK" , new IOException( + "Failed to get the children from ZK" )); + return ; Can "Failed to get the children from ZK" be shared ?
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Thanks a lot Ted for the reviews!
        Updated the patch with your suggestion.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Thanks a lot Ted for the reviews! Updated the patch with your suggestion.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522805/HBASE-5733.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522805/HBASE-5733.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1538//console This message is automatically generated.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522828/HBASE-5733.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522828/HBASE-5733.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestServerCustomProtocol Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1540//console This message is automatically generated.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Test failure and findbugs are urelated to this change.

        I ran the test several times. Once it failed out of 10 runs without the patch.
        Will check the test failure separately as it is not related.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Test failure and findbugs are urelated to this change. I ran the test several times. Once it failed out of 10 runs without the patch. Will check the test failure separately as it is not related.
        Hide
        stack stack added a comment -

        Patch looks good to me. I like the test. The LOG.fatal is redundant. The master abort does a log fatal. Else patch is good.

        Show
        stack stack added a comment - Patch looks good to me. I like the test. The LOG.fatal is redundant. The master abort does a log fatal. Else patch is good.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Yeah, I just seen that in logs in real cluster with this situation. I will remove that explicit FATAL log here.

        2012-04-17 11:18:39,353 FATAL org.apache.hadoop.hbase.master.AssignmentManager: Problem in getting the children from ZK. Going to abort
        2012-04-17 11:18:39,354 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: []
        2012-04-17 11:18:39,354 FATAL org.apache.hadoop.hbase.master.HMaster: Problem in getting the children from ZK
        java.io.IOException: Failed to get the children from ZK
        at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:398)
        at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:347)
        at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:537)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:343)
        at java.lang.Thread.run(Thread.java:662)
        2012-04-17 11:18:39,355 INFO org.apache.hadoop.hbase.master.HMaster: Aborting

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Yeah, I just seen that in logs in real cluster with this situation. I will remove that explicit FATAL log here. 2012-04-17 11:18:39,353 FATAL org.apache.hadoop.hbase.master.AssignmentManager: Problem in getting the children from ZK. Going to abort 2012-04-17 11:18:39,354 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: [] 2012-04-17 11:18:39,354 FATAL org.apache.hadoop.hbase.master.HMaster: Problem in getting the children from ZK java.io.IOException: Failed to get the children from ZK at org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:398) at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:347) at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:537) at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:343) at java.lang.Thread.run(Thread.java:662) 2012-04-17 11:18:39,355 INFO org.apache.hadoop.hbase.master.HMaster: Aborting
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Attached the same patch as previous, with removal of FATAL log.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Attached the same patch as previous, with removal of FATAL log.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12522970/HBASE-5733.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12522970/HBASE-5733.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 4 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1550//console This message is automatically generated.
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        No test failures and some tests skipped, that is unrelated to this change. And findbugs are unrelated.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - No test failures and some tests skipped, that is unrelated to this change. And findbugs are unrelated.
        Hide
        zhihyu@ebaysf.com Ted Yu added a comment -

        From Hadoop QA test output, I didn't find the hanging test.

        Integrated to trunk.

        Thanks for the patch Uma.

        Thanks for the review, Stack.

        Show
        zhihyu@ebaysf.com Ted Yu added a comment - From Hadoop QA test output, I didn't find the hanging test. Integrated to trunk. Thanks for the patch Uma. Thanks for the review, Stack.
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-TRUNK #2779 (See https://builds.apache.org/job/HBase-TRUNK/2779/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE (Uma Maheswara Rao G) (Revision 1327364)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-TRUNK #2779 (See https://builds.apache.org/job/HBase-TRUNK/2779/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE (Uma Maheswara Rao G) (Revision 1327364) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-TRUNK-security #174 (See https://builds.apache.org/job/HBase-TRUNK-security/174/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE (Uma Maheswara Rao G) (Revision 1327364)

        Result = FAILURE
        tedyu :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-TRUNK-security #174 (See https://builds.apache.org/job/HBase-TRUNK-security/174/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE (Uma Maheswara Rao G) (Revision 1327364) Result = FAILURE tedyu : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        umamaheswararao Uma Maheswara Rao G added a comment -

        Since it got committed, marking it as closed.

        Show
        umamaheswararao Uma Maheswara Rao G added a comment - Since it got committed, marking it as closed.
        Hide
        ram_krish ramkrishna.s.vasudevan added a comment -

        I think its better we commit it to 0.94.1 also before Lars could take the Rc.

        Show
        ram_krish ramkrishna.s.vasudevan added a comment - I think its better we commit it to 0.94.1 also before Lars could take the Rc.
        Hide
        ram_krish ramkrishna.s.vasudevan added a comment -

        Reopening so that once committed to other versions we can close it.

        Show
        ram_krish ramkrishna.s.vasudevan added a comment - Reopening so that once committed to other versions we can close it.
        Hide
        ram_krish ramkrishna.s.vasudevan added a comment -

        Committed to 0.94 and 0.92. Hence resolving it.

        Show
        ram_krish ramkrishna.s.vasudevan added a comment - Committed to 0.94 and 0.92. Hence resolving it.
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-0.94 #233 (See https://builds.apache.org/job/HBase-0.94/233/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344352)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-0.94 #233 (See https://builds.apache.org/job/HBase-0.94/233/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344352) Result = FAILURE ramkrishna : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-0.92 #433 (See https://builds.apache.org/job/HBase-0.92/433/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344354)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-0.92 #433 (See https://builds.apache.org/job/HBase-0.92/433/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344354) Result = FAILURE ramkrishna : Files : /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-0.94-security #33 (See https://builds.apache.org/job/HBase-0.94-security/33/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344352)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-0.94-security #33 (See https://builds.apache.org/job/HBase-0.94-security/33/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344352) Result = FAILURE ramkrishna : Files : /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        hudson Hudson added a comment -

        Integrated in HBase-0.92-security #109 (See https://builds.apache.org/job/HBase-0.92-security/109/)
        HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344354)

        Result = SUCCESS
        ramkrishna :
        Files :

        • /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        Show
        hudson Hudson added a comment - Integrated in HBase-0.92-security #109 (See https://builds.apache.org/job/HBase-0.92-security/109/ ) HBASE-5733 AssignmentManager#processDeadServersAndRegionsInTransition can fail with NPE. (Uma) (Revision 1344354) Result = SUCCESS ramkrishna : Files : /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java

          People

          • Assignee:
            umamaheswararao Uma Maheswara Rao G
            Reporter:
            umamaheswararao Uma Maheswara Rao G
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development