HBase
  1. HBase
  2. HBASE-5882

Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.90.6, 0.92.1, 0.94.0
    • Fix Version/s: 0.95.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Currently on master restart if it tries to do processRIT, any region if found on dead server tries to avoid the nwe assignment so that timeout monitor can take care.
      This case is more prominent if the node is found in RS_ZK_REGION_OPENING state. I think we can handle this by triggering a new assignment with a new plan.

      1. HBASE-5882_v6.patch
        9 kB
        ramkrishna.s.vasudevan
      2. HBASE-5882_v5.patch
        4 kB
        ramkrishna.s.vasudevan
      3. hbase_5882.patch
        4 kB
        Ashutosh Jindal
      4. hbase_5882_V4.patch
        4 kB
        ramkrishna.s.vasudevan
      5. hbase_5882_V3.patch
        4 kB
        Ashutosh Jindal
      6. hbase_5882_V2.patch
        4 kB
        Ashutosh Jindal

        Activity

        Hide
        Ashutosh Jindal added a comment -

        Submitted patch for 0.96. Please review and provide your suggestions/comments.

        Show
        Ashutosh Jindal added a comment - Submitted patch for 0.96. Please review and provide your suggestions/comments.
        Hide
        Ted Yu added a comment -

        Idea is good.

        +  private boolean wasOpeningOnDeadServer(ServerName sn,
        +      Map<ServerName, List<Pair<HRegionInfo, Result>>> deadServers) {
        +    if (deadServers.keySet().contains(sn)) {
        

        The above method doesn't check whether regionInfo is in opening state. So the name of method should be changed accordingly.

        Show
        Ted Yu added a comment - Idea is good. + private boolean wasOpeningOnDeadServer(ServerName sn, + Map<ServerName, List<Pair<HRegionInfo, Result>>> deadServers) { + if (deadServers.keySet().contains(sn)) { The above method doesn't check whether regionInfo is in opening state. So the name of method should be changed accordingly.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12527664/hbase_5882.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.master.TestSplitLogManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12527664/hbase_5882.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 31 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1891//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Ted
        Is the name 'wasOnDeadServer' ok?
        But the name was given lik that because this change is done for RS_ZK_OPENING state. Based on your suggestion i can change it and commit it.

        Show
        ramkrishna.s.vasudevan added a comment - @Ted Is the name 'wasOnDeadServer' ok? But the name was given lik that because this change is done for RS_ZK_OPENING state. Based on your suggestion i can change it and commit it.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Will commit today unless objections. Will make the method name as 'wasOnDeadServer'. Any comments pls share.

        Show
        ramkrishna.s.vasudevan added a comment - Will commit today unless objections. Will make the method name as 'wasOnDeadServer'. Any comments pls share.
        Hide
        Ted Yu added a comment -

        wasOnDeadServer is Okay.

        Show
        Ted Yu added a comment - wasOnDeadServer is Okay.
        Hide
        Ashutosh Jindal added a comment -

        Submitted updated patch. Please review and provide suggestions/comments.

        Show
        Ashutosh Jindal added a comment - Submitted updated patch. Please review and provide suggestions/comments.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12528004/hbase_5882_V2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.TestRegionRebalancing
        org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
        org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
        org.apache.hadoop.hbase.master.TestAssignmentManager

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528004/hbase_5882_V2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster org.apache.hadoop.hbase.master.TestAssignmentManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1925//console This message is automatically generated.
        Hide
        ramkrishna.s.vasudevan added a comment -

        The latest test case failure in TestAssignmentManager is due to the impact of the testcase that went in HBASE-5927. A small tweak will make it work.

        Show
        ramkrishna.s.vasudevan added a comment - The latest test case failure in TestAssignmentManager is due to the impact of the testcase that went in HBASE-5927 . A small tweak will make it work.
        Hide
        Ashutosh Jindal added a comment -

        Updated patch for 0.96.

        Show
        Ashutosh Jindal added a comment - Updated patch for 0.96.
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12528025/hbase_5882_V3.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop23. The patch compiles against the hadoop 0.23.x profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        +1 javac. The applied patch does not increase the total number of javac compiler warnings.

        -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.master.TestSplitLogManager
        org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//testReport/
        Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12528025/hbase_5882_V3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop23. The patch compiles against the hadoop 0.23.x profile. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 32 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1928//console This message is automatically generated.
        Hide
        Ted Yu added a comment -
        +        } else if (wasOnDeadServer(sn, deadServers)){
        

        since regionInfo was not one of the parameters to wasOnDeadServer(), the method name still doesn't make sense.
        I think we can directly place the check ( deadServers.keySet().contains(sn) ) above. This way there is no need to introduce a new method.

        Show
        Ted Yu added a comment - + } else if (wasOnDeadServer(sn, deadServers)){ since regionInfo was not one of the parameters to wasOnDeadServer(), the method name still doesn't make sense. I think we can directly place the check ( deadServers.keySet().contains(sn) ) above. This way there is no need to introduce a new method.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Updated patch addressing Ted's comments.
        I can commit this if the patch is ok.

        Show
        ramkrishna.s.vasudevan added a comment - Updated patch addressing Ted's comments. I can commit this if the patch is ok.
        Hide
        Ted Yu added a comment -

        I don't see what is different in patch v4 compared to patch v3.

        Show
        Ted Yu added a comment - I don't see what is different in patch v4 compared to patch v3.
        Hide
        stack added a comment -

        Patch looks good to me.

        For the next time, instead of

        +    if (deadServers.keySet().contains(sn)) {
        +      return true;
        +    }
        +    return false;
        

        Why not just

        return deadServers.keySet().contains(sn)
        
        Show
        stack added a comment - Patch looks good to me. For the next time, instead of + if (deadServers.keySet().contains(sn)) { + return true ; + } + return false ; Why not just return deadServers.keySet().contains(sn)
        Hide
        ramkrishna.s.vasudevan added a comment -

        Ah, sorry. I uploaded the wrong one from my machine.

        Show
        ramkrishna.s.vasudevan added a comment - Ah, sorry. I uploaded the wrong one from my machine.
        Hide
        ramkrishna.s.vasudevan added a comment -

        @Ted
        If you are ok with v5 i can commit it tomorrow. Thanks. Going to bed now

        Show
        ramkrishna.s.vasudevan added a comment - @Ted If you are ok with v5 i can commit it tomorrow. Thanks. Going to bed now
        Hide
        Ted Yu added a comment -

        Patch v5 looks good.

        Show
        Ted Yu added a comment - Patch v5 looks good.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed to trunk. Thanks for the patch Ashutosh.
        Thanks for the review Stack and Ted.

        Show
        ramkrishna.s.vasudevan added a comment - Committed to trunk. Thanks for the patch Ashutosh. Thanks for the review Stack and Ted.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2908 (See https://builds.apache.org/job/HBase-TRUNK/2908/)
        HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1340392)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2908 (See https://builds.apache.org/job/HBase-TRUNK/2908/ ) HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1340392) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #11 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/11/)
        HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1340392)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #11 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/11/ ) HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1340392) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        ramkrishna.s.vasudevan added a comment -

        Currently reverted as TestAssignmentManager needs some clean up.

        Show
        ramkrishna.s.vasudevan added a comment - Currently reverted as TestAssignmentManager needs some clean up.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2909 (See https://builds.apache.org/job/HBase-TRUNK/2909/)
        HBASE-5882 (Revert) TestAssginmentManager needs some cleanup (Revision 1340422)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2909 (See https://builds.apache.org/job/HBase-TRUNK/2909/ ) HBASE-5882 (Revert) TestAssginmentManager needs some cleanup (Revision 1340422) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #12 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/12/)
        HBASE-5882 (Revert) TestAssginmentManager needs some cleanup (Revision 1340422)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #12 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/12/ ) HBASE-5882 (Revert) TestAssginmentManager needs some cleanup (Revision 1340422) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        ramkrishna.s.vasudevan added a comment -

        Just a clean up in TestAssignmentManager.java.
        Restored the default class of LoadBalancer.

        Show
        ramkrishna.s.vasudevan added a comment - Just a clean up in TestAssignmentManager.java. Restored the default class of LoadBalancer.
        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed the patch. Hence resolving this.

        Show
        ramkrishna.s.vasudevan added a comment - Committed the patch. Hence resolving this.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #2910 (See https://builds.apache.org/job/HBase-TRUNK/2910/)
        HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1341110)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #2910 (See https://builds.apache.org/job/HBase-TRUNK/2910/ ) HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1341110) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/)
        HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1341110)

        Result = FAILURE
        ramkrishna :
        Files :

        • /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
        • /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #13 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/13/ ) HBASE-5882 Prcoess RIT on master restart can try assigning the region if the region is found on a dead server instead of waiting for Timeout Monitor (Ashutosh) (Revision 1341110) Result = FAILURE ramkrishna : Files : /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java /hbase/trunk/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
        Hide
        ramkrishna.s.vasudevan added a comment -

        Committed to trunk only.

        Show
        ramkrishna.s.vasudevan added a comment - Committed to trunk only.
        Hide
        stack added a comment -

        Marking closed.

        Show
        stack added a comment - Marking closed.

          People

          • Assignee:
            Ashutosh Jindal
            Reporter:
            ramkrishna.s.vasudevan
          • Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development