HBase
  1. HBase
  2. HBASE-6260

balancer state should be stored in ZK

    Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.95.2
    • Fix Version/s: None
    • Component/s: master, Zookeeper
    • Labels:
      None
    1. HBASE-6260-v2.patch
      31 kB
      Gregory Chanan
    2. HBASE-6260-addendum2.patch
      0.7 kB
      Gregory Chanan
    3. HBASE-6260-addendum2.patch
      0.7 kB
      Gregory Chanan
    4. HBASE-6260-addendum.patch
      1 kB
      Gregory Chanan
    5. HBASE-6260.patch
      31 kB
      Gregory Chanan
    6. 6260-addendum-3.txt
      1 kB
      Ted Yu

      Issue Links

        Activity

        Gregory Chanan created issue -
        Hide
        Jonathan Hsieh added a comment -

        While we are at it, we should add a method that only queries and returns the current state of the balancer. (and a new command in the shell). Currently it seems like we have to toggle it to the old value and then toggle it back which is annoying and potentially race prone.

        Show
        Jonathan Hsieh added a comment - While we are at it, we should add a method that only queries and returns the current state of the balancer. (and a new command in the shell). Currently it seems like we have to toggle it to the old value and then toggle it back which is annoying and potentially race prone.
        Hide
        Gregory Chanan added a comment -

        @Jon:

        Have you seen HBASE-5953?

        Show
        Gregory Chanan added a comment - @Jon: Have you seen HBASE-5953 ?
        Hide
        Jonathan Hsieh added a comment -

        @Greg
        Did not. Now that I have, its great that it has been done!

        Show
        Jonathan Hsieh added a comment - @Greg Did not. Now that I have, its great that it has been done!
        Hide
        Gregory Chanan added a comment -

        Marking as blocker for now, if we can figure out how to do it later without breaking clients, it doesn't need to be a blocker.

        Show
        Gregory Chanan added a comment - Marking as blocker for now, if we can figure out how to do it later without breaking clients, it doesn't need to be a blocker.
        Gregory Chanan made changes -
        Field Original Value New Value
        Priority Major [ 3 ] Blocker [ 1 ]
        Hide
        Gregory Chanan added a comment -

        I haven't run this through the test suite yet, let's see what HadoopQA says.

        • Master reads balanceSwitch state from ZK
        • Adds two test cases:
          1. If balancer is on and master dies, and new master takes over, balancer is still running
          2. 1. If balancer is off and master dies, and new master takes over, balancer is still not running
        Show
        Gregory Chanan added a comment - Attached HBASE-6260 .patch * I haven't run this through the test suite yet, let's see what HadoopQA says. Master reads balanceSwitch state from ZK Adds two test cases: 1. If balancer is on and master dies, and new master takes over, balancer is still running 2. 1. If balancer is off and master dies, and new master takes over, balancer is still not running
        Gregory Chanan made changes -
        Attachment HBASE-6260.patch [ 12544574 ]
        Hide
        stack added a comment -

        +1 on patch.

        ClientProtos.java change looks gratuitous. Leave it out of patch on commit?

        Do you think we need to add the PB_MAGIC on the front of the data we write to zk? Its only a boolean so wouldn't be the end of the world if missing. Most of the other pb serializations to pb seem to have it though?

        Show
        stack added a comment - +1 on patch. ClientProtos.java change looks gratuitous. Leave it out of patch on commit? Do you think we need to add the PB_MAGIC on the front of the data we write to zk? Its only a boolean so wouldn't be the end of the world if missing. Most of the other pb serializations to pb seem to have it though?
        Gregory Chanan made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Hide
        Gregory Chanan added a comment -

        @stack:

        • I don't know what's going on with ClientProtos.java. I've noticed that a few times when I regenerate, the generated code changes. I'll leave it out.
        • I think PB_MAGIC is a good idea. I noticed without it, an empty (newly-created) znode is valid for the purposes of mergeFrom. Not likely to happen, but probably best to have it throw an error in that case.
        Show
        Gregory Chanan added a comment - @stack: I don't know what's going on with ClientProtos.java. I've noticed that a few times when I regenerate, the generated code changes. I'll leave it out. I think PB_MAGIC is a good idea. I noticed without it, an empty (newly-created) znode is valid for the purposes of mergeFrom. Not likely to happen, but probably best to have it throw an error in that case.
        Hide
        Gregory Chanan added a comment -

        Add PB_MAGIC prefix.

        Show
        Gregory Chanan added a comment - Attached HBASE-6260 -v2.patch * Add PB_MAGIC prefix.
        Gregory Chanan made changes -
        Attachment HBASE-6260-v2.patch [ 12545054 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12545054/HBASE-6260-v2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        +1 tests included. The patch appears to include 3 new or modified tests.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 javac. The patch appears to cause mvn compile goal to fail.

        -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2859//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2859//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545054/HBASE-6260-v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2859//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2859//console This message is automatically generated.
        Hide
        Gregory Chanan added a comment -

        I ran TestForceCacheImportantBlocks locally and it passed. TestReplication has been failing for a long time.

        Show
        Gregory Chanan added a comment - I ran TestForceCacheImportantBlocks locally and it passed. TestReplication has been failing for a long time.
        Hide
        stack added a comment -

        +1

        Checked PB_MAGIC is included serializing and managed deserializing. Looks good.

        Show
        stack added a comment - +1 Checked PB_MAGIC is included serializing and managed deserializing. Looks good.
        Hide
        Gregory Chanan added a comment -

        Thanks for the review, stack. Committed to trunk.

        Show
        Gregory Chanan added a comment - Thanks for the review, stack. Committed to trunk.
        Gregory Chanan made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Ted Yu added a comment -
        + * Tracks the load balancer switch up in ZK
        

        Probably you should make the above comment more general - LoadBalancerTracker may cover more than switch status.

        +      // is data in ZK is null, use default of on.
        

        Typo: 'is' -> 'if'

        Show
        Ted Yu added a comment - + * Tracks the load balancer switch up in ZK Probably you should make the above comment more general - LoadBalancerTracker may cover more than switch status. + // is data in ZK is null , use default of on. Typo: 'is' -> 'if'
        Hide
        Gregory Chanan added a comment -

        @Ted:

        attached an addendum. Let me know what you think.

        Show
        Gregory Chanan added a comment - @Ted: attached an addendum. Let me know what you think.
        Gregory Chanan made changes -
        Attachment HBASE-6260-addendum.patch [ 12545079 ]
        Hide
        stack added a comment -

        Just commit G.

        Show
        stack added a comment - Just commit G.
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #172 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/172/)
        HBASE-6260 balancer state should be stored in ZK (Revision 1384593)

        Result = FAILURE
        gchanan :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/LoadBalancerProtos.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
        • /hbase/trunk/hbase-server/src/main/protobuf/LoadBalancer.proto
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #172 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/172/ ) HBASE-6260 balancer state should be stored in ZK (Revision 1384593) Result = FAILURE gchanan : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/LoadBalancerProtos.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java /hbase/trunk/hbase-server/src/main/protobuf/LoadBalancer.proto /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3329 (See https://builds.apache.org/job/HBase-TRUNK/3329/)
        HBASE-6260 balancer state should be stored in ZK (Revision 1384593)

        Result = FAILURE
        gchanan :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/LoadBalancerProtos.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java
        • /hbase/trunk/hbase-server/src/main/protobuf/LoadBalancer.proto
        • /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3329 (See https://builds.apache.org/job/HBase-TRUNK/3329/ ) HBASE-6260 balancer state should be stored in ZK (Revision 1384593) Result = FAILURE gchanan : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/protobuf/generated/LoadBalancerProtos.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java /hbase/trunk/hbase-server/src/main/protobuf/LoadBalancer.proto /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterFailover.java
        Hide
        Ted Yu added a comment -

        Addendum looks good.

        I saw the following in both trunk builds, including HBase-TRUNK-on-Hadoop-2.0.0:

        Failed tests:   testRegionTransitionOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver): Coprocessor should be called on region rebalancing
        

        Please fix the above.

        Show
        Ted Yu added a comment - Addendum looks good. I saw the following in both trunk builds, including HBase-TRUNK-on-Hadoop-2.0.0: Failed tests: testRegionTransitionOperations(org.apache.hadoop.hbase.coprocessor.TestMasterObserver): Coprocessor should be called on region rebalancing Please fix the above.
        Gregory Chanan made changes -
        Attachment HBASE-6260-addendum2.patch [ 12545100 ]
        Hide
        Gregory Chanan added a comment -

        Attached addendum2 which fixes the test issue.

        Show
        Gregory Chanan added a comment - Attached addendum2 which fixes the test issue.
        Gregory Chanan made changes -
        Resolution Fixed [ 1 ]
        Status Resolved [ 5 ] Reopened [ 4 ]
        Hide
        Gregory Chanan added a comment -

        Reattaching to try to kick off HadoopQA again.

        Show
        Gregory Chanan added a comment - Reattaching to try to kick off HadoopQA again.
        Gregory Chanan made changes -
        Attachment HBASE-6260-addendum2.patch [ 12545107 ]
        Gregory Chanan made changes -
        Status Reopened [ 4 ] Patch Available [ 10002 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12545107/HBASE-6260-addendum2.patch
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 javac. The patch appears to cause mvn compile goal to fail.

        -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:
        org.apache.hadoop.hbase.regionserver.TestAtomicOperation
        org.apache.hadoop.hbase.replication.TestReplication

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545107/HBASE-6260-addendum2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.regionserver.TestAtomicOperation org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2867//console This message is automatically generated.
        Hide
        Ted Yu added a comment -

        Addendum v3 combines the first two addenda

        Show
        Ted Yu added a comment - Addendum v3 combines the first two addenda
        Ted Yu made changes -
        Attachment 6260-addendum-3.txt [ 12545168 ]
        Hide
        Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12545168/6260-addendum-3.txt
        against trunk revision .

        +1 @author. The patch does not contain any @author tags.

        -1 tests included. The patch doesn't appear to include any new or modified tests.
        Please justify why no new tests are needed for this patch.
        Also please list what manual steps were performed to verify this patch.

        +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile.

        +1 javadoc. The javadoc tool did not generate any warning messages.

        -1 javac. The patch appears to cause mvn compile goal to fail.

        -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail.

        +1 release audit. The applied patch does not increase the total number of release audit warnings.

        -1 core tests. The patch failed these unit tests:

        Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//testReport/
        Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console

        This message is automatically generated.

        Show
        Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545168/6260-addendum-3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. +1 javadoc. The javadoc tool did not generate any warning messages. -1 javac. The patch appears to cause mvn compile goal to fail. -1 findbugs. The patch appears to cause Findbugs (version 1.3.9) to fail. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//testReport/ Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console This message is automatically generated.
        Hide
        Gregory Chanan added a comment -

        HadoopQA hung in a couple tests:

        dev-support/findHangingTest.sh https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console
          % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                         Dload  Upload   Total   Spent    Left  Speed
        100  159k    0  159k    0     0   563k      0 --:--:-- --:--:-- --:--:-- 1413k
        Hanging test: Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
        Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck
        

        Ran the unit tests locally and they passed:

        Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap
        Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.809 sec
        
        Running org.apache.hadoop.hbase.util.TestHBaseFsck
        Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 210.967 sec
        
        Show
        Gregory Chanan added a comment - HadoopQA hung in a couple tests: dev-support/findHangingTest.sh https://builds.apache.org/job/PreCommit-HBASE-Build/2870//console % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 159k 0 159k 0 0 563k 0 --:--:-- --:--:-- --:--:-- 1413k Hanging test: Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Hanging test: Running org.apache.hadoop.hbase.util.TestHBaseFsck Ran the unit tests locally and they passed: Running org.apache.hadoop.hbase.util.hbck.TestOfflineMetaRebuildOverlap Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.809 sec Running org.apache.hadoop.hbase.util.TestHBaseFsck Tests run: 27, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 210.967 sec
        Hide
        Ted Yu added a comment -

        Thanks for being careful.

        I think addendum v3 is ready to go.

        Show
        Ted Yu added a comment - Thanks for being careful. I think addendum v3 is ready to go.
        Hide
        Gregory Chanan added a comment -

        Thanks for the review, Ted.

        Committed addendum-v3 to trunk.

        Show
        Gregory Chanan added a comment - Thanks for the review, Ted. Committed addendum-v3 to trunk.
        Gregory Chanan made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Resolution Fixed [ 1 ]
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #174 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/174/)
        HBASE-6260 ADDENDUM (fix test failure plus comment fixups) (Revision 1384954)

        Result = FAILURE
        gchanan :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #174 (See https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/174/ ) HBASE-6260 ADDENDUM (fix test failure plus comment fixups) (Revision 1384954) Result = FAILURE gchanan : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        Hide
        Hudson added a comment -

        Integrated in HBase-TRUNK #3333 (See https://builds.apache.org/job/HBase-TRUNK/3333/)
        HBASE-6260 ADDENDUM (fix test failure plus comment fixups) (Revision 1384954)

        Result = FAILURE
        gchanan :
        Files :

        • /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        Show
        Hudson added a comment - Integrated in HBase-TRUNK #3333 (See https://builds.apache.org/job/HBase-TRUNK/3333/ ) HBASE-6260 ADDENDUM (fix test failure plus comment fixups) (Revision 1384954) Result = FAILURE gchanan : Files : /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/LoadBalancerTracker.java
        Dave Latham made changes -
        Link This issue is duplicated by HBASE-3816 [ HBASE-3816 ]

          People

          • Assignee:
            Gregory Chanan
            Reporter:
            Gregory Chanan
          • Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development