Details

    • Type: Sub-task Sub-task
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: master
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Before doing the more significant changes to HMaster, it would benefit greatly from some cleanup, commenting, and a bit of refactoring.

      One motivation is to nail down the initialization flow and comment each step. Another is to add a couple new classes to break up functionality into helpers to reduce HMaster size (for example, pushing all filesystem operations into their own class). And lastly to stop the practice of passing around references to HMaster everywhere and instead pass along only what is necessary.

      1. HBASE-2695-MasterStartupCleanup-v4.patch
        73 kB
        Jonathan Gray
      2. HBASE-2695-part1-masterstatus.patch
        74 kB
        Karthik Ranganathan
      3. HBASE-2695-part2.1-masterstatus.patch
        63 kB
        Karthik Ranganathan
      4. HBASE-2695-ZK-Master-FINAL-v4.patch
        146 kB
        Jonathan Gray

        Activity

        Hide
        Karthik Ranganathan added a comment -

        This adds an interface called MasterStatus and another called ServerStatus.

        Right now, all methods of the HMaster used by other components live in one of the above interfaces. To make the review easier, this is a refactpr-only change. The following changes will actually work on the logic and hence be much more contained (read: easier to review).

        Show
        Karthik Ranganathan added a comment - This adds an interface called MasterStatus and another called ServerStatus. Right now, all methods of the HMaster used by other components live in one of the above interfaces. To make the review easier, this is a refactpr-only change. The following changes will actually work on the logic and hence be much more contained (read: easier to review).
        Hide
        Jonathan Gray added a comment -

        HBASE-2695-part1-masterstatus.patch committed to 0.90_master_rewrite branch

        Show
        Jonathan Gray added a comment - HBASE-2695 -part1-masterstatus.patch committed to 0.90_master_rewrite branch
        Hide
        Karthik Ranganathan added a comment -

        Part 2 of master status... this patch removes many unnecessary methods from MasterStatus.

        Show
        Karthik Ranganathan added a comment - Part 2 of master status... this patch removes many unnecessary methods from MasterStatus.
        Hide
        Jonathan Gray added a comment -

        HBASE-2695-part2.1-masterstatus.patch committed to 0.90_master_rewrite branch

        Show
        Jonathan Gray added a comment - HBASE-2695 -part2.1-masterstatus.patch committed to 0.90_master_rewrite branch
        Hide
        Jonathan Gray added a comment -

        MasterStartupCleanup-v4 was committed to the branch.

        Show
        Jonathan Gray added a comment - MasterStartupCleanup-v4 was committed to the branch.
        Hide
        HBase Review Board added a comment -

        Message from: "Jonathan Gray" <jgray@apache.org>

        -----------------------------------------------------------
        This is an automatically generated e-mail. To reply, visit:
        http://review.hbase.org/r/387/
        -----------------------------------------------------------

        (Updated 2010-07-27 09:31:00.269382)

        Review request for hbase, stack and Karthik Ranganathan.

        Changes
        -------

        Just attaching to HBASE-2695 to see if it makes it to the lists now

        Summary
        -------

        This is the rest of the master cleanup and zookeeper cleanup. Everything is moved over to the new ZooKeeperWatcher, ZooKeeperListeners, ZKUtil/ZKAssign, etc...

        There is a second page to the diff linked at the bottom with lots of good stuff, don't miss it!

        Now on to the good stuff!

        This addresses bug HBASE-2695.
        http://issues.apache.org/jira/browse/HBASE-2695

        Diffs


        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionEventData.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 964617
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java PRE-CREATION
        branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 964617
        branches/0.90_master_rewrite/src/main/resources/hbase-webapps/master/master.jsp 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestMultiParallelPut.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/OOMEHMaster.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMasterTransitions.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 964617
        branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java PRE-CREATION

        Diff: http://review.hbase.org/r/387/diff

        Testing
        -------

        Most unit tests passing. Still addressing remaining failures but most seem to be related to the fact that I was running multiple tests and ZK clusters were stomping on each other.

        Thanks,

        Jonathan

        Show
        HBase Review Board added a comment - Message from: "Jonathan Gray" <jgray@apache.org> ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: http://review.hbase.org/r/387/ ----------------------------------------------------------- (Updated 2010-07-27 09:31:00.269382) Review request for hbase, stack and Karthik Ranganathan. Changes ------- Just attaching to HBASE-2695 to see if it makes it to the lists now Summary ------- This is the rest of the master cleanup and zookeeper cleanup. Everything is moved over to the new ZooKeeperWatcher, ZooKeeperListeners, ZKUtil/ZKAssign, etc... There is a second page to the diff linked at the bottom with lots of good stuff, don't miss it! Now on to the good stuff! This addresses bug HBASE-2695 . http://issues.apache.org/jira/browse/HBASE-2695 Diffs branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/Abortable.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/MiniZooKeeperCluster.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/ServerController.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseEventHandler.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/HBaseExecutorService.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionData.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/executor/RegionTransitionEventData.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ActiveMasterManager.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/HMaster.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ProcessRegionOpen.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/RegionManager.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/ZKUnassignedWatcher.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterCloseRegionHandler.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/master/handler/MasterOpenRegionHandler.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/MasterAddressManager.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/regionserver/RSZookeeperUpdater.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ClusterStatusTracker.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RegionServerTracker.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/RootRegionTracker.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKAssign.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZKUtil.java 964617 branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperNodeTracker.java PRE-CREATION branches/0.90_master_rewrite/src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java 964617 branches/0.90_master_rewrite/src/main/resources/hbase-webapps/master/master.jsp 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/MiniHBaseCluster.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/TestMultiParallelPut.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/OOMEHMaster.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestMasterTransitions.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/master/TestRestartCluster.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/regionserver/TestMasterAddressManager.java 964617 branches/0.90_master_rewrite/src/test/java/org/apache/hadoop/hbase/zookeeper/TestZooKeeperNodeTracker.java PRE-CREATION Diff: http://review.hbase.org/r/387/diff Testing ------- Most unit tests passing. Still addressing remaining failures but most seem to be related to the fact that I was running multiple tests and ZK clusters were stomping on each other. Thanks, Jonathan
        Hide
        Jonathan Gray added a comment -

        Committed HBASE-2695-ZK-Master-FINAL-v4 to master branch. This completes HBASE-2695 and HBASE-2696. Work now moving to HBASE-2697.

        Show
        Jonathan Gray added a comment - Committed HBASE-2695 -ZK-Master-FINAL-v4 to master branch. This completes HBASE-2695 and HBASE-2696 . Work now moving to HBASE-2697 .
        Hide
        Jonathan Gray added a comment -

        Complete in master rewrite branch, keeping as patch available until merge with trunk.

        Show
        Jonathan Gray added a comment - Complete in master rewrite branch, keeping as patch available until merge with trunk.
        Hide
        stack added a comment -

        @Jon For what I've done so far, read commit notes (do an svn log). There is still loads to do but I think I'm up to speed now so we can tag-team hack now. Here's some questions I have:

        + Q: Can I miss events like data changed znode data events?
        + Q: Why when we start a master do we assign root and meta?  Thats a mistake?
        + Q: We let out IEs and KEs?  What are we doing to do w/ them up in the higher levels of server?
        + ClusterStatusTracker is wrong.  Its znode is root region location rather than clusterStatusZNode ( /hbase/shutdown)?
        + In zkassign, we need to fix how it takes regionname when it really wants encodedregionname -- errorprone; should just pass regioninfo everytime.
        + I movd to use event handlers to shutdown regions inside regionserver always rather than handlers when master asks that a region be shutdown but some other code when RS determines it needs to shutdown.  Made it so for RS ordained shutdown, e.g. it noticed cluster shutdown or manual shutdown requested, we do not have the closeregionhandler report to zk though its the close region handler that is running.  Any problem w/ that that you can see?  What if master sends over a close region on a region close already queued?
        
        Show
        stack added a comment - @Jon For what I've done so far, read commit notes (do an svn log). There is still loads to do but I think I'm up to speed now so we can tag-team hack now. Here's some questions I have: + Q: Can I miss events like data changed znode data events? + Q: Why when we start a master do we assign root and meta? Thats a mistake? + Q: We let out IEs and KEs? What are we doing to do w/ them up in the higher levels of server? + ClusterStatusTracker is wrong. Its znode is root region location rather than clusterStatusZNode ( /hbase/shutdown)? + In zkassign, we need to fix how it takes regionname when it really wants encodedregionname -- errorprone; should just pass regioninfo everytime. + I movd to use event handlers to shutdown regions inside regionserver always rather than handlers when master asks that a region be shutdown but some other code when RS determines it needs to shutdown. Made it so for RS ordained shutdown, e.g. it noticed cluster shutdown or manual shutdown requested, we do not have the closeregionhandler report to zk though its the close region handler that is running. Any problem w/ that that you can see? What if master sends over a close region on a region close already queued?
        Hide
        Jonathan Gray added a comment -

        + Q: Can I miss events like data changed znode data events?

        What do you mean "can you miss"? Like, is it possible or do you mean if it happens will we still be okay? Specifically which znode(s) are you talking about? unassigned ones? It's designed so we do not ever miss events, though we can actually miss an individual event we have guarantees about not missing end states.

        + Q: Why when we start a master do we assign root and meta? Thats a mistake?

        Yeah it's a mistake to do that if a failed-over master. What the assignRoot/assignMeta methods should do is check to see if they are already assigned, and if so, do nothing. If not current assignment we should trigger one and then still wait as we do.

        + Q: We let out IEs and KEs? What are we doing to do w/ them up in the higher levels of server?

        It becomes responsibility of callers. In most cases, a KE does an abort(). Interrupted needs to be handled accordingly (in general we don't expect most things to get interrupted, if they do, it's likely done at shutdown).

        + ClusterStatusTracker is wrong. Its znode is root region location rather than clusterStatusZNode ( /hbase/shutdown)?

        Yes it's wrong. We're not really doing anything with it which is why that didn't matter I guess. Like we've discussed, this should be actually used to trigger RS startup rather than presence of a master.

        + In zkassign, we need to fix how it takes regionname when it really wants encodedregionname – errorprone; should just pass regioninfo everytime.

        Sounds like a fine change to me. Just need to make sure we can always have a RegionInfo.

        + I movd to use event handlers to shutdown regions inside regionserver always.....

        Not sure I totally follow. Let's discuss further.

        Any chance at editing your previous comment and introducing some line breaks? I hate when these jiras go mega-wide

        Good stuff. I'm still making my way through the svn log.

        Show
        Jonathan Gray added a comment - + Q: Can I miss events like data changed znode data events? What do you mean "can you miss"? Like, is it possible or do you mean if it happens will we still be okay? Specifically which znode(s) are you talking about? unassigned ones? It's designed so we do not ever miss events, though we can actually miss an individual event we have guarantees about not missing end states. + Q: Why when we start a master do we assign root and meta? Thats a mistake? Yeah it's a mistake to do that if a failed-over master. What the assignRoot/assignMeta methods should do is check to see if they are already assigned, and if so, do nothing. If not current assignment we should trigger one and then still wait as we do. + Q: We let out IEs and KEs? What are we doing to do w/ them up in the higher levels of server? It becomes responsibility of callers. In most cases, a KE does an abort(). Interrupted needs to be handled accordingly (in general we don't expect most things to get interrupted, if they do, it's likely done at shutdown). + ClusterStatusTracker is wrong. Its znode is root region location rather than clusterStatusZNode ( /hbase/shutdown)? Yes it's wrong. We're not really doing anything with it which is why that didn't matter I guess. Like we've discussed, this should be actually used to trigger RS startup rather than presence of a master. + In zkassign, we need to fix how it takes regionname when it really wants encodedregionname – errorprone; should just pass regioninfo everytime. Sounds like a fine change to me. Just need to make sure we can always have a RegionInfo. + I movd to use event handlers to shutdown regions inside regionserver always..... Not sure I totally follow. Let's discuss further. Any chance at editing your previous comment and introducing some line breaks? I hate when these jiras go mega-wide Good stuff. I'm still making my way through the svn log.
        Hide
        stack added a comment -

        Committed as part of HBASE-2692.

        Show
        stack added a comment - Committed as part of HBASE-2692 .

          People

          • Assignee:
            Jonathan Gray
            Reporter:
            Jonathan Gray
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development