HBase
  1. HBase
  2. HBASE-1556

[testing] optimize minicluster based testing in the test suite

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Duplicate
    • Affects Version/s: None
    • Fix Version/s: 0.90.0
    • Component/s: test
    • Labels:
      None

      Description

      It is possible to tell junit to run all of the unit tests in a single forked JVM:

        <junit fork="yes" forkmode="once" ... >
        ...
        </junit>
      

      Then, use statics to manage miniclusters in background threads:

        protected static HBaseConfiguration conf = new HBaseConfiguration();
        protected static MiniZooKeeperCluster zooKeeperCluster;
        protected static MiniHBaseCluster hbaseCluster;
        protected static MiniDFSCluster dfsCluster;
      
        public static boolean isMiniClusterRunning() {
          return hbaseCluster != null;
        }
      
        private static void startDFS() throws Exception {
          if (dfsCluster != null) {
            LOG.error("MiniDFSCluster already running");
            return;
          }
          Path path = new Path(
              conf.get("test.build.data", "test/build/data"), "MiniClusterTestCase");
          FileSystem testFS = FileSystem.get(conf);
          if (testFS.exists(path)) {
            testFS.delete(path, true);
          }
          testDir = new File(path.toString());
          dfsCluster = new MiniDFSCluster(conf, 2, true, (String[])null);
          FileSystem filesystem = dfsCluster.getFileSystem();
          conf.set("fs.default.name", filesystem.getUri().toString());     
          Path parentdir = filesystem.getHomeDirectory();
          conf.set(HConstants.HBASE_DIR, parentdir.toString());
          filesystem.mkdirs(parentdir);
          FSUtils.setVersion(filesystem, parentdir);
          LOG.info("started MiniDFSCluster in " + testDir.toString());
        }
      
        private static void stopDFS() {
          if (dfsCluster != null) try {
            dfsCluster.shutdown();
            dfsCluster = null;
          } catch (Exception e) {
            LOG.warn(StringUtils.stringifyException(e));
          }
        }
      
        private static void startZooKeeper() throws Exception {
          if (zooKeeperCluster != null) {
            LOG.error("ZooKeeper already running");
            return;
          }
          zooKeeperCluster = new MiniZooKeeperCluster();
          zooKeeperCluster.startup(testDir);
          LOG.info("started " + zooKeeperCluster.getClass().getName());
        }
      
        private static void stopZooKeeper() {
          if (zooKeeperCluster != null) try {
            zooKeeperCluster.shutdown();
            zooKeeperCluster = null;
          } catch (Exception e) {
            LOG.warn(StringUtils.stringifyException(e));
          }
        }
       
        private static void startHBase() throws Exception {
          if (hbaseCluster != null) {
            LOG.error("MiniHBaseCluster already running");
            return;
          }
          hbaseCluster = new MiniHBaseCluster(conf, 1);
          // opening the META table ensures that cluster is running
          new HTable(conf, HConstants.META_TABLE_NAME);
          LOG.info("started MiniHBaseCluster");
        }
       
        private static void stopHBase() {
          if (hbaseCluster != null) try {
            HConnectionManager.deleteConnectionInfo(conf, true);
            hbaseCluster.shutdown();
            hbaseCluster = null;
          } catch (Exception e) {
            LOG.warn(StringUtils.stringifyException(e));
          }
        }
      
        public static void startMiniCluster() throws Exception {
          try {
            startDFS();
            startZooKeeper();
            startHBase();
          } catch (Exception e) {
            stopHBase();
            stopZooKeeper();
            stopDFS();
            throw e;
          }
        }
      
        public static void stopMiniCluster() {
          stopHBase();
          stopZooKeeper();
          stopDFS();
        }
      

      The base class for cluster testing can do something like so in its startUp method:

        protected void setUp() throws Exception {
          // start the mini cluster if it is not running yet
          if (!isMiniClusterRunning()) {
            startMiniCluster();
          }
        }
      

      For example, when testing Stargate, it is clear that the minicluster startup costs are included in the run time of the first unit test, which checks if the miniclusters are all running, and subsequent tests do not incur those costs:

       
      test:
         [delete] Deleting directory /home/apurtell/src/stargate.git/build/test/logs
          [mkdir] Created dir: /home/apurtell/src/stargate.git/build/test/logs
          [junit] Running org.apache.hadoop.hbase.stargate.Test00MiniCluster
          [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 10.329 sec
          [junit] Running org.apache.hadoop.hbase.stargate.Test01VersionResource
          [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.243 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestCellModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.012 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestCellSetModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.018 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestRowModel
          [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.008 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestScannerModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.013 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestStorageClusterStatusModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.024 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestStorageClusterVersionModel
          [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.006 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestTableInfoModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.017 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestTableListModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.012 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestTableRegionModel
          [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.018 sec
          [junit] Running org.apache.hadoop.hbase.stargate.model.TestVersionModel
          [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.014 sec
      
      BUILD SUCCESSFUL
      Total time: 14 seconds
      

      This can obviously shave a lot of time off the current HBase test suite. However, the current suite will need to be heavily modified. Each test case has been written with the expectation that it starts up a pristine minicluster, so there are assumptions made that will be invalidated, and many cases which duplicate the table creates of others, etc.

        Activity

        Hide
        stack added a comment -

        This issue will be addressed by hbase-410

        Show
        stack added a comment - This issue will be addressed by hbase-410
        Hide
        Andrew Purtell added a comment -

        Sure, close this issue.

        Show
        Andrew Purtell added a comment - Sure, close this issue.
        Hide
        stack added a comment -

        @Andrew I was going to implement the general idea here of spinning up cluster at start of a test suite and keeping it up while tests run using the suggestion made in hbase-1276; i.e. junit4s @BeforeClass and @AfterClass. If thats ok w/ you, can we close this issue as sub task of hbase-410?

        Show
        stack added a comment - @Andrew I was going to implement the general idea here of spinning up cluster at start of a test suite and keeping it up while tests run using the suggestion made in hbase-1276; i.e. junit4s @BeforeClass and @AfterClass. If thats ok w/ you, can we close this issue as sub task of hbase-410?
        Hide
        Andrew Purtell added a comment -

        HBASE-1828 is an issue where store level unit tests for filters pass but client side tests lock up. This is an indication that we need more client side tests for major functions. A persistent minicluster both helps and hurts here. It may make the suite fast enough to run the tests in a reasonable time frame, but bad state from a previous test case may manifest in a later one, causing confusion.

        Show
        Andrew Purtell added a comment - HBASE-1828 is an issue where store level unit tests for filters pass but client side tests lock up. This is an indication that we need more client side tests for major functions. A persistent minicluster both helps and hurts here. It may make the suite fast enough to run the tests in a reasonable time frame, but bad state from a previous test case may manifest in a later one, causing confusion.
        Hide
        stack added a comment -

        Making critical for 0.21. We have to have a better test story.

        Show
        stack added a comment - Making critical for 0.21. We have to have a better test story.
        Hide
        stack added a comment -

        I'm ok with different framework. Your suggestion of per-package is good too. I was going to add some "judgements" made after spending 30 seconds comparing junit4, testng, and gridunit but have decided to withhold them till I've spent at least 5 minutes on each.

        Show
        stack added a comment - I'm ok with different framework. Your suggestion of per-package is good too. I was going to add some "judgements" made after spending 30 seconds comparing junit4, testng, and gridunit but have decided to withhold them till I've spent at least 5 minutes on each.
        Hide
        Andrew Purtell added a comment -

        How forkmode="once" and a parallel JUnit framework might work nicely together is for each sub-package in the test suite to be executed on an EC2 instance. Group related tests and their initializations together.

        Show
        Andrew Purtell added a comment - How forkmode="once" and a parallel JUnit framework might work nicely together is for each sub-package in the test suite to be executed on an EC2 instance. Group related tests and their initializations together.
        Hide
        Andrew Purtell added a comment -

        I do think a major refactoring if not what amounts to a rewrite is needed. One option is to continue to use JUnit3 as the test rig but look at the distributed extensions such as GridUnit or GridGain. However because so much would be changed anyway, moving to a different testing framework might make sense if there is something better out there.

        Show
        Andrew Purtell added a comment - I do think a major refactoring if not what amounts to a rewrite is needed. One option is to continue to use JUnit3 as the test rig but look at the distributed extensions such as GridUnit or GridGain. However because so much would be changed anyway, moving to a different testing framework might make sense if there is something better out there.
        Hide
        stack added a comment -

        OK. That works for me. How you suggest we proceed. I'd actually like to rewrite bulk of unit tests so consistent, well-grouped, redundancies have been removed, and they are pertinent to current hbase.

        Show
        stack added a comment - OK. That works for me. How you suggest we proceed. I'd actually like to rewrite bulk of unit tests so consistent, well-grouped, redundancies have been removed, and they are pertinent to current hbase.
        Hide
        Andrew Purtell added a comment -

        No need to go to TestNG. (Not that I am opposed to that...) Currently I do this for the Stargate test suite:

          class MiniClusterShutdownThread extends Thread {
            public void run() {
              stopMiniCluster();
            }
          }
        
          protected void setUp() throws Exception {
            // start the mini cluster if it is not running yet
            if (!isMiniClusterRunning()) {
              startMiniCluster();
              Runtime.getRuntime().addShutdownHook(new MiniClusterShutdownThread());
            }
            // ...
          }
        

        Because the tests are run with forkmode="once", the initialization cost of spinning up the minicluster is borne by the first test only, and the shutdown thread runs when junit has no more testcases to execute to clean up everything.

        Show
        Andrew Purtell added a comment - No need to go to TestNG. (Not that I am opposed to that...) Currently I do this for the Stargate test suite: class MiniClusterShutdownThread extends Thread { public void run() { stopMiniCluster(); } } protected void setUp() throws Exception { // start the mini cluster if it is not running yet if (!isMiniClusterRunning()) { startMiniCluster(); Runtime .getRuntime().addShutdownHook( new MiniClusterShutdownThread()); } // ... } Because the tests are run with forkmode="once", the initialization cost of spinning up the minicluster is borne by the first test only, and the shutdown thread runs when junit has no more testcases to execute to clean up everything.
        Hide
        stack added a comment -

        I can see how a unit test could start the mini cluster if not running but how do you shut it down when the test suite is done. Do you know of a hook in the ant junit to let you do this?

        I looked at junit 4.x. It has @Before and @After but seems good for the life of @Test only. Not good enough it would seem.

        This seems a little richer – http://testng.org/doc/documentation-main.html. See down a bit where it has annotations like @BeforeSuite and @AfterSuite.

        Show
        stack added a comment - I can see how a unit test could start the mini cluster if not running but how do you shut it down when the test suite is done. Do you know of a hook in the ant junit to let you do this? I looked at junit 4.x. It has @Before and @After but seems good for the life of @Test only. Not good enough it would seem. This seems a little richer – http://testng.org/doc/documentation-main.html . See down a bit where it has annotations like @BeforeSuite and @AfterSuite.
        Hide
        Andrew Purtell added a comment -

        @Nitay: Waiting for table delete is pretty expensive. It may be relatively clean to shut down the HBase minicluster, do the equivalent of 'hadoop fs -mv /hbase /hbase.<timestamp>' and then launch a new HBase minicluster. Trade space for time by using mv instead of rmr to avoid delete overhead.

        Show
        Andrew Purtell added a comment - @Nitay: Waiting for table delete is pretty expensive. It may be relatively clean to shut down the HBase minicluster, do the equivalent of 'hadoop fs -mv /hbase /hbase.<timestamp>' and then launch a new HBase minicluster. Trade space for time by using mv instead of rmr to avoid delete overhead.
        Hide
        Andrew Purtell added a comment - - edited

        "GridGain" includes a distributed JUnit3 test framework: http://www.gridgainsystems.com/wiki/display/GG15UG/Distributed+JUnit+Overview

        There is also "GridUnit": http://sourceforge.net/projects/gridunit/

        Show
        Andrew Purtell added a comment - - edited "GridGain" includes a distributed JUnit3 test framework: http://www.gridgainsystems.com/wiki/display/GG15UG/Distributed+JUnit+Overview There is also "GridUnit": http://sourceforge.net/projects/gridunit/
        Hide
        Nitay Joffe added a comment -

        Great idea Andrew. On the assumptions side, what if we just delete all the data in DFS/HBase/ZK after each test? I'd expect the overhead from that shouldn't be as bad as re-spinning the servers as we currently do.

        Show
        Nitay Joffe added a comment - Great idea Andrew. On the assumptions side, what if we just delete all the data in DFS/HBase/ZK after each test? I'd expect the overhead from that shouldn't be as bad as re-spinning the servers as we currently do.
        Hide
        Andrew Purtell added a comment -

        In addition we can consider some kind of "parallel/distributed junit" to divide up and assign out unit tests to multiple runners on a cluster, if a test cluster environment is available. Also, EC2 friendliness and maybe a driver script for EC2 that can spin up instances, assign out the tests, collect the results, tear down the instances, and mail the result to the tester or hbase-dev@

        Show
        Andrew Purtell added a comment - In addition we can consider some kind of "parallel/distributed junit" to divide up and assign out unit tests to multiple runners on a cluster, if a test cluster environment is available. Also, EC2 friendliness and maybe a driver script for EC2 that can spin up instances, assign out the tests, collect the results, tear down the instances, and mail the result to the tester or hbase-dev@

          People

          • Assignee:
            Unassigned
            Reporter:
            Andrew Purtell
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development