Hadoop HDFS
  1. Hadoop HDFS
  2. HDFS-1456

Provide builder for constructing instances of MiniDFSCluster

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.22.0
    • Component/s: test
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      Time to fix a broken window. Of the 293 occurences of "new MiniDFSCluster("... most look something like:

      cluster = new MiniDFSCluster(0, config, numDatanodes, true, false, true,  null, null, null, null);

      The largest constructor takes 10 parameters, and even the overloaded constructors can be difficult to read as their mainaly nulls or booleans.

      We should provide a Builder for constructing MiniDFSClusters to improve readability.

      1. HDFS-1456.patch
        10 kB
        Jakob Homan
      2. HDFS-1456-2.patch
        192 kB
        Jakob Homan
      3. HDFS-1456-for-1052-branch.patch
        191 kB
        Jakob Homan

        Activity

        Hide
        Jakob Homan added a comment -

        Patch creates a new Builder class for construction MiniDFSClusters. What before was:

            cluster = new MiniDFSCluster(0, conf, NUM_DATA_NODES, true, false, true, null,
                                          null, null, null);

        can now be expressed as

            cluster = new MiniDFSCluster.Builder(conf)
                                        .numDataNodes(NUM_DATA_NODES)
                                        .manageNameDfsDirs(false).build();

        I've converted a few instances to the new Builder. If people like this idea, I'll convert the rest, mainly through automation to avoid human error, but I wanted an easy-to-read patch before one filled with auto-refactoring. We can deprecate the MiniDFSConstructors as well.

        Show
        Jakob Homan added a comment - Patch creates a new Builder class for construction MiniDFSClusters. What before was: cluster = new MiniDFSCluster(0, conf, NUM_DATA_NODES, true, false, true, null, null, null, null); can now be expressed as cluster = new MiniDFSCluster.Builder(conf) .numDataNodes(NUM_DATA_NODES) .manageNameDfsDirs(false).build(); I've converted a few instances to the new Builder. If people like this idea, I'll convert the rest, mainly through automation to avoid human error, but I wanted an easy-to-read patch before one filled with auto-refactoring. We can deprecate the MiniDFSConstructors as well.
        Hide
        Philip Zeyliger added a comment -

        +1. I'm a big fan of this suggestion.

        One thing that might be interesting would be to "cache" MiniDFSClusters. I have a theory (not tested) that many of the tests would work just fine using the same cluster over and over again. If we had some sort of default cached instance, the tests might pass faster.

        Show
        Philip Zeyliger added a comment - +1. I'm a big fan of this suggestion. One thing that might be interesting would be to "cache" MiniDFSClusters. I have a theory (not tested) that many of the tests would work just fine using the same cluster over and over again. If we had some sort of default cached instance, the tests might pass faster.
        Hide
        Konstantin Boudnik added a comment -

        I'm yet to look at the patch but idea is +1 without any doubts.
        Speaking of faster running tests: if MiniDFSCluster can be fixed to be reusable by a set of consequent tests then we might be able to run these tests in same VM (think TestNG) and save on VM startup time for every test class.

        Last time I've tried to do this (almost 2 years ago) I failed because tests were leaving a lot of garbage after them (including semi-dead MiniDFSCluster in some cases) that clean exit was impossible.

        Show
        Konstantin Boudnik added a comment - I'm yet to look at the patch but idea is +1 without any doubts. Speaking of faster running tests: if MiniDFSCluster can be fixed to be reusable by a set of consequent tests then we might be able to run these tests in same VM (think TestNG) and save on VM startup time for every test class. Last time I've tried to do this (almost 2 years ago) I failed because tests were leaving a lot of garbage after them (including semi-dead MiniDFSCluster in some cases) that clean exit was impossible.
        Hide
        Konstantin Boudnik added a comment -

        At first I was about to suggest to name setters more traditionally but then I absolutely liked the

              cluster = new MiniDFSCluster.Builder(config)
                                          .manageDataDfsDirs(false)
                                          .manageNameDfsDirs(false).build();
        

        +1 on the patch.

        Show
        Konstantin Boudnik added a comment - At first I was about to suggest to name setters more traditionally but then I absolutely liked the cluster = new MiniDFSCluster.Builder(config) .manageDataDfsDirs(false) .manageNameDfsDirs(false).build(); +1 on the patch.
        Hide
        Jakob Homan added a comment -

        Here's a patch with a slightly refined builder and all the calls to the MiniDFSCluster constructors replaced with calls to the builder. Most of the refactoring was automatic since most of the calls to the MiniDFS follow a standard pattern. I noted that quite a large percentage of the calls weren't using the most efficient constructor available. This is a lot of code churn, but as they stands, the MiniDFSCluster constructors are so hideous, I think it's worth the change to get rid of them.
        All the tests pass, although TestBlockRecovery, TestBlockTokenWithDFS and TestPipelines are all pretty flaky both on OSX and Ubuntu.
        Test-patch:

             [exec] +1 overall.  
             [exec] 
             [exec]     +1 @author.  The patch does not contain any @author tags.
             [exec] 
             [exec]     +1 tests included.  The patch appears to include 424 new or modified tests.
             [exec] 
             [exec]     +1 javadoc.  The javadoc tool did not generate any warning messages.
             [exec] 
             [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler warnings.
             [exec] 
             [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
             [exec] 
             [exec]     +1 release audit.  The applied patch does not increase the total number of release audit warnings.
             [exec] 
             [exec]     +1 system tests framework.  The patch passed system tests framework compile.
        
        Show
        Jakob Homan added a comment - Here's a patch with a slightly refined builder and all the calls to the MiniDFSCluster constructors replaced with calls to the builder. Most of the refactoring was automatic since most of the calls to the MiniDFS follow a standard pattern. I noted that quite a large percentage of the calls weren't using the most efficient constructor available. This is a lot of code churn, but as they stands, the MiniDFSCluster constructors are so hideous, I think it's worth the change to get rid of them. All the tests pass, although TestBlockRecovery, TestBlockTokenWithDFS and TestPipelines are all pretty flaky both on OSX and Ubuntu. Test-patch: [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 424 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] [exec] +1 system tests framework. The patch passed system tests framework compile.
        Hide
        Boris Shkolnik added a comment -

        except for some leftover comments , like this (// cluster = new MiniDFSCluster.Builder(conf).numDataNodes(0).format(false))
        it looks good.

        I looked thru the patch at most of the substitutions , but since it so huge, I heavily rely on the fact that all the tests have passed.

        +1

        Show
        Boris Shkolnik added a comment - except for some leftover comments , like this (// cluster = new MiniDFSCluster.Builder(conf).numDataNodes(0).format(false)) it looks good. I looked thru the patch at most of the substitutions , but since it so huge, I heavily rely on the fact that all the tests have passed. +1
        Hide
        Jakob Homan added a comment -

        The commented-out calls were in the original code and got converted during the autorefactor. To minimize change - heh - I left them in. Thanks for the review. I've committed this. Resolving as fixed.

        Show
        Jakob Homan added a comment - The commented-out calls were in the original code and got converted during the autorefactor. To minimize change - heh - I left them in. Thanks for the review. I've committed this. Resolving as fixed.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Hdfs-trunk-Commit #411 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/411/)
        HDFS-1456. Provide builder for constructing instances of MiniDFSCluster.

        Show
        Hudson added a comment - Integrated in Hadoop-Hdfs-trunk-Commit #411 (See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/411/ ) HDFS-1456 . Provide builder for constructing instances of MiniDFSCluster.
        Hide
        Jakob Homan added a comment -

        Patch merged with the 1052 branch to be committed there. Same as trunk, just some lines didn't merge cleanly.

        Show
        Jakob Homan added a comment - Patch merged with the 1052 branch to be committed there. Same as trunk, just some lines didn't merge cleanly.

          People

          • Assignee:
            Jakob Homan
            Reporter:
            Jakob Homan
          • Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development