Hadoop Common
  1. Hadoop Common
  2. HADOOP-1436

Redesign Tool and ToolBase API and releted functionality

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.15.0
    • Fix Version/s: 0.15.0
    • Component/s: util
    • Labels:
      None

      Description

      With the discussion from HADOOP-1425, we need better abstraction and better tool runner utilities.

      1. Classes do not need to extend ToolBase
      2. functions for parsing general HadoopCommands (-fs, -conf, -jt) should be public
      3. We need a ToolRunner, or similar functionality
      4. Also we need each class (implementing Tool) to be runnable (main method)
      5. CLI objects can be passed to run method of the Tool class (arguable)

      1. redesignToolAndRelated_v1.0.patch
        11 kB
        Enis Soztutar
      2. redesignToolAPI_v1.1.patch
        47 kB
        Enis Soztutar
      3. redesignToolAPI_v1.2.patch
        47 kB
        Enis Soztutar
      4. redesignToolAPI_v1.3.patch
        47 kB
        Enis Soztutar

        Activity

        Hide
        Enis Soztutar added a comment -

        If i were a regular hadoop user, i would prefer the issue that deprecates anything to be listed in Incompatible changes. A hadoop client, could easily ignore improvements, optimizations and bug fixes sections and go for only new features and incompatible changes.

        I agree with the deprecation/removal cycle, but i do no see any counter-argument against listing both changes in incompatible changes. Notice also that HADOOP-1621 is already listed in incompatible changes section.

        Show
        Enis Soztutar added a comment - If i were a regular hadoop user, i would prefer the issue that deprecates anything to be listed in Incompatible changes. A hadoop client, could easily ignore improvements, optimizations and bug fixes sections and go for only new features and incompatible changes. I agree with the deprecation/removal cycle, but i do no see any counter-argument against listing both changes in incompatible changes. Notice also that HADOOP-1621 is already listed in incompatible changes section.
        Hide
        Doug Cutting added a comment -

        > the changes.txt log should be in the INCOMPATIBLE CHANGES section rather than IMPROVEMENTS

        I don't think a deprecation is an incompatibility. It doesn't break user code.

        Our pattern is to try to deprecate things in one release, then remove them in a subsequent release. Then user code can run unchanged with each new release, so that the release may be easily evaluated. And, if after upgrading, you remove all uses of deprecated code, then you'll be able to upgrade to the next release. The question is, where in that cycle is the incompatible change?

        Arguably, for applications that play by the rules (removing use of deprecated features in the current release before upgrading to the next release) there are no incompatible changes in this cycle. Even if we do want to label such deprecation/removal cycles as incompatible changes, we should probably choose one event or the other, deprecation or removal, as the incompatible step. I'd probably opt for removal, not deprecation. Thoughts?

        Show
        Doug Cutting added a comment - > the changes.txt log should be in the INCOMPATIBLE CHANGES section rather than IMPROVEMENTS I don't think a deprecation is an incompatibility. It doesn't break user code. Our pattern is to try to deprecate things in one release, then remove them in a subsequent release. Then user code can run unchanged with each new release, so that the release may be easily evaluated. And, if after upgrading, you remove all uses of deprecated code, then you'll be able to upgrade to the next release. The question is, where in that cycle is the incompatible change? Arguably, for applications that play by the rules (removing use of deprecated features in the current release before upgrading to the next release) there are no incompatible changes in this cycle. Even if we do want to label such deprecation/removal cycles as incompatible changes, we should probably choose one event or the other, deprecation or removal, as the incompatible step. I'd probably opt for removal, not deprecation. Thoughts?
        Hide
        Enis Soztutar added a comment -

        Doug, i think the changes.txt log should be in the INCOMPATIBLE CHANGES section rather than IMPROVEMENTS. The patch deprecates ToolBase, which is used heavily in nutch.

        Show
        Enis Soztutar added a comment - Doug, i think the changes.txt log should be in the INCOMPATIBLE CHANGES section rather than IMPROVEMENTS. The patch deprecates ToolBase, which is used heavily in nutch.
        Hide
        Doug Cutting added a comment -

        I just committed this. Thanks, Enis!

        Show
        Doug Cutting added a comment - I just committed this. Thanks, Enis!
        Hide
        Raghu Angadi added a comment -

        Looks good to me. I briefly went through the patch. We can commit.

        Show
        Raghu Angadi added a comment - Looks good to me. I briefly went through the patch. We can commit.
        Hide
        Doug Cutting added a comment -

        This looks reasonable to me. Does anyone have any objections, or should I commit this?

        Show
        Doug Cutting added a comment - This looks reasonable to me. Does anyone have any objections, or should I commit this?
        Show
        Hadoop QA added a comment - +1 http://issues.apache.org/jira/secure/attachment/12363918/redesignToolAPI_v1.3.patch applied and successfully tested against trunk revision r566467. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/561/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/561/console
        Hide
        Enis Soztutar added a comment -

        patch updated to trunk.

        Show
        Enis Soztutar added a comment - patch updated to trunk.
        Hide
        Hadoop QA added a comment -

        -1, build or testing failed

        2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363572/redesignToolAPI_v1.2.patch against trunk revision r564804.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/542/testReport/
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/542/console

        Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

        Show
        Hadoop QA added a comment - -1, build or testing failed 2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363572/redesignToolAPI_v1.2.patch against trunk revision r564804. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/542/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/542/console Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
        Hide
        Enis Soztutar added a comment -

        retriggering build for the last patch

        Show
        Enis Soztutar added a comment - retriggering build for the last patch
        Hide
        Enis Soztutar added a comment -

        OK, here is the latest version, with javadoc warnings cleared.

        Show
        Enis Soztutar added a comment - OK, here is the latest version, with javadoc warnings cleared.
        Hide
        Hadoop QA added a comment -

        -1, new javadoc warnings

        The javadoc tool appears to have generated warning messages when testing the latest attachment http://issues.apache.org/jira/secure/attachment/12363480/redesignToolAPI_v1.1.patch against trunk revision r564012.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/537/testReport/
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/537/console

        Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

        Show
        Hadoop QA added a comment - -1, new javadoc warnings The javadoc tool appears to have generated warning messages when testing the latest attachment http://issues.apache.org/jira/secure/attachment/12363480/redesignToolAPI_v1.1.patch against trunk revision r564012. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/537/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/537/console Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
        Hide
        Enis Soztutar added a comment -

        Resubmitting the patch, since it passes all the tests in my local PC.

        Show
        Enis Soztutar added a comment - Resubmitting the patch, since it passes all the tests in my local PC.
        Hide
        Hadoop QA added a comment -

        -1, build or testing failed

        2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363480/redesignToolAPI_v1.1.patch against trunk revision r564012.

        Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/536/testReport/
        Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/536/console

        Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.

        Show
        Hadoop QA added a comment - -1, build or testing failed 2 attempts failed to build and test the latest attachment http://issues.apache.org/jira/secure/attachment/12363480/redesignToolAPI_v1.1.patch against trunk revision r564012. Test results: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/536/testReport/ Console output: http://lucene.zones.apache.org:8080/hudson/job/Hadoop-Patch/536/console Please note that this message is automatically generated and may represent a problem with the automation system and not the patch.
        Hide
        Enis Soztutar added a comment -

        After no apparent feedback ( smile ) to the previous, finally here is the complete version of the patch.

        I've deprecated ToolBase, moved all functionality for parsing generic arguments to GenericOptionsParser, implemented all the functionality in ToolBase#doMain() in ToolRunner. Classes extending ToolBase can now implement Tool, and optionally extend Configured.

        Show
        Enis Soztutar added a comment - After no apparent feedback ( smile ) to the previous, finally here is the complete version of the patch. I've deprecated ToolBase , moved all functionality for parsing generic arguments to GenericOptionsParser , implemented all the functionality in ToolBase#doMain() in ToolRunner . Classes extending ToolBase can now implement Tool, and optionally extend Configured.
        Hide
        Enis Soztutar added a comment -

        redesignToolAndRelated_v1.0.patch

        This patch is intended for review. implemented basic changes discussed in HADOOP-1425. this patch :

        1. deprecates ToolBase
        2. introduces HadoopCommandLineParser class to parse command line options
        3. ToolRunner class to run classes implementing Tool

        Below are the points that i would be glad to be reviewed :
        1. naming HadoopCommandLineParser (should it be CommandLineParser, or ArgumentParser, etc)
        2. adding Tool.getOptions() and changing Tool#run(CommandLine). All the actual use cases for ToolBase do not use Commons CLI.
        3. -conf argument in https://issues.apache.org/jira/browse/HADOOP-1425#action_12499035

        Show
        Enis Soztutar added a comment - redesignToolAndRelated_v1.0.patch This patch is intended for review. implemented basic changes discussed in HADOOP-1425 . this patch : 1. deprecates ToolBase 2. introduces HadoopCommandLineParser class to parse command line options 3. ToolRunner class to run classes implementing Tool Below are the points that i would be glad to be reviewed : 1. naming HadoopCommandLineParser (should it be CommandLineParser, or ArgumentParser, etc) 2. adding Tool.getOptions() and changing Tool#run(CommandLine) . All the actual use cases for ToolBase do not use Commons CLI. 3. -conf argument in https://issues.apache.org/jira/browse/HADOOP-1425#action_12499035

          People

          • Assignee:
            Enis Soztutar
            Reporter:
            Enis Soztutar
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development