Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-13341

Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.0-alpha1
    • Fix Version/s: 3.0.0-alpha2
    • Component/s: scripts
    • Labels:
      None
    • Target Version/s:
    • Hadoop Flags:
      Incompatible change
    • Release Note:
      Hide
      <!-- markdown -->
      Users:
      * Ability to set per-command+sub-command options from the command line.
      * Makes daemon environment variable options consistent across the project. (See deprecation list below)
      * HADOOP\_CLIENT\_OPTS is now honored for every non-daemon sub-command. Prior to this change, many sub-commands did not use it.

      Developers:
      * No longer need to do custom handling for options in the case section of the shell scripts.
      * Consolidates all \_OPTS handling into hadoop-functions.sh to enable future projects.
      * All daemons running with secure mode features now get \_SECURE\_EXTRA\_OPTS support.

      \_OPTS Changes:

      | Old | New |
      |:---- |:---- |
      | HADOOP\_BALANCER\_OPTS | HDFS\_BALANCER\_OPTS |
      | HADOOP\_DATANODE\_OPTS | HDFS\_DATANODE\_OPTS |
      | HADOOP\_DN\_SECURE_EXTRA_OPTS | HDFS\_DATANODE\_SECURE\_EXTRA\_OPTS |
      | HADOOP\_JOB\_HISTORYSERVER\_OPTS | MAPRED\_HISTORYSERVER\_OPTS |
      | HADOOP\_JOURNALNODE\_OPTS | HDFS\_JOURNALNODE\_OPTS |
      | HADOOP\_MOVER\_OPTS | HDFS\_MOVER\_OPTS |
      | HADOOP\_NAMENODE\_OPTS | HDFS\_NAMENODE\_OPTS |
      | HADOOP\_NFS3\_OPTS | HDFS\_NFS3\_OPTS |
      | HADOOP\_NFS3\_SECURE\_EXTRA\_OPTS | HDFS\_NFS3\_SECURE\_EXTRA\_OPTS |
      | HADOOP\_PORTMAP\_OPTS | HDFS\_PORTMAP\_OPTS |
      | HADOOP\_SECONDARYNAMENODE\_OPTS | HDFS\_SECONDARYNAMENODE\_OPTS |
      | HADOOP\_ZKFC\_OPTS | HDFS\_ZKFC\_OPTS |
      Show
      <!-- markdown --> Users: * Ability to set per-command+sub-command options from the command line. * Makes daemon environment variable options consistent across the project. (See deprecation list below) * HADOOP\_CLIENT\_OPTS is now honored for every non-daemon sub-command. Prior to this change, many sub-commands did not use it. Developers: * No longer need to do custom handling for options in the case section of the shell scripts. * Consolidates all \_OPTS handling into hadoop-functions.sh to enable future projects. * All daemons running with secure mode features now get \_SECURE\_EXTRA\_OPTS support. \_OPTS Changes: | Old | New | |:---- |:---- | | HADOOP\_BALANCER\_OPTS | HDFS\_BALANCER\_OPTS | | HADOOP\_DATANODE\_OPTS | HDFS\_DATANODE\_OPTS | | HADOOP\_DN\_SECURE_EXTRA_OPTS | HDFS\_DATANODE\_SECURE\_EXTRA\_OPTS | | HADOOP\_JOB\_HISTORYSERVER\_OPTS | MAPRED\_HISTORYSERVER\_OPTS | | HADOOP\_JOURNALNODE\_OPTS | HDFS\_JOURNALNODE\_OPTS | | HADOOP\_MOVER\_OPTS | HDFS\_MOVER\_OPTS | | HADOOP\_NAMENODE\_OPTS | HDFS\_NAMENODE\_OPTS | | HADOOP\_NFS3\_OPTS | HDFS\_NFS3\_OPTS | | HADOOP\_NFS3\_SECURE\_EXTRA\_OPTS | HDFS\_NFS3\_SECURE\_EXTRA\_OPTS | | HADOOP\_PORTMAP\_OPTS | HDFS\_PORTMAP\_OPTS | | HADOOP\_SECONDARYNAMENODE\_OPTS | HDFS\_SECONDARYNAMENODE\_OPTS | | HADOOP\_ZKFC\_OPTS | HDFS\_ZKFC\_OPTS |

      Description

      Big features like YARN-2928 demonstrate that even senior level Hadoop developers forget that daemons need a custom _OPTS env var. We can replace all of the custom vars with generic handling just like we do for the username check.

      For example, with generic handling in place:

      Old Var New Var
      HADOOP_NAMENODE_OPTS HDFS_NAMENODE_OPTS
      YARN_RESOURCEMANAGER_OPTS YARN_RESOURCEMANAGER_OPTS
      n/a YARN_TIMELINEREADER_OPTS
      n/a HADOOP_DISTCP_OPTS
      n/a MAPRED_DISTCP_OPTS
      HADOOP_DN_SECURE_EXTRA_OPTS HDFS_DATANODE_SECURE_EXTRA_OPTS
      HADOOP_NFS3_SECURE_EXTRA_OPTS HDFS_NFS3_SECURE_EXTRA_OPTS
      HADOOP_JOB_HISTORYSERVER_OPTS MAPRED_HISTORYSERVER_OPTS

      This makes it:

      a) consistent across the entire project
      b) consistent for every subcommand
      c) eliminates almost all of the custom appending in the case statements

      It's worth pointing out that subcommands like distcp that sometimes need a higher than normal client-side heapsize or custom options are a huge win. Combined with .hadooprc and/or dynamic subcommands, it means users can easily do customizations based upon their needs without a lot of weirdo shell aliasing or one line shell scripts off to the side.

      1. HADOOP-13341.00.patch
        54 kB
        Allen Wittenauer

        Issue Links

          Activity

          Hide
          aw Allen Wittenauer added a comment - - edited

          One of the unexpected challenges here is ordering of operations:

          • HADOOP_OPTS
          • HADOOP_(subcommand)_OPTS
          • HADOOP_CLIENTS_OPTS

          ... is likely the ideal. But the way HADOOP_CLIENT_OPTS are appended makes this particularly tricky since it is done in the case statements. It might be better to pull that code out first, then deal with the various daemons.

          Show
          aw Allen Wittenauer added a comment - - edited One of the unexpected challenges here is ordering of operations: HADOOP_OPTS HADOOP_(subcommand)_OPTS HADOOP_CLIENTS_OPTS ... is likely the ideal. But the way HADOOP_CLIENT_OPTS are appended makes this particularly tricky since it is done in the case statements. It might be better to pull that code out first, then deal with the various daemons.
          Hide
          aw Allen Wittenauer added a comment -

          Turning this into an umbrella JIRA, will do the work in a branch.

          Show
          aw Allen Wittenauer added a comment - Turning this into an umbrella JIRA, will do the work in a branch.
          Hide
          aw Allen Wittenauer added a comment -

          I'm opting instead to go with:

          • HADOOP_OPTS
          • HADOOP_CLIENTS_OPTS
          • HADOOP_(subcommand)_OPTS

          with the thought that subcommand is more specific than CLIENTS. It's trivial to flip (just swap two function calls in the commands) if anyone disagrees.

          I've also hit an interesting dilemma in the form of the secure opts in HDFS. Opening a new subtask to keep track of that.

          Show
          aw Allen Wittenauer added a comment - I'm opting instead to go with: HADOOP_OPTS HADOOP_CLIENTS_OPTS HADOOP_(subcommand)_OPTS with the thought that subcommand is more specific than CLIENTS. It's trivial to flip (just swap two function calls in the commands) if anyone disagrees. I've also hit an interesting dilemma in the form of the secure opts in HDFS. Opening a new subtask to keep track of that.
          Hide
          aw Allen Wittenauer added a comment -

          Under MS-DOS/COMMAND.EXE, environment variables are not always case sensitive. It's going to be safer to uppercase subcommand in the env var. This also means that hdfs NAMENODE and hdfs namenode will need to be effectively the same command. :/

          Show
          aw Allen Wittenauer added a comment - Under MS-DOS/COMMAND.EXE, environment variables are not always case sensitive. It's going to be safer to uppercase subcommand in the env var. This also means that hdfs NAMENODE and hdfs namenode will need to be effectively the same command. :/
          Hide
          aw Allen Wittenauer added a comment -

          I'm gonna pull the array-ification of the _OPTS vars into a separate patch. It's gonna end up being super huge.

          Show
          aw Allen Wittenauer added a comment - I'm gonna pull the array-ification of the _OPTS vars into a separate patch. It's gonna end up being super huge.
          Hide
          aw Allen Wittenauer added a comment -

          FWIW: this is basically down to documentation now. yay!

          Show
          aw Allen Wittenauer added a comment - FWIW: this is basically down to documentation now. yay!
          Hide
          aw Allen Wittenauer added a comment -

          -00:

          • test merge of branch against trunk
          Show
          aw Allen Wittenauer added a comment - -00: test merge of branch against trunk
          Hide
          hadoopqa Hadoop QA added a comment -
          +1 overall



          Vote Subsystem Runtime Comment
          0 reexec 0m 19s Docker mode activated.
          +1 @author 0m 0s The patch does not contain any @author tags.
          +1 test4tests 0m 0s The patch appears to include 4 new or modified test files.
          0 mvndep 0m 19s Maven dependency ordering for branch
          +1 mvninstall 6m 52s trunk passed
          +1 mvnsite 9m 6s trunk passed
          0 mvndep 0m 15s Maven dependency ordering for patch
          +1 mvnsite 8m 41s the patch passed
          +1 shellcheck 0m 12s There were no new shellcheck issues.
          +1 shelldocs 0m 9s There were no new shelldocs issues.
          +1 whitespace 0m 0s The patch has no whitespace issues.
          +1 unit 2m 0s hadoop-common in the patch passed.
          +1 unit 0m 48s hadoop-hdfs in the patch passed.
          +1 unit 2m 41s hadoop-yarn in the patch passed.
          +1 unit 0m 22s hadoop-streaming in the patch passed.
          +1 unit 0m 20s hadoop-distcp in the patch passed.
          +1 unit 0m 18s hadoop-archive-logs in the patch passed.
          +1 unit 0m 22s hadoop-rumen in the patch passed.
          +1 unit 0m 21s hadoop-extras in the patch passed.
          +1 unit 0m 22s hadoop-sls in the patch passed.
          +1 unit 2m 6s hadoop-mapreduce-project in the patch passed.
          +1 asflicense 0m 23s The patch does not generate ASF License warnings.
          36m 28s



          Subsystem Report/Notes
          Docker Image:yetus/hadoop:9560f25
          JIRA Issue HADOOP-13341
          JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12826443/HADOOP-13341.00.patch
          Optional Tests asflicense mvnsite unit shellcheck shelldocs
          uname Linux f05c5a65d9d5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
          Build tool maven
          Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh
          git revision trunk / 20ae1fa
          shellcheck v0.4.4
          Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10429/testReport/
          modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-yarn-project/hadoop-yarn hadoop-tools/hadoop-streaming hadoop-tools/hadoop-distcp hadoop-tools/hadoop-archive-logs hadoop-tools/hadoop-rumen hadoop-tools/hadoop-extras hadoop-tools/hadoop-sls hadoop-mapreduce-project U: .
          Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10429/console
          Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org

          This message was automatically generated.

          Show
          hadoopqa Hadoop QA added a comment - +1 overall Vote Subsystem Runtime Comment 0 reexec 0m 19s Docker mode activated. +1 @author 0m 0s The patch does not contain any @author tags. +1 test4tests 0m 0s The patch appears to include 4 new or modified test files. 0 mvndep 0m 19s Maven dependency ordering for branch +1 mvninstall 6m 52s trunk passed +1 mvnsite 9m 6s trunk passed 0 mvndep 0m 15s Maven dependency ordering for patch +1 mvnsite 8m 41s the patch passed +1 shellcheck 0m 12s There were no new shellcheck issues. +1 shelldocs 0m 9s There were no new shelldocs issues. +1 whitespace 0m 0s The patch has no whitespace issues. +1 unit 2m 0s hadoop-common in the patch passed. +1 unit 0m 48s hadoop-hdfs in the patch passed. +1 unit 2m 41s hadoop-yarn in the patch passed. +1 unit 0m 22s hadoop-streaming in the patch passed. +1 unit 0m 20s hadoop-distcp in the patch passed. +1 unit 0m 18s hadoop-archive-logs in the patch passed. +1 unit 0m 22s hadoop-rumen in the patch passed. +1 unit 0m 21s hadoop-extras in the patch passed. +1 unit 0m 22s hadoop-sls in the patch passed. +1 unit 2m 6s hadoop-mapreduce-project in the patch passed. +1 asflicense 0m 23s The patch does not generate ASF License warnings. 36m 28s Subsystem Report/Notes Docker Image:yetus/hadoop:9560f25 JIRA Issue HADOOP-13341 JIRA Patch URL https://issues.apache.org/jira/secure/attachment/12826443/HADOOP-13341.00.patch Optional Tests asflicense mvnsite unit shellcheck shelldocs uname Linux f05c5a65d9d5 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Build tool maven Personality /testptch/hadoop/patchprocess/precommit/personality/provided.sh git revision trunk / 20ae1fa shellcheck v0.4.4 Test Results https://builds.apache.org/job/PreCommit-HADOOP-Build/10429/testReport/ modules C: hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs hadoop-yarn-project/hadoop-yarn hadoop-tools/hadoop-streaming hadoop-tools/hadoop-distcp hadoop-tools/hadoop-archive-logs hadoop-tools/hadoop-rumen hadoop-tools/hadoop-extras hadoop-tools/hadoop-sls hadoop-mapreduce-project U: . Console output https://builds.apache.org/job/PreCommit-HADOOP-Build/10429/console Powered by Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org This message was automatically generated.
          Hide
          stevel@apache.org Steve Loughran added a comment -

          I lack the bash skills to review it, but the overall design LGTM

          Show
          stevel@apache.org Steve Loughran added a comment - I lack the bash skills to review it, but the overall design LGTM
          Hide
          aw Allen Wittenauer added a comment -

          Thanks. I guess I'll just call for a vote and see what happens.

          Show
          aw Allen Wittenauer added a comment - Thanks. I guess I'll just call for a vote and see what happens.
          Hide
          githubbot ASF GitHub Bot added a comment -

          GitHub user aw-was-here opened a pull request:

          https://github.com/apache/hadoop/pull/126

          HADOOP-13341: Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS

          You can merge this pull request into a Git repository by running:

          $ git pull https://github.com/apache/hadoop HADOOP-13341

          Alternatively you can review and apply these changes as the patch at:

          https://github.com/apache/hadoop/pull/126.patch

          To close this pull request, make a commit to your master/trunk branch
          with (at least) the following in the commit message:

          This closes #126


          commit f462c9d73467eec3be19830829005f1136577e0b
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-07-08T16:25:09Z

          HADOOP-13356. Add a function to handle command_subcommand_OPTS (aw)

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit 0e011b79143e3fc2d9063b466bd17ecdfe7a9ac6
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-26T21:33:33Z

          HADOOP-13355. Handle HADOOP_CLIENT_OPTS in a function (aw)

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit ff98786144ebc1d21127a13850bb943a76ceac01
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-28T17:45:00Z

          HADOOP-13358. Modify HDFS to use hadoop_subcommand_opts

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit 6bc45fa90f509cae020eb70a4d540ae215893dc6
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T15:16:23Z

          HADOOP-13554. Add an equivalent of hadoop_subcmd_opts for secure opts (aw)

          commit d03c6f835dad5963e90d45e047077a7c270c1d5e
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T15:52:41Z

          HADOOP-13562. Change hadoop_subcommand_opts to use only uppercase

          commit 1a3e102cdf5be3b88c9101bfafe0189b9ed98165
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T19:42:32Z

          HADOOP-13357. Modify common to use hadoop_subcommand_opts

          commit 7f75b084ebb1630be935642f69033c91a242b171
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T20:03:45Z

          HADOOP-13359. Modify YARN to use hadoop_subcommand_opts

          commit 59b59ca8564a09c06e4c0b2aed73ef1e94c087dc
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T20:55:04Z

          HADOOP-13361. Modify hadoop_verify_user to be consistent with hadoop_subcommand_opts (ie more granularity)

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit 930e03b71e779772bb75e15ed5503c7392a68fdd
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T22:11:24Z

          HADOOP-13564. modify mapred to use hadoop_subcommand_opts

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit dc6d4900d270a714d05055b822983cd3198aba61
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-30T22:23:29Z

          HADOOP-13563. hadoop_subcommand_opts should print name not actual content during debug

          Signed-off-by: Allen Wittenauer <aw@apache.org>

          commit 5b7a3df75c6fc78793bf8638f3ddaa11f7e0658e
          Author: Allen Wittenauer <aw@apache.org>
          Date: 2016-08-31T14:39:34Z

          HADOOP-13360. Documentation for HADOOP_subcommand_OPTS

          Signed-off-by: Allen Wittenauer <aw@apache.org>


          Show
          githubbot ASF GitHub Bot added a comment - GitHub user aw-was-here opened a pull request: https://github.com/apache/hadoop/pull/126 HADOOP-13341 : Deprecate HADOOP_SERVERNAME_OPTS; replace with (command)_(subcommand)_OPTS You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/hadoop HADOOP-13341 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hadoop/pull/126.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #126 commit f462c9d73467eec3be19830829005f1136577e0b Author: Allen Wittenauer <aw@apache.org> Date: 2016-07-08T16:25:09Z HADOOP-13356 . Add a function to handle command_subcommand_OPTS (aw) Signed-off-by: Allen Wittenauer <aw@apache.org> commit 0e011b79143e3fc2d9063b466bd17ecdfe7a9ac6 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-26T21:33:33Z HADOOP-13355 . Handle HADOOP_CLIENT_OPTS in a function (aw) Signed-off-by: Allen Wittenauer <aw@apache.org> commit ff98786144ebc1d21127a13850bb943a76ceac01 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-28T17:45:00Z HADOOP-13358 . Modify HDFS to use hadoop_subcommand_opts Signed-off-by: Allen Wittenauer <aw@apache.org> commit 6bc45fa90f509cae020eb70a4d540ae215893dc6 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T15:16:23Z HADOOP-13554 . Add an equivalent of hadoop_subcmd_opts for secure opts (aw) commit d03c6f835dad5963e90d45e047077a7c270c1d5e Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T15:52:41Z HADOOP-13562 . Change hadoop_subcommand_opts to use only uppercase commit 1a3e102cdf5be3b88c9101bfafe0189b9ed98165 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T19:42:32Z HADOOP-13357 . Modify common to use hadoop_subcommand_opts commit 7f75b084ebb1630be935642f69033c91a242b171 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T20:03:45Z HADOOP-13359 . Modify YARN to use hadoop_subcommand_opts commit 59b59ca8564a09c06e4c0b2aed73ef1e94c087dc Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T20:55:04Z HADOOP-13361 . Modify hadoop_verify_user to be consistent with hadoop_subcommand_opts (ie more granularity) Signed-off-by: Allen Wittenauer <aw@apache.org> commit 930e03b71e779772bb75e15ed5503c7392a68fdd Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T22:11:24Z HADOOP-13564 . modify mapred to use hadoop_subcommand_opts Signed-off-by: Allen Wittenauer <aw@apache.org> commit dc6d4900d270a714d05055b822983cd3198aba61 Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-30T22:23:29Z HADOOP-13563 . hadoop_subcommand_opts should print name not actual content during debug Signed-off-by: Allen Wittenauer <aw@apache.org> commit 5b7a3df75c6fc78793bf8638f3ddaa11f7e0658e Author: Allen Wittenauer <aw@apache.org> Date: 2016-08-31T14:39:34Z HADOOP-13360 . Documentation for HADOOP_subcommand_OPTS Signed-off-by: Allen Wittenauer <aw@apache.org>
          Hide
          chris.douglas Chris Douglas added a comment -

          Doesn't the base deprecation rewrite handle this case?

          +  hadoop_deprecate_envvar HADOOP_NFS3_SECURE_EXTRA_OPTS HDFS_NFS3_SECURE_EXTRA_OPTS
          

          +1 overall, though. Even if there are some corner cases it doesn't cover, this is a good base for alpha

          Show
          chris.douglas Chris Douglas added a comment - Doesn't the base deprecation rewrite handle this case? + hadoop_deprecate_envvar HADOOP_NFS3_SECURE_EXTRA_OPTS HDFS_NFS3_SECURE_EXTRA_OPTS +1 overall, though. Even if there are some corner cases it doesn't cover, this is a good base for alpha
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user steveloughran commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/126#discussion_r78158957

          — Diff: hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md —
          @@ -24,14 +24,26 @@ Apache Hadoop has many environment variables that control various aspects of the

              1. `HADOOP_CLIENT_OPTS`

          -This environment variable is used for almost all end-user operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example:
          +This environment variable is used for all end-user, non-daemon operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example:

          ```bash
          HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp
          ```

          will increase the memory and send this command via a SOCKS proxy server.

          +### `(command)_(subcommand)_OPTS`
          +
          +It is also possible to set options on a per subcommand basis. This allows for one to create special options for particular cases. The first part of the pattern is the command being used, but all uppercase. The second part of the command is the subcommand being used. Then finally followed by the string `_OPT`.
          +
          +For example, to configure `mapred distcp` to use a 2GB heap, one would use:
          +
          +```bash
          +MAPRED_DISTCP_OPTS="-Xmx2g"
          +```
          +
          +These options will appear after `HADOOP_CLIENT_OPTS` during execution and will generally take precedence.
          — End diff –

          might be good to add an example here. e.g what the final options of distcp are going to be. Will there be two -Xmx commands? if so, which wins? Because I suspect that's a JVM decision

          Show
          githubbot ASF GitHub Bot added a comment - Github user steveloughran commented on a diff in the pull request: https://github.com/apache/hadoop/pull/126#discussion_r78158957 — Diff: hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md — @@ -24,14 +24,26 @@ Apache Hadoop has many environment variables that control various aspects of the `HADOOP_CLIENT_OPTS` -This environment variable is used for almost all end-user operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example: +This environment variable is used for all end-user, non-daemon operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example: ```bash HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp ``` will increase the memory and send this command via a SOCKS proxy server. +### `(command)_(subcommand)_OPTS` + +It is also possible to set options on a per subcommand basis. This allows for one to create special options for particular cases. The first part of the pattern is the command being used, but all uppercase. The second part of the command is the subcommand being used. Then finally followed by the string `_OPT`. + +For example, to configure `mapred distcp` to use a 2GB heap, one would use: + +```bash +MAPRED_DISTCP_OPTS="-Xmx2g" +``` + +These options will appear after `HADOOP_CLIENT_OPTS` during execution and will generally take precedence. — End diff – might be good to add an example here. e.g what the final options of distcp are going to be. Will there be two -Xmx commands? if so, which wins? Because I suspect that's a JVM decision
          Hide
          stevel@apache.org Steve Loughran added a comment -

          +1

          Show
          stevel@apache.org Steve Loughran added a comment - +1
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user aw-was-here commented on a diff in the pull request:

          https://github.com/apache/hadoop/pull/126#discussion_r78162457

          — Diff: hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md —
          @@ -24,14 +24,26 @@ Apache Hadoop has many environment variables that control various aspects of the

              1. `HADOOP_CLIENT_OPTS`

          -This environment variable is used for almost all end-user operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example:
          +This environment variable is used for all end-user, non-daemon operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example:

          ```bash
          HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp
          ```

          will increase the memory and send this command via a SOCKS proxy server.

          +### `(command)_(subcommand)_OPTS`
          +
          +It is also possible to set options on a per subcommand basis. This allows for one to create special options for particular cases. The first part of the pattern is the command being used, but all uppercase. The second part of the command is the subcommand being used. Then finally followed by the string `_OPT`.
          +
          +For example, to configure `mapred distcp` to use a 2GB heap, one would use:
          +
          +```bash
          +MAPRED_DISTCP_OPTS="-Xmx2g"
          +```
          +
          +These options will appear after `HADOOP_CLIENT_OPTS` during execution and will generally take precedence.
          — End diff –

          If there is an Xmx in HADOOP_CLIENT_OPTS and an Xmx in MAPRED_DISTCP_OPTS, then the mapred distcp final HADOOP_OPTS will definitely have two Xmx flags. After HADOOP-13365, we'll be in a position to potentially de-dupe user provided settings like we do for other things. But until de-dupe, you're correct that it's a JVM decision. In the past, that decision has been last one wins and I doubt Oracle could change it if they wanted to at this point without major ramifications.

          Show
          githubbot ASF GitHub Bot added a comment - Github user aw-was-here commented on a diff in the pull request: https://github.com/apache/hadoop/pull/126#discussion_r78162457 — Diff: hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md — @@ -24,14 +24,26 @@ Apache Hadoop has many environment variables that control various aspects of the `HADOOP_CLIENT_OPTS` -This environment variable is used for almost all end-user operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example: +This environment variable is used for all end-user, non-daemon operations. It can be used to set any Java options as well as any Apache Hadoop options via a system property definition. For example: ```bash HADOOP_CLIENT_OPTS="-Xmx1g -Dhadoop.socks.server=localhost:4000" hadoop fs -ls /tmp ``` will increase the memory and send this command via a SOCKS proxy server. +### `(command)_(subcommand)_OPTS` + +It is also possible to set options on a per subcommand basis. This allows for one to create special options for particular cases. The first part of the pattern is the command being used, but all uppercase. The second part of the command is the subcommand being used. Then finally followed by the string `_OPT`. + +For example, to configure `mapred distcp` to use a 2GB heap, one would use: + +```bash +MAPRED_DISTCP_OPTS="-Xmx2g" +``` + +These options will appear after `HADOOP_CLIENT_OPTS` during execution and will generally take precedence. — End diff – If there is an Xmx in HADOOP_CLIENT_OPTS and an Xmx in MAPRED_DISTCP_OPTS, then the mapred distcp final HADOOP_OPTS will definitely have two Xmx flags. After HADOOP-13365 , we'll be in a position to potentially de-dupe user provided settings like we do for other things. But until de-dupe, you're correct that it's a JVM decision. In the past, that decision has been last one wins and I doubt Oracle could change it if they wanted to at this point without major ramifications.
          Hide
          aw Allen Wittenauer added a comment -

          Doesn't the base deprecation rewrite handle this case?

          No, it doesn't. I opted to not cover the secure flags deprecation case at all because:

          • there's only two of them
          • they used different patterns
          • they only ever got used in HDFS

          This makes the secure opts handler a bit smaller than the insecure one as a result.

          Show
          aw Allen Wittenauer added a comment - Doesn't the base deprecation rewrite handle this case? No, it doesn't. I opted to not cover the secure flags deprecation case at all because: there's only two of them they used different patterns they only ever got used in HDFS This makes the secure opts handler a bit smaller than the insecure one as a result.
          Hide
          aw Allen Wittenauer added a comment -

          FYI, rebased to fix merge conflicts with HDFS-10553.

          Show
          aw Allen Wittenauer added a comment - FYI, rebased to fix merge conflicts with HDFS-10553 .
          Hide
          anu Anu Engineer added a comment -

          +1, LGTM

          Show
          anu Anu Engineer added a comment - +1, LGTM
          Hide
          githubbot ASF GitHub Bot added a comment -

          Github user asfgit closed the pull request at:

          https://github.com/apache/hadoop/pull/126

          Show
          githubbot ASF GitHub Bot added a comment - Github user asfgit closed the pull request at: https://github.com/apache/hadoop/pull/126
          Hide
          aw Allen Wittenauer added a comment -

          Committed to trunk.

          Thanks all!

          Show
          aw Allen Wittenauer added a comment - Committed to trunk. Thanks all!
          Hide
          hudson Hudson added a comment -

          SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10426 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10426/)
          HADOOP-13341. Deprecate HADOOP_SERVERNAME_OPTS; replace with (aw: rev 58ed4fa5449872d65efd52d840f02dd60af2771a)

          • (edit) hadoop-yarn-project/hadoop-yarn/bin/yarn
          • (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_add_client_opts.bats
          • (edit) hadoop-tools/hadoop-streaming/src/main/shellprofile.d/hadoop-streaming.sh
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md
          • (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_subcommand_opts.bats
          • (edit) hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md
          • (edit) hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh
          • (edit) hadoop-mapreduce-project/conf/mapred-env.sh
          • (edit) hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh
          • (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_subcommand_secure_opts.bats
          • (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_verify_user.bats
          • (edit) hadoop-tools/hadoop-sls/src/main/bin/slsrun.sh
          • (edit) hadoop-tools/hadoop-archive-logs/src/main/shellprofile.d/hadoop-archive-logs.sh
          • (edit) hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh
          • (edit) hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md
          • (edit) hadoop-common-project/hadoop-common/src/main/bin/hadoop
          • (edit) hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh
          • (edit) hadoop-tools/hadoop-extras/src/main/shellprofile.d/hadoop-extras.sh
          • (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs
          • (edit) hadoop-mapreduce-project/bin/mapred
          • (edit) hadoop-mapreduce-project/bin/mapred-config.sh
          • (edit) hadoop-tools/hadoop-sls/src/main/bin/rumen2sls.sh
          Show
          hudson Hudson added a comment - SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10426 (See https://builds.apache.org/job/Hadoop-trunk-Commit/10426/ ) HADOOP-13341 . Deprecate HADOOP_SERVERNAME_OPTS; replace with (aw: rev 58ed4fa5449872d65efd52d840f02dd60af2771a) (edit) hadoop-yarn-project/hadoop-yarn/bin/yarn (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_add_client_opts.bats (edit) hadoop-tools/hadoop-streaming/src/main/shellprofile.d/hadoop-streaming.sh (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs-config.sh (edit) hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HdfsNfsGateway.md (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_subcommand_opts.bats (edit) hadoop-common-project/hadoop-common/src/site/markdown/ClusterSetup.md (edit) hadoop-common-project/hadoop-common/src/main/conf/hadoop-env.sh (edit) hadoop-mapreduce-project/conf/mapred-env.sh (edit) hadoop-common-project/hadoop-common/src/main/bin/hadoop-functions.sh (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_subcommand_secure_opts.bats (add) hadoop-common-project/hadoop-common/src/test/scripts/hadoop_verify_user.bats (edit) hadoop-tools/hadoop-sls/src/main/bin/slsrun.sh (edit) hadoop-tools/hadoop-archive-logs/src/main/shellprofile.d/hadoop-archive-logs.sh (edit) hadoop-tools/hadoop-rumen/src/main/shellprofile.d/hadoop-rumen.sh (edit) hadoop-common-project/hadoop-common/src/site/markdown/UnixShellGuide.md (edit) hadoop-common-project/hadoop-common/src/main/bin/hadoop (edit) hadoop-tools/hadoop-distcp/src/main/shellprofile.d/hadoop-distcp.sh (edit) hadoop-tools/hadoop-extras/src/main/shellprofile.d/hadoop-extras.sh (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/bin/hdfs (edit) hadoop-mapreduce-project/bin/mapred (edit) hadoop-mapreduce-project/bin/mapred-config.sh (edit) hadoop-tools/hadoop-sls/src/main/bin/rumen2sls.sh

            People

            • Assignee:
              aw Allen Wittenauer
              Reporter:
              aw Allen Wittenauer
            • Votes:
              0 Vote for this issue
              Watchers:
              12 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development